CN112948155A - Model training method, state prediction method, device, equipment and storage medium - Google Patents

Model training method, state prediction method, device, equipment and storage medium Download PDF

Info

Publication number
CN112948155A
CN112948155A CN201911268312.XA CN201911268312A CN112948155A CN 112948155 A CN112948155 A CN 112948155A CN 201911268312 A CN201911268312 A CN 201911268312A CN 112948155 A CN112948155 A CN 112948155A
Authority
CN
China
Prior art keywords
prediction model
sample
energy
output sequence
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911268312.XA
Other languages
Chinese (zh)
Other versions
CN112948155B (en
Inventor
叶尧罡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201911268312.XA priority Critical patent/CN112948155B/en
Publication of CN112948155A publication Critical patent/CN112948155A/en
Application granted granted Critical
Publication of CN112948155B publication Critical patent/CN112948155B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0775Content or structure details of the error report, e.g. specific table structure, specific error fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The embodiment of the application discloses a system anomaly prediction model training method, a system state prediction method, a device, equipment and a storage medium, wherein the method comprises the following steps: obtaining sample characteristics; initializing the system abnormity prediction model according to the set weight parameters; processing the sample characteristics through the system abnormity prediction model to obtain predicted energy; constructing an objective function based on the predicted energy; in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function. In this way, the abnormal condition of the system operation state can be predicted.

Description

Model training method, state prediction method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a system anomaly prediction model training method, a system state prediction method, an apparatus, a device, and a computer storage medium.
Background
With the development of big data technology, the size of distributed computing clusters is becoming larger and larger nowadays, and the complexity of distributed systems is also increasing. Because the number of nodes of the distributed cluster is large, although the probability of failure of each node is low, the probability of failure of the whole cluster is not low, for example, a large cluster is likely to have a hard disk damaged every day, and the failures have great influence on the stable operation of a large complex system and may even cause serious consequences. Therefore, for a large-scale system, in order to effectively reduce the loss caused by the fault, the problem needs to be discovered and solved as soon as possible, and therefore how to predict the abnormal condition of the system operation state becomes a difficult problem to be faced when the maintenance system is normally operated.
Disclosure of Invention
The embodiment of the application provides a system anomaly prediction model training method, a system state prediction method, a device, equipment and a computer storage medium, which can predict the anomaly condition of the system operation state.
In order to achieve the above purpose, the technical solution of the embodiment of the present application is implemented as follows:
in a first aspect, an embodiment of the present application provides a system anomaly prediction model training method, where the method includes:
obtaining sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the sample characteristics through the system abnormity prediction model to obtain predicted energy;
constructing an objective function based on the predicted energy;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
In some embodiments, the obtaining sample features comprises:
acquiring a log sequence, and processing the log sequence to obtain a vector sequence;
and performing feature extraction on the vector sequence to obtain sample features.
In some embodiments, the processing the sample features through the system anomaly prediction model to obtain the predicted energy includes:
obtaining a first output sequence based on the sample characteristics and a first decoder of a first submodel of the system anomaly prediction model;
obtaining a second output sequence based on the sample characteristics and a second decoder of a first submodel of the system anomaly prediction model;
and acquiring a third output sequence based on the first output sequence, the second output sequence and the sample characteristics.
In some embodiments, said obtaining a third output sequence based on said first output sequence, said second output sequence, and said sample features comprises:
determining a first reconstruction error based on the first output sequence and the sample features;
determining a second reconstruction error based on the second output sequence and the sample features;
and splicing the first reconstruction error, the second reconstruction error and the hidden space vector of the last time step of the encoder in the first submodel to obtain a third output sequence.
In some embodiments, the processing the sample features through the system anomaly prediction model to obtain the predicted energy includes:
clustering the third output sequence by using a second sub-model of the system abnormity prediction model to obtain K clusters, wherein K is a positive integer;
determining a mean and a covariance of the mth cluster based on samples within the mth cluster;
based on the mean and covariance, a predicted energy for the sample is determined.
In some embodiments, said determining a mean and covariance of said mth cluster based on samples within said mth cluster comprises:
estimating the third output sequence by utilizing the second submodel of the system abnormity prediction model, and determining the probability of each distribution of the samples in the third output sequence;
determining a mean and covariance of the mth cluster based on samples within the mth cluster and a probability that the samples belong to each distribution.
In some embodiments, said constructing an objective function based on said predicted energy comprises:
determining a reconstruction loss based on the first output sequence, the second output sequence, and the sample characteristics;
and constructing an objective function based on the reconstruction loss and the predicted energy.
In a second aspect, an embodiment of the present application provides a method for predicting a system state, where the method includes:
acquiring sample characteristics of a system;
determining energy corresponding to the sample characteristics based on a system abnormity prediction model;
determining a state of the system based on the energy.
In some embodiments, determining a state of the system based on the energy comprises:
determining that the system is abnormal when the predicted energy is greater than a preset energy threshold;
alternatively, the system is determined to be normal if the predicted energy is less than or equal to the energy threshold.
In some embodiments, before determining the energy corresponding to the sample feature based on the system anomaly prediction model, the method further comprises:
acquiring training sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the training sample characteristics through the system anomaly prediction model to obtain the prediction energy of the training sample;
constructing an objective function based on the predicted energies of the training samples;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
In a third aspect, an embodiment of the present application provides a system anomaly prediction model training apparatus, which includes a first obtaining module, an initializing module, a first processing module, a constructing module, and an updating module, wherein,
the first obtaining module is used for obtaining sample characteristics;
the initialization module is used for initializing the system abnormity prediction model according to the set weight parameters;
the first processing module is used for processing the sample characteristics through the system abnormity prediction model to obtain predicted energy;
the construction module is used for constructing an objective function based on the predicted energy;
and the updating module is used for updating the weight parameters of the system abnormity prediction model through the target function in back propagation.
In a fourth aspect, embodiments of the present application provide a system state prediction apparatus, which includes a second obtaining module, an energy determination module, and a state determination module, wherein,
the second acquisition module is used for acquiring the sample characteristics of the system;
the energy determining module is used for determining energy corresponding to the sample characteristics based on a system abnormity prediction model;
the state determination module is configured to determine a state of the system based on the energy.
In a fifth aspect, embodiments of the present application provide an apparatus, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the steps of the method for training a system anomaly prediction model provided in any embodiment of the present application, and/or the steps of the method for predicting a system state provided in any embodiment of the present application.
In a sixth aspect, embodiments of the present application provide a computer storage medium, where a system anomaly prediction model training program and/or a system state prediction program are stored on the computer storage medium, where the system anomaly prediction model training program, when executed by a processor, implements the steps of the system anomaly prediction model training method provided in any of the embodiments of the present application, and the system state prediction program, when executed by the processor, implements the steps of the system state prediction method provided in any of the embodiments of the present application.
The method for training the system anomaly prediction model provided by the embodiment includes the steps of obtaining sample characteristics, initializing the system anomaly prediction model according to set weight parameters, processing the sample characteristics through the system anomaly prediction model to obtain prediction energy, constructing an objective function based on the prediction energy, and updating the weight parameters of the system anomaly prediction model through the objective function in back propagation. Therefore, an end-to-end system abnormity prediction model is established, an objective function is constructed based on prediction energy, so that the system abnormity prediction model can be trained by adopting label-free samples, the process of manual labeling is reduced, the problem that a large number of positive samples are difficult to obtain and the performance of the model is influenced is solved, and the effect of obtaining a large number of training samples and ensuring the performance of the model is achieved. Meanwhile, the target function is used for back propagation, and the system anomaly prediction model is subjected to combined optimization, so that the purpose of optimal performance of the anomaly prediction model is achieved.
Drawings
FIG. 1 is a schematic processing flow diagram illustrating a system anomaly prediction model training method according to an embodiment of the present disclosure;
FIG. 2 is a schematic processing flow diagram illustrating a system state prediction method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram illustrating a structure of a system anomaly prediction model according to an embodiment of the present application;
FIG. 4 is a diagram illustrating a structure of a log sub-sequence according to an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating an exemplary system anomaly prediction model training apparatus according to the present application;
FIG. 6 is a schematic diagram of a system state prediction apparatus according to an embodiment of the present application;
FIG. 7 is a schematic structural diagram of a system anomaly prediction model training device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a system state prediction device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the following will describe the specific technical solutions of the present application in further detail with reference to the accompanying drawings in the embodiments of the present application. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Before describing the system anomaly prediction model training method provided by the embodiment of the present application in detail, the technology related to the present application will be briefly introduced.
A log file is a record file or collection of files used to record system operational events and plays an important role in handling historical data, tracking diagnostic issues, and understanding the activities of the system. At present, a programmer utilizes a log file to locate problems when developing a program, and an operation and maintenance worker utilizes the log file to locate problems when a system has a fault, wherein the log file is composed of a plurality of logs.
Because the log file can reflect the current state of the system in time, the abnormal detection is carried out on the running state of the system based on the existing method based on log data analysis. Specifically, the log is divided into individual words according to a natural language processing technology, each log is converted into a word list or the word list is further encoded, and the encoding method is usually implemented by TF-IDF (term frequency-inverse text frequency index), or BOW (Bag-of-words), or word2vec (word to vector). And based on the encoded log vectors, clustering is carried out or a time series analysis method is used for carrying out abnormity detection. However, the above-mentioned anomaly detection method can only detect that the current operation of the system is abnormal, and even detect the abnormality after the system is in an abnormal operation state for a period of time, so the above-mentioned anomaly detection method cannot ensure high stability and high availability of the system.
In one aspect, an embodiment of the present invention provides a method for training a system anomaly prediction model, please refer to fig. 1, where the method includes:
step 101, obtaining sample characteristics.
Here, the system anomaly prediction model training device encodes the samples for training, and obtains the sample characteristics that can be input into the system anomaly prediction model, that is, converts the sample format into a normalized format, so as to facilitate training of the system anomaly prediction model.
For example, the system abnormality prediction model training device encodes a sample used for training, encodes the sample into a fixed-length vector, and inputs the fixed-length vector as a sample feature into the system abnormality prediction model.
And step 102, initializing the system abnormity prediction model according to the set weight parameters.
Here, the system abnormality prediction model training device sets weight parameters of the system abnormality prediction model in advance, initializes the system abnormality prediction model based on the set weight parameters, and the accuracy of the initialized system abnormality prediction model is not high.
And 103, processing the sample characteristics through the system abnormity prediction model to obtain predicted energy.
Here, the system abnormality prediction model training device inputs the sample characteristics to the system abnormality prediction model, and obtains the prediction energy of the sample through the processing of the system abnormality prediction model.
For example, the system anomaly prediction model captures sequential information of sample characteristics by using a recurrent neural network, and then performs parameter estimation on the sample by using a Gaussian model, so as to obtain prediction energy.
And 104, constructing an objective function based on the predicted energy.
Here, since the system abnormality prediction model training device constructs the objective function based on the prediction function, the samples used for training the system abnormality prediction model may be unlabeled samples. The system anomaly prediction model training belongs to unsupervised learning.
And 105, updating the weight parameters of the system anomaly prediction model through the objective function in back propagation.
The system anomaly prediction model training device continuously corrects the weight parameters of the system anomaly prediction model in back propagation by using an objective function, so that the reconstruction loss of the system anomaly prediction model is smaller and smaller, and convergence is achieved.
In the above implementation, the obtained sample features are input into the system anomaly prediction model, the prediction energy is obtained, an objective function is constructed based on the prediction energy, and the weight parameters of the system anomaly prediction model are updated in the back propagation by using the objective function. Therefore, an end-to-end system abnormity prediction model is established, an objective function is constructed based on prediction energy, so that the system abnormity prediction model can be trained by adopting label-free samples, the process of manually marking samples is reduced, the problem that a large number of positive samples are difficult to obtain and the performance of the model is influenced is solved, and the effect of obtaining a large number of training samples and ensuring the performance of the model is achieved. Meanwhile, the target function is used for back propagation, and the system anomaly prediction model is subjected to combined optimization, so that the purpose of optimal performance of the anomaly prediction model is achieved.
In some embodiments, the step 101 of obtaining sample features includes:
the system abnormity prediction model training device obtains a log sequence, and the log sequence is processed to obtain a vector sequence.
Here, the samples are log sequences, the log sequences usually include various types of variables, and the system anomaly prediction model is directly trained by using the log sequences, which introduces noise interference caused by the variables in the log sequences, thereby reducing the accuracy of the system anomaly prediction model. Therefore, a log sequence needs to be processed to convert the log sequence into a vector sequence.
Wherein the processing the log sequence comprises preprocessing the log sequence and encoding the preprocessed log sequence. Specifically, the preprocessing the log sequence includes cleaning variables in the log sequence, for example, replacing the variable values of the log sequence by using a regular expression matching method based on a unique format and artificial experience of the log, so that each log in the log sequence is converted into a word list, and encoding each log by using an integer encoding method based on a natural language processing idea to obtain a vector sequence.
And the system anomaly prediction model training device extracts the characteristics of the vector sequence to obtain the characteristics of the sample.
Here, the system anomaly prediction model training device performs feature extraction on the vector sequence by using a neural network to obtain sample features. The neural network is trained through a word embedding model by utilizing mass data in advance.
In the above embodiment, when the sample is a log sequence, the log sequence needs to be processed to obtain a vector sequence, and feature extraction is performed on the vector sequence, so as to obtain sample features. Therefore, the log sequence is processed first, noise caused by variables in the log can be avoided when model training is carried out, and accuracy of the system abnormity prediction model is improved.
In some embodiments, the step 103 of processing the sample features through the system anomaly prediction model to obtain the predicted energy includes:
obtaining a first output sequence based on the sample characteristics and a first decoder of a first submodel of the system anomaly prediction model;
obtaining a second output sequence based on the sample characteristics and a second decoder of a first submodel of the system anomaly prediction model;
and acquiring a third output sequence based on the first output sequence, the second output sequence and the sample characteristics.
Here, the first sub-model of the system anomaly prediction model includes an encoder and two decoders, and the encoder of the first sub-model is composed of unidirectional stacked GRUs (Gated current units), so that the first sub-model can capture precedence information of samples. Wherein the first decoder comprises a normal decoder and the second decoder comprises a decoder with an attention mechanism.
The encoder of the first sub-model may include one or more coding layers, for example, according to step 101, the obtained sample feature X ═ X (X ═ X)1,x2,…,xn) The first hidden space vector of the first coding layer is respectively obtained after the encoder input to the first submodel 302 passes through the GRU units stacked in the double-layer single direction
Figure BDA0002313467720000081
And a second hidden space vector of a second coding layer
Figure BDA0002313467720000082
The first hidden space vector h1And a second hidden space vector h2Splicing is carried out, and a hidden space vector H ═ H is obtained1,h2,…hn) Wherein, in the step (A),
Figure BDA0002313467720000083
concat represents the splicing function, and n is the number of time steps of GUR.
The sample feature X obtained in step 101 is input to the first submodel, and the process of obtaining the third output sequence may be represented as:
z=Attentive_GRU(X) (1)
where z denotes the third output sequence and Attentive _ GRU denotes the non-linear transformation function of the first submodel.
Here, the obtaining, by the first decoder based on the sample features and the first sub-model of the system anomaly prediction model, a first output sequence includes: and determining the current output of the first decoder according to the historical hidden space vector of the first decoder of the first submodel and the historical output of the first decoder.
For example, taking time t as an example, the hidden state s of the first decoder at the previous time1 t-1And the output y of the first decoder at the previous time1 t-1Input to the first decoder to obtain the output y of the current time1 tThe process can be expressed as y1 t=GRU1(s1 t-1,y1 t-1) Wherein GRU1 is a first decoder nonlinear transformation function, s0=hn,hnIs the hidden spatial vector at the last time step of the encoder. Thus, the sample features are input into a first submodel, and a first output sequence Y is obtained by the first decoder1=(y1 1,y1 2,…,y1 n)。
Wherein, y1 t=GRU1(s1 t-1,y1 t-1),y1 tThe updating process comprises the following steps:
z1 t=σ(W1 zy1 t-1+Uz 1s1 t-1) (2)
rt 1=σ(Wr 1y1 t-1+Ur 1s1 t-1) (3)
Figure BDA0002313467720000091
Figure BDA0002313467720000092
y1 t=σ(W1 o·st 1) (6)
wherein, Wz 1、Wr 1、Ws 1、Wo 1、Uz 1、Ur 1、Us 1Is a related weight matrix, zt 1To refresh the door, rt 1In order to reset the gate, the gate is reset,
Figure BDA0002313467720000093
tanh represents a Tanh function, σ represents a Sigmoid function,
Figure BDA0002313467720000094
representing matrix element multiplication.
Here, obtaining a second output sequence based on the sample features and a second decoder of the first sub-model 302 of the system anomaly prediction model includes determining a current output of the second decoder from the hidden spatial vector of the encoder, a historical hidden spatial vector of the first decoder, and a historical output of the second decoder.
For example, taking time t as an example, after the hidden space vector of the encoder passes through the attention module, the weighted sum c of the attention mechanism is obtainedtA weighted sum c of said attention mechanismstHidden state s of the second decoder at the previous moment2 t-1And the output y of the second decoder at the previous time2 t-1Input into a second decoder to obtain the output y of the current time2 tThe process can be represented as y2 t=GRU2(s1 t-1,y1 t-1,ct) Wherein the GRU2 is a second decoder nonlinear transformation function, such that sample features are input into the first submodel 302, a first output sequence Y is obtained by the first decoder2=(y2 1,y2 2,…,y2 n)。
Wherein a weighted sum c of attention mechanisms is obtainedtThe process of (a) is as follows,
Figure BDA0002313467720000095
Figure BDA0002313467720000101
here, etjRepresenting a hidden spatial vector s of a second decoder2 t-1With encoder hidden space vector hjThe correlation of (c). Alpha is alphatjIs a weight coefficient for representing the hidden space vector s of the second decodert-1The importance of (c).
Wherein, y2 t=GRU2(s1 t-1,y1 t-1,ct),y2 tThe updating process comprises the following steps:
zt 2=σ(Wz 2y2 t-1+Uz 2s2 t-1+Czct) (9)
rt 2=σ(W2 ry2 t-1+U2 rs2 t-1+Crct) (10)
Figure BDA0002313467720000102
Figure BDA0002313467720000103
Figure BDA0002313467720000104
wherein, Wz 2、Wr 2、Ws 2、Wo 2、Uz 2、Ur 2、Us 2、Cz 2、Cr 2And Cs 2Is a related weight matrix, zt 2To refresh the door, rt 2In order to reset the gate, the gate is reset,
Figure BDA0002313467720000105
tanh represents a Tanh function, σ represents a Sigmoid function,
Figure BDA0002313467720000106
representing multiplication of matrix elements
In the above embodiment, the first sub-model includes a normal decoder and a decoder with an attention mechanism, and different samples are endowed with different importance by introducing the attention mechanism, so that the correlation information of the samples in time sequence is captured more effectively, and the prediction accuracy of the system anomaly prediction model is improved.
In some embodiments, said obtaining a third output sequence based on said first output sequence, said second output sequence, and said sample features comprises:
determining a first reconstruction error based on the first output sequence and the sample features;
determining a second reconstruction error based on the second output sequence and the sample features;
and splicing the first reconstruction error, the second reconstruction error and the hidden space vector of the last time step of the encoder in the first submodel to obtain a third output sequence.
Here, the first reconstruction error e1The calculation formula is as follows:
Figure BDA0002313467720000107
wherein, Y1Is the first output sequence output by the first decoder, D represents the dimension of X, which represents the sample characteristics of the input first submodel.
Here, the second reconstruction error e2The calculation formula is as follows:
Figure BDA0002313467720000108
wherein, Y2Is the second output sequence output by the second decoder, D represents the dimension of X, which represents the sample characteristics of the input first submodel.
Here, the first reconstruction error e is determined1The second reconstruction error e2And the hidden space vector h of the last time step of the encoder in the first submodelnSplicing to obtain a third output sequence z ═ hn,e1,e2]. And the third output sequence is an output sequence of the first sub-model and is used as an input sequence of the second sub-model.
In some embodiments, the step 103 of processing the sample features through the system anomaly prediction model to obtain the predicted energy includes:
the system abnormity prediction model training device clusters the third output sequence by using a second submodel of the system abnormity prediction model to obtain K clusters, wherein K is a positive integer;
determining a mean and a covariance of the mth cluster based on samples within the mth cluster, 0< m ≦ K;
based on the mean and covariance, a predicted energy for the sample is determined.
Here, the second submodel is used to perform parameter estimation on the sample characteristics, and the predicted energy of the sample is determined according to the estimated parameters.
For example, the second submodel includes a gaussian mixture model GMM based on the K-means and the estimation network, and the second submodel is used to determine parameters of a gaussian distribution corresponding to the third output sequence, thereby determining the predicted energy of the sample.
In some embodiments, said determining a mean and covariance of said mth cluster based on samples within said mth cluster comprises:
estimating the third output sequence by utilizing the second submodel of the system abnormity prediction model, and determining the probability of each distribution of the samples in the third output sequence;
determining a mean and covariance of the mth cluster based on samples within the mth cluster and a probability that the samples belong to each distribution.
Here, the third output sequence is estimated by using an estimation network of the second submodel of the system anomaly prediction model, and the probability of the third output sequence belonging to each distribution in the GMM is determined
Figure BDA0002313467720000111
The estimation network is a multilayer fully-connected neural network, and the last layer of the neural network is a normalization function layer. When the probability that the third output sequence belongs to each distribution in the GMM is estimated by utilizing the multilayer fully-connected neural network, the probability that the third output sequence belongs to each distribution
Figure BDA0002313467720000112
Is modeled as a multi-classification problem, the probability of each distribution is obtained by a multilayer fully-connected neural network:
Figure BDA0002313467720000121
the MLP is a nonlinear function of a multilayer fully-connected neural network, the number of neurons in an input layer of the fully-connected neural network is the same as the z dimension, the number of neurons in an output layer is K, an activation function is a normalization function, and the value of K is obtained by observing the distribution characteristics of a log sequence. For example, if the log sequence is divided into a normal log sequence and an abnormal log sequence, then K is taken to be 2.
It should be noted that the number of neurons in the output layer is equal to the number of clusters after clustering, and a sample in each cluster after clustering corresponds to a distribution output by the layer fully-connected neural network, that is, the mth cluster corresponds to the mth distribution.
Determining a mean and covariance of the mth cluster based on samples within the mth cluster and a probability that the samples belong to each distribution.
Determining the probability gamma of each sample in the third output sequence belonging to each distribution by using an estimation network, and determining the mean value of the mth cluster according to the calculation method of the Gaussian mixture model parameters based on the probability of the sample in the mth cluster and the probability of the sample in the mth cluster belonging to the mth distribution
Figure BDA0002313467720000122
Sum covariance
Figure BDA0002313467720000123
For example:
Figure BDA0002313467720000124
Figure BDA0002313467720000125
wherein, γimRepresenting clustered intra-cluster samples zi' the probability of belonging to the m-th distribution,
Figure BDA0002313467720000126
represents the mean value of the m-th cluster,
Figure BDA0002313467720000127
represents the covariance of the mth cluster.
Here, the mean and covariance are used as the basisDetermining a predicted energy of the sample, comprising: based on the probability, mean and covariance of the K-distribution, the predicted energy of the sample is determined. Specifically, according to the calculation method of the gaussian distribution, the probability that all samples belong to the mth distribution is calculated in an average manner to obtain the probability of the mth distribution, and the probability of the mth distribution is calculated
Figure BDA0002313467720000128
The formula of (1) is:
Figure BDA0002313467720000129
wherein W is the number of samples of the second submodel input to the system abnormality prediction model.
Here, the parameters obtained from the estimation, such as the probability of each distribution
Figure BDA00023134677200001210
Mean value
Figure BDA00023134677200001211
Sum variance
Figure BDA00023134677200001212
The predicted energy e (z) for each sample may be calculated as follows:
Figure BDA0002313467720000131
in the above embodiment, the sample is subjected to parameter estimation by using the estimation network and the clustering model in the second sub-model, so that the estimated sample distribution is more accurate, and no labeled sample is required to participate.
In some embodiments, the step 104 of constructing an objective function based on the predicted energy comprises:
determining a reconstruction loss based on the first output sequence, the second output sequence, and the sample characteristics;
and constructing an objective function based on the reconstruction loss and the predicted energy.
In the first sub-model, the performance of the encoder and the decoder is described by using the first reconstruction error and the second reconstruction error, W samples are taken as training samples for training the abnormal prediction model of the system, and the reconstruction loss L is calculatedrcThe formula of (1) is:
Figure BDA0002313467720000132
wherein | | | purple hair2Denotes norm, y1 iRepresenting a first decoder pair xiReconstructed result, y2 iRepresenting a second decoder pair xiAnd (5) reconstructing the result.
And constructing an objective function based on the reconstruction loss and the predicted energy.
Here, the objective function L is:
Figure BDA0002313467720000133
wherein λ is1Is a hyper-parameter which is the parameter,
Figure BDA0002313467720000134
is the total predicted energy of W samples.
In the embodiment, in the system anomaly prediction model training, the reconstruction loss and the sample energy are introduced into the objective function to perform model training, and the required training samples are label-free data and belong to unsupervised learning.
In another aspect of the present invention, a method for predicting a system status is provided, please refer to fig. 2, where the method includes:
step 201, a system state prediction device acquires sample characteristics of a system;
step 202, the system state prediction device determines energy corresponding to the sample characteristics based on a system abnormity prediction model;
in step 203, the system state prediction device determines the state of the system based on the energy.
Here, the system state prediction means determines that the system is abnormal in a case where the energy is greater than a preset energy threshold, or determines that the system is normal in a case where the energy is less than or equal to the energy threshold.
For example, when the energy e (z) > θ is determined as an abnormal sample, the system state prediction apparatus determines that an abnormality occurs during operation of the system.
In some embodiments, before said determining the energy corresponding to the sample feature based on the system anomaly prediction model in step 202, the method further comprises:
the system abnormity prediction obtains training sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the training sample characteristics through the system anomaly prediction model to obtain the prediction energy of the training sample;
constructing an objective function based on the predicted energies of the training samples;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
In the above embodiment, when the system state prediction apparatus performs prediction by using a trained system anomaly prediction model, it is determined whether an anomaly of the system is about to occur by calculating the energy of the sample, and comparing the energy of the sample with a preset energy threshold, so as to perform unsupervised anomaly prediction.
In another aspect of the embodiment of the present application, another method for training a system anomaly prediction model is provided, so as to further understand the method for training an information model provided in the embodiment of the present application, which is described by taking, as an example, log data generated by a sample from each component of a hadoop system in a period of time, where the log data is composed of a plurality of logs, please refer to fig. 3, the system anomaly prediction model includes a first sub-model 302 and a second sub-model 303, and the sample features are processed based on the first sub-model 302 and the second sub-model 303 to obtain prediction energy, where the first sub-model 302 includes a GRU layer including a dual decoder and an attention mechanism, and the second sub-model 303 is a gaussian mixture model based on a K-means and an estimation network. In addition, the system anomaly prediction model may further include an embedding layer 301, where the embedding layer 301 is used to obtain sample features. The system anomaly prediction model training method applied to the system anomaly prediction model training device comprises the following steps:
step 401, obtaining sample characteristics by using an embedding layer 301;
step 402, initializing the system abnormity prediction model according to the set weight parameters;
step 403, processing the sample characteristics through the first submodel 302 and the second submodel 303 to obtain predicted energy;
step 404, constructing an objective function based on the predicted energy;
step 405, updating the weight parameters of the first submodel 302 and the second submodel 303 of the system anomaly prediction model through the objective function in back propagation.
Here, the objective function is used as a loss function, and an adaptive moment estimation (ada) optimizer is used to update the weight parameters in the first sub-model and the second sub-model by calculating the gradient of the loss function.
In the embodiment, vectorization, feature extraction and anomaly prediction of the log are integrated into the system anomaly prediction model, so that the system anomaly prediction model is more convenient to predict and is suitable for different data. And the target function constructed based on the predicted energy performs joint optimization on the whole through a back propagation algorithm, so that the final result can be ensured to be optimal.
In some embodiments, before acquiring the sample feature using the embedding layer 301, step 401 further includes: step 405, pre-processing the sample.
Here, the preprocessing of the samples is to convert a sample format into a data format input by the system anomaly prediction model. Wherein the preprocessing comprises transforming and clustering the samples. Taking a sample as log data as an example for explanation, referring to table 1, a format of a log is (timestamp, level, class, information message), and the converting the log data includes: and cleaning each log in the log data and performing word segmentation on the cleaned log.
Figure BDA0002313467720000151
Figure BDA0002313467720000161
Table 1 log preprocessing example
Specifically, the cleaning of each log in the log data includes replacing variable values in the log by using a regular expression matching method based on a unique format and manual experience of the log. The variables to be replaced include an IP (internet protocol) address, a timestamp, a log level, a path/URL (uniform resource locator), 10-System numbers, 16-System numbers, a block number identifier block _ id of a Hadoop Distributed File System (HDFS), an application number identifier application _ id, a work number identifier jobid, a task number identifier task _ id, a container number identifier container _ id, and the like.
Specifically, the segmenting the cleaned log includes segmenting the log with the variable removed. In which, unlike the segmentation of natural language, the segmentation of logs uses more separators, and in the embodiment of the present application, "#", "\\", "-", "" "", "(", ")", "" ","? ","! "or the like" as a word separator, it should be noted that the separator is not limited to the above-mentioned exemplified symbols.
In addition, after the log data is converted, each logiIs converted into a word list tokeni={t1,t2,…,tnWhere t isjRepresenting a string of indefinite length. The word lists of all logs together form a list set total ═ tokens1,tokens2,…,tokensn}. Due to the characteristic that a plurality of noise interferences exist in log data, for each tokeniFiltration is also required. The filtering rules may include: there are many randomly generated strings with indefinite length in the log, which are composed of lower case letters and numbers, and in the step of removing the variable, the numbers in the log data are replaced by "#", for example, the random string "1 a8fb23e 6" is processed into "# a # fb # e #" after the step of removing the variable, and after the word segmentation, { "a", "fb", "e" }isobtained. And for each tokeniRemoving t of a predetermined length therefromjFor example, since many random character strings composed of lower case letters with very short lengths are obtained after word segmentation, most of which have a length of 1, the preset length is set to 1, i.e., each token is set to be 1iRemoving t of length 1j
Here, since the logs are often different, even a log output by the same log print statement in the application program is different depending on a variable in the log. In order to reduce the complexity of log analysis, the converted log data is clustered, each log in the processed log data is distributed to a corresponding cluster, and each cluster is given a label p after the clustering is finishediGiving the logs in each cluster to the cluster's label piThus, logs from the application program with similar log printing statements are labeled with the same label for subsequent analysisThe method treats the logs as the same logs, so that tens of millions of different logs can be simplified into thousands of different labels, and the complexity of analysis is greatly reduced. For example, the edit distance is used as a distance measure of the cluster, that is, each token list is regarded as a sentence composed of several words, and the distance between any two logs is the edit distance of two token sequences. And clustering the converted log sequence by using an OPTICS (Ordering points to identification the clustering structure) algorithm.
So far, after all log data are converted and clustered, log sequences are converted, the log sequences are sorted according to time sequence, and each log corresponding to the log sequences is represented by a cluster label corresponding to each log, but not specific log contents.
In some embodiments, the step 401 of obtaining sample features using the embedding layer 301 includes:
step 4011, obtaining a log sequence, and processing the log sequence by using the embedded layer 301 to obtain a vector sequence.
Here, acquiring the log sequence means acquiring the log sequence for training the system anomaly prediction model. Specifically, the log sequence obtained in step 405 is divided into a plurality of sub-sequences according to a sliding time window, please refer to fig. 4, a time window TC and a TP with fixed duration are taken, and the log sequences are moved on the log sequence with a certain step t, so as to obtain a log sub-sequence, for example, t is 3, the time window TC1 corresponds to the log sequences at times 1 to 30, the time window TP1 corresponds to the log sequences at times 31 to 40, and the log sequences are moved on the log sequences with a certain step 3, the time window TC2 is the log sequence at times 4 to 33, the time window TP2 corresponds to the log sequences at times 34 to 43, the first log sub-sequence is the log sub-sequence corresponding to the time windows TC1 and TP1, and the second log sub-sequence is the log sub-sequence corresponding to the time windows TC2 and TP 2. The log sub-sequence corresponding to the time window TC is a log sequence used for prediction, and the log sequence corresponding to the time window TP is a log sub-sequence used for prediction, that is, the system anomaly prediction model predicts whether an anomaly occurs in the system during the time window TP by using the log sequence corresponding to the time window TC. The method is adopted to obtain the log subsequences corresponding to the time windows TC, and the log subsequences are used as the log sequences of the abnormal prediction model of the training system.
It should be noted that, because the log sequences used for training are all historical data, the label of each time window TC may be determined by the system operating state during the corresponding TP window, for example, as shown in fig. 4, if a system operation abnormality occurs during the TP window, the time window TC label is 1, and if the system operation is normal during the TP window, the TC label is 0. Therefore, the accuracy of the trained system anomaly prediction model can be verified by using the log subsequence corresponding to the marked TC.
Here, the processing the log sequence to obtain a vector sequence includes encoding the log sequence by using an integer encoding method, and converting the log sequence into the vector sequence. For example, a time window TC contains n logs, each log corresponds to a tag P, and the subsequence of logs corresponding to the time window TC can be represented by the tag corresponding to each log as P ═ { P ═log1,plog2,…,plognBased on the idea of natural language processing, consider P as a sentence and PlogiRegarding as words, adopting one-hot coding method to make every word plogiEncoding is performed to convert the log sequence into a vector sequence.
And step 4012, extracting features of the vector sequence by using the embedding layer 301 to obtain sample features.
Here, the embedding layer 301 includes a neural network, the vector sequence obtained in step 4011 is input to the neural network of the embedding layer 301, the vectors in the vector sequence are subjected to feature extraction based on a word embedding algorithm, and are converted into feature vectors x with a fixed lengthiThus, sample characteristics are obtained. The Word embedding algorithm may adopt Word2Vec (Word vectors) or GloVe (Global vectors for Word representation), but is not limited to the above algorithm. Taking a vector sequence corresponding to a time window TC, said vector sequence comprising n vectors, for example, based on embeddingThe neural network of layer 301 converts the vector into a 64-dimensional feature vector, which is represented by X ═ X (X)1,x2,…,xn) Each xiAre 64 dimensions.
It should be noted that the model parameters of the neural network in the embedding layer 301 are trained by the word embedding model using mass data in advance. When the system abnormity prediction model is carried out, the neural network parameters are not updated.
In some embodiments, referring to fig. 3, in step 403, the processing the sample features by the first sub-model 302 and the second sub-model 303 to obtain the predicted energy includes:
step 4031a, a first output sequence is obtained based on the sample characteristics and a first decoder of a first submodel 302 of the system anomaly prediction model;
step 4031b, a second output sequence is obtained based on the sample characteristics and a second decoder of the first submodel 302 of the system anomaly prediction model;
step 4032, a third output sequence is obtained based on the first output sequence, the second output sequence and the sample characteristics.
Here, the system anomaly prediction model training apparatus captures precedence information on the log sequence by using a first sub-model 302, where the first sub-model 302 includes an encoder and two decoders, specifically, the first sub-model includes GRU layers including dual decoders and an attention mechanism, and the encoder is composed of unidirectional stacked GRUs, for example, 64 GRU units are used for each layer of the encoder.
Wherein the GRU layer encoder may include one or more encoding layers. For example, in order to achieve better practical effects, the GRU layer encoder includes two encoding layers, and the sample characteristics of the log sub-sequence including n logs corresponding to one time window TC are taken as an example for explanation. Obtaining the sample characteristic X ═ X (X) in the step 4021,x2,…,xn) The first edition is respectively obtained after the encoder input to the first sub-model 302 passes through the GRU units stacked in the double-layer single directionFirst hidden space vector of code layer
Figure BDA0002313467720000191
And a second hidden space vector of a second coding layer
Figure BDA0002313467720000192
The first hidden space vector h1And a second hidden space vector h2Splicing is carried out, and a hidden space vector H ═ H is obtained1,h2,…hn) Wherein, in the step (A),
Figure BDA0002313467720000193
concat represents the splicing function, and n is the number of time steps of GUR.
The sample feature X obtained in step 402 is input into the first submodel 302, and the process of obtaining the third output sequence may be represented as:
z=Attentive_GRU(X) (23)
where z denotes a third output sequence and Attentive _ GRU denotes a non-linear transformation function of the first submodel 302.
In some embodiments, the first decoder comprises a normal encoder, and the step 4031a, based on the sample features and the first decoder of the first submodel 302 of the system anomaly prediction model, obtains the first output sequence, comprises: and determining the current output of the first decoder according to the historical hidden space vector of the first decoder of the first submodel and the historical output of the first decoder.
For example, taking time t as an example, the hidden state s of the first decoder at the previous time1 t-1And the output y of the first decoder at the previous time1 t-1Input to the first decoder to obtain the output y of the current time1 tThe process can be expressed as y1 t=GRU1(s1 t-1,y1 t-1) Wherein GRU1 is a first decoder nonlinear transformation function, s0=hn,hnHidden space being the last time step of the encoderAnd (5) vector quantity. Thus, sample features are input into the first submodel 302, and a first output sequence Y is obtained by the first decoder1=(y1 1,y1 2,…,y1 n)。
Wherein, y1 t=GRU1(s1 t-1,y1 t-1),y1 tThe updating process comprises the following steps:
z1 t=σ(W1 zy1 t-1+Uz 1s1 t-1) (24)
rt 1=σ(Wr 1y1 t-1+Ur 1s1 t-1) (25)
Figure BDA0002313467720000194
Figure BDA0002313467720000195
y1 t=σ(W1 o·st 1) (28)
wherein, Wz 1、Wr 1、Ws 1、Wo 1、Uz 1、Ur 1、Us 1Is a related weight matrix, zt 1To refresh the door, rt 1In order to reset the gate, the gate is reset,
Figure BDA0002313467720000201
tanh represents a Tanh function, σ represents a Sigmoid function,
Figure BDA0002313467720000202
representing matrix element multiplication.
In some embodiments, the second decoder comprises a decoder with an attention mechanism, and the obtaining a second output sequence based on the sample features and the second decoder of the first sub-model 302 of the system anomaly prediction model in step 4031b comprises determining a current output of the second decoder according to the hidden spatial vector of the encoder, the historical hidden spatial vector of the first decoder, and the historical output of the second decoder.
For example, taking time t as an example, after the hidden space vector of the encoder passes through the attention module, the weighted sum c of the attention mechanism is obtainedtA weighted sum c of said attention mechanismstHidden state s of the second decoder at the previous moment2 t-1And the output y of the second decoder at the previous time2 t-1Input into a second decoder to obtain the output y of the current time2 tThe process can be represented as y2 t=GRU2(s1 t-1,y1 t-1,ct) Wherein the GRU2 is a second decoder nonlinear transformation function, such that sample features are input into the first submodel 302, a first output sequence Y is obtained by the first decoder2=(y2 1,y2 2,…,y2 n)。
Wherein a weighted sum c of attention mechanisms is obtainedtThe process of (a) is as follows,
Figure BDA0002313467720000203
Figure BDA0002313467720000204
here, etjRepresenting a hidden spatial vector s of a second decoder2 t-1With encoder hidden space vector hjThe correlation of (c). Alpha is alphatjIs a weight coefficient for representing the hidden space vector s of the second decodert-1Is heavyNature is important.
Wherein, y2 t=GRU2(s1 t-1,y1 t-1,ct),y2 tThe updating process comprises the following steps:
zt 2=σ(Wz 2y2 t-1+Uz 2s2 t-1+Czct) (31)
rt 2=σ(W2 ry2 t-1+U2 rs2 t-1+Crct) (32)
Figure BDA0002313467720000205
Figure BDA0002313467720000206
Figure BDA0002313467720000207
wherein, Wz 2、Wr 2、Ws 2、Wo 2、Uz 2、Ur 2、Us 2、Cz 2、Cr 2And Cs 2Is a related weight matrix, zt 2To refresh the door, rt 2In order to reset the gate, the gate is reset,
Figure BDA0002313467720000211
tanh represents a Tanh function, σ represents a Sigmoid function,
Figure BDA0002313467720000212
representing multiplication of matrix elements
In some embodiments, the step 4032, acquiring a third output sequence based on the first output sequence, the second output sequence and the sample feature, includes:
determining a first reconstruction error based on the first output sequence and the sample features;
determining a second reconstruction error based on the second output sequence and the sample features;
and splicing the first reconstruction error, the second reconstruction error and the hidden space vector of the last time step of the encoder in the first submodel to obtain a third output sequence.
Here, the sample feature X obtained in step 402 is (X)1,x2,…,xn) And the first output sequence Y acquired in the step 4031a1=(y1 1,y1 2,…,y1 n) Determining a first reconstruction error e1
Wherein the first reconstruction error e1The calculation formula is as follows:
Figure BDA0002313467720000213
here, the sample feature X obtained in step 402 is (X)1,x2,…,xn) And the second output sequence Y acquired in the step 4031b2=(y2 1,y2 2,…,y2 n) Determining a second reconstruction error e2
Wherein the second reconstruction error e2The calculation formula is as follows:
Figure BDA0002313467720000214
here, the first reconstruction error e is determined1The second reconstruction error e2And the hidden space vector h of the last time step of the encoder in the first submodelnSplicing to obtain a third output sequence z ═ hn,e1,e2]. Wherein the third output sequence is an output sequence of the first submodel 302 as an input sequence of the second submodel 303.
It should be noted that z is a third output sequence of sample features of the log sub-sequence corresponding to one time window TC.
In some embodiments, the step 402 of processing the sample features through the first sub-model 302 and the second sub-model 303 to obtain the predicted energy includes:
clustering the third output sequence by using a second sub-model of the system abnormity prediction model to obtain K clusters, wherein K is a positive integer;
determining a mean and a covariance of the mth cluster based on samples within the mth cluster, 0< m ≦ K;
based on the mean and covariance, a predicted energy for the sample is determined.
Here, the second sub-model includes a gaussian mixture model GMM based on the K-means and the estimation network, and the second sub-model is used to determine parameters of a gaussian distribution corresponding to the third output sequence, so as to determine the predicted energy of the sample.
Taking a log subsequence corresponding to W time windows TC, taking the log subsequence as a log sequence of a training system abnormal prediction model, and outputting W third output sequences (z1,z2,…zW) Inputting the data into a second sub-model of the system abnormity prediction model, clustering W third output sequences based on a clustering algorithm to obtain K clusters, wherein samples in each cluster are
Figure BDA0002313467720000224
For example, 100 third output sequences are input into the second submodel, a K-means clustering algorithm is adopted, Euclidean distance is used as a similarity measure, and 2 clusters are obtained, wherein a sample in one cluster is (z'1,z'2,…,z'35) Here, z'iCorresponding to the third output sequence zrAnd r is 100 or less.
Wherein the determining a mean and a covariance of the mth cluster based on the samples within the mth cluster comprises:
and estimating the third output sequence by utilizing the second submodel of the system abnormity prediction model, and determining the probability of each distribution of the samples in the third output sequence.
Here, the third output sequence is estimated by using an estimation network of the second submodel of the system anomaly prediction model, and the probability of the third output sequence belonging to each distribution in the GMM is determined
Figure BDA0002313467720000221
The estimation network is a multilayer fully-connected neural network, and the last layer of the estimation network is a normalization function layer. When the probability that the third output sequence belongs to each distribution in the GMM is estimated by utilizing the multilayer fully-connected neural network, the probability that the third output sequence belongs to each distribution
Figure BDA0002313467720000222
Is modeled as a multi-classification problem, the probability of each distribution is obtained by a multilayer fully-connected neural network:
Figure BDA0002313467720000223
the MLP is a nonlinear function of a multilayer fully-connected neural network, the number of neurons in an input layer of the multilayer fully-connected neural network is the same as the z dimension, the number of neurons in an output layer is K, an activation function is a normalization function, and the value of K is obtained by observing the distribution characteristics of a log sequence. For example, if the log sequence is divided into a normal log sequence and an abnormal log sequence, then K is taken to be 2.
It should be noted that the number of neurons in the output layer is equal to the number of clusters after clustering, and a sample in each cluster after clustering corresponds to a distribution output by the layer fully-connected neural network, that is, the mth cluster corresponds to the mth distribution.
Determining a mean and covariance of the mth cluster based on samples within the mth cluster and a probability that the samples belong to each distribution.
Determining the probability gamma of each sample in the third output sequence belonging to each distribution by using an estimation network, and determining the mean value of the mth cluster according to the calculation method of the Gaussian mixture model parameters based on the probability of the sample in the mth cluster and the probability of the sample in the mth cluster belonging to the mth distribution
Figure BDA0002313467720000231
Sum covariance
Figure BDA0002313467720000232
For example:
Figure BDA0002313467720000233
Figure BDA0002313467720000234
wherein, γimDenotes sample z'iThe probability of belonging to the m-th distribution,
Figure BDA0002313467720000235
represents the mean value of the m-th cluster,
Figure BDA0002313467720000236
represents the covariance of the mth cluster.
Here, the determining the predicted energy of the sample based on the mean and the covariance includes: based on the probability, mean and covariance of the K-distribution, the predicted energy of the sample is determined. Specifically, according to the calculation method of the Gaussian mixture model parameters, the probability that all samples belong to the mth distribution is calculated in an average mode to obtain the probability of the mth distribution, and the probability of the mth distribution is calculated
Figure BDA0002313467720000237
Formula (2)Comprises the following steps:
Figure BDA0002313467720000238
wherein W is the number of samples of the second submodel input to the system abnormality prediction model.
Here, the parameters obtained from the estimation, such as the probability of each distribution
Figure BDA0002313467720000239
Mean value
Figure BDA00023134677200002310
Sum variance
Figure BDA00023134677200002311
The predicted energy e (z) for each sample may be calculated as follows:
Figure BDA00023134677200002312
in some embodiments, the step 404 of constructing an objective function based on the predicted energy comprises:
determining a reconstruction loss based on the first output sequence, the second output sequence, and the sample characteristics.
Here, in the first sub-model 302, the performance of the encoder and the decoder is described by using the first reconstruction error and the second reconstruction error, W log sub-sequences are taken as log sequences for training the system abnormal prediction model, and the reconstruction loss L is calculatedrcThe formula of (1) is:
Figure BDA0002313467720000241
wherein | | | purple hair2Denotes norm, y1 iRepresenting a first decoder pair xiReconstructed result, y2 iRepresenting a second decoder pair xiAnd (5) reconstructing the result.
And constructing an objective function based on the reconstruction loss and the predicted energy.
Here, the objective function L is:
Figure BDA0002313467720000242
wherein λ is1Is a hyper-parameter which is the parameter,
Figure BDA0002313467720000243
is the total predicted energy of W samples.
In some embodiments, when the trained system anomaly prediction model is used for prediction, the system state prediction device inputs a log sequence for prediction into the system anomaly prediction model to obtain the energy of the prediction sample;
here, the system state prediction apparatus inputs the log sequence for prediction to an embedded layer, acquires a sample feature, inputs the sample feature to a first submodel, acquires a third output sequence, inputs the third output sequence to an estimation network of a second submodel, determines a probability that a sample belongs to each distribution, and determines the energy e (z) of the predicted log sequence together with a mean value and a covariance of each distribution obtained during training.
Determining a state of the system based on the energy.
Here, the system state prediction means determines that the system is abnormal in a case where the energy is greater than a preset energy threshold; alternatively, the system is determined to be normal if the energy is less than or equal to the energy threshold.
For example, the preset energy threshold is θ, when the log sequence corresponding to the time window TC is input into the abnormality prediction model, the energy e (z) is determined, and when the energy e (z) > θ is determined, and the prediction sample is determined to be an abnormal sample, it is determined that the system will be abnormal during the TP period.
In another aspect of the present embodiment, a system anomaly prediction model training apparatus is further provided, referring to fig. 5, the apparatus includes a first obtaining module 401, an initializing module 402, a first processing module 403, a constructing module 404, and an updating module 405, wherein,
the first obtaining module 401 is configured to obtain sample characteristics;
the initialization module 402 is configured to initialize the system anomaly prediction model according to a set weight parameter;
the first processing module 403 is configured to process the sample features through the system anomaly prediction model to obtain prediction energy;
the constructing module 404 is configured to construct an objective function based on the predicted energy;
the updating module 405 is configured to update the weight parameter of the system anomaly prediction model through the objective function in the back propagation.
In some embodiments, the first obtaining module 401 comprises a vector obtaining unit and a feature extracting unit, wherein,
the vector acquisition unit is used for acquiring a log sequence and processing the log sequence to obtain a vector sequence;
and the feature extraction unit is used for extracting features of the vector sequence to obtain sample features.
In some embodiments, the first processing module 403 includes a first sequence obtaining unit, a second sequence obtaining unit, and a third sequence obtaining unit, wherein,
the first sequence obtaining unit is used for obtaining a first output sequence based on the sample characteristics and a first decoder of a first submodel of the system abnormity prediction model;
the second sequence obtaining unit is used for obtaining a second output sequence based on the sample characteristics and a second decoder of a first submodel of the system abnormity prediction model;
the third sequence obtaining unit is configured to obtain a third output sequence based on the first output sequence, the second output sequence, and the sample feature.
In some embodiments, the third sequence acquisition unit comprises a first error determination unit, a second error determination unit, and a stitching unit, wherein,
the first error determination unit is configured to determine a first reconstruction error based on the first output sequence and the sample characteristics;
the second error determination unit is configured to determine a second reconstruction error based on the second output sequence and the sample feature;
and the splicing unit is used for splicing the first reconstruction error, the second reconstruction error and the hidden space vector of the last time step of the encoder in the first sub-model to obtain a third output sequence.
In some embodiments, the first processing module 403 comprises a clustering unit, a determination unit, a prediction energy determination unit, wherein,
the clustering unit is used for clustering the third output sequence by using a second sub-model of the system abnormity prediction model to obtain K clusters, wherein K is a positive integer;
the determining unit is used for determining the mean value and the covariance of the mth cluster based on the samples in the mth cluster, wherein 0< m is less than or equal to K;
the prediction energy determination unit is used for determining the prediction energy of the sample based on the mean value and the covariance.
In some embodiments, the determining unit comprises a probability determining unit and a mean covariance determining unit, wherein,
the probability determining unit is configured to estimate the third output sequence by using the second sub-model of the system anomaly prediction model, and determine a probability that a sample in the third output sequence belongs to each distribution;
the mean covariance determination unit is configured to determine a mean and a covariance of the mth cluster based on the samples in the mth cluster and a probability that the samples belong to each distribution.
In some embodiments, the construction module 404 includes a reconstruction loss determination unit and a function construction unit, wherein,
the reconstruction loss determining unit is used for determining the reconstruction loss according to the first output sequence, the second output sequence and the sample characteristics;
the function construction unit is used for constructing an objective function based on the reconstruction loss and the predicted energy.
In another aspect of the present embodiment, a system status prediction apparatus is further provided, referring to fig. 6, the apparatus includes the second obtaining module 501, an energy determining module 502, and a status determining module 503, wherein,
the second obtaining module 501 is configured to obtain sample characteristics of the system;
the energy determining module 502 is configured to determine energy corresponding to the sample feature based on a system anomaly prediction model;
the state determination module 503 is configured to determine a state of the system based on the energy.
In some embodiments, the state determination module 503 is specifically configured to determine that the system is abnormal when the energy is greater than a preset energy threshold; alternatively, the system is determined to be normal if the energy is less than or equal to the energy threshold.
In some embodiments, the system state prediction device further comprises a training module, wherein,
the second obtaining module 501 is further configured to obtain training sample characteristics;
the energy determining module 502 is further configured to process the training sample features through the system anomaly prediction model to obtain the predicted energy of the training sample;
the training module is specifically used for initializing the system abnormity prediction model according to a set weight parameter; constructing an objective function based on the predicted energies of the training samples; in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
In another aspect of the embodiments of the present application, an apparatus is also provided, referring to fig. 7, where the computer apparatus at least includes at least one processor 601 and at least one memory 605. Wherein the memory 605 comprises a computer program for storing data executable on the processor 601, wherein the processor 601 is configured to execute, when executing the computer program: a method of system anomaly prediction model training, the method comprising:
obtaining sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the sample characteristics through the system abnormity prediction model to obtain predicted energy;
constructing an objective function based on the predicted energy;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
The processor 601 is further configured to execute, when the computer program runs, the following steps: the obtaining sample features includes:
acquiring a log sequence, and processing the log sequence to obtain a vector sequence;
and performing feature extraction on the vector sequence to obtain sample features.
The processor 601 is further configured to execute, when the computer program runs, the following steps: the processing the sample characteristics through the system anomaly prediction model to obtain the prediction energy comprises:
obtaining a first output sequence based on the sample characteristics and a first decoder of a first submodel of the system anomaly prediction model;
obtaining a second output sequence based on the sample characteristics and a second decoder of a first submodel of the system anomaly prediction model;
and acquiring a third output sequence based on the first output sequence, the second output sequence and the sample characteristics.
The processor 601 is further configured to execute, when the computer program runs, the following steps: obtaining a third output sequence based on the first output sequence, the second output sequence, and the sample features includes:
determining a first reconstruction error based on the first output sequence and the sample features;
determining a second reconstruction error based on the second output sequence and the sample features;
and splicing the first reconstruction error, the second reconstruction error and the hidden space vector of the last time step of the encoder in the first submodel to obtain a third output sequence.
The processor 601 is further configured to execute, when the computer program runs, the following steps: the processing the sample characteristics through the system anomaly prediction model to obtain the prediction energy comprises:
clustering the third output sequence by using a second sub-model of the system abnormity prediction model to obtain K clusters, wherein K is a positive integer;
determining a mean and a covariance of the mth cluster based on samples within the mth cluster;
based on the mean and covariance, a predicted energy for the sample is determined.
The processor 601 is further configured to execute, when the computer program runs, the following steps: the determining a mean and a covariance of the mth cluster based on the samples within the mth cluster comprises:
estimating the third output sequence by utilizing the second submodel of the system abnormity prediction model, and determining the probability of each distribution of the samples in the third output sequence;
determining a mean and covariance of the mth cluster based on samples within the mth cluster and a probability that the samples belong to each distribution.
The processor 601 is further configured to execute, when the computer program runs, the following steps: the constructing an objective function based on the predicted energy comprises:
determining a reconstruction loss based on the first output sequence, the second output sequence, and the sample characteristics;
and constructing an objective function based on the reconstruction loss and the predicted energy.
In some embodiments, the device further comprises a system bus 602, a user interface 603, a communication interface 604. Wherein the communication bus 602 is configured to enable connectivity communication between these components, the user interface 603 may include a display screen, and the communication interface 604 may include standard wired and wireless interfaces.
In another aspect of the embodiments of the present application, an apparatus is also provided, with reference to fig. 8, where the computer apparatus at least includes at least one processor 701 and at least one memory 705. Wherein the memory 705 comprises a computer program for storing data executable on the processor 701, wherein the processor 701 is configured to execute, when executing the computer program: a method of system state prediction, the method comprising:
acquiring sample characteristics of a system;
determining energy corresponding to the sample characteristics based on a system abnormity prediction model;
determining a state of the system based on the energy.
In some embodiments, the processor 701 is configured to execute, when running the computer program, the following: the determining a state of the system based on the energy includes:
determining that the system is abnormal when the energy is greater than a preset energy threshold;
alternatively, the system is determined to be normal if the energy is less than or equal to the energy threshold.
In some embodiments, the processor 701, when running the computer program, is configured to perform:
acquiring training sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the training sample characteristics through the system anomaly prediction model to obtain the prediction energy of the training sample;
constructing an objective function based on the predicted energies of the training samples;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
In some embodiments, the device further comprises a system bus 702, a user interface 703, a communication interface 704. Wherein the communication bus 702 is configured to enable connectivity communication between these components, the user interface 703 may include a display screen, and the communication interface 704 may include standard wired and wireless interfaces.
In yet another aspect of the embodiments of the present application, a storage medium is further provided, where a system anomaly prediction model training program and/or a system state prediction program are stored on the computer readable storage medium, where the system anomaly prediction model training program is executed by a processor to implement the steps of the system anomaly prediction model training method provided in any one of the embodiments of the present application, and the system state prediction program is executed by the processor to implement the steps of the system state prediction method provided in any one of the embodiments of the present application.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. The protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

1. A system anomaly prediction model training method is characterized by comprising the following steps:
obtaining sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the sample characteristics through the system abnormity prediction model to obtain predicted energy;
constructing an objective function based on the predicted energy;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
2. The method of claim 1, wherein the obtaining sample features comprises:
acquiring a log sequence, and processing the log sequence to obtain a vector sequence;
and performing feature extraction on the vector sequence to obtain sample features.
3. The method of claim 1, wherein the processing the sample features by the system anomaly prediction model to obtain a predicted energy comprises:
obtaining a first output sequence based on the sample characteristics and a first decoder of a first submodel of the system anomaly prediction model;
obtaining a second output sequence based on the sample characteristics and a second decoder of a first submodel of the system anomaly prediction model;
and acquiring a third output sequence based on the first output sequence, the second output sequence and the sample characteristics.
4. The method of claim 3, wherein obtaining a third output sequence based on the first output sequence, the second output sequence, and the sample features comprises:
determining a first reconstruction error based on the first output sequence and the sample features;
determining a second reconstruction error based on the second output sequence and the sample features;
and splicing the first reconstruction error, the second reconstruction error and the hidden space vector of the last time step of the encoder in the first submodel to obtain a third output sequence.
5. The method of claim 3, wherein the processing the sample features by the system anomaly prediction model to obtain a predicted energy comprises:
clustering the third output sequence by using a second sub-model of the system abnormity prediction model to obtain K clusters, wherein K is a positive integer;
determining a mean and a covariance of the mth cluster based on samples within the mth cluster, 0< m ≦ K;
based on the mean and covariance, a predicted energy for the sample is determined.
6. The method of claim 5, wherein the determining the mean and covariance of the mth cluster based on the samples in the mth cluster comprises:
estimating the third output sequence by utilizing the second submodel of the system abnormity prediction model, and determining the probability of each distribution of the samples in the third output sequence;
determining a mean and covariance of the mth cluster based on samples within the mth cluster and a probability that the samples belong to each distribution.
7. The method of claim 4, wherein constructing an objective function based on the predicted energy comprises:
determining a reconstruction loss based on the first output sequence, the second output sequence, and the sample characteristics;
and constructing an objective function based on the reconstruction loss and the predicted energy.
8. A method for predicting a system state, the method comprising:
acquiring sample characteristics of a system;
determining energy corresponding to the sample characteristics based on a system abnormity prediction model;
determining a state of the system based on the energy.
9. The method of claim 8, wherein the determining the state of the system based on the energy comprises:
determining that the system is abnormal when the energy is greater than a preset energy threshold;
alternatively, the system is determined to be normal if the energy is less than or equal to the energy threshold.
10. The method according to claim 8 or 9, wherein before the determining the energy corresponding to the sample feature based on the system anomaly prediction model, the method further comprises:
acquiring training sample characteristics;
initializing the system abnormity prediction model according to the set weight parameters;
processing the training sample characteristics through the system anomaly prediction model to obtain the prediction energy of the training sample;
constructing an objective function based on the predicted energies of the training samples;
in back propagation, updating the weight parameters of the system anomaly prediction model through the objective function.
11. The system abnormity prediction model training device is characterized by comprising a first acquisition module, an initialization module, a first processing module, a construction module and an updating module, wherein,
the first obtaining module is used for obtaining sample characteristics;
the initialization module is used for initializing the system abnormity prediction model according to the set weight parameters;
the first processing module is used for processing the sample characteristics through the system abnormity prediction model to obtain predicted energy;
the construction module is used for constructing an objective function based on the predicted energy;
and the updating module is used for updating the weight parameters of the system abnormity prediction model through the target function in back propagation.
12. A system state prediction apparatus comprising a second acquisition module, an energy determination module, and a state determination module, wherein,
the second acquisition module is used for acquiring the sample characteristics of the system;
the energy determining module is used for determining energy corresponding to the sample characteristics based on a system abnormity prediction model;
the state determination module is configured to determine a state of the system based on the energy.
13. An apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of a method of training a system anomaly prediction model according to any one of claims 1 to 7 and/or a method of predicting a system state according to any one of claims 8 to 10.
14. A storage medium having stored thereon a system anomaly prediction model training program which, when executed by a processor, implements the steps of the system anomaly prediction model training method according to any one of claims 1 to 7, and/or a system state prediction program which, when executed by a processor, implements the steps of the system state prediction method according to any one of claims 8 to 10.
CN201911268312.XA 2019-12-11 2019-12-11 Model training method, state prediction method, device, equipment and storage medium Active CN112948155B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911268312.XA CN112948155B (en) 2019-12-11 2019-12-11 Model training method, state prediction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911268312.XA CN112948155B (en) 2019-12-11 2019-12-11 Model training method, state prediction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112948155A true CN112948155A (en) 2021-06-11
CN112948155B CN112948155B (en) 2022-12-16

Family

ID=76234082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911268312.XA Active CN112948155B (en) 2019-12-11 2019-12-11 Model training method, state prediction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112948155B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254255A (en) * 2021-07-15 2021-08-13 苏州浪潮智能科技有限公司 Cloud platform log analysis method, system, device and medium
CN115410638A (en) * 2022-07-28 2022-11-29 南京航空航天大学 Magnetic disk fault detection system based on contrast clustering
WO2023272851A1 (en) * 2021-06-29 2023-01-05 未鲲(上海)科技服务有限公司 Anomaly data detection method and apparatus, device, and storage medium
CN117150407A (en) * 2023-09-04 2023-12-01 国网上海市电力公司 Abnormality detection method for industrial carbon emission data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8554703B1 (en) * 2011-08-05 2013-10-08 Google Inc. Anomaly detection
CN106656637A (en) * 2017-02-24 2017-05-10 国网河南省电力公司电力科学研究院 Anomaly detection method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8554703B1 (en) * 2011-08-05 2013-10-08 Google Inc. Anomaly detection
CN106656637A (en) * 2017-02-24 2017-05-10 国网河南省电力公司电力科学研究院 Anomaly detection method and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023272851A1 (en) * 2021-06-29 2023-01-05 未鲲(上海)科技服务有限公司 Anomaly data detection method and apparatus, device, and storage medium
CN113254255A (en) * 2021-07-15 2021-08-13 苏州浪潮智能科技有限公司 Cloud platform log analysis method, system, device and medium
CN115410638A (en) * 2022-07-28 2022-11-29 南京航空航天大学 Magnetic disk fault detection system based on contrast clustering
CN115410638B (en) * 2022-07-28 2023-11-07 南京航空航天大学 Disk fault detection system based on contrast clustering
CN117150407A (en) * 2023-09-04 2023-12-01 国网上海市电力公司 Abnormality detection method for industrial carbon emission data

Also Published As

Publication number Publication date
CN112948155B (en) 2022-12-16

Similar Documents

Publication Publication Date Title
CN112948155B (en) Model training method, state prediction method, device, equipment and storage medium
CN114169330B (en) Chinese named entity recognition method integrating time sequence convolution and transform encoder
CN112966074B (en) Emotion analysis method and device, electronic equipment and storage medium
US10891540B2 (en) Adaptive neural network management system
JP6793774B2 (en) Systems and methods for classifying multidimensional time series of parameters
CN111382555B (en) Data processing method, medium, device and computing equipment
CN110659742A (en) Method and device for acquiring sequence representation vector of user behavior sequence
CN110674673A (en) Key video frame extraction method, device and storage medium
CN110766070A (en) Sparse signal identification method and device based on cyclic self-encoder
EP3798916A1 (en) Transformation of data samples to normal data
Azzalini et al. A minimally supervised approach based on variational autoencoders for anomaly detection in autonomous robots
CN111475622A (en) Text classification method, device, terminal and storage medium
CN114090326B (en) Alarm root cause determination method, device and equipment
CN110781818B (en) Video classification method, model training method, device and equipment
CN116402352A (en) Enterprise risk prediction method and device, electronic equipment and medium
CN115795038A (en) Intention identification method and device based on localization deep learning framework
CN116561748A (en) Log abnormality detection device for component subsequence correlation sensing
Behnaz et al. DEEPPBM: Deep probabilistic background model estimation from video sequences
CN113378178A (en) Deep learning-based graph confidence learning software vulnerability detection method
Wakchaure et al. A scheme of answer selection in community question answering using machine learning techniques
CN115035455A (en) Cross-category video time positioning method, system and storage medium based on multi-modal domain resisting self-adaptation
CN113392929A (en) Biological sequence feature extraction method based on word embedding and self-encoder fusion
CN115512693A (en) Audio recognition method, acoustic model training method, device and storage medium
CN112866257A (en) Domain name detection method, system and device
CN111967253A (en) Entity disambiguation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant