Disclosure of Invention
In view of the above, an object of the present invention is to provide a method and a system for monitoring a fault of a software system based on deep convolution migration learning, which can still obtain an ideal fault monitoring effect when the conditions of fewer fault samples under multiple loads or missing a certain fault sample occur, and can save a lot of time without retraining a network model for a data set under a new load. The transfer learning is a learning method for solving problems in different but related fields by using existing knowledge, and the method realizes field knowledge sharing by transferring the knowledge obtained by learning in a source field into a target field, thereby solving the problem of poor performance of a training model caused by few learning samples and unbalanced sample distribution in the target field. Compared with methods such as incremental learning, multi-task learning and self-learning, the migration learning emphasizes the correlation between learning tasks and utilizes the correlation to complete the migration between knowledge. The concept of Deep learning originates from the field of artificial intelligence machine learning, and a Deep Neural Network (DNN) model composed of multiple hidden layers is a remarkable characteristic of the Deep learning model. Compared with a shallow neural network model, the DNN can combine bottom layer features to form more abstract high-level feature representation, so that implicit feature expression of data is found, and features of information are effectively extracted and represented through layer-by-layer conversion of data features. Transfer Learning (Transfer Learning) is a machine Learning method, which transfers knowledge in one field (i.e., a source field) to another field (i.e., a target field) to enable the target field to obtain a better Learning effect.
In order to achieve the purpose, the invention provides the following technical scheme:
on one hand, the invention provides a software system fault monitoring method based on deep convolution transfer learning, which comprises the following steps:
s1: collecting a software system load data set under the existing load S, and constructing a source domain sample data set;
s2: point division is carried out on each group of original response time, and a source domain data set is constructed;
s3: constructing a target domain sample data set, and performing point segmentation on each group of original response time in the target domain data set to construct a target domain data set;
s4: and carrying out fault monitoring on the software system by using the source domain data set and the target domain data set through deep convolution transfer learning.
Further, in step S1, the software system load sample data set under the existing load S is classified into w states according to the fault type, and the original response time under each fault type
Where w represents the data class, w is 1, 2, 3 … n, x
0~x
nRepresented as the 1 st to n +1 th group fault signals in the w fault state.
Further, in the step S2, the source domain data set construction method includes the following steps:
s21: setting a window sliding step length s and a window length l according to the number N of data points, and generating a sample number t; sample di={X0,X1,X2,...XL1, 2, 3, ·, t; obtaining a source domain data set M from a samples={d1,,,d2,,d3,…dt,};
S22: setting a source domain test set in a source domain data set
And source domain training set
R, the source domain training set
Sample number a ═ t · r, source domain test set
The sample number b is t (1-r).
Further, in step S3, the machine response time of the software system under different loads in the four states of normal operation state, data abnormality, local user abnormality, and downtime is collected
Constructing a target domain sample data set according to the machine response time
Wherein, w' is 1, 2, 3, 4, which respectively represents four states of normal operation state, program data abnormity, local error and downtime.
Further, setting window sliding step length and window length, and constructing a target domain data set M
T(ii) a Setting the proportion of the test set and the training set, and constructing a target domain data training set
And target domain data test set
Further, in step S4, the fault monitoring includes the following steps:
s41: training source domain data
Inputting a set of one-dimensional depth convolution neural network I to pre-train and initialize network parameters, and testing the set through a source domain
Testing the network effect, if the testing effect is ideal, pre-training to finish determining parameters and finishing training the network, otherwise, continuously adjusting the network to perform back propagation and continuously updating the parameters until the network achieves the ideal effect on the test set to finish training;
s42: targeting domain dataset M using convolutional neural network hierarchy
TPerforming transfer learning, freezing the global mean pooling layer L in the feature extraction module and the feature classification module of the one-dimensional deep convolutional neural network I
GAnd L in the full connection layer
FAdding a new Softmax layer for the network model I to adapt to the target domain data set
Completing network level adjustment and constructing new network I
2;
S43: to network I2Fine tuning is performed by locking feature classification modules D, and L1,L2,L3Weight parameter of layer, unfreezing L4Layer parameters, obtaining network I after fine adjustment3;
S44: acquiring original fault signals of the software system in real time and transmitting the signals to a network I3And obtaining a fault monitoring result of the current software system.
Further, the one-dimensional depth convolution neural network I model construction method in step S41 includes the following steps:
(1) construction of a convolution pooling layer Lj:
Lj={Cj,Pj,Bj}
In the formula, Cj、Pj、BjThe convolution layer, the pooling layer and the normalization layer are respectively used for feature extraction; j is the number of the convolution pooling module;
(2) stacking 4 convolution pooling layers to construct a feature extraction module S', S ═ L1,L2,L3,L4};
(3) Adding a characteristic classification module D, D ═ L
G,L
F,L
softmaxThe feature classification module comprises a global mean pooling layer L
GAll-connected layer L
FSoftmax layer
And completing the network construction.
Further, in the step S43, the target domain training data set is used
To network I
3Training is carried out to enable the network to extract deep abstract features from the target domain data set
Via the full connection layer L
FSoftmax layer
And outputting the fault probability distribution of each fault type of the target domain, wherein the maximum probability of the fault probability distribution corresponds to the fault type and serves as a diagnosis result.
On the other hand, the invention provides a load diagnosis system based on a deep convolution migration learning software system, which comprises a source domain sample data set construction module, a source domain data set construction module, a target domain data set construction module and a fault monitoring module;
the source domain sample data set construction module collects a software system load data set under the existing load S and constructs a source domain sample data set;
the source domain data set construction module performs point number segmentation on each group of original response time to construct a source domain data set;
the target domain data set construction module constructs a target domain sample data set, and performs point segmentation on each group of original response time in the target domain data set to construct a target domain sample data set;
and the fault monitoring module carries out fault monitoring on the software system by using the deep convolution transfer learning on the source domain data set and the target domain data set.
Further, when the fault monitoring module detects a fault, the method includes: training set of source domain data
Inputting one-dimensional deep convolution neural network I to pre-train and initialize network parameters, and passing through a source domain test set
Testing the network effect of the network, if the test effect is ideal, pre-training to complete the determination of parameters and complete the training of the network, otherwise, continuously adjusting the network to perform back propagation and continuously updating the parameters until the network achieves the ideal effect on the test set to complete the training; targeting domain dataset M using convolutional neural network hierarchy
TPerforming transfer learning, freezing the global mean pooling layer L in the feature extraction module S' and the feature classification module of the one-dimensional deep convolutional neural network I
GAnd weighting parameters of LF in the full connection layer, and adding a new Softmax layer for adapting the network model I to the target domain data set
New network I is constructed by adjusting transmission completion network level
2(ii) a Acquiring original fault signals of the software system in real time and transmitting the signals to a network I
3And obtaining a fault monitoring result of the current software system.
The invention has the beneficial effects that: compared with the traditional fault monitoring method, the deep winding machine migration learning method provided by the invention still has higher fault monitoring precision when a few sample data sets or missing sample data sets are faced. 2. Compared with the traditional fault monitoring method, the deep convolution transfer learning method provided by the invention utilizes the convolutional neural network hierarchical structure transfer learning, and can save a large amount of time when new load training is faced.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
As shown in fig. 1, the present invention provides a method for monitoring system faults based on deep convolution transfer learning software, which includes the following steps:
s1, collecting the software system load data set under the existing load S, and constructing the source domain sample data set
The software system load sample data set under the existing load S is classified into w states according to fault types, and the original response time under each fault type
Where w represents the data class, w is 1, 2, 3 … n, x
NRepresented as the nth set of fault signals in the w fault state.
S2, performing point number segmentation on each group of original response time to construct a source domain data set;
to be provided with
One set of signals x in (1)
1For example, for x
1Point segmentation is carried out to construct a source domain sample data set, and the specific steps are as follows:
and S21, setting a window sliding step length S and a window length l according to the number N of the data points, and generating a sample with the number t. Sample di={X0,X1,X2,...XL1, 2, 3, ·, t; obtaining a source domain data set M from a samples={d1,,,d2,,d3,…dt,};
S22, setting a source domain test set in a source domain data set
And source domain training set
R, the source domain training set
Sample number a ═ t · r, source domain test set
Sample number b ═ t · (1-r); in the present embodiment, the ratio r is preferably 0.3.
S3, constructing a target domain sample data set
Dividing the number of points of each group of original response time in the target domain data set to construct a target domain data set;
collecting original response time of software system under different loads in four states of normal operation state, abnormal program data, local error and downtime
Constructing a target domain sample data set according to the response time
W is 1, 2, 3 and 4, which respectively represent a normal running state, tooth surface abrasion, planet gear tooth breakage and rolling element bearing loss;
setting window sliding step size and window length according to step S21, and constructing target domain data set M
TSetting the ratio of the test set to the training set according to the step S22, and constructing a training set of target domain data
And target domain data test set
S4, carrying out fault monitoring on the software system by using the deep convolution transfer learning of the source domain data set and the target domain data set, which comprises the following specific steps:
s41, training set of source domain data
Inputting one-dimensional deep convolution neural network I to pre-train and initialize network parameters, and passing through a source domain test set
And testing the network effect of the network, if the test effect is ideal, pre-training to finish determining parameters and finishing training the network, otherwise, continuously adjusting the network to perform back propagation and continuously updating the parameters until the network achieves the ideal effect on the test set to finish training. The initialization of the internal parameters of the network comprises the steps of setting learning rate, activating function, weighting parameters, extracting characteristics and the like.
As shown in fig. 2, the method for constructing the one-dimensional depth convolution neural network I model includes:
(1) construction of a convolution pooling layer Lj:
Lj={Cj,Pj,Bj}
In the formula, Cj、Pj、BjThe convolution layer, the pooling layer and the normalization layer are respectively used for feature extraction; j is the convolution pooling module number.
(2) Superposing 4 convolution pooling layers to construct a feature extraction module S, S ═ L1,L2,L3,L4}。
(3) Adding a characteristic classification module D, D ═ L
G,L
F,L
softmaxThe feature classification module comprises a global mean pooling layer L
GAll-connected layer L
FSoftmax layer
And completing the network construction.
The source domain data training set passes through C of each convolution pooling layer in the feature extraction module
j、P
j、B
jConvolution kernel operation, pooling operation, normalization operation output characteristics of
Superposition of 4 convolutional pooling layers S ═ L
1,L
2,L
3,L
4Get the final characteristics
Final characteristics
Outputting the characteristic value y after passing through the global mean pooling layer
fg(ii) a Full connection layer pair y
fgPerforming characteristic combination and Dropodt operation to output characteristic value y
tAnd is combined with y
tAnd (4) inputting the probability distribution of each fault type of the source domain into a Softmax classifier, and taking the maximum probability of the probability distribution corresponding to the fault type as a diagnosis result.
S42, utilizing the convolutional neural network hierarchy structure to carry out the data set M of the target domain
TPerforming transfer learning, freezing a feature extraction module S of the network model I and a global mean pooling layer L in the feature classification module
GAnd L in the full connection layer
FAdding a new Softmax layer for the network model I to adapt to the target domain data set
Completing network level adjustment and constructing new network I
2As shown in fig. 3.
Training set using target domain data
For new network 1
2Training and updating Softmax layer
Pass the target domain test set
And testing the network, finishing the transfer learning if the testing effect is ideal, and otherwise, continuing to perform network iteration and performing back propagation until the network achieves the ideal effect on the testing set.
S43, network I2Fine tuning is performed by locking feature classification modules D, and L1,L2,L3Weight parameter of layer, unfreezing L4Layer parameters, obtaining network I after fine adjustment3As shown in fig. 4.
Training a data set using a target domain
To network I
3Training is carried out to enable the network to extract deep abstract features from the target domain data set
Via the full connection layer L
FSoftmax layer
And outputting the fault probability distribution of each fault type of the target domain, wherein the maximum probability of the fault probability distribution corresponds to the fault type and serves as a diagnosis result.
S44, real-time obtaining step S3SoftwareThe original fault signal of the system is transmitted to the network I in step S433And obtaining a fault monitoring result of the current software system.
The invention also provides a system load diagnosis system based on the deep convolution transfer learning software, which comprises: the system comprises a source domain sample data set construction module, a source domain data set construction module, a target domain data set construction module and a fault monitoring module;
a source domain data set construction module collects a software system load data set under the existing load S and constructs a source domain sample data set;
the source domain data set construction module performs point number segmentation on each group of original response time to construct a source domain data set;
the target domain data set construction module constructs a target domain sample data set, and performs point segmentation on each group of original response time in the target domain sample data set to construct a target domain data set;
and the fault monitoring module carries out fault monitoring on the software system by using the deep convolution transfer learning on the source domain data set and the target domain data set.
In the above embodiment, in the fault monitoring module, the fault monitoring includes the following steps:
training set of source domain data
Inputting one-dimensional deep convolution neural network I to pre-train and initialize network parameters, and passing through a source domain test set
Testing the network effect of the network, if the test effect is ideal, pre-training to complete the determination of parameters and complete the training of the network, otherwise, continuously adjusting the network to perform back propagation and continuously updating the parameters until the network achieves the ideal effect on the test set to complete the training;
targeting domain dataset M using convolutional neural network hierarchy
TPerforming transfer learning, freezing a feature extraction module S of the network model I and a global mean pooling layer L in the feature classification module
GAnd L in the full connection layer
FAdding a new Softmax layer for the network model I to adapt to the target domain data set
Completing network level adjustment and constructing new network I
2;
To network I2Fine tuning is performed by locking feature classification modules D, and L1,L2,L3Weight parameter of layer, unfreezing L4Layer parameters, obtaining network I after fine adjustment3;
Acquiring original fault signals of the software system in real time and transmitting the signals to a network I3And obtaining a fault monitoring result of the current software system.
In conclusion, the invention constructs the one-dimensional deep convolutional neural network, performs the migration learning by utilizing the hierarchical structure of the one-dimensional convolutional neural network, and provides the software system fault monitoring method based on the deep convolutional migration learning. The invention uses the existing source domain data set to pre-train the one-dimensional convolutional neural network, and uses the hierarchical structure of the one-dimensional convolutional neural network to complete the transfer learning of the target domain data set.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.