CN114609994B - Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning - Google Patents

Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning Download PDF

Info

Publication number
CN114609994B
CN114609994B CN202210174747.3A CN202210174747A CN114609994B CN 114609994 B CN114609994 B CN 114609994B CN 202210174747 A CN202210174747 A CN 202210174747A CN 114609994 B CN114609994 B CN 114609994B
Authority
CN
China
Prior art keywords
granularity
model
fault
new
old
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210174747.3A
Other languages
Chinese (zh)
Other versions
CN114609994A (en
Inventor
王煜
陈慧彤
胡清华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202210174747.3A priority Critical patent/CN114609994B/en
Publication of CN114609994A publication Critical patent/CN114609994A/en
Application granted granted Critical
Publication of CN114609994B publication Critical patent/CN114609994B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0259Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterized by the response to fault detection
    • G05B23/0262Confirmation of fault detection, e.g. extra checks to confirm that a failure has indeed occurred
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/20Pc systems
    • G05B2219/24Pc safety
    • G05B2219/24065Real time diagnostics

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a fault diagnosis method and device based on multi-granularity regularization rebalancing increment learning, wherein the method comprises the following steps: constructing a continuous label with multi-granularity information, and optimizing by utilizing KL divergence loss; the feature extraction layer is used for obtaining feature expression vectors of new and old categories, decision output of the current model is constrained to be identical to output distribution of the model before incremental learning through knowledge distillation, relatively low weight is applied to the new categories with a large number of samples based on a multi-granularity regularization term, gap between new fault categories and gradient update of the old fault categories is balanced, and category imbalance is relieved; and adopting a two-stage training strategy, performing a first-stage training by using the data which can be acquired currently, updating the feature extraction layer, decoupling with the feature extraction layer during a second-stage training, namely freezing parameters of the feature extraction layer, and retraining the classifier by adopting a resampled balance training subset. The device comprises: a processor and a memory. The invention improves the continuous learning capability of the fault diagnosis model, thereby improving the fault recognition capability of the device.

Description

Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning
Technical Field
The invention relates to the field of intelligent fault diagnosis of machine learning, in particular to a fault diagnosis method and device based on multi-granularity regularized heavy balance incremental learning.
Background
Along with the rapid development of modern technology, large equipment is continuously promoted to be new, and the large equipment is widely used in various fields of traffic, energy, electric power, aerospace and the like, and the safe, stable, reliable and efficient operation of the large equipment is closely related to national development and national defense construction. Meanwhile, as the running time goes, serious equipment inevitably fails, so that a disaster accident occurs; the complexity and precision of large devices makes their failure difficult to detect by simple detection means. Therefore, how to design intelligent fault diagnosis research for large-scale equipment becomes a problem to be solved urgently for industrial intelligence.
The existing intelligent fault diagnosis method designs the deep neural network to train and deploy based on the tidied data by collecting sensor information of equipment at different moments, and predicts the newly collected sensor data to judge whether the equipment belongs to a normal state or a certain fault state currently. This approach, while capable of achieving good performance, would not allow for an update of the intelligent fault diagnosis model once deployed. Generally, the type of failure of the equipment will change gradually with the age, and failures that have not occurred before will come up. The existing method is difficult to be practically used because only the learned old type of faults can be identified.
Incremental learning is a way to update a model to learn a new task, i.e., the model learns the new task on the basis of a previous task, to achieve the ability to process both new and old tasks. In general, the incremental learning model can save knowledge of old tasks while learning a large amount of new task data, with a small portion of old task-representative data. However, the data volume of the new task is far higher than that of the old task, so that the deep neural network model changes important parameters related to the old task when the new task is learned, and serious deviation is generated on the identification of the new task and the old task, namely, the parameters are biased to learn the new task, and the identification capacity of the old task is greatly reduced.
To solve this problem, the existing methods design some deviation correction methods, add a linear correction layer, normalize the output cosine of the new class and the old class, etc., but they depend on the assumption of the deviation relationship between the new class and the old class to a great extent. In practical applications, this assumption is difficult to be established, and therefore, the performance of the model is impaired, and it is difficult to adapt to a complex practical application scenario. Therefore, how to solve the forgetting problem of the old task in the incremental learning is a key of whether the intelligent fault diagnosis model can be effectively applied on a large scale.
Disclosure of Invention
The invention provides a fault diagnosis method and a device based on multi-granularity regularization rebalancing incremental learning, and designs a multi-granularity regularization rebalancing method based on the problem that the prior assumption is difficult to obtain for the deviation distribution of new and old types of faults in the fault diagnosis task incremental learning, so that the constraint model can learn the balanced data distribution and the correlation among different fault categories at the same time, and the model can reduce the forgetting of knowledge about the learned old type of faults when accurately learning and identifying new faults, thereby improving the continuous learning capability of the fault diagnosis model, further improving the fault identification capability of large equipment, greatly increasing the safety of equipment, reducing the fault rate, and being described in detail below:
in a first aspect, a fault diagnosis method based on multi-granularity regularized heavy balance incremental learning, the method comprising:
carrying out category word vector representation on the divided data set, obtaining word vectors corresponding to semantic tags of the data set, and clustering by using a K-means algorithm to obtain a two-layer multi-granularity structure; constructing a continuous label with multi-granularity information, and optimizing by utilizing KL divergence loss;
the feature extraction layer is used for obtaining feature expression vectors of new and old types, decision output of the current model is constrained to be identical to output distribution of the model before incremental learning through knowledge distillation, relatively low weight is applied to the new failure types with a large number of samples based on a multi-granularity regularization term, gap between the new failure types and gradient update of the old failure types is balanced, and type imbalance is relieved;
and adopting a two-stage training strategy, performing a first-stage training by using the data which can be acquired currently, updating the feature extraction layer, decoupling the classifier from the feature extraction layer during a second-stage training, namely freezing parameters of the feature extraction layer, and retraining the classifier by adopting a resampled balance training subset.
The first stage training process is as follows: sample D of new class new And old class paradigm set D' old Mixing, designated as D t As input of the model, the output of the network is obtained through the feature extraction layer and the classifier, and the optimization target is to minimize the weighted sum of distillation loss and the loss of the rebalancing module of multi-granularity regularization.
Further, the second stage training process is as follows: sampling the data into training subsets with balanced categories, freezing parameters of a feature extraction layer, and independently retraining a classifier; the optimization objective is to minimize the weighted sum of distillation losses and the loss of the multi-granularity regularized re-balancing module.
In a second aspect, a fault diagnosis apparatus based on multi-granularity regularized heavy balance delta learning, the apparatus comprising: a processor and a memory having stored therein program instructions that invoke the program instructions stored in the memory to cause an apparatus to perform the method steps of any of the first aspects.
In a third aspect, a computer readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method steps of any of the first aspects.
The technical scheme provided by the invention has the beneficial effects that:
1. the invention considers the correlation between the rebalance modeling and the fault category at the same time, further improves the learning ability of new and old categories, reduces the disastrous forgetting phenomenon of the model, improves the recognition ability of fault diagnosis, further improves the fault recognition ability of large equipment, greatly increases the safety of equipment, and reduces the fault rate of the equipment;
2. the invention provides a new multi-granularity regularized rebalancing method aiming at the intelligent fault diagnosis task of large-scale equipment in an industrial scene, the method does not need to assume the deviation relation between new and old categories in advance, and the problem of disastrous forgetting of the old task of incremental learning of an intelligent fault diagnosis model is improved;
3. the invention designs a multi-granularity regularized rebalancing module, restricts the influence of the rebalancing module on model updating according to the number of samples of different fault categories, and embeds the class correlation based on the hierarchical structure into the learning process by constructing the hierarchical structure of the fault class;
4. the invention can reach the most advanced performance in the incremental learning tasks such as fault diagnosis, and the like, and compared with a comparison method, the average precision improvement can reach 7.47 percent and 9.99 percent at most.
Drawings
FIG. 1 is a flow chart of a fault diagnosis method based on multi-granularity regularized heavy balance increment learning;
FIG. 2 is a schematic diagram of a multi-granularity structure construction of a large fault diagnosis dataset FASON;
FIG. 3 is a comparative schematic;
the graph is applied to a large fault diagnosis data set, and compared with a method only adopting a re-balance method, the accuracy of all categories is improved by adopting multi-granularity regularization, and the accuracy of most categories is obviously improved after the multi-granularity regularization is adopted;
FIG. 4 is a schematic structural diagram of a fault diagnosis device based on multi-granularity regularized heavy balance increment learning;
table 1 shows the comparison of the accuracy and average accuracy of the final stage on the published data set for the different methods;
table 2 shows the comparison of the accuracy and average accuracy of the last stage on the fault diagnosis data set collected in the real scene by different methods;
table 3 shows the effectiveness of the various components of the present method for an ablation experiment at a FARON-22/22 setting.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in further detail below.
Example 1
Since only a small number of samples are saved in the learned old fault class, and a large number of samples are used for learning the new fault class, in order to solve the problems existing in the background technology, the incremental learning of the new fault class in the fault diagnosis task can cause serious degradation of the identification capability of the old fault class, and the embodiment of the invention provides a fault diagnosis method based on multi-granularity regularized heavy balance incremental learning, which can maintain the identification capability of the old fault class in the past while incrementally learning the new fault in the fault diagnosis, so that the quick and efficient fault diagnosis model is updated and continuously learned, and the method comprises the following steps of:
101: loss of weight balance;
wherein, the re-balance loss learns the updating weight of the category based on the sample number of the new and old fault category, and the identification capability is maintained by increasing the gradient updating of the old fault category with sparse samples.
102: a multi-granularity regularization term;
the new class fault learning is affected only by using the re-balance loss, so that a multi-granularity regularization term is provided, and the relation between the fault classes is learned by using the hierarchical structure information constraint model of the fault classes, so that the new class and the old class can be identified.
103: the classifier is trained.
The method comprises the steps of adopting a two-stage training strategy, performing a first-stage training by using data which can be acquired currently, updating a feature extraction layer, decoupling with the feature extraction layer during a second-stage training, namely freezing parameters of the feature extraction layer, and retraining a classifier by adopting a resampled balance training subset.
In summary, the embodiment of the invention realizes that the correlation between the rebalancing modeling and the fault category is considered at the same time through the steps 101-103, so that the learning ability of new and old categories is further improved, the catastrophic forgetting phenomenon of the model is lightened, the identification ability of fault diagnosis is improved, the safety of equipment is greatly improved, and the fault rate of the equipment is reduced.
Example 2
The scheme of example 1 is further described in conjunction with fig. 1, the calculation formulas, and the tables, and is described in detail below:
1. model frame
The model framework consists of a feature extraction module, knowledge distillation, multi-granularity regularized rebalancing modeling and a classifier, and is divided into two-stage training, as shown in fig. 1. The following two components are the core of the method:
(1) And (3) modeling of the rebalance: a re-balance strategy is used in the training process to mitigate the effect of data imbalance;
the foregoing rebalancing strategy is well known to those skilled in the art, and the embodiments of the present invention will not be described herein.
(2) Multi-granularity regularization: because the use of only the rebalancing strategy results in a decrease in the classification accuracy of the new class (i.e., the under-fitting of the new class), embodiments of the present invention design a multi-granularity regularization term that allows the model to take into account class correlation.
2. Data partitioning, sampling and construction of multi-granularity structure
1. Data set partitioning
The embodiment of the invention refers to the class which is learned in the current state of the model as an old class, and refers to the class which is not learned and is to be learned at this stage as a new class. For the incremental learning fault diagnosis task, it is necessary to divide the classes of different batches on the experimental dataset to simulate the arrival of new classes. To ensure experimental fairness, class learning sequences were uniformly set according to the comparative method of iCaRL, biC, IL M et al (known to those skilled in the art), and random number seeds were set to 1993. The division is identified by n/m, where n represents the number of old categories learned in the initial stage and m represents the number of new categories that need to be learned each round later until all categories of the dataset are learned. The CIFAR10, CIFAR100, miniImageNet dataset and FARON fault diagnosis dataset are divided as follows:
(1) The CIFAR10 dataset is in total of 10 categories, each category containing 5000 samples for training and 1000 samples for testing. The dataset was divided into two settings, 1/1 and 2/2, i.e. 10 categories were equally divided into incremental batches of category 1, category 2.
(2) The CIFAR100 dataset is in total 100 categories, each category containing 500 samples for training and 100 samples for testing. 50 categories of the 100 categories are used as initial learning use categories, and every 5 categories and 10 categories of the rest 50 categories are divided into one incremental learning batch and marked as 50/5 and 50/10 division modes.
(3) The MiniImageNet dataset contains 100 categories of 600 samples each. The embodiment of the invention adopts the following steps: the scale of 1 divides 600 samples per class into a training set and a test set. The experiment uses a 10/10,20/20 partition mode.
(4) A fault diagnosis dataset. The dataset is large equipment fault diagnosis data, and the total of 121 sensor output dimensions in the data describe the current running state of the system from 121 different aspects. In order to perform effective fault diagnosis modeling, the data used need to include normal operation data and fault data under different working conditions. Normal operation data was run for 5.32 hours, a total of 76632 signal samples were collected. The dataset was of 66 categories in total, including 1 normal category and 55 failure categories. The random seed of the data set learning sequence is set to 36, ensuring that normal categories are learned at an initial stage. The experiment uses a partition pattern of 11/11,22/22 and 33/33.
2. Sampling strategy
The experiment uses a random sampling strategy and assumes that the maximum stored sample space is fixed, the maximum stored sample number being denoted by K. For the public data set, the maximum number of stored samples K is set to 2000. The maximum number of stored samples K of the failure diagnosis data set is 8000.
The training set of the old class is different from the training set of the new class in the number of various samples, but the verification set of the old class and the verification set of the new class are the same in the number of various samples, which is limited by the limited storage space, and is used for the balance training stage. The sum of samples of the training set and the test set of the old class is less than or equal to the upper limit K of the storage space. Specifically, the embodiment of the invention sets the sample ratio of the training set of the old class to the verification set to be 9:1, and the training set of the old classOld class verification set +.>New class verification set ++>New class training set->The number of samples of (a) is shown in the following formula:
3. constructing a multi-granularity structure for partitioned data sets
GloVe (Global Vectors for Word Representation) is a commonly used word vector representation method based on global word frequency statistics by mapping words into a high-dimensional space. Each word can be converted into a 300-dimensional word vector using a pre-trained word vector library. After the representation of the words in the feature space is obtained, the embodiment of the invention uses a clustering algorithm to cluster the semantic vectors, and the method can be used for building the category which appears at present into a two-layer hierarchical structure. The embodiment of the invention uses a K-means algorithm to cluster the word vectors converted by the labels.
Wherein, the optimization formula of the K-means algorithm is shown as follows:
wherein x= (x) 1 ,x 2 ,...,x n ) Representing a characteristic representation of each point. Mu (mu) i Is cluster S i Average of all points in (a). The optimization targets of the algorithm are as follows: finding clusters S satisfying the above i The n points are divided into k clusters such that the sum of squares of errors for all points within a cluster is minimized. The smaller the parameter k is, the fewer the superclasses are aggregated, and the weaker the correlation between the classes under the same superclass is; the larger the parameter k, the more superclasses that are aggregated, the stronger the correlation between classes that belong to one superclass.
The algorithm flow is to first introduce a GloVe pre-training model to obtain word vectors corresponding to semantic tags. Then clustering is carried out by using a K-means algorithm to obtain a two-layer hierarchical structure.
3. Feature extraction module
The embodiment of the invention uses a residual network (ResNet) as a feature extractor, and uses ResNet32 and ResNet18 networks for extracting features for CIFAR and MiniImageNet datasets, respectively. For large fault diagnosis datasets, CNN networks are used to extract features.
Wherein, resNet32, resNet18 and CNN networks are all well known to those skilled in the art, and the embodiments of the present invention will not be described in detail.
4. Knowledge distillation module
Will learn N from the previous round old The model of the class is used as a teacher model, and N needing to be learned in the round is used as a teacher model old +N new The model of a class is called a student model. The input of the student model is the union of the sampled old class sample and the new class sample, and the symbol is D t =D′ old ∪D new The corresponding output isAfter the same data passes through the teacher model, the data is outputThe probability of outputting to the teacher model before passing through the softmax layer is each divided by the temperature coefficient T and this result is noted as soft label pi (z'). The probability output by the student model is each divided by the temperature T and denoted pi (z). Next, N is taken before pi (z) old The prediction probability of the class and the distillation loss were calculated, the distillation loss function is as follows:
wherein the above-mentioned' "represents an example after samplingSuch as: d (D) old Representing the sampled old class samples. T is distillation temperature, and the larger the temperature T is, the more uniform probability distribution is generated; when T is set to 1, the distillation loss function is equivalent to the cross entropy loss function. The embodiment of the invention sets T to 2.
Algorithm 1 below summarizes the operation of the knowledge distillation technique applied to the invention.
5. Multi-granularity regularized rebalancing module
The multi-granularity regularized rebalancing module comprises two components. First is the modeling of the rebalance. A rebalancing strategy is used in the training process to mitigate the effects of data imbalance; and secondly, multi-granularity regularization. Because the use of only the rebalancing strategy results in a decrease in the accuracy of classification of new classes, a multi-granularity regularization term is designed to allow the model to take class correlation into account.
The weight balance modeling is performed by applying relatively high weight to the sample sparse category, applying relatively low weight to the category with a large number of samples, balancing the gap between the new category and the old category gradient update, and reducing the category imbalance. The category weights are as follows:
wherein z is i Representing the output of the network, n i Representing the number of samples for each category; gamma represents a hyper-parameter defined as:
wherein N represents the total number of categories, z j Representing the network output, log (·) represents the predicted probability that the sample belongs to category j. The loss of weight balance is expressed as:
the embodiment of the invention improves the supervision intensity of the old category by means of the re-balance loss, so that the negative effect of the new category and the old category due to the obvious difference of the sample number is further weakened. Besides using a rebalancing method, the embodiment of the invention further designs a multi-granularity regularization term according to the influence of the sample number of different fault categories on model update, and class correlation based on the hierarchical structure is embedded into a learning process by constructing the hierarchical structure of the fault category.
Firstly, a hierarchical structure of the class is obtained by using a data set or clustering the semantics, semantic tags are converted into word vectors by using the semantic information of the hierarchical structure, and semantic clustering is carried out by using a K-means algorithm to obtain a two-hierarchy structure. Subsequently, continuous tags with multi-granularity information are constructed and optimized with KL divergence loss. The specific calculation mode is as follows:
the calculation mode for constructing the continuous label with multi-granularity information is as follows:
wherein C represents a node of a real label, A represents a current class node, M represents a multi-granularity label, and N represents a leaf node set. Beta represents a super parameter of the function, the value range of which is (0, + -infinity), which controls the distribution of the labels. The smaller the value of beta is, the more uniform the label distribution is; the larger the beta, the sharper the tag distribution. Formula d (N) i ,N j ) Layer is measuredThe distance between two fine-grained categories in the secondary structure. LCS (minimum common subtree) represents the minimum common subtree of the two nodes. Height (B) represents the Height of the subtree whose root is node B. Continuous labels with multi-granularity information are noted as
The calculation mode of the multi-granularity regularization loss function is as follows:
i.e. calculation using KL divergenceAnd predictive probability q i Differences between them.
The loss function of the multi-granularity regularized rebalancing module is:
L M =(1-λ)L CB +L H (13)
wherein,
algorithm 2 below summarizes the rebalancing module of multi-granularity regularization.
6. Classifier
The classifier is a layer of fully-connected network, the number of output neurons is equal to the number of all categories seen by the stage and the previous stage, and the corresponding number of neurons is increased along with the incremental learning.
7. Training strategy
The model is trained in two stages. In the first stage, the training process is to use the new class sample D new And old class exemplar set D old Mixing, designated as D t As input to the model. And obtaining the output of the network through the feature extraction layer and the classifier, wherein the optimization target is to minimize the weighted sum of distillation loss and the loss of a multi-granularity regularized re-balancing module.
The total loss function formula is shown in (14):
L=L M +αλL D (14)
wherein,for adjusting the degree of old knowledge retention. As incremental learning proceeds, the value of parameter λ becomes progressively larger, and the retention of knowledge for the old category becomes progressively stronger. Alpha=10 -x For losing L D Is adjusted to be with L M On the same order of magnitude, x is typically assigned a value of 0 or 1.
After the training under unbalanced data is finished, the second stage training is to add an additional step of eliminating the classifier bias, namely decoupling the CNN layer from the classifier, freezing parameters of the CNN layer, and retraining the classifier. This is done as an additional step after training using an unbalanced training set. After the validation set of completed balances is constructed, parameters of the CNN layer are frozen prior to training. The classifier is retrained with the validation set during training. In particular, this step will be performed at all stages (after training with unbalanced samples) except the initial stage, since the samples of the initial stage are balanced.
In summary, the embodiment of the invention realizes that the correlation between the rebalance modeling and the fault category is considered at the same time through the parts, so that the learning ability of new and old categories is further improved, the catastrophic forgetting phenomenon of the model is lightened, the identification ability of fault diagnosis is improved, the safety of equipment is greatly improved, and the fault rate of the equipment is reduced.
Example 3
The feasibility of the protocol in examples 1 and 2 was verified by specific experimental data in conjunction with fig. 2 and 3, described in detail below:
the verification of the method is first performed using the public dataset. The hierarchical structure of the data sets is gradually built by semantic clustering.
For CIFAR10 and CIFAR100, the learning rate starts at 0.01 and divides by 10 at iterations 70 and 140. For MiniImageNet, the batch size was set to 128. The learning rate is started from 0.01, the learning rate is divided by 0.1 after every 30 iterations are completed, the SGD optimizer is adopted, the weight attenuation is 2e-4, and the momentum is 0.9. The training phases for CIFAR10, CIFAR100 and MiniImageNet were set to 200, 200 and 100, respectively. The data batch sizes were all set to 128, using Adam optimizer. For the memory limit K, k=2,000 is set on all common data sets.
And secondly, performing verification of the method by using the fault diagnosis data set. FIG. 2 is a schematic diagram of a multi-granularity architecture construction of a large fault diagnosis dataset. On the right side of fig. 2, it can be seen that the characteristics of the categories numbered 1,5, 27, 28 are similar to the coarse-grained category of service pump failure. Taking the FARON-22/22 experimental setup as an example, the fine-grained fault category numbered 5 was learned in the initial phase and the categories numbered 1, 27, 28 were learned in the last incremental learning phase. Through multi-granularity regularization, the structural information of the data can be added into incremental learning, and the correlation between fault categories is maintained.
The random seed of the data set learning sequence is set to 36 in the embodiment of the invention, so that the normal category is ensured to be learned in the initial stage. For this dataset the inventive embodiment employed an Adam optimizer and set the learning rate to 1e-5 training 50 epochs. The upper memory limit k=8,000 is set.
The experiment in the embodiment of the invention follows a standard evaluation system, and the holding capacity of the model to old knowledge and the learning capacity to new knowledge are evaluated through the increment accuracy and the average accuracy. Wherein the average accuracy is the average accuracy of the metrics including initial stage internal increment learning each stage. The higher the value of the average increment accuracy and the average accuracy, the stronger the learning ability of the representative model to the new class sample and the holding ability of the old class knowledge. Comparing the experimental results with six existing methods, including: iCaRL, lwF, LUCIR, MT, biC and PODNet.
The accuracy and average accuracy at the final incremental stage on the CIFAR10, CIFAR100 and MiniImageNet datasets are shown in Table 1, which is a comprehensive representation of the performance of the model provided by the embodiments of the present invention. Table 2 shows the accuracy and average accuracy of each incremental stage on the actual fault diagnosis dataset, which represents the performance of the model provided by the embodiment of the invention under the actual complex scene fault. The embodiment of the invention considers the unbalanced property of the categories and utilizes the correlation among the categories to apply constraint on the model, thereby improving the performance of the model.
Table 1 incremental learning results (accuracy) over three common data sets.
Where last represents the accuracy of the last stage and Avg represents the average incremental accuracy.
TABLE 2 incremental learning results (accuracy) of fault diagnosis data task
In addition, the validity of the self component is verified according to the embodiment of the invention, as shown in table 3. FIG. 3 visualizes the accuracy improvement for each category using the re-balancing, multi-granularity regularization Method (MGRB) and using only the re-balancing method (RB).
TABLE 3 results of ablation experiments on FARON Large Fault diagnosis data sets
Where CE is an acronym for cross entropy loss and CB is a class balance classification loss part in a multi-granularity regularized rebalancing module. MG represents the multi-granularity regularization portion of the multi-granularity regularized rebalancing module. Decouping indicates whether there is second stage training.
Example 4
A video key point dynamic capturing apparatus, see fig. 3, the apparatus comprising: a processor 1 and a memory 2, the memory 2 having stored therein program instructions, the processor 1 calling the program instructions stored in the memory 2 to cause the apparatus to perform the following method steps in embodiment 1:
a fault diagnosis method based on multi-granularity regularized heavy balance increment learning, the method comprising:
word vector representation is carried out on the divided data set, word vectors corresponding to semantic tags are obtained, and a K-means algorithm is used for clustering to obtain a two-layer multi-granularity structure; constructing a continuous label with multi-granularity information, and optimizing by utilizing KL divergence loss;
the feature extraction layer is used for obtaining feature expression vectors of new and old categories, decision output of the current model is constrained to be identical to output distribution of the model before incremental learning through knowledge distillation, relatively low weight is applied to the categories with more samples based on a multi-granularity regularization term, the gap between new fault categories and old fault category gradient update is balanced, and category imbalance is relieved;
and adopting a two-stage training strategy, performing a first-stage training by using the data which can be acquired currently, decoupling the data from the feature extraction layer during a second-stage training, namely freezing parameters of the feature extraction layer, and retraining the classifier by adopting a resampled balance training subset.
The construction of the continuous tag with multi-granularity information specifically comprises the following steps:
wherein C represents a node of a true label, A represents a current class node, M represents a multi-granularity label, N represents a leaf node set, beta represents a super parameter of the function, and formula d (N i ,N j ) Measuring the distance between two fine-grained categories in the hierarchy, LCS representing the smallest common subtree containing two nodes, height (B) representing the Height of the subtree rooted at node B, successive labels with multi-granularity information being noted as
Further, the loss function of the multi-granularity regularization term is calculated by the following steps:
the loss function of the multi-granularity regularized rebalancing is:
L M =(1-λ)L CB +L H
the total loss function is:
L=L M +αλL D
wherein,α=10 -x for losing L D Is adjusted to be with L M On the same order of magnitude.
The first stage training process comprises the following steps: sample D of new class new And old class exemplar set D old Mixing, designated as D t As input of the model, the output of the network is obtained through the feature extraction layer and the classifier, and the optimization target is to minimize the weighted sum of distillation loss and the loss of the rebalancing module of multi-granularity regularization.
Further, the second stage training process is: sampling the data into training subsets with balanced categories, freezing parameters of a feature extraction layer, and independently retraining a classifier; the optimization objective is to minimize the weighted sum of distillation losses and the loss of the multi-granularity regularized re-balancing module.
It should be noted that, the device descriptions in the above embodiments correspond to the method descriptions in the embodiments, and the embodiments of the present invention are not described herein in detail.
The execution main bodies of the processor 1 and the memory 2 may be devices with computing functions, such as a computer, a singlechip, a microcontroller, etc., and in particular implementation, the execution main bodies are not limited, and are selected according to the needs in practical application.
Data signals are transmitted between the memory 2 and the processor 1 via the bus 3, which is not described in detail in the embodiment of the present invention.
Example 5
Based on the same inventive concept, the embodiment of the present invention also provides a computer readable storage medium, where the storage medium includes a stored program, and when the program runs, the device where the storage medium is controlled to execute the method steps in the above embodiment.
The computer readable storage medium includes, but is not limited to, flash memory, hard disk, solid state disk, and the like.
It should be noted that the readable storage medium descriptions in the above embodiments correspond to the method descriptions in the embodiments, and the embodiments of the present invention are not described herein.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the invention, in whole or in part.
The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. The usable medium may be a magnetic medium or a semiconductor medium, or the like.
The embodiment of the invention does not limit the types of other devices except the types of the devices, so long as the devices can complete the functions.
Those skilled in the art will appreciate that the drawings are schematic representations of only one preferred embodiment, and that the above-described embodiment numbers are merely for illustration purposes and do not represent advantages or disadvantages of the embodiments.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (3)

1. A fault diagnosis method based on multi-granularity regularized rebalancing incremental learning, the method comprising:
word vector representation is carried out on the divided fault diagnosis data set, word vectors corresponding to semantic tags are obtained, and the word vectors converted by the semantic tags are clustered by using a K-means algorithm to obtain a two-layer multi-granularity structure;
constructing a continuous label with multi-granularity information, and optimizing by utilizing KL divergence loss;
the feature extraction layer is used for obtaining feature expression vectors of new and old fault categories, the fault category learned in the current state of the model is called as the old fault category, the fault category which is not learned and the fault category to be learned at the stage is called as the new fault category;
the decision output of the current model is constrained to be the same as the output distribution of the model before incremental learning through knowledge distillation, relatively low weight is applied to the categories with more samples based on a multi-granularity regularization term, and the gap between the new fault category and the old fault category gradient update is balanced;
adopting a two-stage training strategy, performing a first-stage training by using the currently acquired data, updating a feature extraction layer, decoupling the classifier from the feature extraction layer during a second-stage training, namely freezing parameters of the feature extraction layer, and retraining the classifier by adopting a resampled balance training subset;
the construction of the continuous tag with multi-granularity information comprises the following steps:
wherein C represents a node of a true label, A represents a current class node, M represents a multi-granularity label, N represents a leaf node set, beta represents a super parameter of the function, and formula d (N i ,N j ) Measuring the distance between two fine-grained categories in the hierarchy, LCS representing the smallest common subtree containing two nodes, height (B) representing the Height of the subtree rooted at node B, successive labels with multi-granularity information being noted as
The calculation mode of the loss function of the multi-granularity regularization term is as follows:
the loss function of the multi-granularity regularized rebalancing is:
L M =(1-λ)L CB +L H
the loss of weight balance is expressed as:
wherein w is i Representing class weights, z i Representing the output of the network, N representing the total number of categories, z j Representing a network output;
the total loss function is:
L=L M +αλL D
wherein,α=10 -x for loss of distillation L D Is adjusted to be with L M On the same order of magnitude;
wherein, the previous round of learning N old The model of the class is used as a teacher model, and N needing to be learned in the round is used as a teacher model old +N new The model of the class is called a student model; the input of the student model is the union of the sampled old class sample and the new class sample, and the symbol is D t =D old ∪D new The corresponding output isThe same data is output after passing through the teacher modelDividing each term of the probability output by the teacher model by the temperature coefficient T before passing through the softmax layer, and marking the result as soft label pi (z'), and dividing each term of the probability output by the student model by the temperature T as pi (z); the first stage training process is as follows: sample D of new fault class new With old fault class paradigm set D old Mixing, designated as D t As the input of the model, the output of the network is obtained through a feature extraction layer and a classifier, and the optimization target is to minimize the weighted sum of distillation loss and the loss of a multi-granularity regularized re-balancing module;
the second stage training process is as follows: sampling the data into training subsets with balanced categories, freezing parameters of a feature extraction layer, and independently retraining a classifier; the optimization objective is to minimize the weighted sum of distillation losses and the loss of the multi-granularity regularized re-balancing module.
2. A fault diagnosis device based on multi-granularity regularized rebalancing increment learning, the device comprising: a processor and a memory, the memory having stored therein program instructions that cause an apparatus to perform the method steps of claim 1, the processor invoking the program instructions stored in the memory.
3. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method steps of claim 1.
CN202210174747.3A 2022-02-24 2022-02-24 Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning Active CN114609994B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210174747.3A CN114609994B (en) 2022-02-24 2022-02-24 Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210174747.3A CN114609994B (en) 2022-02-24 2022-02-24 Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning

Publications (2)

Publication Number Publication Date
CN114609994A CN114609994A (en) 2022-06-10
CN114609994B true CN114609994B (en) 2023-11-07

Family

ID=81859844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210174747.3A Active CN114609994B (en) 2022-02-24 2022-02-24 Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning

Country Status (1)

Country Link
CN (1) CN114609994B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965057B (en) * 2022-11-28 2023-09-29 北京交通大学 Brain-like continuous learning fault diagnosis method for train transmission system
CN116089883B (en) * 2023-01-30 2023-12-19 北京邮电大学 Training method for improving classification degree of new and old categories in existing category increment learning
CN116108346A (en) * 2023-02-17 2023-05-12 苏州大学 Bearing increment fault diagnosis life learning method based on generated feature replay
CN116977635B (en) * 2023-07-19 2024-04-16 中国科学院自动化研究所 Category increment semantic segmentation learning method and semantic segmentation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113834656A (en) * 2021-08-27 2021-12-24 西安电子科技大学 Bearing fault diagnosis method, system, equipment and terminal
CN113887580A (en) * 2021-09-15 2022-01-04 天津大学 Contrast type open set identification method and device considering multi-granularity correlation
CN113946920A (en) * 2021-10-22 2022-01-18 大连海事大学 Rolling bearing fault diagnosis method with unbalanced data and data set deviation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107884706B (en) * 2017-11-09 2020-04-07 合肥工业大学 Analog circuit fault diagnosis method based on vector value regular kernel function approximation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113834656A (en) * 2021-08-27 2021-12-24 西安电子科技大学 Bearing fault diagnosis method, system, equipment and terminal
CN113887580A (en) * 2021-09-15 2022-01-04 天津大学 Contrast type open set identification method and device considering multi-granularity correlation
CN113946920A (en) * 2021-10-22 2022-01-18 大连海事大学 Rolling bearing fault diagnosis method with unbalanced data and data set deviation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Novel Intelligent Fault Diagnosis Method for Rolling Bearings Based on Compressed Sensing and Stacked Multi-Granularity Convolution Denoising Auto-Encoder;CHUANG LIANG;《IEEE Access》;全文 *
于军 ; 丁博 ; 何勇军.基于平均多粒度决策粗糙集和NNBC的滚动轴承故障诊断.《振动与冲击》.2019,全文. *

Also Published As

Publication number Publication date
CN114609994A (en) 2022-06-10

Similar Documents

Publication Publication Date Title
CN114609994B (en) Fault diagnosis method and device based on multi-granularity regularized rebalancing increment learning
CN115270956B (en) Continuous learning-based cross-equipment incremental bearing fault diagnosis method
CN105518656A (en) A cognitive neuro-linguistic behavior recognition system for multi-sensor data fusion
CN111738520A (en) System load prediction method fusing isolated forest and long-short term memory network
CN112433927A (en) Cloud server aging prediction method based on time series clustering and LSTM
CN111784061B (en) Training method, device and equipment for power grid engineering cost prediction model
CN113837635A (en) Risk detection processing method, device and equipment
CN116400168A (en) Power grid fault diagnosis method and system based on depth feature clustering
CN115659244A (en) Fault prediction method, device and storage medium
CN114510871A (en) Cloud server performance degradation prediction method based on thought evolution and LSTM
CN112559741B (en) Nuclear power equipment defect record text classification method, system, medium and electronic equipment
CN113537614A (en) Construction method, system, equipment and medium of power grid engineering cost prediction model
Chen et al. Tcn-based lightweight log anomaly detection in cloud-edge collaborative environment
Renström et al. Fraud Detection on Unlabeled Data with Unsupervised Machine Learning
CN116680969A (en) Filler evaluation parameter prediction method and device for PSO-BP algorithm
CN116225636A (en) Method, device, equipment and storage medium for generating task processing model
CN112698977B (en) Method, device, equipment and medium for positioning server fault
CN111913462B (en) Chemical fault monitoring method based on generalized multiple independent element analysis model
CN114491511A (en) Network intrusion detection method based on variational self-encoder and deep echo state
Natsumeda et al. RULENet: end-to-end learning with the dual-estimator for remaining useful life estimation
CN111291898A (en) Multi-task sparse Bayesian extreme learning machine regression method
CN113673573B (en) Abnormality detection method based on self-adaptive integrated random fuzzy classification
Wang et al. Research on Aircraft Trajectory Prediction Algorithm Based on Hybrid Neural Network Model
CN114169091B (en) Method for establishing prediction model of residual life of engineering machinery part and prediction method
Li et al. Abnormal Diagnosis Method of Self‐Powered Power Supply System Based on Improved GWO‐SVM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant