CN117407781B

CN117407781B - Equipment fault diagnosis method and device based on federal learning

Info

Publication number: CN117407781B
Application number: CN202311714817.0A
Authority: CN
Inventors: 马雪红; 狄加学; 曹梅
Original assignee: Shandong Energy Shuzhiyun Technology Co ltd
Current assignee: Shandong Energy Shuzhiyun Technology Co ltd
Priority date: 2023-12-14
Filing date: 2023-12-14
Publication date: 2024-02-23
Anticipated expiration: 2043-12-14
Also published as: CN117407781A

Abstract

The invention provides a device fault diagnosis method and device based on federal learning, which relate to the technical field of data processing and comprise the following steps: acquiring data to be diagnosed corresponding to target equipment; determining a target model corresponding to the data to be diagnosed based on the data stability condition corresponding to the data to be diagnosed; inputting the data to be diagnosed into a target model, and determining a fault classification result corresponding to the data to be diagnosed through the target model; and performing equipment fault diagnosis on the target equipment based on the fault classification result. The target model is built based on a preset federal learning model and a training sample set after the dimension reduction and sample expansion, so that the invention can utilize distributed data to carry out equipment fault diagnosis, protect the privacy and safety of the data, adapt to the change of the running state of the equipment and improve the accuracy of the equipment fault diagnosis.

Description

Equipment fault diagnosis method and device based on federal learning

Technical Field

The invention relates to the technical field of data processing, in particular to a federal learning-based equipment fault diagnosis method and a federal learning-based equipment fault diagnosis device.

Background

Equipment failure is one of the common problems in various industries, particularly in a large number of equipment and complex systems, such as in the manufacturing industry, power systems, or other industrial fields. Equipment failure can lead to production stagnation, impact efficiency, and even safety accidents. Therefore, the accurate and rapid equipment fault diagnosis has important significance for ensuring the normal operation of production and ensuring the safety of personnel and equipment.

However, conventional methods of device fault diagnosis often rely on manual inspection and analysis, which is inefficient in handling large-scale devices and data, and accuracy is also limited by individual experience and judgment. In addition, with the growth of data and the appearance of distributed data sources, how to effectively utilize distributed data resources under the premise of ensuring data security and privacy also becomes a challenge.

Based on this, the prior art has the following technical problems: (1) Many existing methods have difficulty in equipment failure diagnosis and classification of unbalanced, small sample data. (2) Many existing methods are difficult to perform efficient feature selection and extraction on equipment fault data, so that an algorithm is difficult to accurately identify and classify. (3) In the actual equipment fault identification application field, the fault data acquisition is difficult, the data of each fault type are usually unbalanced, and the algorithm has the problems of weak execution capacity, weak adaptability, poor stability and the like. (4) With the increase of the number of devices and the distributed storage of data, how to effectively use the distributed data for device fault diagnosis and protect the privacy and safety of the data is a challenge facing the prior art.

Disclosure of Invention

In view of the above, the present invention aims to provide a method and an apparatus for diagnosing equipment failure based on federal learning, which can utilize distributed data to diagnose equipment failure, protect the privacy and safety of data, adapt to the change of the running state of the equipment, and improve the accuracy of equipment failure diagnosis.

In a first aspect, an embodiment of the present invention provides an apparatus fault diagnosis method based on federal learning, including: acquiring data to be diagnosed corresponding to target equipment; the data to be diagnosed are vibration signal data acquired by a preset sensor at a preset position of target equipment; determining a target model corresponding to the data to be diagnosed based on the data stability condition corresponding to the data to be diagnosed; the target model is built based on a preset federal learning model and a training sample set after dimension reduction and sample expansion; inputting the data to be diagnosed into a target model, and determining a fault classification result corresponding to the data to be diagnosed through the target model; and performing equipment fault diagnosis on the target equipment based on the fault classification result.

The embodiment of the invention has the following beneficial effects: according to the equipment fault diagnosis method and device based on federal learning, whether a pre-trained model is updated or not is determined according to the data stability condition corresponding to data to be diagnosed, and equipment fault diagnosis is carried out by using the corresponding model. The invention not only can utilize the information in the training data, but also can update the model in real time to adapt to the change of the running state of the equipment, thereby improving the accuracy of equipment fault diagnosis. In addition, the initial model is built based on a federal learning framework, the federal learning framework is a model training method of a distributed structure, and the model training method is composed of a central server and a plurality of users, so that each user participating in training in equipment fault diagnosis tasks can train a model on a global data set under the condition of not sharing data. The parameter model obtained after training is uploaded to the central server through homomorphic encryption, and the data leakage can be effectively avoided through the processing mode. Moreover, the target model is constructed based on a training sample set after the dimension reduction and sample expansion, can be used for carrying out equipment fault diagnosis and classification on unbalanced and small sample data, and can also be used for carrying out efficient feature selection and extraction on the equipment fault data. In addition, in the actual equipment fault identification application site, the execution capacity, the adaptability and the stability are effectively improved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings. In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an equipment fault diagnosis method based on federal learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a federal learning framework according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a training flow of a federal learning-based device fault diagnosis algorithm according to an embodiment of the present invention;

FIG. 4 is a flowchart of another method for diagnosing equipment failure based on federal learning according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an apparatus fault diagnosis device based on federal learning according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of another apparatus for diagnosing a device failure based on federal learning according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

For the purposes of clarity, technical solutions, and advantages of the embodiments of the present disclosure, the following description describes embodiments of the present disclosure with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure herein. It will be apparent that the described embodiments are merely some, but not all embodiments of the present disclosure. The disclosure may be embodied or practiced in other different specific embodiments, and details within the subject specification may be modified or changed from various points of view and applications without departing from the spirit of the disclosure. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure are intended to be within the scope of this disclosure.

It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described in this disclosure may be embodied in a wide variety of forms and that any specific structure and/or function described in this disclosure is illustrative only. Based on the present disclosure, one skilled in the art will appreciate that one aspect described in this disclosure may be implemented independently of any other aspects, and that two or more of these aspects may be combined in various ways. For example, apparatus may be implemented and/or methods practiced using any number of the aspects set forth in this disclosure. In addition, such apparatus may be implemented and/or such method practiced using other structure and/or functionality in addition to one or more of the aspects set forth in the disclosure. It should also be noted that the illustrations provided in the following embodiments merely illustrate the basic concepts of the disclosure by way of illustration, and only the components related to the disclosure are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated. In addition, in the following description, specific details are provided in order to provide a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.

The embodiment of the invention provides a device fault diagnosis method and device based on federal learning, which can utilize distributed data to carry out device fault diagnosis, protect the privacy and safety of the data, adapt to the change of the running state of the device and improve the accuracy of the device fault diagnosis.

For easy understanding, a detailed description is first provided of a device fault diagnosis method based on federal learning according to an embodiment of the present invention, specifically, fig. 1 shows a flowchart of a device fault diagnosis method based on federal learning according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:

step S102, obtaining data to be diagnosed corresponding to target equipment.

Step S104, determining a target model corresponding to the data to be diagnosed based on the data stability condition corresponding to the data to be diagnosed.

In specific implementation, the data to be diagnosed is vibration signal data acquired by a preset sensor at a preset position of the target equipment. In one embodiment, the data to be diagnosed is vibration signal data acquired from a measurement point at the driving end of the rotation device of the coal mine hoist. In the embodiment of the invention, the data is subjected to fault diagnosis by using the model, and the target model comprises an initial model constructed in advance or an updated model corresponding to the initial model. The target model is built based on a preset federal learning model and a training sample set after the dimension reduction and sample expansion.

The method adopts a mixing method in the process of model selection and prediction. Specifically, when the data of the running state of the equipment is relatively stable, a trained model is directly adoptedPredicting; when the data of the running state of the equipment changes significantly, online learning is implemented to update the model in combination with the real-time data of the equipment. The method for judging whether the data are relatively stable comprises the following steps: acquiring equipment characteristics of each moment of data to be diagnosed, and determining characteristic variation corresponding to the equipment characteristics of adjacent moments; comparing the characteristic variation with a preset variation threshold; when the characteristic variation meets a variation threshold, determining a target model corresponding to the data to be diagnosed as an initial model; when the feature variation does not satisfy the variation threshold,and determining the target model corresponding to the data to be diagnosed as an updated model corresponding to the initial model.

In particular implementations, embodiments of the present invention set a threshold valueTo determine whether online learning is required. The determination of the threshold may be set based on historical data and traffic demands of the device. Is provided with->Is->Features of the time-of-day device->Is->Characteristics of the time device, the variation of the characteristics of the device is +.>. When- >Time (+)>Representing a 2-norm) that the operating state of the device is expected to change significantly, online learning is required to update the model. The course of online learning can be expressed as +.>Wherein->Representing an online update function->Representing the model before update ∈>Representing updated modesIs (1)>Is->The real state of the time-of-day device for updating the model in online learning>. The mixing method not only can utilize the information in the training data, but also can update the model in real time to adapt to the change of the running state of the equipment, thereby improving the accuracy of equipment fault diagnosis.

The initial model is built based on the federal learning framework, wherein the federal learning framework of the embodiment of the invention is a model training method of a distributed structure, and the model training method comprises a central server and a plurality of users, so that each user participating in training in equipment fault diagnosis tasks can train a model on a global data set under the condition of not sharing data. Firstly, each user side stores a respective data set; the parties can then train the machine learning model locally using the local data, and a central server connects all users with different secure communication channels. The parameter model obtained after training is uploaded to the central server through homomorphic encryption, and the data leakage can be effectively avoided through the processing mode. After receiving the model parameters of the user end, the central server carries out weighted aggregation processing on all the model parameters, then returns the global model to each user end, and each user end acts on the local model after decrypting the model parameters from the central server, and the steps are iterated for a plurality of times until the loss function converges. The federal learning framework employed in the present invention is shown in fig. 2.

In the federal learning framework provided by the invention, a federal average algorithm is adopted to carry out cooperative training of the model. The federal averaging algorithm is an aggregation algorithm in federal learning that implements co-training of local models through global iterations. After the model is trained by each user side through the local data, the model is uploaded to a central server side, and federation fusion of the models is carried out.

In the federal learning framework of the present invention, the criteria for model cessation training include reaching a preset number of training rounds or model convergence. Wherein, if the preset training wheel number is reached, that is, before training starts, one training wheel number can be set, and once the training wheel number is reached, model training is stopped. In addition, when the training and validation loss function value of the model stops or changes significantly decrease, or the accuracy of the model no longer significantly improves, the model may be considered to have converged and model training stopped.

Thus, the model reasoning process described above can be expressed as:

wherein the threshold valueThe system can be manually adjusted according to the characteristics and service requirements of specific equipment. In each global iteration, it is assumed that the number of clients is +.>And the total number of samples owned is +. >User side->The number of samples is +.>，/>Is thatThe space of the real number is maintained,the objective function to be optimized can thus be expressed as:

wherein:

wherein,loss function for federal learning model, +.>For model parameters->For->Sample dataPrediction function of loss->As a loss function. For user side->For the definition of its loss function +.>The method comprises the following steps:

wherein,is->And data index of each user side. Therefore, the loss function of the Federal learning model +.>Can be expressed as:

when the user terminalGradient of->The learning rate is->In the case of->Updating after iteration to obtain the latest global model parameter +.>The method comprises the following steps:

wherein,respectively +.>First->And updating the obtained global model parameters after the iteration. User terminalThe local model parameter update mode of (a) can be expressed as:

wherein,user side->In->First->And updating the obtained local model parameters after the iteration. Therefore, when federal learning is performed, the objective to be optimized of the user side is determined first, and then the loss function of each user side and the loss function of the federal learning model are determined. Further, at the acquisition client->Gradient of->After that, get->Local model parameters at the time of iterative update +.>The global model parameter update may be expressed as:

In the first placeAfter several iterations, user side +.>Is set to +.>The iterative computation is then continued until the training is completed. The training flow of the federal learning-based equipment fault diagnosis algorithm is shown in fig. 3.

Step S106, inputting the data to be diagnosed into a target model, and determining a fault classification result corresponding to the data to be diagnosed through the target model.

And step S108, performing equipment fault diagnosis on the target equipment based on the fault classification result.

According to the equipment fault diagnosis method based on federal learning, whether a pre-trained model is updated or not is determined according to the data stability condition corresponding to data to be diagnosed, and equipment fault diagnosis is carried out by using the corresponding model. The method not only can utilize information in training data, but also can update the model in real time to adapt to the change of the running state of the equipment, thereby improving the accuracy of equipment fault diagnosis. In addition, the initial model of the embodiment of the invention is constructed based on a federal learning framework, and the federal learning framework is a model training method of a distributed structure, and is composed of a central server and a plurality of users, so that each user participating in training in equipment fault diagnosis tasks can train a model on a global data set under the condition of not sharing data. The parameter model obtained after training is uploaded to the central server through homomorphic encryption, and the data leakage can be effectively avoided through the processing mode. Moreover, the target model is constructed based on a training sample set after the dimension reduction and sample expansion, can be used for carrying out equipment fault diagnosis and classification on unbalanced and small sample data, and can also be used for carrying out efficient feature selection and extraction on the equipment fault data. In addition, in the actual equipment fault identification application site, the execution capacity, the adaptability and the stability are effectively improved.

Furthermore, on the basis of the embodiment, the embodiment of the invention also provides another equipment fault diagnosis method based on federal learning, and the method is used for explaining the construction method of the initial model. Specifically, fig. 4 shows a flowchart of another method for diagnosing equipment failure based on federal learning according to an embodiment of the present invention, as shown in fig. 4, the method includes the following steps:

step S202, a pre-constructed training sample set is obtained.

The training sample set is obtained by expanding a sample of a pre-acquired initial sample set based on a sample type through a preset expansion algorithm. Embodiments of the present invention include one-phase data expansion (constructing a first expanded sample set) and two-phase data expansion (further constructing a second expanded sample set). It can be understood that the data of each user terminal may have the phenomenon of unbalanced sample number among each class, and the invention provides an improved SMOTE algorithm in one-stage data expansion, which is used for data expansion in a few sample classes. In the two-stage data expansion process, the SMOTE algorithm is improved in constraint to improve the performance of the model in processing unbalanced data sets, especially in predicting fine-grained features that can be considered time-consuming by the equipment.

First, a procedure of one-stage data expansion will be described. The traditional SMOTE algorithm comprises the following specific steps: for any one sample of a few sample classesCalculate->Distance to the samples of the other few sample classes, which is Euclidean distance between samples, i.e. obtaining sample +.>Is->And the neighbors. From->Random selection from the neighborsNeighbor samples, denoted->By->And->And (3) carrying out random linear interpolation to construct samples of a new few sample types. The interpolation formula is:

wherein,samples representing a few sample classes artificially structured, +.>Is section->Satisfying uniformly distributed random numbers. The above conventional SMOTE algorithm improves the classification of unbalanced data sets to some extent, but it +.>The value needs to be manually determined, and has certain blindness. Secondly, for edge points, sample points expanded by the SMOTE algorithm may still be edge points, resulting in expanded data marginalization and easy blurring of positive and negative class boundaries. The present invention proposes an improved SMOTE algorithm for sample expansion.

In specific implementation, the embodiment of the invention firstly acquires the initial sample set acquired in advance, then divides the initial sample set into the safety point and the noise point, and interpolates the sample represented as the safety point by using a preset interpolation algorithm, thereby realizing sample expansion and constructing the training sample set of the embodiment of the invention.

Specifically, the training sample set is constructed based on historical fault data acquired by a preset sensor, and the training sample set comprises a first sample label and a plurality of second sample labels, wherein the first sample label is used for representing that the equipment is in a normal running state, and the second sample label is used for representing that the equipment is in a fault state. In the federal learning framework provided by the invention, data adopted by each user side are acquired by a sensor and comprise a plurality of data, wherein the plurality of data comprise first data in a normal running state and second data in a plurality of abnormal running states, and each data comprises a plurality of characteristic parameters. In particular, the data set selected by the invention can consist of historical fault data of the coal mining machine of the coal mine mechanical equipment, and the data set covers 22 running states of the coal mining machine, namely 1 normal state and 21 common fault states, and each running state is marked by a unique tag. In addition, in the data set, each operating state of the shearer includesThe characteristic parameters are expressed as vibration signals of the equipment, and the vibration signals are acquired by a vibration signal sensor. After the equipment data are acquired, the data are marked in a manual marking mode and are used for subsequent data processing and model training.

Further, the initial sample set includes a plurality of initial samples, the initial samples include a plurality of sample categories, the sample categories include a first type sample and a second type sample, and the category of the first type sample is less than the category of the second type sample. The embodiment of the invention determines a neighbor sample corresponding to a preset initial sample from an initial sample set and the sample proportion of a first type sample in the neighbor sample; and determining whether the initial sample is a safety point according to the sample proportion, if so, calculating the sampling proportion corresponding to the initial sample, and expanding the initial sample according to the sampling proportion by a preset interpolation algorithm to construct a first expanded sample set.

In a specific implementation, the training sample data set is first partitioned into safe points and noise points. Setting, preprocessing the obtained data setEach point of->All have a class label->And->The value of (a) is 1 (corresponding to a minority class, i.e. a first class of samples) or 0 (corresponding to a majority class, i.e. a second class of samples). For any point +>Calculate its distance to all other points +.>Then find the nearest->A point. The manner in which the distance is calculated can be expressed as:

wherein,and->Respectively indicate- >And->In->Values on individual features,/->Representing the number of features. Further, calculate->Proportion of minority classes in nearest neighbor +.>I.e. the sample ratio, can be expressed as:

if it isMarking x as a security point; otherwise, x is marked as a noise point. When the marking result represents that the initial sample is a safety point, calculating a sampling proportion corresponding to the initial sample, and expanding the initial sample through a preset interpolation algorithm according to the sampling proportion. Specifically, determining the number of samples of the initial sample in a preset space, and calculating the sampling proportion corresponding to the initial sample based on the number of samples; the sampling proportion is calculated based on a preset boundary sensitivity factor and sample density corresponding to the number of samples, and the boundary sensitivity factor is determined according to a pre-calculated decision boundary. And generating an interpolation sample corresponding to the initial sample based on the sampling proportion and a preset interpolation algorithm, and constructing a first expansion sample set based on the interpolation sample and the initial sample.

Specifically, for the safe point sample x, the number of samples within the radius r is calculatedCan be expressed as:

wherein I is an indicator function whenWhen i=1, otherwise, i=0; m is the total number of samples. Further, the method Calculating the density of the sample->Can be expressed as:

wherein,is the volume of a super sphere with radius r, and for a p-dimensional space, the calculation formula is as follows:

wherein,is a gamma function and p is the number of features. Further, use density ∈ ->To adjust the sampling ratio. Specifically, for one sample x, the sampling ratio w thereof can be calculated by the following formula:

wherein,is a boundary sensitivity factor that acts to increase the weight of samples near the decision boundary, solving the problem of increased classification error rate due to more difficult classification of samples near the boundary.

Specifically, when calculating the boundary sensitive factor, a decision boundary needs to be defined first. The present invention uses a pre-set training data based classifier (e.g., support vector machine classifier) to determine decision boundaries. Decision function generated for classifierWill->Is regarded as +.>Distance from decision boundary, i.e.)>. Further, the distance from the decision boundary is converted into a boundary sensitivity factor +.>For adjusting the sampling ratio of the sample. Boundary sensitivity factorCan be expressed as:

wherein,is a manually preset super parameter for controlling the reduction speed of the weight. Further, a new sample is generated from the sampling ratio w of each sample. For one sample x, w (x) x m new samples are generated.

In specific implementation, the embodiment of the invention determines a plurality of target initial samples from an initial sample set, and determines a sample center point corresponding to an initial sample with a class label characterized as 1; generating a generated sample corresponding to the plurality of target initial samples according to the predetermined calculation parameters and the sample center point. Specifically, for the samples marked as safe points, a nonlinear triangular interpolation mode is adopted to generate new samples. In particular, for any two security pointsAnd->And center point of minority class samples +.>(the center point of the sample is represented as a sample mean of a few sample classes in a numerical processing manner), a new sample +.>The manner of (a) can be expressed as:

wherein,and->Is a randomly generated parameter, satisfy->，/>And->，/>Indicate->And features. In the embodiment of the invention, the new sample is generated based on the two safety points and the center points of the minority samples, so that the new sample can be added, and the aim of training sample expansion is fulfilled. It can be understood that the data of each user terminal may have the phenomenon of unbalanced sample number among each class, and the invention provides an improved SMOTE algorithm for expanding the data in a few sample classes.

Further, phase calibration is performed. Based on the initial sample correspondenceAnd carrying out phase calibration on the generated sample by the phase change value and the phase value corresponding to the sample center point. Let each sample have a phase valueAnd->Is continuous, calculate the phase value of the new sample +.>Can be expressed as:

wherein,represents the center point +.>Is a phase value of (a). And obtaining an interpolation sample corresponding to the initial sample until the phase value of the generated sample meets a preset phase threshold value and the number of the generated samples meets the number requirement corresponding to the sampling proportion. Wherein the embodiment of the present invention is based on the sampling ratio of each sample +.>To generate new samples. For a sample->GeneratingNew samples.

The first expansion sample set is obtained through the steps, layering is carried out on the first expansion sample set, sample expansion is carried out on the first expansion sample set according to layering results, and a second expansion sample set is constructed. And combining the first extended sample set and the second extended sample set to construct a training sample set. In a specific implementation, the second expanded sample set is constructed by:

a) And classifying the first extended sample set to determine a plurality of hierarchical samples. And then, identifying the sample category of each level sample, and determining the sample to be expanded according to the sample category. Specifically, the method comprises the following steps:

1. Data set hierarchical analysis: first, defineTo represent raw data points or samples; />The number of data features; />Is->Weights of the individual features; />Is->A classification function of the individual features; />To represent a data set after a phase expansion; />Identifying indexes for a few classes; />Returning to 1 when the condition is met, otherwise returning to 0;a minority class for the target; />Is an objective function in the Lagrangian multiplier method; />Is a Lagrangian multiplier; />Is a constant in the constraint; />For sample->Is a constraint function of (2); />Is an updated lagrangian multiplier; />Is the original Lagrangian multiplier; />Is the learning rate; />For->Is a gradient of (2); />Is the mean value; />Is the standard deviation; />Is a degree of freedom parameter in chebyshev inequality; />Optimizing a function for the feature space; />To generate new samples; />Is the original sample; />Is->Is the nearest neighbor sample of (a); />Is a randomly generated interpolation factor; />Is a factor adjusted according to the data hierarchy; />Is a multidimensional verification index;is a fused sample; />Is a fusion scale factor; />Evaluating a function for the quality of the data set; />A representative metric for the sample; />A diversity measure for the sample; / >Is the updated dataset; />Is the original data set; />To optimize the step length; />For->Is a gradient of (a).

Further, the data set after the one-stage expansion is layered, and in one embodiment, preliminary classification is performed based on the device type, the energy consumption level and other key factors, so as to obtain a plurality of hierarchy samples. At the same time, minority class and majority class data in each hierarchy are identified, i.e., sample class of each hierarchy sample is identified. In particular, the manner of hierarchical classification can be expressed as:

wherein,representing data points after a phase expansion, +.>Is the number of data features, +.>Is the characteristic weight of the characteristic,is characterized by->Is a classification function of (a). Further, according to the sample category, the sample to be expanded is determined. The embodiment of the invention carries out two-stage expansion on a minority sample, and specifically, a minority identification index is calculated by the following modes:

wherein,representing a data set (i.e. a first expanded sample set) after a one-stage expansion, is->Is an indication function of the display,is a target minority class. Further, feature classification function->Is a threshold-based function, and can be expressed as:

wherein,is specific to->Threshold of individual features.

b) Calculating a sag parameter corresponding to a sample to be expanded by a preset Lagrange relaxation method; and optimizing the sample distribution of the sample to be expanded in the multidimensional feature space through the Chebyshev inequality, and determining the target sample distribution. Specifically, after the sample to be expanded is determined, the following operations are performed on the sample to be expanded:

2. layered application is performed based on Lagrangian relaxation method: wherein a few samples may be 1 or more, the embodiment of the invention applies Lagrangian relaxation method on each data level separately, and adjusts the flexibility of sample generation. Further, different relaxation parameters are set for different levels to accommodate differences in level characteristics. Specifically, the calculation method of the lagrangian multiplier can be expressed as:

wherein,is Lagrangian multiplier, +.>Is a constraint condition (JavaScript)>Is sample->Is a constraint function of (a). Further, the manner in which the relaxation parameters are adjusted can be expressed as:

wherein,is learning rate (I/O)>Representation about->Is a gradient of (a). Further, u ∈>Relative to->The way in which the gradient of (a) is expressed can be:

wherein,is aboutThe beam condition function, in particular, is a sample equalization metric.

3. Optimizing sample distribution based on chebyshev theory and multidimensional space analysis: and optimizing the sample distribution in the multidimensional feature space through the Chebyshev inequality, and determining the target sample distribution. In particular implementations, chebyshev theory is applied custom for each data level, taking into account the relevance and distribution of features within the level. Specifically, the chebyshev inequality can be expressed as:

wherein,is mean value (I)>Is standard deviation (S)>Is a degree of freedom. Further, in feature space optimization, the optimization function can be expressed as: />

Wherein,representing a feature space optimization function. Feature optimization function->The normalized square of the feature distance is shown, reflecting the degree of deviation of each feature from its mean and standard deviation. Based on the above function, a target sample distribution can be determined.

c) And carrying out sample expansion on the sample to be expanded according to the sag parameter and the target sample distribution, and obtaining a second expansion sample when the multidimensional verification of the expanded sample passes. In specific implementation, the embodiment of the invention generates samples through an SMOTE algorithm: for each data hierarchy, the SMOTE algorithm is customized according to the hierarchy characteristics. Meanwhile, in generating the sample, characteristics specific to the hierarchy and a data distribution pattern are considered. Specifically, the sample generation function may be shown in the table:

Wherein,is->Nearest neighbor sample,/">Is a randomly generated interpolation factor, and +.>Is in [0,1 ]]Randomly selected, for controlling the distance between the new sample and the original sample. />Is->The iterative features influence the score. Further, the manner of adjusting the hierarchical characteristics can be expressed as:

wherein,is a factor that is adjusted according to the hierarchy characteristics. Further, when calculating the feature influence score, the invention adopts a dynamic feature weight adjustment mode to set, so that the algorithm can pay attention to the importance of different features in a self-adaptive way. Specifically, at each iteration, the feature weights are first performedThe initialization, the initial weight is set to equal weight, can be expressed as:

further, in the feature impact score calculation, for the firstIndividual characteristics, calculate their influence score +.>The manner of (a) can be expressed as:

wherein,is->Value of individual characteristic>Is->The mean of the individual features. Further, the manner of performing dynamic feature weight adjustment and updating the weight according to the feature impact score can be expressed as follows:

wherein u isRepresenting the number of iterations. After each iteration, based on the new dataset +.>Recalculating->And->。

After the sample expansion is performed, the embodiment of the invention also performs multidimensional verification on the expanded sample, and when the multidimensional verification of the expanded sample passes, a second expanded sample is obtained. Specifically, the steps of sample generation and multidimensional verification are as follows: new composite samples are generated at each level. And carrying out multidimensional verification on the generated sample, and ensuring effective distribution of the sample in a characteristic space. Specifically, the calculation mode of the multidimensional verification index can be expressed as follows:

，/>Representing a multi-dimensional verification index. Multidimensional feature distance->Indicating that the newly generated sample is at +.>Normalized distance in dimension relative to the original dataset.

d) And carrying out cross fusion on the second expansion sample and the first expansion sample set to construct a second expansion sample set. In specific implementation, the embodiment of the invention further performs cross-level sample fusion on the second extended sample: the samples generated at different levels are cross-fused to increase the diversity and complexity of the data set. Meanwhile, the Lagrangian relaxation method is applied again in the fusion process to maintain the diversity of samples. Specifically, the fusion function can be expressed as:

，/>is a fusion scale factor that can be dynamically adjusted according to the importance of different levels or the number of samples to balance the impact of different levels. Further, the construction and optimization of the final data set are carried out, and a training sample set is constructed.

In specific implementation, the fused and optimized synthesized sample (i.e., the second extended sample set) is combined with the original data set (i.e., the first extended sample set) to obtain a training sample set. And comprehensively evaluating the training sample set to ensure that the data quality meets the requirements of a prediction model. Specifically, the manner in which the quality of the data set is evaluated can be expressed as:

，/>Representative of the sample, ++>Representing the diversity of the samples. />Is calculated based on the similarity of the sample to other samples of the dataset. />Is calculated based on the degree of difference of the sample from other samples of the dataset.

Further, iterative optimization and adjustment are performed. And carrying out iterative optimization on the second expansion sample set according to feedback of the model on the training sample set, and adjusting sample generation strategies of all levels so as to further improve the quality and applicability of the data set. Specifically, the manner of optimizing the iteration can be expressed as:

，u/>is the optimization step size +.>Representation about->Is a gradient of (a). By means of->About data set->Can find the direction of improving the quality of the whole data set.

Further, it will be appreciated that data in actual tasks tends to be incomplete, noisy, inconsistent. Aiming at the situation, in the embodiment, each user side cleans and corrects the data acquired by the sensor and the generated data in the following data processing mode: and (5) processing of missing values. Data loss is the most common problem in running the data acquisition process. During the device data acquisition process, some sensor measurement points may not work properly, resulting in a loss of the collected device data portion. At this time, various measures such as interpolation, deletion, etc. are required according to the importance of the data. And processing abnormal data. There may be some unreasonable data in the uploaded data due to sensor failure or other reasons of the collected device data. In the invention, the abnormal data is deleted. Normalization of data. The equipment data comprises a plurality of types, and the collected data has different numerical ranges and value ranges, so that normalization operation is needed to better reflect the relation between the equipment data and the fault diagnosis result and reduce the influence of different orders on the equipment fault diagnosis result. Therefore, the present embodiment normalizes the data using the range normalization method:

Wherein,、/>respectively represent the minimum and maximum values in the same set of data samples, +.>Representing the entered data>Representing normalized data.

Step S204, feature extraction is carried out on the training sample set through a preset feature extraction model, and sample feature parameters of dimension reduction are determined.

The feature extraction model of the embodiment of the invention is constructed based on an improved Bi-LSTM model, a preset learning algorithm and an improved greedy algorithm. Specifically, the invention adopts the improved Bi-LSTM and KSVD learning dictionary to combine with the improved greedy algorithm for feature extraction. In specific implementation, the Bi-LSTM model is used to process the extended training data set, and the data in the extended training data set is time sequence data.

Specifically, the training sample set is input into the improved Bi-LSTM model, and the hidden state corresponding to the training sample set is determined. The hidden state comprises preset time dimension information which is the time sequence numberAccording to the above. For an input sequenceWherein->Representing +.>The hidden state of Bi-LSTM +.>Can be expressed as:

in the improved Bi-LSTM of the present invention, time information is addedIn the hidden state, the learning ability of the model on the time dependence relation in the data is improved. New hidden state- >Can be expressed as:

wherein,is a timestamp of the data that provides additional time dimension information. Further, dictionary representations corresponding to the hidden states are determined through a preset learning algorithm. The preset learning algorithm comprises an improved greedy algorithm, and the greedy algorithm is used for alternately optimizing an initial dictionary and a sparse coding matrix corresponding to the hidden state to obtain dictionary representation. Wherein a KSVD algorithm is used to learn a dictionary to represent the data. Dictionary learning is an unsupervised learning method that can learn a dictionary from a large amount of data to represent the data. Tool withBody, define hidden state->Is +.>Dictionary is->Then, dictionary learning is aimed at minimizing the following optimization problem:

/>

wherein,is the target function of dictionary learning, +.>Representing the Frobenius norm, +.>Is a sparse coding matrix +.>For indicating hidden status->In dictionary->I.e. dictionary representation.

It will be appreciated that the above process of minimizing the objective function is an NP-hard problem and conventional solutions may require significant computational resources. Therefore, the invention adopts an improved greedy algorithm to solve. In each iteration, the dictionary is fixed first Optimizing sparse coding matrix->Then fix +.>Optimizing dictionary->In this way, the process is alternated until a preset number of iterations is reached, or the dictionary +.>And sparse coding matrix->Is less than a certain threshold.

Further, inputting the dictionary representation into a preset deep belief network, and performing feature dimension reduction on the dictionary representation until a preset iteration requirement is met, so as to obtain a dimension-reduced sample feature parameter. The deep belief network comprises a plurality of layers of sub-networks, dictionary representations are input to input layers of the deep belief network, and the layers of sub-networks of the deep belief network are pre-trained layer by using the dictionary representations to obtain training output when the deep belief network is in specific implementation; the error corresponding to the training output is reversely transmitted to an input layer, and parameter adjustment is carried out on the deep belief network; and obtaining the sample characteristic parameters of dimension reduction based on the current training output until the error corresponding to the training output meets a preset error threshold.

The training sample data set of dictionary representation obtained after the iteration is finished isFurther, will->And inputting the characteristics into a deep belief network to perform characteristic dimension reduction. The deep belief network is a generated probability map model formed by superposition of a plurality of restricted boltzmann machines. By training the deep belief network layer by layer, the high-level features of the data can be effectively extracted, and the high-level features refer to more abstract features Is a non-linear feature of (2).

The deep belief network comprises the following structures: the input layer is provided withA personal node, a first hidden layer with +.>A personal node, a second hidden layer with +.>And each node. Through training the deep belief network, the data after dimension reduction can be obtained. Is provided with->Dimension reduction of data to +.>Dimension (V) & gt>。

Training of the deep belief network comprises two steps of layer-by-layer pre-training and global fine tuning: layer-by-layer pre-training: first, use is made ofThe first limited boltzmann machine was trained to reach an equilibrium distribution. Further, the hidden layer output of the first restricted boltzmann machine is taken as the input of the next restricted boltzmann machine, and training is performed again. I.e. an initialized deep belief network is obtained. Global fine tuning: further, the error of the output layer is back-propagated to the input layer using a back-propagation algorithm, thereby adjusting parameters of the deep belief network. The energy function of the limited boltzmann machine is defined as:

wherein,and->Representing the status of the visible layer and the hidden layer, respectively,/->Representing the connection weight between the visible layer and the hidden layer,/->And->Indicating the bias of the visible layer and the hidden layer, respectively. By minimizing the free energy of the limited boltzmann machine, a representation of the data at the hidden layer can be obtained. The representation of the hidden layer is the data characteristic after dimension reduction.

Step S206, inputting the dimension-reduced sample characteristic parameters, the first sample label and the second sample label of the training sample set into a preset federal learning model, performing classification training on the federal learning model, and constructing an initial model based on the trained federal learning model so as to perform equipment fault diagnosis on the target equipment through the initial model.

The embodiment of the invention uses the training sample data set after dimension reduction to train the classifier, wherein the classifier is trained by combining an extreme gradient lifting decision tree with an extreme gradient lifting decision tree and an integrated intelligent optimization method, and by the mode, the deviation and variance of the model can be reduced at the same time, and the prediction precision and stability of the model are further improved. The classifier of the embodiment of the invention is constructed by using a federal learning model, and the specific implementation mode of the federal learning model refers to the initial model constructed based on the federal learning framework in the embodiment. In specific implementation, the preset federal learning model comprises a central server and a plurality of client classifiers; according to the embodiment of the invention, the sample characteristic parameters corresponding to each user side classifier are input into the corresponding user side classifier, and the user side classifier is subjected to classification training to obtain a classification training result; and carrying out weighted aggregation treatment on the classification training results respectively corresponding to each user side classifier through the central server to obtain a global result. Judging whether the classification model loss function corresponding to the global result is converged, if not, updating parameters of the user side classifier until the classification model loss function is converged, and obtaining a trained federal learning model so as to construct an initial model based on the current federal learning model.

1) Firstly, inputting sample characteristic parameters corresponding to each user side classifier into the corresponding user side classifier, and performing classification training on the user side classifier to obtain a classification training result. In the embodiment of the invention, the user side classifier is trained based on a preset gradient lifting decision tree algorithm and a preset collective intelligent optimization method.

In specific implementation, the embodiment of the invention generates a population of solutions corresponding to the sample characteristic parameters based on a preset gradient lifting decision tree algorithm. And then, evaluating the quality of each solution in the group based on a preset collective intelligent optimization method, and determining the optimal solution in each solution. The target optimal solution is determined according to decision tree weights corresponding to the gradient lifting decision tree algorithm, and the decision tree weights are updated based on a preset optimization function. And then updating the optimal solution according to a preset rule, and determining the objective function value corresponding to the optimal solution. And selecting a target optimal solution according to the target function value, and judging whether a user side loss function value corresponding to the current target optimal solution meets the preset user side loss value requirement. If yes, obtaining a classification training result, otherwise, performing iterative training. In particular, the main goal of the extreme gradient boost decision algorithm is to minimize Function:

wherein,is a loss function for measuring the predicted value +.>And actual value +.>The gap between them. />Is regularized item, +.>Is the kth tree model. When training the polar gradient lifting tree, let +.>Parameters of the functionAnd optimizing parameters of the polar gradient lifting tree by adopting a collective intelligent optimization mode. The method comprises the following specific steps: initializing: a population of solutions is generated, each solution representing a possible combination of parameters. Updating: each solution is updated according to certain rules, such as by simulating "fly" or "foraging" behaviors, etc. Selecting: and selecting an optimal solution according to the value of the objective function.

In the method, a collective intelligent optimization method is applied to parameter optimization of an extreme gradient lifting decision tree classifier. In each update, the cross entropy loss function is used to evaluate the quality of each solution (parameter combination) and then the optimal solution (parameter combination) is selected for the next update. Then, the optimal parameter combination can be searched in a large range, so that the performance of the extreme gradient lifting decision tree classifier is improved. Specifically, this process can be expressed as the following optimization problem:

wherein,and->Predictive value and kth tree model and parameter representing extreme gradient boosting decision tree classifier +. >And (5) correlation.

Further, a target optimal solution is selected according to the target function value, and whether a user side loss function value corresponding to the current target optimal solution meets a preset user side loss value requirement is judged. If not, updating the decision tree parameters of the gradient lifting decision tree algorithm according to the function gradient indicated by the loss function value of the user side; and executing a decision tree algorithm based on preset gradient lifting, and generating a population of solutions corresponding to the sample characteristic parameters. And determining the current target optimal solution as a classification training result of the user side classifier until the user side loss function value meets the preset user side loss value requirement. In particular implementations, in each iteration, the model parameters are optimized by minimizing the following loss function:

wherein,is the true device state, +.>Is the predictive result of the model,/->Is a loss function between the predicted value and the actual value,/->Is regularized item, +.>Is the kth tree model.

In each iteration, the training algorithm updates the parameters using the gradient of the loss function. Training the algorithm calculates the gradient of the loss function and then multiplies the gradient by the learning rateAnd updates the parameters using the results. This process can be formalized as the following optimization problem:

Wherein,is the predictive result of the model,/->Is the kth tree model,/->Is a learning rate, said learning rate +.>The definition is as follows:

wherein,is the value of the change in the loss function in two successive iterations,is a positive constant, preset by human.

2) Further, the classification training results corresponding to the user side classifiers through the central server are subjected to weighted aggregation treatment to obtain global results; judging whether the classification model loss function corresponding to the global result is converged, if not, updating parameters of the user side classifier until the classification model loss function is converged, and obtaining a trained federal learning model for explanation.

In specific implementation, the invention uses an innovative integrated learning method, combines the advantages of bagging and boosting, and performs integrated learning decision when fault diagnosis is performed on equipment. A set of decision trees is first generated using a random forest. Further, a gradient lifting decision tree is used for iterative optimization on the basis. In each iteration, the new decision tree is not trained, but the weights of the existing decision tree are optimized. By the method, deviation and variance of the model can be reduced at the same time, and prediction accuracy and stability of the model are further improved. Specifically, at the first In round iterations, the model predictions can be expressed as:

wherein,is input, & lt + & gt>Is->Predictive outcome of individual decision trees,/->Is->The decision tree is at->Weights in the round iteration. In each iteration, the optimized objective function is:

wherein,is a loss function, +.>Is->True label of individual samples->Is->The sample is at->Prediction in round iteration, +.>Is a regularization parameter. By solving the optimization problem, the optimal weight of each decision tree in each round of iteration can be obtained. After training, the final model prediction result is:

wherein,is->Optimum weights for the decision tree. Further, after the model is trained by each user side through the local data, the model is uploaded to a central server side, and federation fusion of the models is carried out.

According to the equipment fault diagnosis method based on federal learning, a training sample set is obtained by carrying out sample expansion on a pre-acquired initial sample set based on sample types through a preset expansion algorithm, wherein an improved SMOTE algorithm is provided, and data expansion is carried out in a few sample types. In the improved SMOTE algorithm, by distinguishing the safety point from the noise point, the algorithm can more accurately identify and process the data points with larger influence on the classification result; meanwhile, by calculating the density of the sample points and adjusting the sampling proportion, especially increasing the weight of the samples near the decision boundary, the misclassification is reduced, especially near the decision boundary. Moreover, the improved SMOTE algorithm improves the performance of a minority class of samples in the dataset by enhancing consideration of the boundary sensitivity factor, thereby improving the recognition rate of these samples. In addition, a nonlinear triangular interpolation method is used for generating new samples, and compared with the samples generated by a traditional linear interpolation method, the samples are more in line with actual data distribution, so that the generalization capability of the classifier is improved. Further, the expanded sample set is further expanded, wherein in the further sample expansion, the improvement of the constraint of the SMOTE algorithm can improve the performance of the model when processing the unbalanced data set, and particularly the fine granularity characteristics which can be considered in time consumption of the prediction equipment.

Furthermore, a feature extraction model constructed based on an improved Bi-LSTM model, a preset learning algorithm and an improved greedy algorithm is also provided, the improved Bi-LSTM and KSVD learning dictionary is adopted to combine with the improved greedy algorithm to perform feature extraction, and a deep belief network is trained layer by layer to perform feature dimension reduction by using the deep belief network, so that high-level features of data can be effectively extracted. Furthermore, the embodiment of the invention trains the classifier by combining the extreme gradient lifting decision tree with the integrated intelligent optimization method, and by adopting the mode, the deviation and variance of the model can be reduced at the same time, and the prediction precision and stability of the model are further improved. Based on the above, the embodiment of the invention can effectively improve the equipment fault diagnosis precision of the model.

Further, on the basis of the above method embodiment, the embodiment of the present invention further provides a device fault diagnosis apparatus based on federal learning, and fig. 5 shows a schematic structural diagram of the device fault diagnosis apparatus based on federal learning provided by the embodiment of the present invention, as shown in fig. 5, where the device includes: the data acquisition module 100 is configured to acquire data to be diagnosed corresponding to the target device; the data to be diagnosed are vibration signal data acquired by a preset sensor at a preset position of target equipment; the processing module 200 is configured to determine a target model corresponding to the data to be diagnosed based on the data stability condition corresponding to the data to be diagnosed; the target model is built based on a preset federal learning model and a training sample set after dimension reduction and sample expansion; the execution module 300 is configured to input data to be diagnosed into a target model, and determine a fault classification result corresponding to the data to be diagnosed through the target model; and an output module 400, configured to perform device fault diagnosis on the target device based on the fault classification result.

The equipment fault diagnosis device based on federal learning provided by the embodiment of the invention has the same technical characteristics as the equipment fault diagnosis method based on federal learning provided by the embodiment, so that the same technical problems can be solved, and the same technical effects can be achieved.

Further, on the basis of the foregoing embodiment, the embodiment of the present invention further provides another device fault diagnosis apparatus based on federal learning, and fig. 6 shows a schematic structural diagram of the device fault diagnosis apparatus based on federal learning provided by the embodiment of the present invention, where, as shown in fig. 6, the target model includes an initial model that is built in advance, or an update model corresponding to the initial model; the processing module 200 is further configured to obtain a device feature at each time of the data to be diagnosed, and determine a feature variation corresponding to the device feature at an adjacent time; comparing the characteristic variation with a preset variation threshold; when the characteristic variation meets a variation threshold, determining a target model corresponding to the data to be diagnosed as an initial model; and when the characteristic variation does not meet the variation threshold, determining the target model corresponding to the data to be diagnosed as an updated model corresponding to the initial model.

The device further comprises a model construction module 500 for obtaining a pre-constructed training sample set; the training sample set is obtained by carrying out sample expansion on a pre-acquired initial sample set based on sample types through a preset expansion algorithm; the training sample set is constructed based on historical fault data acquired by a preset sensor, and comprises a first sample label and a plurality of second sample labels, wherein the first sample label is used for representing that the equipment is in a normal running state, and the second sample label is used for representing that the equipment is in a fault state; performing feature extraction on the training sample set through a preset feature extraction model, and determining sample feature parameters of dimension reduction; the feature extraction model is constructed based on an improved Bi-LSTM model, a preset learning algorithm and an improved greedy algorithm; inputting the dimension-reduced sample characteristic parameters, the first sample label and the second sample label of the training sample set into a preset federal learning model, carrying out classification training on the federal learning model, and constructing an initial model based on the trained federal learning model so as to carry out equipment fault diagnosis on target equipment through the initial model.

The model construction module 500 is further configured to input a training sample set into the improved Bi-LSTM model, and determine a hidden state corresponding to the training sample set; the hidden state comprises preset time dimension information; determining dictionary representations corresponding to the hidden states through a preset learning algorithm; the method comprises the steps that a preset learning algorithm comprises an improved greedy algorithm, wherein the greedy algorithm is used for alternately optimizing an initial dictionary and a sparse coding matrix corresponding to a hidden state so as to obtain dictionary representation; inputting the dictionary representation into a preset deep belief network, and performing feature dimension reduction on the dictionary representation until the preset iteration requirement is met, so as to obtain sample feature parameters of dimension reduction.

Further, the deep belief network includes a multi-layer subnetwork; the model building module 500 is further configured to input dictionary representations to an input layer of the deep belief network, and perform layer-by-layer pre-training on a multi-layer sub-network of the deep belief network using the dictionary representations to obtain a training output; the error corresponding to the training output is reversely transmitted to an input layer, and parameter adjustment is carried out on the deep belief network; and obtaining the sample characteristic parameters of dimension reduction based on the current training output until the error corresponding to the training output meets a preset error threshold.

Further, the preset federal learning model comprises a central server and a plurality of client classifiers; the model building module 500 is further configured to input a sample feature parameter corresponding to each user side classifier into the corresponding user side classifier, and perform classification training on the user side classifier to obtain a classification training result; the method comprises the steps that a central server carries out weighted aggregation treatment on classification training results corresponding to each user side classifier respectively to obtain a global result; judging whether the classification model loss function corresponding to the global result is converged, if not, updating parameters of the user side classifier until the classification model loss function is converged, and obtaining a trained federal learning model so as to construct an initial model based on the current federal learning model.

Further, the user side classifier is trained based on a preset gradient lifting decision tree algorithm and a preset collective intelligent optimization method; the model building module 500 is further configured to generate a population of solutions corresponding to the sample feature parameters based on a preset gradient lifting decision tree algorithm; evaluating the quality of each solution in the group based on a preset collective intelligent optimization method, and determining an optimal solution in each solution; updating the optimal solution according to a preset rule, and determining a target function value corresponding to the optimal solution; selecting a target optimal solution according to the objective function value, and judging whether a user side loss function value corresponding to the current target optimal solution meets a preset user side loss value requirement or not; the target optimal solution is determined according to decision tree weights corresponding to the gradient lifting decision tree algorithm, and the decision tree weights are updated based on a preset optimization function; if not, updating the decision tree parameters of the gradient lifting decision tree algorithm according to the function gradient indicated by the loss function value of the user side; executing a decision tree algorithm based on preset gradient lifting, and generating a solution group corresponding to the sample characteristic parameters; and determining the current target optimal solution as a classification training result of the user side classifier until the user side loss function value meets the preset user side loss value requirement.

The model building module 500 is further configured to obtain an initial sample set acquired in advance; the initial sample set comprises a plurality of initial samples, the initial samples comprise a plurality of sample categories, the sample categories comprise a first type sample and a second type sample, and the categories of the first type sample are less than those of the second type sample; determining a neighbor sample corresponding to a preset initial sample from the initial sample set, and determining the sample proportion of a first type sample in the neighbor sample; determining whether an initial sample is a safety point according to the sample proportion, if so, calculating the sampling proportion corresponding to the initial sample, and expanding the initial sample according to the sampling proportion by a preset interpolation algorithm to construct a first expanded sample set; layering the first expansion sample set, and carrying out sample expansion on the first expansion sample set according to layering results to construct a second expansion sample set; and combining the first extended sample set and the second extended sample set to construct a training sample set.

The model building module 500 is further configured to classify the first extended sample set to determine a plurality of hierarchical samples; identifying the sample category of each level sample, and determining a sample to be expanded according to the sample category; calculating a sag parameter corresponding to a sample to be expanded by a preset Lagrange relaxation method; optimizing the sample distribution of the sample to be expanded in the multidimensional feature space through Chebyshev inequality, and determining the target sample distribution; sample expansion is carried out on the sample to be expanded according to the sag parameter and the target sample distribution, and a second expansion sample is obtained when the multidimensional verification of the expanded sample is passed; and carrying out cross fusion on the second expansion sample and the first expansion sample set to construct a second expansion sample set.

The embodiment of the invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the method shown in any one of the figures 1 to 4. Embodiments of the present invention also provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor performs the steps of the method shown in any of the above-mentioned figures 1 to 4. The embodiment of the present invention further provides a schematic structural diagram of an electronic device, as shown in fig. 7, where the electronic device includes a processor 71 and a memory 70, where the memory 70 stores computer executable instructions that can be executed by the processor 71, and the processor 71 executes the computer executable instructions to implement the method shown in any of the foregoing fig. 1 to 4.

In the embodiment shown in fig. 7, the electronic device further comprises a bus 72 and a communication interface 73, wherein the processor 71, the communication interface 73 and the memory 70 are connected by the bus 72. The memory 70 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the system network element and the at least one other network element is achieved via at least one communication interface 73 (which may be wired or wireless), which may use the internet, a wide area network, a local network, a metropolitan area network, etc. The Bus 72 may be an ISA (Industry Standard Architecture ) Bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) Bus, an EISA (Extended Industry Standard Architecture ) Bus, or the like, and may be an AMBA (Advanced Microcontroller Bus Architecture, standard for on-chip buses) Bus, where AMBA defines three types of buses, including an APB (Advanced Peripheral Bus) Bus, an AHB (Advanced High-performance Bus) Bus, and a AXI (Advanced eXtensible Interface) Bus. The bus 72 may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, only one bi-directional arrow is shown in FIG. 7, but not only one bus or type of bus.

The processor 71 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 71. The processor 71 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory and the processor 71 reads the information in the memory and in combination with its hardware performs the method as shown in any of the foregoing figures 1 to 4.

The embodiment of the invention provides a method and a device for diagnosing equipment faults based on federal learning, which comprises a computer readable storage medium storing program codes, wherein the instructions included in the program codes can be used for executing the method described in the previous method embodiment, and specific implementation can be referred to the method embodiment and is not repeated herein.

It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding process in the foregoing method embodiment for the specific working process of the above-described system, which is not described herein again. In addition, in the description of embodiments of the present invention, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood by those skilled in the art in specific cases.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Finally, it should be noted that: the above examples are only specific embodiments of the present invention for illustrating the technical solution of the present invention, but not for limiting the scope of the present invention, and although the present invention has been described in detail with reference to the foregoing examples, it will be understood by those skilled in the art that the present invention is not limited thereto: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A federal learning-based equipment fault diagnosis method, the method comprising:

acquiring data to be diagnosed corresponding to target equipment; the data to be diagnosed are vibration signal data acquired by a preset sensor at a preset position of target equipment;

determining a target model corresponding to the data to be diagnosed based on the data stability condition corresponding to the data to be diagnosed; the target model is built based on a preset federal learning model and a training sample set after dimension reduction and sample expansion;

inputting the data to be diagnosed into the target model, and determining a fault classification result corresponding to the data to be diagnosed through the target model;

performing equipment fault diagnosis on the target equipment based on the fault classification result;

the method further comprises the steps of: acquiring a pre-constructed training sample set; the training sample set is obtained by carrying out sample expansion on a pre-acquired initial sample set based on sample types through a preset expansion algorithm; the training sample set is constructed based on historical fault data acquired by the preset sensor, and comprises a first sample label and a plurality of second sample labels, wherein the first sample label is used for representing that the equipment is in a normal running state, and the second sample label is used for representing that the equipment is in a fault state;

Performing feature extraction on the training sample set through a preset feature extraction model, and determining sample feature parameters of dimension reduction; the feature extraction model is constructed based on an improved Bi-LSTM model, a preset learning algorithm and an improved greedy algorithm;

inputting the dimension-reduced sample characteristic parameters, the first sample label and the second sample label of the training sample set into a preset federal learning model, performing classification training on the federal learning model, and constructing an initial model based on the trained federal learning model so as to perform equipment fault diagnosis on target equipment through the initial model;

the preset federal learning model comprises a central server and a plurality of user side classifiers; inputting the dimension-reduced sample characteristic parameters, the first sample label and the second sample label of the training sample set into a preset federal learning model, performing classification training on the federal learning model, and constructing an initial model based on the trained federal learning model, wherein the method comprises the following steps:

inputting sample characteristic parameters corresponding to each user side classifier into the corresponding user side classifier, and performing classification training on the user side classifier to obtain a classification training result;

The central server carries out weighted aggregation treatment on the classification training results corresponding to each user side classifier respectively to obtain a global result;

judging whether a classification model loss function corresponding to the global result is converged or not, if not, carrying out parameter updating on the parameters of the user side classifier until the classification model loss function is converged, and obtaining a trained federal learning model so as to construct an initial model based on a current federal learning model;

the user side classifier is trained based on a preset gradient lifting decision tree algorithm and a preset collective intelligent optimization method; inputting sample characteristic parameters corresponding to each user side classifier into the corresponding user side classifier, and performing classification training on the user side classifier to obtain a classification training result, wherein the method comprises the following steps:

generating a population of solutions corresponding to the sample characteristic parameters based on a preset gradient lifting decision tree algorithm;

evaluating the quality of each solution in the group based on a preset collective intelligent optimization method, and determining an optimal solution in each solution;

updating the optimal solution according to a preset rule, and determining an objective function value corresponding to the optimal solution;

Selecting a target optimal solution according to the target function value, and judging whether a user side loss function value corresponding to the current target optimal solution meets a preset user side loss value requirement or not; the target optimal solution is determined according to decision tree weights corresponding to the gradient lifting decision tree algorithm, and the decision tree weights are updated based on a preset optimization function;

if not, updating the decision tree parameters of the gradient lifting decision tree algorithm according to the function gradient indicated by the loss function value of the user side; executing a decision tree algorithm based on preset gradient lifting, and generating a solution group corresponding to the sample characteristic parameters;

and determining the current target optimal solution as a classification training result of the user side classifier until the user side loss function value meets the preset user side loss value requirement.

2. The method according to claim 1, wherein the target model comprises a pre-built initial model, or an updated model corresponding to the initial model;

based on the data stability condition corresponding to the data to be diagnosed, determining a target model corresponding to the data to be diagnosed, including:

Acquiring the equipment characteristics of each moment of the data to be diagnosed, and determining the characteristic variation corresponding to the equipment characteristics of the adjacent moment;

comparing the characteristic variation with a preset variation threshold;

when the characteristic variation meets the variation threshold, determining a target model corresponding to the data to be diagnosed as the initial model;

and when the characteristic variation does not meet the variation threshold, determining the target model corresponding to the data to be diagnosed as an updated model corresponding to the initial model.

3. The method according to claim 1, wherein the step of performing feature extraction on the training sample set through a preset feature extraction model to determine sample feature parameters of dimension reduction comprises:

inputting the training sample set into an improved Bi-LSTM model, and determining a hidden state corresponding to the training sample set; the hidden state comprises preset time dimension information;

determining dictionary representations corresponding to the hidden states through a preset learning algorithm; the preset learning algorithm comprises an improved greedy algorithm, and the greedy algorithm is used for alternately optimizing an initial dictionary and a sparse coding matrix corresponding to the hidden state so as to obtain dictionary representation;

Inputting the dictionary representation into a preset deep belief network, and performing feature dimension reduction on the dictionary representation until a preset iteration requirement is met, so as to obtain dimension-reduced sample feature parameters.

4. The method of claim 3, wherein the deep belief network comprises a multi-layer subnetwork; inputting the dictionary representation into a preset deep belief network, performing feature dimension reduction on the dictionary representation until a preset iteration requirement is met, and obtaining dimension-reduced sample feature parameters, wherein the method comprises the following steps of:

inputting the dictionary representation to an input layer of the deep belief network, and performing layer-by-layer pre-training on a multi-layer sub-network of the deep belief network by using the dictionary representation to obtain training output;

back-propagating errors corresponding to the training output to the input layer, and carrying out parameter adjustment on the deep belief network;

and obtaining the dimension-reduced sample characteristic parameters based on the current training output until the error corresponding to the training output meets a preset error threshold.

5. The method according to claim 1, wherein the method further comprises:

acquiring an initial sample set acquired in advance; the initial sample set comprises a plurality of initial samples, the initial samples comprise a plurality of sample categories, the sample categories comprise a first type sample and a second type sample, and the categories of the first type sample are less than those of the second type sample;

Determining a neighbor sample corresponding to a preset initial sample from the initial sample set, and determining the sample proportion of a first type sample in the neighbor sample;

determining whether the initial sample is a safety point or not according to the sample proportion, if so, calculating the sampling proportion corresponding to the initial sample, and carrying out sample expansion on the initial sample according to the sampling proportion by a preset interpolation algorithm to construct a first expansion sample set;

layering the first expansion sample set, and carrying out sample expansion on the first expansion sample set according to layering results to construct a second expansion sample set;

and combining the first extended sample set and the second extended sample set to construct a training sample set.

6. The method of claim 5, wherein the steps of layering the first expanded sample set and sample expanding the first expanded sample set based on layering results to construct a second expanded sample set comprise:

classifying the first extended sample set to determine a plurality of hierarchical samples;

identifying a sample class of each level sample, and determining a sample to be expanded according to the sample class;

Calculating a sag parameter corresponding to the sample to be expanded by a preset Lagrange relaxation method; optimizing the sample distribution of the sample to be expanded in a multidimensional feature space through Chebyshev inequality, and determining target sample distribution;

sample expansion is carried out on the sample to be expanded according to the sag parameter and the target sample distribution, and a second expansion sample is obtained when the expanded sample passes multidimensional verification;

and carrying out cross fusion on the second expansion sample and the first expansion sample set to construct a second expansion sample set.

7. An apparatus for diagnosing a device failure based on federal learning, the apparatus comprising:

the data acquisition module is used for acquiring data to be diagnosed corresponding to the target equipment; the data to be diagnosed are vibration signal data acquired by a preset sensor at a preset position of target equipment;

the processing module is used for determining a target model corresponding to the data to be diagnosed based on the data stability condition corresponding to the data to be diagnosed; the target model is built based on a preset federal learning model and a training sample set after dimension reduction and sample expansion;

The execution module is used for inputting the data to be diagnosed into the target model, and determining a fault classification result corresponding to the data to be diagnosed through the target model;

the output module is used for carrying out equipment fault diagnosis on the target equipment based on the fault classification result;

the device also comprises a model construction module, a model analysis module and a model analysis module, wherein the model construction module is used for acquiring a pre-constructed training sample set; the training sample set is obtained by carrying out sample expansion on a pre-acquired initial sample set based on sample types through a preset expansion algorithm; the training sample set is constructed based on historical fault data acquired by the preset sensor, and comprises a first sample label and a plurality of second sample labels, wherein the first sample label is used for representing that the equipment is in a normal running state, and the second sample label is used for representing that the equipment is in a fault state; performing feature extraction on the training sample set through a preset feature extraction model, and determining sample feature parameters of dimension reduction; the feature extraction model is constructed based on an improved Bi-LSTM model, a preset learning algorithm and an improved greedy algorithm; inputting the dimension-reduced sample characteristic parameters, the first sample label and the second sample label of the training sample set into a preset federal learning model, performing classification training on the federal learning model, and constructing an initial model based on the trained federal learning model so as to perform equipment fault diagnosis on target equipment through the initial model;

The preset federal learning model comprises a central server and a plurality of user side classifiers; the model construction module is further used for inputting sample characteristic parameters corresponding to each user side classifier into the corresponding user side classifier, and performing classification training on the user side classifier to obtain a classification training result; the central server carries out weighted aggregation treatment on the classification training results corresponding to each user side classifier respectively to obtain a global result; judging whether a classification model loss function corresponding to the global result is converged or not, if not, carrying out parameter updating on the parameters of the user side classifier until the classification model loss function is converged, and obtaining a trained federal learning model so as to construct an initial model based on a current federal learning model;

the user side classifier is trained based on a preset gradient lifting decision tree algorithm and a preset collective intelligent optimization method; the model construction module is further used for generating a population of solutions corresponding to the sample characteristic parameters based on a preset gradient lifting decision tree algorithm; evaluating the quality of each solution in the group based on a preset collective intelligent optimization method, and determining an optimal solution in each solution; updating the optimal solution according to a preset rule, and determining an objective function value corresponding to the optimal solution; selecting a target optimal solution according to the target function value, and judging whether a user side loss function value corresponding to the current target optimal solution meets a preset user side loss value requirement or not; the target optimal solution is determined according to decision tree weights corresponding to the gradient lifting decision tree algorithm, and the decision tree weights are updated based on a preset optimization function; if not, updating the decision tree parameters of the gradient lifting decision tree algorithm according to the function gradient indicated by the loss function value of the user side; executing a decision tree algorithm based on preset gradient lifting, and generating a solution group corresponding to the sample characteristic parameters; and determining the current target optimal solution as a classification training result of the user side classifier until the user side loss function value meets the preset user side loss value requirement.