CN116821774A

CN116821774A - Power generation fault diagnosis method based on artificial intelligence

Info

Publication number: CN116821774A
Application number: CN202311083408.5A
Authority: CN
Inventors: 孙贵杰; 马素真; 张丽红
Original assignee: Shandong Ligong Haoming New Energy Co ltd
Current assignee: Shandong Polytechnic College
Priority date: 2023-08-28
Filing date: 2023-08-28
Publication date: 2023-09-29
Anticipated expiration: 2043-08-28
Also published as: CN116821774B

Abstract

The invention discloses a power generation fault diagnosis method based on artificial intelligence, which comprises the following steps: data collection and processing, data storage and searching, feature extraction and selection, establishment of a power generation fault diagnosis model and interpretation of power generation fault diagnosis results. The invention belongs to the technical field of power engineering, in particular to a power generation fault diagnosis method based on artificial intelligence, which adopts a consistent hash algorithm to determine the node position of data by dividing hash rings; grouping by adopting a clustering algorithm, and mapping the original data to a high-dimensional feature space by using a nonlinear feature mapping function to extract features; adopting an optimal strategy searching algorithm, interacting a decision process of power generation fault diagnosis with an environment state, and optimizing a diagnosis process according to a feedback rewarding function; and interpreting the diagnosis result of the single sample by adopting a local interpretation model, extracting a decision rule by using a rule extraction algorithm, and evaluating the generalization capability of the decision rule by using W-fold cross validation.

Description

Power generation fault diagnosis method based on artificial intelligence

Technical Field

The invention belongs to the field of power engineering, and particularly relates to a power generation fault diagnosis method based on artificial intelligence.

Background

The power generation fault diagnosis is a process of fault source positioning, fault detection and fault classification for faults possibly occurring in a power generation system, and aims to discover potential faults in advance through analysis of operation data, monitoring signals and maintenance records, and take corresponding maintenance and maintenance measures, so that serious faults and shutdown accidents of equipment are avoided, and the reliability and usability of the power generation system are improved. However, the existing power generation fault diagnosis has the technical problems that the operation data of a power generation system is complex, the effective data storage and searching are difficult to perform, and the fault diagnosis and the operation and maintenance efficiency are negatively influenced; in the process of diagnosing the power generation faults, the most representative data features are difficult to extract when the features of the original data are extracted and selected, the relationship between the features and the power generation faults cannot be fully understood, and the accuracy of the diagnosis results is affected; the technical problems that the current power generation fault diagnosis model is based on a static model to model and predict the occurrence of faults, the working environment which changes cannot be perceived and adapted in real time, the diagnosis result is lagged, and the maintenance cost of a power generation system is increased exist; the black box algorithm lacks of interpretability, cannot explain the internal operation mode of the power generation system, is difficult for engineers and operators to find algorithm loopholes and improvement points, and cannot carry out deep analysis and optimization and make decision.

Disclosure of Invention

Aiming at the above situation, in order to overcome the defects of the prior art, the invention provides an artificial intelligence-based power generation fault diagnosis method, which aims at the technical problems that the operation data of a power generation system is complex and difficult to store and search, and the fault diagnosis and the operation and maintenance efficiency are negatively affected, adopts a consistent hash algorithm to map the storage space and the data onto a hash ring, determines the node position of the storage space by dividing the hash ring, reduces the data search range, realizes the load balance and efficient data storage of the data, and effectively improves the data processing efficiency, fault tolerance and expansibility of the system; aiming at the technical problems that in the process of diagnosing power generation faults, the most representative data features are difficult to extract when the features are extracted and selected, the relation between the features and the power generation faults is difficult to fully understand, and the accuracy of diagnosis results is influenced, a clustering algorithm is adopted to calculate the center point of feature dimensions to group data samples, the non-linear feature mapping function is used for mapping the original data to a high-dimensional feature space, and the feature extraction is carried out through the feature space of the non-linear feature mapping function, so that the feature dimensions can be reduced, the distinguishing and the representativeness of the features are improved, and the most representative data features are obtained; aiming at the technical problems that the current power generation fault diagnosis model is based on a static model for modeling and predicting the occurrence of faults, the working environment of change cannot be perceived and adapted in real time, the diagnosis result is lagged, and the maintenance cost of the power generation system is increased, an optimal strategy algorithm is adopted, a decision process of power generation fault diagnosis is used as the behavior of an intelligent body to interact with the environment state and is learned according to a feedback rewarding function, the diagnosis process is optimized according to the feedback rewarding function observed in real time, quick response is carried out under the scene with high real-time requirement, and the maintenance cost of the power generation system is reduced; aiming at the technical problems that the black box algorithm lacks the interpretability and cannot explain the internal operation mode, engineers and operators cannot find algorithm loopholes and improvement points and cannot deeply analyze and optimize and make decisions, a rule extraction algorithm is adopted to extract decision rules which are easy to understand, a diagnosis result of a single sample point is interpreted by a local interpretation model according to a feature importance analysis and decision process in a decision tree structure interpretation model, and a diagnosis result of a power generation fault diagnosis model is developed by a visualization means, so that engineers and operators are helped to understand the diagnosis result of the power generation fault diagnosis model, and the working efficiency is improved.

The technical scheme adopted by the invention is as follows: the invention provides a power generation fault diagnosis method based on artificial intelligence, which comprises the following steps:

step S1: the data collection and processing specifically comprises the steps of collecting power generation fault data, carrying out data cleaning and data preprocessing to obtain preprocessed power generation fault data, and carrying out data marking on the preprocessed power generation fault data;

step S2: the data storage and searching are carried out, specifically, a consistent hash algorithm is adopted, the storage space and the preprocessed power generation fault data are mapped to a hash ring, and the node position of the preprocessed power generation fault data in the storage space is determined through division of the hash ring;

step S3: the feature extraction and selection is specifically to calculate a center point of a feature dimension by adopting a clustering algorithm, group the preprocessed power generation fault data according to the center point of the feature dimension, map the original data to a high-dimensional feature space by using a nonlinear feature mapping function, and perform feature extraction by using the high-dimensional feature space of the nonlinear feature mapping function;

step S4: establishing a power generation fault diagnosis model, namely adopting an optimal strategy searching algorithm, interacting a decision process of power generation fault diagnosis as the behavior of an intelligent agent with an environmental state, learning according to a feedback rewarding function, optimizing the diagnosis process according to the feedback rewarding function observed in real time, and performing quick response under a scene with high real-time requirement;

Step S5: the method comprises the steps of analyzing the importance of each characteristic sample in a power generation fault diagnosis model by using a decision tree structure, analyzing the fault diagnosis result of a single sample point by using a local interpretation model, extracting a decision rule easy to understand by using a decision rule extraction algorithm, explaining a decision process in the power generation fault diagnosis model, and unfolding the result of the power generation fault diagnosis model by using a visual means to help engineers and operators understand the result of the power generation fault diagnosis model.

Further, in step S1, the data collecting and processing includes the following steps:

step S11: acquiring data, namely acquiring required power generation fault data through database inquiry, a sensor and monitoring equipment, wherein the power generation fault data comprises power generator operation data, temperature data, fault logs, alarm records and external environment monitoring data;

step S12: data preprocessing, namely performing data cleaning and data preprocessing on the collected power generation fault data, wherein the data preprocessing comprises the steps of eliminating noise, filling missing values and processing abnormal values to obtain preprocessed power generation fault data;

step S13: and marking data, namely marking the faults which occur as fault categories, and marking the data which are in normal operation as normal categories to obtain a target label of the training model.

Further, in step S2, the data storing and searching includes the following steps:

step S21: mapping hash values, namely mapping the input preprocessed power generation fault data into hash values with fixed lengths through a hash function SHA-3, and storing and searching the data by using the hash values with the fixed lengths as indexes, wherein the method comprises the following steps of:

step S211: defining the input as message M, the output as hash value H, initializing a 1600 bit state array S ₁ And will 1600 bits of state array S ₁ A matrix divided into 5*5;

step S212: filling the message M, and dividing the filled message M into 1600-bit blocks;

step S213: expanding each 1600-bit block into an expansion matrix A, and combining the expansion matrix A with a state array S ₁ Performing bitwise or operations;

step S214: repeating the step S213 for 12 times to obtain a matrix E;

step S215: expanding the matrix E into a bit string, and taking the prefix part of the bit string as a final hash value H;

step S22: constructing a hash ring, mapping a hash space onto the ring to form the hash ring, wherein each node occupies a position on the hash ring, and the position of the node is calculated by a hash function SHA-3;

Step S23: updating the hash ring, when a new node joins the system and an old node leaves the system, updating the hash ring, wherein the joined node finds its own position on the hash ring through the hash value H, and the leaving node is redistributed to other positions on the hash ring;

step S24: data storage, namely performing hash calculation on the data to be stored, finding out the corresponding node position on the hash ring according to the calculation result, storing the data on the corresponding node, and performing backup and redundant storage on the data;

step S25: when certain data is required to be searched, firstly carrying out hash calculation on the data, finding the corresponding position of the data on the hash ring through a hash value H, and then finding a node responsible for storing the data through a consistent hash algorithm;

step S26: and data migration, namely migrating the data on the node to other nodes by using a consistent hash algorithm when the conditions of node addition, node removal and node failure occur.

Further, in step S3, the feature extraction and selection includes the steps of:

step S31: data preparation, namely dividing the preprocessed power generation fault data into a training data set and a test data set, and defining the training data set as original data, wherein the original data comprises the preprocessed power generation fault data and a target label of a training model;

Step S32: calculating the center point of the feature dimension, calculating the average value of all sample points on each feature dimension, and taking the average value of all sample points as the center point of the corresponding feature dimension, wherein the formula is as follows:

u[j]=（x[1][j]+x[2][j]+……+x[n][j]）/n；

c[j]=u[j]；

wherein x [ i ] [ j ] is a value representing the ith sample point in the jth feature dimension, u [ j ] is an average value of all sample points in the jth feature dimension, c [ j ] is a center point of the jth feature dimension, n is the number of sample points, and i is an integer between [1, n ];

step S33: calculating a mapping result, mapping the distance between each sample point and the center point of the characteristic dimension to a high-dimensional characteristic space by using a nonlinear characteristic mapping function, thereby introducing a nonlinear relation to obtain the mapping result of the sample point in the high-dimensional characteristic space, wherein the formula is as follows:

；

wherein x is ₁ Is the characteristic vector of the original data, Γ is the nonlinear characteristic mapping function center, ε is the control nonlinearityThe parameters of the width of the feature mapping function,is the square of the Euclidean distance between the feature vector of the original data and the center of the nonlinear feature mapping function, and ψ (x) is the mapping result of the sample points in the high-dimensional feature space;

step S34: the feature representation, the mapping result of each sample point in the high-dimensional feature space is used as a new feature of the sample point, and then the new feature of each sample point is combined with the feature of the original data to obtain a feature vector alpha;

Step S35: training the feature extraction model, training the feature extraction model through the feature vector alpha and a target label of the training model, and comprising the following steps:

step S351: preparing a training set, namely pairing the feature vector alpha with a target label of a training model to form the training set;

step S352: feature standardization, performing feature standardization operation on the training set, and mapping the value ranges of the features of the training set into the same range to obtain a standard training set, wherein the formula is as follows:

y=（x-x _min ）/（x _max -x _min ）；

wherein y is a value after feature normalization, and the value range of y is [0,1 ]]Between x is the feature of the training set, x _min Is the minimum value of the feature of the training set, x _max Is the maximum value of the features of the training set;

step S353: and calculating a Lagrange multiplier, and obtaining the Lagrange multiplier through a sequence minimum optimization algorithm, wherein the formula is as follows:

；

wherein W (alpha) is an objective function of the support vector machine, x _i And x _j Is a training sample in the training set, y _i Is x _i Is a label of y _j Is x _j Is a label of (a) _i Is the Lagrangian multiplier to be solved, alpha _i The range isC is a relaxation factor, alpha _i And y _i Satisfy the formula->，K（x _i ，x _j ) Is a kernel function;

step S354: constructing a classification decision function, introducing Lagrangian multipliers as multiplication factors, and calculating support vectors to obtain the classification decision function;

Step S355: the classification decision function calculates the importance of the features through the weights of the support vectors, performs feature screening, and obtains the most representative data features.

Further, in step S4, the building of the power generation fault diagnosis model includes the following steps:

step S41: a state representation of the power generation system is designed according to the characteristics of the power generation system and the diagnosed fault type;

step S42: defining a feedback reward function, wherein the feedback reward function is R (s, a, s ') and represents the feedback reward function when the state s is shifted to the state s' after taking the action a;

step S43: constructing a reinforcement learning environment, taking a state representation and the most representative data characteristic of a power generation system as an environment state, taking a decision process of power generation fault diagnosis as the behavior of an intelligent agent to interact with the environment state, and learning according to a feedback rewarding function;

step S44: constructing a power generation fault diagnosis model, using a deep neural network as a function approximator of the power generation fault diagnosis model, and learning an optimal decision strategy;

step S45: the power generation fault diagnosis model is trained, an optimal strategy searching algorithm is used, a decision process of power generation fault diagnosis is used as the behavior of an intelligent body, and the training and optimization of the power generation fault diagnosis model are carried out through interaction with the environment state, and the method comprises the following steps:

Step S451: initializing a Z value function, and initializing the Z value of each pair of states-action pairs to 0;

step S452: the environment interaction is performed, actions are selected according to the optimal decision strategy, interaction is performed with the environment state, and the feedback rewarding function and the next state are observed;

step S453: the Z-value function is updated using the following formula:

Z（s，a）=Z（s，a）+θ*（R+γ*max（Z（s’，a’））-Z（s，a））；

wherein Z (s, a) is the current state-the Z value of the action pair, θ is the learning rate, R is the current feedback rewarding function, γ is the discount factor, s 'is the next state, and a' is the optimal action for the next state;

step S454: repeating the step S452 and the step S453, gradually reducing the learning rate theta until the Z value function converges, solving an optimal Z value function, obtaining an optimal strategy for maximizing the expected return, and making decision and action selection according to the optimal strategy;

step S46: and carrying out real-time fault diagnosis, deploying the power generation fault diagnosis model into a real-time environment after training and optimizing, carrying out fault prediction and diagnosis according to data acquired in real time, and providing fault early warning and maintenance advice.

Further, in step S5, the power generation failure diagnosis result interpretation includes the steps of:

step S51: and analyzing the importance of each feature sample in the power generation fault diagnosis model by using a decision tree structure, and calculating the coefficient of the foundation of each feature sample to evaluate the importance of the feature, wherein the following formula is used:

；

Wherein Gini (p) is the coefficient of the characteristic sample p, p _d Is the duty ratio of the class d feature sample on the feature sample p;

step S52: interpreting the fault diagnosis result, constructing a local interpretation model, acquiring the characteristic weight of the diagnosis result by using a linear regression model, and interpreting the fault diagnosis result of a single sample point, wherein the method comprises the following steps:

step S521: selecting a part of important features which have important influence on fault diagnosis from all feature samples through feature importance analysis;

step S522: performing fault diagnosis on a single sample point by using a power generation fault diagnosis model to obtain a fault diagnosis result of the sample point;

step S523: sampling nearby sample points, and sampling nearby single sample points to obtain a group of nearby sample points;

step S524: constructing a local interpretation model, using a linear regression model, taking the characteristics of a single sample point and the characteristic data of adjacent sample points as inputs, and taking the fault diagnosis result of the sample points as output to train the local interpretation model;

step S525: acquiring the characteristic weight of the fault diagnosis result of the sample point through the coefficient of the linear regression model, and explaining the fault diagnosis result of the sample point;

Step S53: an interpretation decision process, wherein a decision rule extraction algorithm is used for extracting an easily understood decision rule from a power generation fault diagnosis model, and the interpretation decision process in the power generation fault diagnosis model comprises the following steps:

step S531: feature selection, namely inputting a power generation fault diagnosis model to be interpreted, and selecting a feature Q related to a fault diagnosis result of a sample point from the power generation fault diagnosis model by using a coefficient of Kerning;

step S532: generating a decision rule, namely mining the association between the fault diagnosis result and the characteristic Q by using an association rule mining method to generate the decision rule;

step S533: the decision rule evaluation, namely evaluating the accuracy of the decision rule according to the consistency of the fault diagnosis result of the sample point and the target label of the training model, and evaluating the interpretation degree of the decision rule according to the length and the readability of the decision rule;

step S534: decision rule screening, namely screening decision rules by applying heuristic strategies, guiding a searching direction by utilizing local information of the feature Q, searching important features according to the searching direction, and screening the decision rules according to the important features to obtain screened decision rules;

step S535: explaining the decision rule, expanding the result of the power generation fault diagnosis model by using a visualization technology through the screened decision rule, and converting the decision rule into natural language and graphical display which are easy to understand;

Step S54: cross-validation and evaluation optimization, using cross-validation to validate the generalization ability of decision rules, performing qualitative analysis on the interpretation performance of the decision rules, comprising the steps of:

step S541: w folds are cross-validated, a test data set is divided into W subsets with the same size, namely folds, for each fold, the W-1 folds are used as validation folds, for each cross-validated training set, a decision rule is extracted from the training folds by utilizing a decision rule extraction algorithm, the extracted decision rule is applied to the corresponding validation folds, and the generalization capability of the decision rule is evaluated;

step S542: for the W-fold cross verification result, calculating the accuracy, recall and F1 value, and evaluating the overall performance of the decision rule;

step S544: and (3) analyzing and optimizing, qualitatively analyzing the interpretation performance of the decision rule, observing whether the decision rule extraction algorithm provides clear interpretation for the decision rule and whether the decision rule accords with the knowledge of domain experts, collecting feedback and opinion, and adjusting and improving in time.

The beneficial results obtained by adopting the scheme of the invention are as follows:

(1) Aiming at the technical problems that the operation data of a power generation system is complex and is difficult to store and search effectively, and the fault diagnosis and the operation and maintenance efficiency are negatively influenced, a consistent hash algorithm is adopted to map the storage space and the data onto a hash ring, the node position of the data in the storage space is determined by dividing the hash ring, the data search range is reduced, the load balance and the efficient data storage of the data are realized, and the data processing efficiency, the fault tolerance and the expansibility of the system are effectively improved;

(2) Aiming at the technical problems that in the process of diagnosing power generation faults, the most representative data features are difficult to extract when the features are extracted and selected, the relation between the features and the power generation faults is difficult to fully understand, and the accuracy of diagnosis results is influenced, a clustering algorithm is adopted to calculate the center point of feature dimensions to group data samples, the non-linear feature mapping function is used for mapping the original data to a high-dimensional feature space, and the feature extraction is carried out through the feature space of the non-linear feature mapping function, so that the feature dimensions can be reduced, the distinguishing and the representativeness of the features are improved, and the most representative data features are obtained;

(3) Aiming at the technical problems that the current power generation fault diagnosis model is based on a static model for modeling and predicting the occurrence of faults, the working environment of change cannot be perceived and adapted in real time, the diagnosis result is lagged, and the maintenance cost of the power generation system is increased, an optimal strategy algorithm is adopted, a decision process of power generation fault diagnosis is used as the behavior of an intelligent body to interact with the environment state and is learned according to a feedback rewarding function, the diagnosis process is optimized according to the feedback rewarding function observed in real time, quick response is carried out under the scene with high real-time requirement, and the maintenance cost of the power generation system is reduced;

(4) Aiming at the technical problems that the black box algorithm lacks the interpretability and cannot explain the internal operation mode, engineers and operators cannot find algorithm loopholes and improvement points and cannot deeply analyze and optimize and make decisions, a rule extraction algorithm is adopted to extract decision rules which are easy to understand, a diagnosis result of a single sample point is interpreted by a local interpretation model according to a feature importance analysis and decision process in a decision tree structure interpretation model, and a diagnosis result of a power generation fault diagnosis model is developed by a visualization means, so that engineers and operators are helped to understand the diagnosis result of the power generation fault diagnosis model, and the working efficiency is improved.

Drawings

FIG. 1 is a schematic flow chart of an artificial intelligence-based power generation fault diagnosis method;

FIG. 2 is a flow chart of step S2;

FIG. 3 is a flow chart of step S3;

fig. 4 is a flow chart of step S4;

fig. 5 is a flow chart of step S5.

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be understood that the terms "upper," "lower," "front," "rear," "left," "right," "top," "bottom," "inner," "outer," and the like indicate orientation or positional relationships based on those shown in the drawings, merely to facilitate description of the invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the invention.

Referring to fig. 1, the invention provides a power generation fault diagnosis method based on artificial intelligence, which comprises the following steps:

In a second embodiment, referring to fig. 1, the data collection and processing in step S1 includes the following steps:

Embodiment three, referring to fig. 1 and 2, based on the above embodiment, in step S2, the data storage and searching includes the following steps:

step S214: repeating the step S213 for 12 times to obtain a matrix E;

In the operation, the scheme adopts the consistent hash algorithm to map the storage space and the data to the hash ring, the node position of the data in the storage space is determined by dividing the hash ring, the data searching range is reduced, the load balance and the efficient data storage of the data are realized, the data processing efficiency, the fault tolerance and the expansibility of the system are effectively improved, and the technical problems that the running data of the power generation system are complex, the effective data storage and searching are difficult to perform, and the negative influence is generated on the fault diagnosis and the operation and maintenance efficiency are solved.

Fourth embodiment, referring to fig. 1 and 3, based on the above embodiment, in step S3, the feature extraction and selection includes the following steps:

u[j]=（x[1][j]+x[2][j]+……+x[n][j]）/n；

c[j]=u[j]；

wherein x [ i ] [ j ] is a value representing the ith sample point in the jth feature dimension, u [ j ] is an average value of all sample points in the jth feature dimension, c [ j ] is a center point of the jth feature dimension, and n is the number of sample points;

step S33: calculating a mapping result, mapping the distance between each sample and the center point of the characteristic dimension to a high-dimensional characteristic space by using a nonlinear characteristic mapping function, thereby introducing a nonlinear relation to obtain the mapping result of the sample point in the high-dimensional characteristic space, wherein the formula is as follows:

；

wherein x is ₁ Is the feature vector of the original data, Γ is the center of the nonlinear feature mapping function, ε is the parameter that controls the width of the nonlinear feature mapping function,is the square of the Euclidean distance between the feature vector of the original data and the center of the nonlinear feature mapping function, and ψ (x) is the mapping result of the sample points in the high-dimensional feature space;

y=（x-x _min ）/（x _max -x _min ）；

；

In the operation, the method adopts the clustering algorithm to calculate the center point of the feature dimension to group the data samples, uses the nonlinear feature mapping function to map the original data to the high-dimensional feature space, and uses the feature space of the nonlinear feature mapping function to perform feature extraction, so that the feature dimension can be reduced, the distinguishing property and the representativeness of the feature can be improved, the most representative data feature can be obtained, and the technical problems that the most representative data feature is difficult to extract when the original data is subjected to feature extraction and selection in the power generation fault diagnosis process, the relationship between the feature and the power generation fault is difficult to be fully understood, and the accuracy of the diagnosis result is influenced are solved.

Fifth embodiment, referring to fig. 1 and 4, the method for establishing a power generation fault diagnosis model in step S4 includes the following steps:

step S453: the Z-value function is updated using the following formula:

In the operation, the scheme adopts the algorithm of searching the optimal strategy, the decision process of power generation fault diagnosis is used as the behavior of an intelligent agent to interact with the environment state and learn according to the feedback rewarding function, the diagnosis process is optimized according to the feedback rewarding function observed in real time, the rapid reaction is carried out under the scene with high real-time requirement, the maintenance cost of the power generation system is reduced, and the technical problems that the current power generation fault diagnosis model is based on a static model to model and predict the occurrence of faults, the diagnosis result is lagged, and the maintenance cost of the power generation system is increased due to the fact that the current power generation fault diagnosis model cannot sense and adapt to the changed working environment in real time are solved.

Embodiment six, referring to fig. 1 and 5, based on the above embodiment, in step S5, the power generation failure diagnosis result interpretation includes the steps of:

；

step S52: interpreting the fault diagnosis result, constructing a local interpretation model, acquiring the characteristic weight of the diagnosis result by using a linear regression model, and interpreting the diagnosis result of a single sample point, wherein the method comprises the following steps:

step S524: constructing a local interpretation model, using a linear regression model, taking the characteristics of a single selected sample point and the characteristic data of adjacent sample points as inputs, taking the fault diagnosis result of the sample points as output, and performing local interpretation model training;

step S53: an interpretation decision process, which uses a decision rule extraction algorithm to extract an easily understood decision rule from a complex model, and an interpretation decision process in a power generation fault diagnosis model, comprising the following steps:

step S541: w folds are cross-validated, a test data set is divided into W subsets with the same size, namely folds, for each fold, the W-1 folds are used as validation folds, for each cross-validated training set, a decision rule is extracted from the training folds by utilizing a decision rule extraction algorithm, the generated decision rule is applied to the corresponding validation folds, and the generalization capability of the decision rule is evaluated;

In the operation, the scheme adopts a rule extraction algorithm to extract a decision rule which is easy to understand, a diagnosis result of a single sample point is interpreted by a local interpretation model according to a decision process in a feature importance analysis and decision tree structure interpretation model, a diagnosis result of a power generation fault diagnosis model is developed by a visual means, engineers and operators are helped to understand the diagnosis result of the power generation fault diagnosis model, the working efficiency is improved, and the technical problems that a black box algorithm lacks interpretability, an internal operation mode cannot be interpreted, an engineer and operators cannot find algorithm loopholes and improvement points easily, and deep analysis and optimization cannot be performed and decision making are solved.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

The invention and its embodiments have been described above with no limitation, and the actual construction is not limited to the embodiments of the invention as shown in the drawings. In summary, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical solution should not be creatively devised without departing from the gist of the present invention.

Claims

1. The utility model provides a power generation fault diagnosis method based on artificial intelligence which characterized in that: the method comprises the following steps:

step S3: the feature extraction and selection is specifically to calculate the center point of the feature dimension by adopting a clustering algorithm, group the preprocessed power generation fault data according to the center point of the feature dimension, and map the original data to a high-dimensional feature space by using a nonlinear feature mapping function for feature extraction;

step S4: establishing a power generation fault diagnosis model, namely adopting an optimal strategy searching algorithm, interacting a decision process of power generation fault diagnosis as the behavior of an intelligent agent with an environment state, learning according to a feedback rewarding function, and optimizing the diagnosis process according to the feedback rewarding function observed in real time;

step S5: the interpretation of the power generation fault diagnosis results is specifically to analyze the importance of each characteristic sample in a power generation fault diagnosis model by using a decision tree structure, interpret the fault diagnosis results of single sample points by using a local interpretation model, extract decision rules by using a decision rule extraction algorithm, and interpret the decision process in the power generation fault diagnosis model.

2. The artificial intelligence based power generation fault diagnosis method according to claim 1, wherein: in step S2, the data storage and searching includes the following steps:

step S214: repeating the step S213 for 12 times to obtain a matrix E;

3. The artificial intelligence based power generation fault diagnosis method according to claim 1, wherein: in step S3, the feature extraction and selection includes the steps of:

u[j]=（x[1][j]+x[2][j]+……+x[n][j]）/n；

c[j]=u[j]；

；

wherein x is ₁ Is the feature vector of the original data, Γ is the center of the nonlinear feature mapping function, ε is the parameter that controls the width of the nonlinear feature mapping function, Is the square of the Euclidean distance between the feature vector of the original data and the center of the nonlinear feature mapping function, and ψ (x) is the mapping result of the sample points in the high-dimensional feature space;

y=（x-x _min ）/（x _max -x _min ）；

wherein y isThe value of the characteristic after normalization is within the range of [0,1 ]]Between x is the feature of the training set, x _min Is the minimum value of the feature of the training set, x _max Is the maximum value of the features of the training set;

；

4. The artificial intelligence based power generation fault diagnosis method according to claim 1, wherein: in step S4, the building of the power generation fault diagnosis model includes the following steps:

step S453: the Z-value function is updated using the following formula:

Step S46: and carrying out real-time fault diagnosis, deploying the power generation fault diagnosis model into a real-time environment after training and optimizing, and carrying out fault prediction and diagnosis according to data acquired in real time.

5. The artificial intelligence based power generation fault diagnosis method according to claim 1, wherein: in step S5, the power generation failure diagnosis result interpretation includes the steps of:

；

6. The artificial intelligence based power generation fault diagnosis method according to claim 1, wherein: in step S1, the data collection and processing includes the steps of: