CN115859191A

CN115859191A - Fault diagnosis method and device, computer readable storage medium and computer equipment

Info

Publication number: CN115859191A
Application number: CN202211467440.9A
Authority: CN
Inventors: 金渊; 刘秀兰; 陈平; 李香龙; 钱梓锋; 陈慧敏; 张倩; 关宇; 焦然; 杨芮
Original assignee: State Grid Corp of China SGCC; State Grid Beijing Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Beijing Electric Power Co Ltd
Priority date: 2022-11-22
Filing date: 2022-11-22
Publication date: 2023-03-28

Abstract

The invention discloses a fault diagnosis method, a fault diagnosis device, a computer readable storage medium and computer equipment. Wherein, the method comprises the following steps: acquiring initial data of a charging pile; obtaining fault characteristic data of the charging pile based on the initial data; matching the fault characteristic data with a preset two-classification rule to obtain a fault matching result; calculating the probability of the fault characteristic data corresponding to the preset fault type to obtain a calculation result, and obtaining a fault prediction result based on the calculation result; and determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result. The invention solves the technical problems that the fault diagnosis method for the charging pile is not accurate enough, the efficiency is low, and the interpretability of the diagnosis result is poor.

Description

Fault diagnosis method and device, computer readable storage medium and computer equipment

Technical Field

The invention relates to the technical field of new energy and energy conservation, in particular to a fault diagnosis method and device, a computer readable storage medium and computer equipment.

Background

In the related art, a neural network model is generally used for diagnosing faults of the charging pile, but if the method is used, a large number of data samples are needed for model training in order to improve diagnosis precision, the time is long, the interpretability is poor, and if an analytic model method is used, a complex nonlinear system is difficult to establish an accurate model.

Therefore, in the related art, there are technical problems that a fault diagnosis method for a charging pile is not accurate enough, efficiency is low, and interpretability of a diagnosis result is poor.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a fault diagnosis method, a fault diagnosis device, a computer readable storage medium and computer equipment, which at least solve the technical problems that the fault diagnosis method for a charging pile is not accurate enough, the efficiency is low, and the interpretability of a diagnosis result is poor.

According to an aspect of an embodiment of the present invention, there is provided a fault diagnosis method including: acquiring initial data of a charging pile; obtaining fault characteristic data of the charging pile based on the initial data; matching the fault characteristic data with a preset two-classification rule to obtain a fault matching result; calculating the probability of the fault characteristic data corresponding to the preset fault type to obtain a calculation result, and obtaining a fault prediction result based on the calculation result; and determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result.

Optionally, before matching the fault feature data with the preset two classification rules to obtain a fault matching result, the method further includes: acquiring a plurality of target sample sets, wherein the target sample sets in the plurality of target sample sets comprise sample fault characteristic data and sample fault labels corresponding to the sample fault characteristic data; according to the fault label types of the sample fault labels, carrying out secondary classification on the multiple target sample sets to obtain sample sets under multiple fault types; and obtaining a preset classification rule based on the corresponding relation between the sample fault characteristic data in the sample set under the plurality of fault types and the fault types.

Optionally, obtaining a plurality of target sample sets includes: acquiring a plurality of groups of sample initial data; carrying out feature extraction on the multiple groups of sample initial data to obtain multiple groups of initial sample feature data; carrying out characteristic screening on the multiple groups of initial sample characteristic data to obtain multiple groups of sample fault characteristic data; determining sample fault labels respectively corresponding to the multiple groups of sample fault characteristic data; and determining a plurality of target sample sets based on the plurality of groups of sample fault characteristic data and the sample fault labels respectively corresponding to the plurality of groups of sample fault characteristic data.

Optionally, determining, based on the multiple sets of sample fault feature data and the sample fault labels respectively corresponding to the multiple sets of sample fault feature data, that the multiple target sample sets include: determining a plurality of initial sample sets based on the plurality of groups of sample fault characteristic data and sample fault labels respectively corresponding to the plurality of groups of sample fault characteristic data; respectively calculating the sample label proportions of the plurality of initial sample sets corresponding to the fault label types according to the fault label types of the sample fault labels; respectively oversampling sample fault characteristic data in a plurality of initial sample sets based on the sample label proportion to obtain a plurality of groups of oversampled sample fault characteristic data; and obtaining a plurality of target sample sets based on the plurality of groups of over-sampled sample fault characteristic data and the sample fault labels respectively corresponding to the plurality of groups of over-sampled sample fault characteristic data.

Optionally, before calculating the probability that the fault characteristic data corresponds to the preset fault type to obtain a calculation result and obtaining a fault prediction result based on the calculation result, the method further includes: based on a random forest algorithm, a fault type prediction model corresponding to a preset fault type is constructed and obtained by utilizing a plurality of target sample sets.

Optionally, calculating the probability that the fault feature data corresponds to the preset fault type to obtain a calculation result, and obtaining a fault prediction result based on the calculation result, includes: determining data which fails to be matched with a preset two-classification rule in the fault characteristic data as fault prediction data; respectively calculating the probability of the fault prediction data under various preset fault types by using a fault type prediction model to obtain a plurality of probability calculation results; and determining the fault type with the highest probability value in the probability results as a fault prediction result.

According to another aspect of the embodiments of the present invention, there is also provided a fault diagnosis apparatus including: the first acquisition module is used for acquiring initial data of the charging pile; the first determining module is used for obtaining fault characteristic data of the charging pile based on the initial data; the matching module is used for matching the fault characteristic data with a preset two-classification rule to obtain a fault matching result; the calculation module is used for calculating the probability of the fault characteristic data corresponding to the preset fault type to obtain a calculation result and obtaining a fault prediction result based on the calculation result; and the second determining module is used for determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result.

Optionally, the apparatus further comprises: the second acquisition module is used for acquiring a plurality of target sample sets, wherein the target sample sets in the plurality of target sample sets comprise sample fault characteristic data and sample fault labels corresponding to the sample fault characteristic data; the two-classification module is used for carrying out two-classification on the plurality of target sample sets according to the fault label types of the sample fault labels to obtain sample sets under a plurality of fault types; and the rule determining module is used for obtaining a preset two-classification rule based on the corresponding relation between the sample fault feature data in the sample set under the multiple fault types and the fault types.

According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium including a stored program, wherein when the program runs, the apparatus where the computer-readable storage medium is located is controlled to execute any one of the above fault diagnosis methods.

According to another aspect of the embodiments of the present invention, there is also provided a computer device, including: a memory and a processor, the memory storing a computer program; a processor for executing the computer program stored in the memory, the computer program when running causing the processor to perform any of the above fault diagnosis methods.

In the embodiment of the invention, a random forest algorithm is adopted, the fault characteristic data of the charging pile is determined based on the initial data of the charging pile, the fault characteristic data is matched with two preset classification rules, the fault type which can be directly matched and determined by the fault characteristic data, namely a fault matching result, is determined, the probability that the fault characteristic data corresponds to the preset fault type is calculated, the fault type which is possibly corresponding to the charging pile, namely a fault prediction result, is determined, and the fault type of the charging pile can be rapidly and accurately determined based on the fault matching result and the fault prediction result, so that the technical effect of interpretability during judgment by using a model can be ensured while the model is high in operation speed and reliability, and the technical problems that a fault diagnosis method for the charging pile is not accurate enough, low in efficiency and poor in interpretability of the diagnosis result are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow chart of a fault diagnosis method provided according to an embodiment of the invention;

fig. 2 is a flowchart of a charging pile fault diagnosis method based on a random forest algorithm according to an alternative embodiment of the present invention;

FIG. 3 is a schematic diagram of a decision tree structure provided in accordance with an alternative embodiment of the present invention;

FIG. 4 is a schematic diagram of a random forest structure provided in accordance with an alternative embodiment of the present invention;

fig. 5 is a block diagram of a fault diagnosis apparatus provided according to an embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In accordance with an embodiment of the present invention, there is provided a method embodiment of fault diagnosis, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

Fig. 1 is a flowchart of a fault diagnosis method provided according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:

step S102, acquiring initial data of a charging pile;

step S104, obtaining fault characteristic data of the charging pile based on the initial data;

step S106, matching the fault characteristic data with a preset two-classification rule to obtain a fault matching result;

step S108, calculating the probability of the fault characteristic data corresponding to the preset fault type to obtain a calculation result, and obtaining a fault prediction result based on the calculation result;

and step S110, determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result.

Through the steps, the fault characteristic data of the charging pile is determined based on the initial data of the charging pile by adopting a random forest algorithm, then the fault characteristic data is matched with two preset classification rules, the fault type which can be directly matched and determined by the fault characteristic data, namely the fault matching result, is determined, the probability that the fault characteristic data corresponds to the preset fault type is calculated, the fault type which is possibly corresponding to the charging pile, namely the fault prediction result, is determined, and the fault type of the charging pile can be rapidly and accurately determined based on the fault matching result and the fault prediction result, so that the technical effect of interpretability during judgment by using the model can be ensured while the model is high in operation speed and reliability, and the technical problems that a fault diagnosis method for the charging pile is not accurate enough, low in efficiency and poor in interpretability of the diagnosis result are solved.

It should be noted that the initial data may include a transaction record, an alarm record, an operation record, a defect elimination record, and the like of the charging pile.

As an optional embodiment, before matching the fault feature data with the preset two classification rules to obtain a fault matching result, the method further includes: acquiring a plurality of target sample sets, wherein the target sample sets in the plurality of target sample sets comprise sample fault characteristic data and sample fault labels corresponding to the sample fault characteristic data; according to the fault label types of the sample fault labels, carrying out secondary classification on the multiple target sample sets to obtain sample sets under multiple fault types; and obtaining a preset classification rule based on the corresponding relation between the sample fault characteristic data in the sample set under the plurality of fault types and the fault types.

The preset two-classification rule may be determined based on sample data, for example, the sample fault feature data and the sample fault label corresponding to the sample fault feature data are determined as one target sample set, and the plurality of target sample sets are classified into two types according to the type of the sample fault label in each sample set, for example, the target sample set corresponding to the sample fault label in the type a is classified into a first type, and the other target sample sets are classified into other types, so that the multi-classification problem in the original fault diagnosis process is converted into a plurality of two-classification problems, and the preset two-classification rule may be obtained according to the corresponding relationship between the sample fault feature data in the sample sets under the plurality of fault types and the fault type.

When fault diagnosis of the charging pile is carried out based on the preset two classification rules, fault characteristic data of the charging pile can be directly matched with the preset two classification rules, and when the fault characteristic data accords with a certain fault classification label, matching is successful, and then the fault type corresponding to the charging pile, namely a fault matching result, can be directly determined.

As an alternative embodiment, obtaining a plurality of target sample sets comprises: acquiring a plurality of groups of sample initial data; carrying out feature extraction on the multiple groups of sample initial data to obtain multiple groups of initial sample feature data; carrying out characteristic screening on the multiple groups of initial sample characteristic data to obtain multiple groups of sample fault characteristic data; determining sample fault labels respectively corresponding to the multiple groups of sample fault characteristic data; and determining a plurality of target sample sets based on the plurality of groups of sample fault characteristic data and the sample fault labels respectively corresponding to the plurality of groups of sample fault characteristic data.

Before obtaining the sample characteristic data based on the sample initial data, the sample initial data may be preprocessed, for example, data such as transaction records, alarm records, operation records, and deletion records may be converted into a data set in a matrix format suitable for a random forest algorithm, or the sample initial data may be subjected to data cleansing, for example, processing of abnormal values, missing values, and repeated data. Wherein, for abnormal values, a 3 sigma principle can be adopted, the data characteristics of normal distribution are obeyed, a mu +/-3 sigma region should contain 99.7% of data, and the data outside the interval are directly removed; for data containing missing values, the data are directly removed due to the small proportion of the missing values of the data; and directly removing the repeated data.

When the sample characteristic data is obtained based on the sample initial data, a characteristic related to the fault may be constructed based on the sample initial data, for example, the characteristic may be a description of a charging transaction process, a category of an alarm outside the transaction, an alarm start stage, an alarm end stage, a transaction electric quantity, a transaction amount, an alarm reason, and the like.

By carrying out feature screening on multiple groups of initial sample feature data, relevant features which have little influence on faults can be screened, and more core feature data in the initial sample feature data, namely, multiple groups of sample fault feature data, can be obtained. For example, when variance filtering is adopted, if the variance corresponding to a feature is small, it indicates that the feature has little effect on sample distinguishing, and the feature can be filtered; when the relevance filtering is adopted, if the relevance between the features and the labels is large, the features can provide a large amount of information, so that the features with small relevance between the features and the labels can be filtered; and the chi-square test can be adopted to compare two or more sample rates and the correlation analysis of two classification variables.

When the sample fault labels respectively corresponding to the multiple groups of sample fault characteristic data are determined, a mode of generating labels based on the characteristic data may be adopted, for example, the sample fault labels may be generated through steps of charging equipment fault device keyword identification, process flow operation keyword identification and fault cause generation.

As an alternative embodiment, the determining the plurality of target sample sets based on the plurality of sets of sample fault feature data and the sample fault labels respectively corresponding to the plurality of sets of sample fault feature data includes: determining a plurality of initial sample sets based on the plurality of groups of sample fault characteristic data and sample fault labels respectively corresponding to the plurality of groups of sample fault characteristic data; respectively calculating the corresponding sample label proportion of the plurality of initial sample sets under the fault label type according to the fault label type of the sample fault label; respectively oversampling sample fault characteristic data in a plurality of initial sample sets based on the sample label proportion to obtain a plurality of groups of oversampled sample fault characteristic data; and obtaining a plurality of target sample sets based on the plurality of groups of over-sampled sample fault characteristic data and the sample fault labels respectively corresponding to the plurality of groups of over-sampled sample fault characteristic data.

After the plurality of initial sample sets are determined, because the training effects of the sample sets in different sample label ratios are different, for example, when the sample label is relatively low, it indicates that the number of sample data under the label is not large, and if the training is directly performed by using the sample data, the adverse effect of overfitting can be generated, thereby affecting the accuracy of the fault diagnosis result. Therefore, in the present embodiment, the sample label ratios corresponding to different initial sample sets are respectively calculated, and the sample sets with the sample label ratios lower than the predetermined ratio (e.g., 100.

As an alternative embodiment, before calculating the probability that the fault feature data corresponds to the preset fault type to obtain the calculation result and obtaining the fault prediction result based on the calculation result, the method further includes: based on a random forest algorithm, a fault type prediction model corresponding to a preset fault type is constructed and obtained by utilizing a plurality of target sample sets. By adopting a random forest algorithm, a decision tree corresponding to a preset fault type, namely a fault type prediction model, is constructed, wherein the preset fault type can be various, and similarly, a plurality of decision trees can be correspondingly obtained, and each decision tree can classify each fault feature data.

As an alternative embodiment, the calculating the probability that the fault feature data corresponds to the preset fault type to obtain a calculation result, and obtaining the fault prediction result based on the calculation result includes: determining data which fails to be matched with a preset two-classification rule in the fault characteristic data as fault prediction data; respectively calculating the probability of the fault prediction data under various preset fault types by using a fault type prediction model to obtain a plurality of probability calculation results; and determining the fault type with the maximum probability value in the probability results as a fault prediction result. If data which cannot be matched with the preset two-classification rule exists in the fault feature data of the charging pile, fault diagnosis can be continuously performed on the charging pile in a mode of model prediction, for example, a fault type prediction model, namely a plurality of decision trees obtained based on a random forest algorithm, can be used for classifying and calculating the probability of the fault prediction data to obtain the probability that the fault prediction data respectively accord with various fault types, and after a plurality of probability calculation results are obtained, the fault type with the maximum probability value can be determined as the fault prediction result, and a plurality of fault types with the probability value larger than a preset threshold value can also be determined as the fault prediction result.

Based on the above embodiments and alternative embodiments, the present invention provides an alternative implementation, which is described below.

Fig. 2 is a flowchart of a charging pile fault diagnosis method based on a random forest algorithm according to an alternative embodiment of the present invention, which is described below as shown in fig. 2.

(1) Data pre-processing

The transaction record, the alarm record, the operation record and the defect elimination record related to the fault diagnosis of the charging pile are converted into an original Sample set (Sample set) suitable for a random forest algorithm

1) Data cleansing

The method mainly processes abnormal values, missing values and repeated data. For abnormal values, a 3 sigma principle is adopted, the data characteristics of normal distribution are obeyed, a mu +/-3 sigma region should contain 99.7% of data, and data outside the interval are directly removed; for data containing missing values, the data are directly removed due to the small proportion of the missing values of the data; and directly removing the repeated data.

2) Construction of data tags

And summarizing the fault elimination process in the missing data, and generating specific reasons of the faults as label data of the diagnosis model. The processing flow can be divided into the following three steps of charging equipment fault device keyword identification, processing flow operation keyword identification and fault reason generation.

1. The key word recognition of the charging equipment fault device is realized, a large number of components are contained in the charging pile, human factors are considered, different calling methods can be carried out on the same device, the difficulty of fault device recognition is increased to a certain extent, and therefore it is very necessary to establish a perfect and accurate device recognition dictionary. Text mining is carried out on a large amount of historical missing data through jieba (bus-tie segmentation), keywords of fault devices of the charging equipment are extracted (such as a charging module, a charging gun, a TCU and the like), and a dictionary containing key devices of the charging equipment is obtained through combing.

2. Text mining is carried out on a large amount of historical deleted data through jieba (Chinese character of 'jieba'), operation keywords of the charging equipment are extracted (such as replacement, adjustment and the like), and a dictionary containing the operation keywords of the charging equipment is obtained through combing.

Jieba (Chinese character of 'Jieba' is a simple, efficient and flexible Python tool library, the principle of the Jieba is based on a statistical dictionary, and a prefix dictionary is constructed firstly; segmenting the input sentence by using a prefix dictionary to obtain all segmentation possibilities, and constructing a directed acyclic graph according to segmentation positions; and calculating to obtain a maximum probability path through a dynamic planning algorithm, and finding out a maximum segmentation combination based on the word frequency. For unknown words, an HMM model based on Chinese character word forming capability is adopted, and Viterbi algorithm is used for calculation and part-of-speech tagging; and extracting keywords based on TF-IDF and extrank models respectively.

And generating a specific fault reason, namely the label of the sample set according to the personnel business knowledge by identifying the obtained fault device and the operation key word.

3) Data merging

And carrying out data combination on the known transaction records, alarm records and operation records to obtain relevant information (including the stake number, the transaction time, the transaction amount, the alarm reason and the like) of each charging process.

4) Feature generation

Constructing a characteristic related to the fault through business knowledge according to the charging process related information obtained in the step 3) (for example: description of charging transaction process, types of alarms outside the transaction, an alarm starting stage, an alarm ending stage and the like); and combining with original related information and label, converting into a data set Smtrigx in a kXn matrix format, wherein k is the number of historical records, and n is the number of fault diagnosis influencing factors plus 1. In the data set in the matrix format, each row represents data of a known fault reason, and each column represents various factors (such as transaction electric quantity, transaction amount, alarm reason, alarm starting stage, alarm ending stage and the like) and fault reasons causing the fault reason.

5) Feature screening

And filtering the features in sequence through variance filtering, correlation filtering and chi-square testing, and screening the related features which have little influence on fault reasons.

1. And filtering the variance, wherein the variance of the feature is small, which indicates that the sample has no difference basically on the feature, most values in the feature are the same, and even the value of the whole feature is the same, and the feature has no effect on sample distinction.

2. Relevance filtering, which is generally a feature that can provide us with a large amount of information if the relevance of the feature to the tag is large. If the features are not label dependent, it is wasteful and may also be noisy to model.

3. Chi-square test is a hypothesis test method using widely available counting data. It belongs to the field of nonparametric inspection, and mainly compares two or more sample rates and the correlation analysis of two classification variables. The idea of chi-square verification is to assume that two variables are independent, calculate the distribution situation when the two variables are independent, and compare the situation with the real situation. And then determines whether the hypothesis is reliable by consulting the chi-square table. The chi-squared table will provide a p-value corresponding to the calculated chi-squared value, the p-value using 0.01 as the significance level, i.e., the boundary of the p-value determination, with a p-value less than 0.01 being a relevant feature.

6) Generating a sample set

Generating an original Sample set suitable for a random algorithm according to the data feature sequence { x1, x2, …, xm } determined in the step 5), wherein the Sample set Sample is a kx (m + 1) matrix, each row in the original Sample set represents data of a known fault reason, and each column represents the core data feature sequence { x1, x2, …, xm } and the fault reason obtained in the step 15.

(2) Fault classification and oversampling

The fault diagnosis of the charging pile is a multi-classification problem, dozens of fault types are available, and the one-summary-al (one-to-many) strategy is adopted to convert the multi-classification problem into a plurality of two-classification problems; because fill electric pile in the operation process, the probability diverse that different trouble takes place can cause the unbalanced problem of sample, carries out the oversampling to all sample sets.

1) And setting the type of the label constructed by the deletion record as a, classifying the Sample of a certain class into one class, classifying the samples of the other a-1 classes into another class, and generating a Sample sets { Sample set1, sample set2, … and Sample seta } which have the same characteristics as the original Sample set and have different labels.

2) For the sample set with the sample label proportion smaller than 100 in the sample set, the SMOTE oversampling method is adopted to solve the problem of sample imbalance; for the sample set with the sample label ratio larger than 100.

SMOTE oversampling steps are as follows:

1. for each sample x in the minority class, calculating the distance from the sample x to all samples in the minority class sample set by taking the Euclidean distance as a standard to obtain k neighbor of the sample x;

2. setting a sampling proportion according to the sample imbalance proportion to determine a sampling multiplying power N, randomly selecting a plurality of samples from k neighbors of each minority sample x, and assuming that the selected neighbors are xn;

3. for each randomly selected neighbor xn, a new sample is constructed according to the following formula with the original sample

(3) Model training

Training all sample sets by utilizing an ensemble learning random forest algorithm to obtain a plurality of charging pile fault diagnosis models aiming at different fault reasons and meeting error requirements.

1) Sampling a sample set

Randomly and repeatedly extracting j samples (sub-training sets) from the original training set in a release mode for j times, wherein the probability that each sample is extracted is 1/j. The remaining samples constitute an out-of-bag data set (OOB) as the final test set.

2) Extracting features

Randomly extracting M constituent feature subsets from a feature set with the total number of M, wherein M < M.

3) Feature selection

Calculating the kiney index of each feature in the node data set to the data set, selecting the feature with the minimum kiney index and the corresponding segmentation point as the optimal feature and the optimal segmentation point, generating two child nodes from the nodes, and distributing the residual training data to the two child nodes.

4) Generating CART decision trees

Step 3 is repeated in the sample subset of each child node, recursively performing node segmentation until all leaf nodes are generated. Decision trees are a very fast classification model, are one of the most classical classification algorithms, and do not need prior assumptions or normalization processing of characteristic values.

The decision tree algorithm is specifically to divide a root node into two child nodes according to a set threshold value according to a certain division rule, and repeat the step at the child nodes until a classification result is obtained. Fig. 3 is a schematic diagram of a decision tree structure according to an alternative embodiment of the present invention, as shown in fig. 3, if the process is shown in a flowchart form, the process is similar to a binary tree, where N0 is a root node, and there are no incoming edges but zero or two outgoing edges (edges), pointing to two internal nodes (internal nodes), i.e., ni (i =1,2,3,4); the inner node has an incoming edge and two outgoing edges, and the outgoing edge points to a terminal node (final node), i.e., a final classification result Cr.

5) Random forest

And repeating the steps 2) to 4) to obtain j different decision trees.

6) Test data

Each decision tree classifies each piece of data in the test set, j classification results are counted, and the class with the largest ticket number is the final class of the sample. Fig. 4 is a schematic diagram of a random forest structure provided in accordance with an alternative embodiment of the present invention.

7) Determining a fault diagnosis model

And (3) performing the steps on all the sample sets generated in the step (2) to obtain a-b fault diagnosis models, wherein a is the type of the fault label, and b is the number of the sample sets with the sample unbalance degree larger than 100.

(4) Fault diagnosis

And (4) bringing the data of the charging pile which needs to be subjected to fault diagnosis into all models to obtain the probability of all fault diagnoses, wherein the maximum probability is the result of the fault diagnosis.

1) Matching the data of the charging pile needing fault diagnosis with the rule generated in the step (2), and representing the result of corresponding fault diagnosis if the data of the charging pile needing fault diagnosis are matched with the rule;

2) And (4) bringing the data which are not matched in the previous step into all fault diagnosis models to obtain the probability of all fault diagnoses, wherein the maximum probability is the result of the fault diagnoses.

In summary, the alternative embodiments of the present invention have the following advantages:

1. according to the charging pile fault diagnosis method based on the random forest algorithm, provided by the optional embodiment of the invention, the idea of solving the charging pile fault diagnosis by using the machine learning algorithm is that a theoretical problem is abstracted into a multi-classification problem in mathematics, the multi-classification problem is converted into a plurality of two-classification problems by using one-summary-al (one-to-many) strategies, and the problem of sample imbalance is solved by using an SMOTE oversampling method. For charging pile faults with unknown reasons, the fault reasons can be obtained only by inputting the corresponding independent variables, so that the charging pile fault diagnosis method based on the machine learning method has the advantages of high operation speed, high reliability and the like;

2. the random forest algorithm adopted by the optional implementation mode is a novel algorithm with excellent performance in the field of artificial intelligence, and can be used for quickly positioning the fault reason, so that the diagnosis efficiency and accuracy are greatly improved, and the maintenance time is greatly reduced;

3. compared with deep learning and other most machine learning algorithms, the random forest algorithm has high interpretability and certain guiding significance on business.

According to another aspect of the present invention, there is also provided a fault diagnosis apparatus, and fig. 5 is a block diagram of a structure of the fault diagnosis apparatus provided according to an embodiment of the present invention, as shown in fig. 5, the apparatus includes: a first obtaining module 51, a first determining module 52, a matching module 53, a calculating module 54 and a second determining module 55, which will be described below.

The first obtaining module 51 is configured to obtain initial data of the charging pile; a first determining module 52, connected to the first obtaining module 51, for obtaining fault feature data of the charging pile based on the initial data; a matching module 53, connected to the first determining module 52, for matching the fault feature data with a preset two classification rules to obtain a fault matching result; a calculating module 54, connected to the matching module 53, for calculating the probability that the fault characteristic data corresponds to the preset fault type, obtaining a calculation result, and obtaining a fault prediction result based on the calculation result; and a second determining module 55, connected to the calculating module 54, for determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result.

As an alternative embodiment, the apparatus further comprises: the second acquisition module is used for acquiring a plurality of target sample sets, wherein the target sample sets in the plurality of target sample sets comprise sample fault characteristic data and sample fault labels corresponding to the sample fault characteristic data; the two-classification module is used for carrying out two-classification on the plurality of target sample sets according to the fault label types of the sample fault labels to obtain sample sets under a plurality of fault types; and the rule determining module is used for obtaining a preset two-classification rule based on the corresponding relation between the sample fault feature data in the sample set under the multiple fault types and the fault types.

According to another aspect of the present invention, there is also provided a computer-readable storage medium including a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform any one of the above-mentioned fault diagnosis methods.

According to another aspect of the present invention, there is also provided a computer apparatus comprising: a memory and a processor, the memory storing a computer program; a processor for executing the computer program stored in the memory, the computer program when running causing the processor to perform any of the above fault diagnosis methods.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technical content can be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and amendments can be made without departing from the principle of the present invention, and these modifications and amendments should also be considered as the protection scope of the present invention.

Claims

1. A fault diagnosis method, comprising:

acquiring initial data of a charging pile;

obtaining fault characteristic data of the charging pile based on the initial data;

matching the fault characteristic data with a preset two-classification rule to obtain a fault matching result;

calculating the probability of the fault characteristic data corresponding to a preset fault type to obtain a calculation result, and obtaining a fault prediction result based on the calculation result;

and determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result.

2. The method according to claim 1, before said matching the fault signature data with a preset two-classification rule to obtain a fault matching result, further comprising:

obtaining a plurality of target sample sets, wherein a target sample set in the plurality of target sample sets comprises sample fault characteristic data and a sample fault label corresponding to the sample fault characteristic data;

according to the fault label types of the sample fault labels, performing secondary classification on the target sample sets to obtain sample sets under a plurality of fault types;

and obtaining the preset two classification rules based on the corresponding relation between the sample fault characteristic data in the sample set under the plurality of fault types and the fault types.

3. The method of claim 2, wherein the obtaining a plurality of target sample sets comprises:

acquiring a plurality of groups of sample initial data;

performing feature extraction on the multiple groups of sample initial data to obtain multiple groups of initial sample feature data;

performing characteristic screening on the multiple groups of initial sample characteristic data to obtain multiple groups of sample fault characteristic data;

determining sample fault labels respectively corresponding to the multiple groups of sample fault characteristic data;

and determining the plurality of target sample sets based on the plurality of groups of sample fault characteristic data and sample fault labels respectively corresponding to the plurality of groups of sample fault characteristic data.

4. The method of claim 3, wherein the determining the plurality of target sample sets based on the plurality of sets of sample fault signature data and the sample fault labels corresponding to the plurality of sets of sample fault signature data, respectively, comprises:

determining a plurality of initial sample sets based on the plurality of groups of sample fault characteristic data and the sample fault labels respectively corresponding to the plurality of groups of sample fault characteristic data;

respectively calculating sample label proportions corresponding to the plurality of initial sample sets under the fault label type according to the fault label type of the sample fault label;

respectively oversampling the sample fault characteristic data in the plurality of initial sample sets based on the sample label proportion to obtain a plurality of groups of oversampled sample fault characteristic data;

and obtaining the target sample sets based on the plurality of groups of over-sampled sample fault characteristic data and the sample fault labels respectively corresponding to the plurality of groups of over-sampled sample fault characteristic data.

5. The method according to claim 2, before said calculating the probability that the fault signature data corresponds to the preset fault type, obtaining a calculation result, and obtaining a fault prediction result based on the calculation result, further comprising:

and constructing a fault type prediction model corresponding to the preset fault type by using the plurality of target sample sets based on a random forest algorithm.

6. The method of claim 5, wherein calculating the probability that the fault signature data corresponds to a predetermined fault type to obtain a calculation result, and obtaining a fault prediction result based on the calculation result comprises:

determining data which fails to be matched with the preset two classification rules in the fault characteristic data as fault prediction data;

respectively calculating the probability of the fault prediction data under various preset fault types by using the fault type prediction model to obtain a plurality of probability calculation results;

and determining the fault type with the highest probability value in the plurality of probability results as the fault prediction result.

7. A failure diagnosis device characterized by comprising:

the first acquisition module is used for acquiring initial data of the charging pile;

the first determining module is used for obtaining fault characteristic data of the charging pile based on the initial data;

the matching module is used for matching the fault characteristic data with a preset two-classification rule to obtain a fault matching result;

the calculation module is used for calculating the probability of the fault characteristic data corresponding to a preset fault type to obtain a calculation result and obtaining a fault prediction result based on the calculation result;

and the second determination module is used for determining a fault diagnosis result of the charging pile based on the fault matching result and the fault prediction result.

8. The apparatus of claim 7, further comprising:

the second acquisition module is used for acquiring a plurality of target sample sets, wherein the target sample sets in the plurality of target sample sets comprise sample fault characteristic data and sample fault labels corresponding to the sample fault characteristic data;

the two-classification module is used for carrying out two-classification on the plurality of target sample sets according to the fault label types of the sample fault labels to obtain sample sets under a plurality of fault types;

and the rule determining module is used for obtaining the preset two classification rules based on the corresponding relation between the sample fault feature data in the sample set under the multiple fault types and the fault types.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium includes a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the fault diagnosis method according to any one of claims 1 to 6.

10. A computer device, comprising: a memory and a processor, wherein the processor is capable of,

the memory stores a computer program;

the processor configured to execute a computer program stored in the memory, the computer program when executed causing the processor to perform the fault diagnosis method of any one of claims 1 to 6.