CN115408190A - Fault diagnosis method and device - Google Patents
Fault diagnosis method and device Download PDFInfo
- Publication number
- CN115408190A CN115408190A CN202211063252.XA CN202211063252A CN115408190A CN 115408190 A CN115408190 A CN 115408190A CN 202211063252 A CN202211063252 A CN 202211063252A CN 115408190 A CN115408190 A CN 115408190A
- Authority
- CN
- China
- Prior art keywords
- feature
- fault
- data
- network
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Library & Information Science (AREA)
- Quality & Reliability (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a fault diagnosis method and device, and relates to the technical field of fault analysis. The method comprises the following steps: constructing a fault diagnosis network model; the fault diagnosis network model comprises a feature extraction network, a feature interaction network and a feature classification network; according to the work order sample data, the feature extraction network and the historical fault database, determining first feature data used for representing text semantics and second feature data used for representing a text fault theme; determining a trained fault diagnosis model according to the first characteristic data, the second characteristic data, the characteristic interaction network and the characteristic classification network; and performing similarity calculation according to the fault diagnosis model and the historical fault database, and determining corresponding work order processing information in the historical fault database. The scheme of the invention can complete the diagnosis of different cases, improve the accuracy of automatic fault diagnosis, and enhance the interpretability of model feature learning and the diagnosis adaptability to different fault subjects.
Description
Technical Field
The invention relates to the technical field of fault analysis, in particular to a fault diagnosis method and device.
Background
The massive fault records stored in manufacturing industries such as aerospace, automotive, and processing industries have motivated the use of data mining and text mining in historical data-driven fault diagnosis technologies. The fault records comprise a mechanism of product failure, and related part and fault phenomena, and can help develop product analysis and guide workers to complete fault repair. Industrial fault records are divided into structured data (information such as model serial numbers of parts, operation signals, and common voltage and current observed values, etc., which can be diagnosed directly using a computer), and unstructured information (usually embedded in text form). Structured numerical data can be directly utilized by a computer, however, information retrieval and diagnosis of unstructured fault records often relies on the experience of professional technicians, and is time consuming and inefficient. Mining and analyzing such text records can help maintenance workers to judge the type of the fault, analyze the cause of the fault and retrieve the corresponding maintenance scheme. The faults of complex products often relate to different parts, and each part can also generate different fault phenomena under different fault reasons, and the fault phenomena correspond to different fault subject categories and solutions. In addition, due to the existence of common text description problems such as professional words, ambiguous words, and linguistic words, deviations of the computer for translation and feature recognition of the text may be caused. The effect of extracting fault features is also influenced by the difference of the habitual modes of recording texts of different manufacturers and maintenance personnel, and the fault diagnosis performance of the model is directly determined by the extraction result of the features as the input of a classifier and similar case retrieval. Therefore, a universal method capable of accurately extracting fault features and developing intelligent fault diagnosis is urgently needed for case records of different products.
Disclosure of Invention
The invention aims to provide a fault diagnosis method and a fault diagnosis device, which aim to solve the problems of low automation degree and insufficient precision of fault diagnosis caused by insufficient efficiency and poor characteristics of a fault text characteristic extraction technology in the prior art.
To achieve the above object, an embodiment of the present invention provides a fault diagnosis method, including:
constructing a fault diagnosis network model; the fault diagnosis network model comprises a feature extraction network, a feature interaction network and a feature classification network;
according to the work order sample data, the feature extraction network and the historical fault database, determining first feature data used for representing text semantics and second feature data used for representing a text fault theme;
determining a trained fault diagnosis model according to the first characteristic data, the second characteristic data, the characteristic interaction network and the characteristic classification network;
and performing similarity calculation according to the fault diagnosis model and the historical fault database, and determining corresponding work order processing information in the historical fault database.
Optionally, determining a trained fault diagnosis model according to the first feature data, the second feature data, the feature interaction network, and the feature classification network, includes:
performing vector product interaction on the first characteristic data and the second characteristic data, and determining a weight coefficient matrix after interaction;
determining a processed first weight coefficient according to the normalization function of the feature interaction network and the weight coefficient matrix;
determining third feature data by weighted summation of the first weight coefficient and the first feature data; the third feature data is used for representing feature data fusing text semantics and a theme;
and determining a trained fault diagnosis model according to the third feature data and the target loss function of the feature classification network.
Optionally, the target loss function of the feature classification network is determined by:
determining a first loss function of second characteristic data passing through the characteristic classification network, a second loss function of third characteristic data passing through the characteristic classification network and a third loss function of model parameter regularization loss;
and determining the target loss function according to the weighted summation of the first loss function, the second loss function and the third loss function.
Optionally, the first loss function is determined according to a preset first cross entropy function and the number of training samples;
the second loss function is determined according to a preset second cross entropy function and the number of training samples;
wherein the first cross entropy function and the second cross entropy function each comprise: class labels of the samples and predictive values of the sample classifications.
Optionally, performing vector product interaction on the first feature data and the second feature data, and determining a weight coefficient matrix after the interaction, including:
performing vector product interaction on the first characteristic data and the second characteristic data to determine a semantic characteristic interaction matrix;
and performing a layer of convolution network processing on the semantic feature interaction matrix to determine the weight coefficient matrix.
Optionally, determining, according to the work order sample data, the feature extraction network, and the historical failure database, first feature data used for characterizing text semantics and second feature data used for characterizing a text failure topic, including:
determining first characteristic data of work order sample data through a first preset algorithm of the characteristic extraction network; the first feature data comprises semantic length and an embedding vector dimension;
determining second characteristic data of the work order sample data according to the work order sample data and a historical fault database through a second preset algorithm of the characteristic extraction network; the second characteristic data comprise theme number, failure phenomenon theme, failure reason theme and failure measure theme.
Optionally, determining a trained fault diagnosis model according to the third feature data and the target loss function of the feature classification network, including:
determining a target fault classification loss value according to the third characteristic data and the target loss function;
and if the target fault classification loss value is lower than a threshold value, optimizing the feature extraction network, the feature interaction network and the feature classification network according to a preset function, and determining a trained fault diagnosis model until the target fault classification loss value is greater than or equal to the threshold value.
Optionally, the method further includes:
acquiring N historical fault work orders, preprocessing the work orders and determining fault characteristic data; n is a positive integer;
constructing the historical fault database based on the fault feature data; the fault database comprises corresponding relations between N historical fault work orders and N third characteristic data; third feature data is stored in the historical fault database in a vector form;
wherein the fault signature data comprises: one or more of a failure mode, a failure cause, a failure impact, a failure detection method, a design improvement measure, and a usage compensation measure;
the preprocessing comprises one or more of noise information rejection, data de-duplication and sensitive word filtering.
To achieve the above object, an embodiment of the present invention further provides a fault diagnosis apparatus, including:
the building module is used for building a fault diagnosis network model; the fault diagnosis network model comprises a feature extraction network, a feature interaction network and a feature classification network;
the first determination module is used for determining first feature data used for representing text semantics and second feature data used for representing a text fault theme according to work order sample data, the feature extraction network and the historical fault database;
the second determination module is used for determining a trained fault diagnosis model according to the first characteristic data, the second characteristic data, the characteristic interaction network and the characteristic classification network;
and the third determining module is used for performing similarity calculation according to the fault diagnosis model and a historical fault database and determining corresponding work order processing information in the historical fault database.
To achieve the above object, an embodiment of the present invention further provides a readable storage medium on which a program or instructions are stored, the program or instructions implementing the steps in the fault diagnosis method as described in any one of the above when executed by a processor.
The technical scheme of the invention has the following beneficial effects:
according to the technical scheme, the first feature data used for representing text semantics and the second feature data used for representing text fault subjects are extracted, the first feature data and the second feature data are combined through the feature interaction network, the trained fault diagnosis model is determined through the feature classification network, the corresponding worksheet processing information in the historical fault database is determined according to the similarity calculation between the fault diagnosis model and the historical fault database, the accuracy of automatic fault diagnosis is improved, and the interpretability of model feature learning and the diagnosis adaptability to different fault subjects are enhanced.
Drawings
Fig. 1 is a flowchart of a fault diagnosis method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a fault diagnosis network model provided by an embodiment of the present invention;
FIG. 3 is a block diagram of a feature interaction network provided by an embodiment of the present invention;
fig. 4 is a second flowchart of a fault diagnosis method according to an embodiment of the present invention;
fig. 5 is a structural diagram of a fault diagnosis apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In various embodiments of the present invention, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In addition, the terms "system" and "network" are often used interchangeably herein.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.
As shown in fig. 1, an embodiment of the present invention provides a fault diagnosis method, including:
The fault diagnosis network Model based on the feature extraction Interaction of text mining is called a Topic-semantic Interaction feature Model (TCIM). As shown in fig. 2, the TCIM model can be divided into three parts, i.e., a feature extraction network (Gg), a feature interaction network (Gf) and a feature classification network (Gd), according to the specific functions of the structures of the parts, where the network structure of fig. 2 can be used in any of the following steps, and the arrows in fig. 2 represent the flowing process of the failure raw data during model training.
And 102, determining first characteristic data used for representing text semantics and second characteristic data used for representing a text fault theme according to work order sample data, the characteristic extraction network and a historical fault database. The work order sample data is fault case text data, such as maintenance logs, manuals and other text-form record data.
In this embodiment, the first feature data may represent global information of the mined text from the perspective of the word or phrase, and may also represent local semantic features determined by an attention mechanism; the second feature data is feature data of a text fault topic and represents key information in the text, and if the topic word is matched with the candidate keyword, the candidate content fully represents the text subject. And constructing the topic keyword by using an implied Dirichlet Allocation (LDA) topic model, wherein the basis of topic feature scoring is whether the candidate keyword appears in the topic feature word, if so, the weight is doubled, and otherwise, the weight is unchanged.
103, determining a trained fault diagnosis model according to the first feature data, the second feature data, the feature interaction network and the feature classification network;
in the embodiment, the first feature data and the second feature data are interacted through a feature interaction network, and then the trained fault diagnosis model is further determined through a feature classification network.
And 104, performing similarity calculation according to the fault diagnosis model and a historical fault database, and determining corresponding work order processing information in the historical fault database.
In the embodiment of the invention, after the fault diagnosis model is determined, the work order data to be processed is input into the fault diagnosis model, the fault diagnosis model can classify the work order data to be processed, the classified data or the key information of the work order data to be processed is determined, and the similarity calculation is carried out with the historical fault database according to the classified data or the key information of the work order data to be processed, so that the similar work orders are found, and the processing suggestion of the work orders is returned.
It should be noted that the similarity process is as follows: and generating third feature data containing the first two feature information from the first feature data and the second feature data through a feature interaction network, and performing similarity calculation on the third feature data and a third feature data set stored in a historical fault database to find out similar fault cases so as to help solve the current problem, wherein a fault diagnosis model is used for helping to finish classified intelligent diagnosis of the fault.
Optionally, the similarity calculation method adopts a cosine similarity determination mode, judges the correlation degree of the two texts through cosine measurement, if the cosine similarity is larger, indicates that the included angle between the two variables is smaller, the similarity degree of the two texts is higher, if the cosine similarity is smaller, indicates that the included angle between the two variables is larger, the similarity degree of the two texts is lower, and finally determines the work order processing information with the highest similarity in the historical fault database after comparing with the historical fault database.
The scheme of the invention can complete the diagnosis and retrieval tasks of different cases, improves the accuracy of automatic fault diagnosis, and enhances the interpretability of model feature learning and the diagnosis adaptability to different fault subjects.
Optionally, step 102 includes:
104, determining first characteristic data of work order sample data through a first preset algorithm of the characteristic extraction network; the first feature data comprises semantic length and an embedding vector dimension;
105, determining second characteristic data of the work order sample data according to the work order sample data and the historical fault database through a second preset algorithm of the feature extraction network; the second characteristic data comprises a theme number, a fault phenomenon theme, a fault reason theme and a fault measure theme.
It should be noted that the number of subjects, the prior distribution of subjects, are parameters determined in advance (the research field is expected to be divided into several subjects), and the determination in advance can be determined by human or machine algorithm. Here, the theme of the second characteristic data refers to the theme of the fault contained in all the historical fault databases, such as for the automobile field: the types of air inlet and exhaust faults, mechanical transmission faults, power battery faults and the like of the engine cylinder. After the topic feature module of the feature extraction network learns, all historical data are divided into several categories corresponding to the number of topics, corresponding words (namely article-topic and topic-word probability distribution) are corresponding to each topic, and final second feature information can be obtained by weighting and averaging the embedding vectors corresponding to the previous high-frequency words under each topic, wherein the final second feature information represents main fault phenomena and reasons under each topic.
In the embodiment of the present invention, the feature extraction network (Gg) is used to extract effective fault features from an original fault data text (work order sample data), and is divided into a semantic feature extraction module (as shown in fig. 2) and a theme feature extraction module (as shown in fig. 2). The semantic feature extraction module extracts semantic information of the text by using a convolutional neural network based on the learned word vector, and can obtain a required text semantic feature vector, namely first feature data, through normalization, a linear rectification function (ReLU function) and a pooling layer. The topic feature extraction module performs topic mining on records in a case database based on a Latent Dirichlet (LDA) topic model widely applied to text mining to obtain fault topic distribution capable of being observed manually and high-frequency words under each topic, and performs weighted summation on the high-frequency word vectors under each topic to obtain a topic feature vector under each fault topic, namely first feature data.
In an optional embodiment, the extraction of semantic features and topic category features is performed through a feature extraction network (Gg), where the semantic feature (first feature data) is a semantic feature embedded vector D = { D } that is output after each case text passes through a Word2vec (Word to vector) model and a convolutional layer 1 ,d 2 ,…,d e }∈R s×e S is the length of the document and e is the embedding vector dimension of the setting. The topic category feature (second feature data) is a topic feature vector T = { T = T obtained by adding and averaging the embedding vectors of the top five terms of the probability under each topic according to the output (topic-term probability distribution) of the topic model 1 ,t 2 ,…,t k }∈R k×e K is the number of topics and e is the embedding vector dimension. Understandably, R k×e Represents a vector dimensional space, i.e. there are k components, each component being a vector in e-dimension.
Optionally, the step 103 includes:
104, performing vector product interaction on the first characteristic data and the second characteristic data, and determining a weight coefficient matrix after interaction;
step 105, determining a processed first weight coefficient according to the normalization function of the feature interaction network and the weight coefficient matrix;
step 106, determining third characteristic data by weighting and summing the first weight coefficient and the first characteristic data; the third feature data is used for representing feature data fusing text semantics and a theme;
and step 107, determining a trained fault diagnosis model according to the third feature data and the target loss function of the feature classification network.
In the embodiment, in a feature interaction network (Gf), vector product is performed on the extracted topic features and semantic features to obtain a topic-semantic relationship, that is, vector product interaction is performed on the first feature data and the second feature data to determine a weight coefficient matrix after interaction; after the weight coefficient matrix passes through a normalization function (softmax activation function), the weight coefficient matrix is a corresponding theme weight value of the text segment, the third feature data of the text under the guidance of the theme category can be obtained by weighting the semantic features, and the theme and the semantic information of the case are fused with the third feature data. And for the third characteristic data, the third characteristic data is in a vector form which is convenient for a computer to store, the similarity between the current case and the historical fault case in the case base can be obtained through the calculation of cosine vectors, and the case retrieval is completed to provide a solution for the current case.
In step 107, the feature classification network (Gd) is composed of multiple layers of fully-connected neural networks, and the Gd is input as a subject category feature vector extracted by the subject feature extraction module and third feature data generated through an interactive network, and output as probability values corresponding to all fault categories through a target loss function of the feature classification network (Gd), so as to predict fault classification according to case features and complete intelligent fault diagnosis.
Optionally, the target loss function of the feature classification network is determined by:
step 108, determining a first loss function of second characteristic data passing through the characteristic classification network, a second loss function of third characteristic data passing through the characteristic classification network and a third loss function of model parameter regularization loss;
step 109, determining the target loss function according to the weighted summation of the first loss function, the second loss function and the third loss function.
In this embodiment, the target loss function is mainly composed of the loss functions of the classifiers. The total loss function of the fault classifier, that is, the target loss function Lc, is obtained by weighted summation of the prediction loss of the subject feature, the prediction loss of the interactive feature and the regularization loss of the model parameter, and is defined as:
L c =L 1 +PL 2 + λ R (w), formula one;
in the formula I, L1 is the predicted loss obtained by the interactive features through a classification module, namely a second loss function of third feature data through the feature classification network; l2 is the predicted loss obtained by the classification module of the subject feature, namely a first loss function of the second feature data through the feature classification network; p and λ are weight values corresponding to the loss of the two, and are between 0 and 1 (including 0 and 1), and R (W) = | W | 2 Is the quadratic regularization value of all parameters to be learned in the model.
Specifically, the first loss function is determined according to a preset first cross entropy function and the number of training samples;
the second loss function is determined according to a preset second cross entropy function and the number of training samples;
wherein the first cross entropy function and the second cross entropy function each comprise: class labels of the samples and predictive values of the sample classifications.
In this embodiment, the failure predicted loss values of the second loss function L1 and the first loss function L2 are measured by using the cross entropy between the prediction and the real tag, and the predicted loss formulas of L1 and L2 are as follows:
wherein n is the number of training samples;as a cross-entropy function, y 1 As the type of failure of the current failure text, y h = f (H) is a fault type prediction value obtained by the interactive characteristic H through a classifier, y t And = f (T) represents a fault type predicted value obtained by the subject feature vector T through the classifier.
The cross entropy function is defined as follows:
wherein y is (i) Is the class label of the ith sample, i.e. the true value of the sample;the predicted value for the model to classify the ith sample.
In summary, the objective loss function Lc represents the equivalent as follows:
optionally, as shown in fig. 3, the step 104 includes:
and 110, performing vector product interaction on the first characteristic data and the second characteristic data to determine a semantic characteristic interaction matrix.
In this embodiment, vector product interaction is performed on the first feature data and the second feature data after the work order sample data is extracted, that is, the first feature data and the second feature data are input to a feature interaction network, and a specific network structure of the feature interaction network is shown in fig. 3. Firstly, carrying out vector multiplication to complete interactive calculation to obtain a category semantic feature interactive matrix G = DT T ={g 1 ,g 2 ,…,g s }∈R s×k The semantic feature interaction matrix G enables the model to provide more feature information for the following tasks, the interpretability of the model is reserved, and the text can be re-expressed by introducing topic prior knowledge.
And 111, performing a layer of convolution network processing on the semantic feature interaction matrix to determine the weight coefficient matrix.
To obtain the association of topic-word pairs, the semantic feature interaction matrix G is further processed with a layer of convolutional network to capture the non-linear relationship of neighboring words.
For each G vector with the center of d and the convolution kernel size of n from d-n to d + n, obtaining a vector U = { U } after convolution by adopting a relu activation function 1 ,u 2 ,…,u s }∈R s×k For the d-th convolution vector u d The formula five of the convolution process is as follows:
u d =Relu(w d G d-n:d+n +b d ) A formula five;
wherein, w d And b d Is the parameter to be learned by the convolutional network model and finally passes through max pooling (maxpololing) u d Obtaining a final weighted attention vectorV={v 1 ,v 2 ,…,v S }∈R s×1 . V is a vector of length s storing the attention score between each word and the topic in the current text, and the score can be converted into a topic-word weight coefficient vector β using the Softmax function, that is: β = SoftMax (v).
Wherein the d-th topic-word weight coefficient is calculated as:
after the weight coefficient vector beta is obtained through calculation, new interactive features H = { H } can be generated by weighting corresponding to each word component of the original text semantic features 1 ,h 2 ,…,h S }∈R S×E The feature is a finally generated new feature with a topic-word feature interaction relationship, that is, third feature data, which contains text information and category information of each case and is calculated as:
further, as shown in fig. 2, step 107 of the embodiment of the present invention includes:
step 112, determining a target fault classification loss value according to the third characteristic data and the target loss function;
and 113, if the target fault classification loss value is lower than a threshold value, optimizing the feature extraction network, the feature interaction network and the feature classification network according to a preset function, and determining a trained fault diagnosis model until the target fault classification loss value is greater than or equal to the threshold value.
In the embodiment of the invention, the trained fault diagnosis model is determined to be optimized with a semantic feature extraction network, a feature interaction network and a feature classification network. The fixedly trained theme feature extraction module extracts semantic features by utilizing original data to complete feature interaction and classification prediction to obtain a target fault classification loss value after one round of training; if the target fault classification loss value is lower than the threshold value, the feature extraction network, the feature interaction network and the feature classification network are optimized according to a preset function, that is, the current loss inverse gradient propagation is performed repeatedly, step 112 is repeated, until the target fault classification loss value is greater than or equal to the threshold value, the model loss convergence is finished, and the trained fault diagnosis model is determined.
It should be noted that the optimization goal of the model is to minimize the overall classification loss of the fault classifier. On the one hand, the model needs to minimize L 2 The extracted subject features can be fitted to classification categories as much as possible after being weighted by loss constraint, and on the other hand, after the semantic features are added, the model needs to minimize L 1 The generated features are restricted by loss to represent complete information of the case, so that the feature classification network can distinguish the type of the fault, and the generated fine-grained features have the fault diagnosis capability. In the training process, the model realizes the optimization target by calculating a target fault classification loss value in the feature classification network and reversely transmitting parameters of the sequentially iterative optimization feature extraction model and the feature interaction model.
It should be noted that, before the step 103, model initialization is first required. Initializing weight parameters of a feature extraction network (Gg), a feature interaction network (Gf) and a feature classification network (Gd); secondly, testing and determining parameters of a theme feature extraction module in a feature extraction network (Gg); the method comprises the steps of extracting theme characteristics by utilizing original data, determining proper theme model parameters by utilizing a high-frequency word and theme visual distribution method through experiments, and making full use of experience of related technical personnel to enable the theme characteristic distribution to be close to real fault characteristic distribution as much as possible.
Optionally, the method further includes:
step 114, acquiring N historical fault work orders, preprocessing the work orders and determining fault characteristic data; n is a positive integer;
step 115, building the historical fault database based on the fault characteristic data; the fault database comprises corresponding relations between N historical fault work orders and N third characteristic data; third feature data is stored in the historical fault database in a vector mode;
wherein the fault signature data comprises: one or more of a failure mode, a failure cause, a failure impact, a failure detection method, a design improvement measure, and a usage compensation measure;
the preprocessing comprises one or more of noise information rejection, data de-duplication and sensitive word filtering.
It should be noted that, the historical fault cases (historical fault work orders) and the corresponding third features (generated during training and including the first feature data and the second feature data) are stored in the constructed historical fault database, and the third feature data is stored in a vector form, so that compared with the conventional text storage and similar text retrieval, the method is more convenient to store and more convenient to complete similarity calculation between cases.
In the embodiment, N historical fault work orders are obtained and preprocessed, wherein the determined fault feature data comprise a manual diagnosis result and a system diagnosis result, the system diagnosis result is obtained by processing the fault feature data based on a prestored fault diagnosis system, the N historical fault work orders comprise M manual diagnosis results, M is a positive integer smaller than or equal to N, and N is a positive integer; the fault feature data is obtained by processing fault detail data of the historical fault work order, and the fault detail data comprises data in a preset time period related to occurrence time of faults in the historical fault work order. The failure characteristic data includes failure product information (product name, product model, function, material, environmental load, performance parameter) and failure information (failure mode, failure cause, failure influence, failure detection method, design improvement measure, use compensation measure, etc.)
In an alternative embodiment: and establishing a historical fault database which comprises three database tables of a product family/product platform, product detailed information and function description, and three database tables of a fault mode, detailed information and a fault mechanism. Products with similar functions and the same internal interfaces are organized in the form of product families and product trees, and then different personality modules are added on a product platform to form a product example. All fault knowledge belongs to a certain product instance or platform. The database comprises two database tables, namely a relational layer table and an application layer table. The relation layer table stores known object relation, and the application layer table stores data relation between product function and fault, so as to form a historical fault database.
In another embodiment, as shown in fig. 4, the present invention further provides an overall flow chart, comprising:
step 1: and extracting historical fault case data (historical fault work order) from the database, and performing text preprocessing, including the recognition of professional words and the removal of stop words to obtain a fault case corpus (historical fault database).
And 2, step: text semantic features of each fault case and overall fault subject category features of the case base are extracted in two stages according to the feature extraction network, appropriate network layer structure parameters are selected to generate features through the feature interaction network, and an overall loss function training model is minimized through a classifier in the feature classification network.
And step 3: for a new fault case, a case test sample is preprocessed and then directly input into a trained model, so that the characteristic representation of the current case can be obtained.
And 4, step 4: the case characterization features can be classified, predicted and diagnosed, similarity calculation is carried out on the case characterization features and case features in a historical fault database, similar cases are found, and diagnosis and solution retrieval on new fault cases are completed.
And 5: and adding the case into a historical fault database after analysis and inspection to continuously update the database.
In conclusion, the scheme of the invention extracts and interacts the semantic and theme features in two stages to obtain the fine-grained case features under the condition of considering the fault theme-semantic relationship, completes the tasks of diagnosing and searching different cases, improves the accuracy of automatic fault diagnosis, and enhances the interpretability of model feature learning and the diagnosis adaptability to different fault themes.
As shown in fig. 5, an embodiment of the present invention further provides a fault diagnosis apparatus, including:
a building module 501, configured to build a fault diagnosis network model; the fault diagnosis network model comprises a feature extraction network, a feature interaction network and a feature classification network;
a first determining module 502, configured to determine, according to work order sample data, the feature extraction network, and a historical failure database, first feature data used for representing text semantics and second feature data used for representing a text failure topic;
a second determining module 503, configured to determine a trained fault diagnosis model according to the first feature data, the second feature data, the feature interaction network, and the feature classification network;
a third determining module 504, configured to perform similarity calculation according to the fault diagnosis model and a historical fault database, and determine corresponding work order processing information in the historical fault database.
Optionally, the second determining module 503 includes:
the first determining submodule is used for carrying out vector product interaction on the first characteristic data and the second characteristic data and determining a weight coefficient matrix after interaction;
the second determining submodule is used for determining the processed first weight coefficient according to the normalization function of the characteristic interaction network and the weight coefficient matrix;
a third determining submodule for determining third feature data by weighted summation of the first weighting coefficient and the first feature data; the third feature data is used for representing feature data fusing text semantics and a theme;
and the fourth determining submodule is used for determining a trained fault diagnosis model according to the third feature data and the target loss function of the feature classification network.
Optionally, in the building module 501, the objective loss function of the feature classification network is determined by:
the first determining unit is used for determining a first loss function of second characteristic data passing through the characteristic classification network, a second loss function of third characteristic data passing through the characteristic classification network and a third loss function of model parameter regularization loss;
a second determining unit, configured to determine the target loss function according to a weighted summation of the first loss function, the second loss function, and the third loss function.
Specifically, the first determining unit is specifically configured to determine the first loss function according to a preset first cross entropy function and the number of training samples;
the second determining unit is specifically configured to determine the second loss function according to a preset second cross entropy function and the number of training samples;
wherein the first cross entropy function and the second cross entropy function each comprise: class labels of the samples and predictive values of the sample classifications.
Optionally, the first determining sub-module includes:
the third determining unit is used for performing vector product interaction on the first feature data and the second feature data to determine a semantic feature interaction matrix;
and the fourth determining unit is used for performing a layer of convolution network processing on the semantic feature interaction matrix to determine the weight coefficient matrix.
Optionally, the first determining module 502 includes:
a fifth determining unit, configured to determine first feature data of the work order sample data through a first preset algorithm of the feature extraction network; the first feature data comprises a semantic length and an embedded vector dimension;
a sixth determining unit, configured to determine, according to the work order sample data and the historical fault database, second feature data of the work order sample data through a second preset algorithm of the feature extraction network; the second characteristic data comprise theme number, failure phenomenon theme, failure reason theme and failure measure theme.
Optionally, the fourth determining submodule includes:
a seventh determining unit, configured to determine a target fault classification loss value according to the third feature data and the target loss function;
and the eighth determining unit is used for optimizing the feature extraction network, the feature interaction network and the feature classification network according to a preset function if the target fault classification loss value is lower than a threshold value, and determining a trained fault diagnosis model until the target fault classification loss value is greater than or equal to the threshold value.
In an embodiment of the present invention, the fault diagnosis apparatus further includes:
the acquisition module is used for acquiring N historical fault work orders, preprocessing the N historical fault work orders and determining fault characteristic data; n is a positive integer;
the second construction module is used for constructing the historical fault database based on the fault characteristic data; the fault database comprises corresponding relations between N historical fault work orders and N third characteristic data; third feature data is stored in the historical fault database in a vector mode;
wherein the fault signature data comprises: one or more of a failure mode, a failure cause, a failure impact, a failure detection method, a design improvement measure, and a usage compensation measure;
the preprocessing comprises one or more of noise information rejection, data de-duplication and sensitive word filtering.
The implementation embodiments of the fault diagnosis method are all applicable to the embodiment of the fault diagnosis device, and the same technical effects can be achieved, so that repetition is avoided, and details are not repeated here.
The readable storage medium of the embodiment of the present invention stores a program or instructions thereon, and the program or instructions, when executed by the processor, implement the steps in the fault diagnosis method described above, and can achieve the same technical effects, and in order to avoid repetition, the detailed description is omitted here.
The processor is the processor in the fault diagnosis method in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
In embodiments of the present invention, modules may be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be constructed as an object, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different bits which, when joined logically together, comprise the module and achieve the stated purpose for the module.
Indeed, a module of executable code may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Likewise, operational data may be identified within the modules and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, merely as electronic signals on a system or network.
When a module can be implemented by software, considering the level of existing hardware technology, a module implemented by software may build a corresponding hardware circuit to implement a corresponding function, without considering cost, and the hardware circuit may include a conventional Very Large Scale Integration (VLSI) circuit or a gate array and an existing semiconductor such as a logic chip, a transistor, or other discrete components. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
The exemplary embodiments described above are described with reference to the drawings, and many different forms and embodiments of the invention may be made without departing from the spirit and teaching of the invention, therefore, the invention is not to be construed as limited to the exemplary embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. In the drawings, the size and relative sizes of components may be exaggerated for clarity. The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Unless otherwise indicated, a range of values, when stated, includes the upper and lower limits of the range and any subranges therebetween.
While the foregoing is directed to the preferred embodiment of the present invention, it will be appreciated by those skilled in the art that various changes and modifications may be made therein without departing from the principles of the invention as set forth in the appended claims.
Claims (10)
1. A fault diagnosis method, comprising:
constructing a fault diagnosis network model; the fault diagnosis network model comprises a feature extraction network, a feature interaction network and a feature classification network;
according to the work order sample data, the feature extraction network and the historical fault database, determining first feature data used for representing text semantics and second feature data used for representing a text fault theme;
determining a trained fault diagnosis model according to the first feature data, the second feature data, the feature interaction network and the feature classification network;
and performing similarity calculation according to the fault diagnosis model and the historical fault database, and determining corresponding work order processing information in the historical fault database.
2. The method of claim 1, wherein determining a trained fault diagnosis model based on the first feature data, the second feature data, the feature interaction network, and the feature classification network comprises:
performing vector product interaction on the first characteristic data and the second characteristic data, and determining a weight coefficient matrix after interaction;
determining a processed first weight coefficient according to the normalization function of the feature interaction network and the weight coefficient matrix;
determining third feature data by weighted summation of the first weight coefficient and the first feature data; the third feature data is used for representing feature data fusing text semantics and a theme;
and determining a trained fault diagnosis model according to the third feature data and the target loss function of the feature classification network.
3. The method of claim 2, wherein the target loss function of the feature classification network is determined by:
determining a first loss function of second characteristic data passing through the characteristic classification network, a second loss function of third characteristic data passing through the characteristic classification network and a third loss function of model parameter regularization loss;
and determining the target loss function according to the weighted summation of the first loss function, the second loss function and the third loss function.
4. The method according to claim 3, wherein the first loss function is determined according to a preset first cross entropy function and the number of training samples;
the second loss function is determined according to a preset second cross entropy function and the number of training samples;
wherein the first cross entropy function and the second cross entropy function each comprise: class labels of the samples and predictive values of the sample classifications.
5. The method of claim 2, wherein performing vector product interaction on the first feature data and the second feature data, and determining an interacted weight coefficient matrix comprises:
performing vector product interaction on the first characteristic data and the second characteristic data to determine a semantic characteristic interaction matrix;
and performing a layer of convolution network processing on the semantic feature interaction matrix to determine the weight coefficient matrix.
6. The method of claim 1, wherein determining first feature data for characterizing text semantics and second feature data for characterizing text fault topics according to work order sample data, the feature extraction network and a historical fault database comprises:
determining first characteristic data of work order sample data through a first preset algorithm of the characteristic extraction network; the first feature data comprises a semantic length and an embedded vector dimension;
determining second characteristic data of the work order sample data according to the work order sample data and the historical fault database through a second preset algorithm of the feature extraction network; the second characteristic data comprises a theme number, a fault phenomenon theme, a fault reason theme and a fault measure theme.
7. The method of claim 2, wherein determining a trained fault diagnosis model based on the third feature data and an objective loss function of the feature classification network comprises:
determining a target fault classification loss value according to the third characteristic data and the target loss function;
and if the target fault classification loss value is lower than a threshold value, optimizing the feature extraction network, the feature interaction network and the feature classification network according to a preset function, and determining a trained fault diagnosis model until the target fault classification loss value is greater than or equal to the threshold value.
8. The method of claim 1, further comprising:
acquiring N historical fault work orders, preprocessing the work orders and determining fault characteristic data; n is a positive integer;
building the historical fault database based on the fault feature data; the fault database comprises corresponding relations between N historical fault work orders and N third characteristic data; third feature data is stored in the historical fault database in a vector mode;
wherein the fault signature data comprises: one or more of a failure mode, a failure cause, a failure impact, a failure detection method, a design improvement measure, and a usage compensation measure;
the preprocessing comprises one or more of noise information elimination, data de-duplication and sensitive word filtering.
9. A failure diagnosis device characterized by comprising:
the building module is used for building a fault diagnosis network model; the fault diagnosis network model comprises a feature extraction network, a feature interaction network and a feature classification network;
the first determination module is used for determining first feature data used for representing text semantics and second feature data used for representing a text fault theme according to work order sample data, the feature extraction network and the historical fault database;
the second determining module is used for determining a trained fault diagnosis model according to the first feature data, the second feature data, the feature interaction network and the feature classification network;
and the third determining module is used for performing similarity calculation according to the fault diagnosis model and the historical fault database and determining corresponding work order processing information in the historical fault database.
10. A readable storage medium on which a program or instructions are stored, characterized in that the program or instructions, when executed by a processor, implement the steps in the fault diagnosis method according to any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211063252.XA CN115408190A (en) | 2022-08-31 | 2022-08-31 | Fault diagnosis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211063252.XA CN115408190A (en) | 2022-08-31 | 2022-08-31 | Fault diagnosis method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115408190A true CN115408190A (en) | 2022-11-29 |
Family
ID=84163964
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211063252.XA Pending CN115408190A (en) | 2022-08-31 | 2022-08-31 | Fault diagnosis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115408190A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116542252A (en) * | 2023-07-07 | 2023-08-04 | 北京营加品牌管理有限公司 | Financial text checking method and system |
CN116738323A (en) * | 2023-08-08 | 2023-09-12 | 北京全路通信信号研究设计院集团有限公司 | Fault diagnosis method, device, equipment and medium for railway signal equipment |
CN118094185A (en) * | 2024-02-22 | 2024-05-28 | 远江盛邦(北京)网络安全科技股份有限公司 | Load feature extraction method and device, electronic equipment and storage medium |
CN118568251A (en) * | 2024-08-02 | 2024-08-30 | 山东亚微软件股份有限公司 | Similar work order recommendation system based on semantics and similarity |
-
2022
- 2022-08-31 CN CN202211063252.XA patent/CN115408190A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116542252A (en) * | 2023-07-07 | 2023-08-04 | 北京营加品牌管理有限公司 | Financial text checking method and system |
CN116542252B (en) * | 2023-07-07 | 2023-09-29 | 北京营加品牌管理有限公司 | Financial text checking method and system |
CN116738323A (en) * | 2023-08-08 | 2023-09-12 | 北京全路通信信号研究设计院集团有限公司 | Fault diagnosis method, device, equipment and medium for railway signal equipment |
CN116738323B (en) * | 2023-08-08 | 2023-10-27 | 北京全路通信信号研究设计院集团有限公司 | Fault diagnosis method, device, equipment and medium for railway signal equipment |
CN118094185A (en) * | 2024-02-22 | 2024-05-28 | 远江盛邦(北京)网络安全科技股份有限公司 | Load feature extraction method and device, electronic equipment and storage medium |
CN118568251A (en) * | 2024-08-02 | 2024-08-30 | 山东亚微软件股份有限公司 | Similar work order recommendation system based on semantics and similarity |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115408190A (en) | Fault diagnosis method and device | |
CN111914090B (en) | Method and device for enterprise industry classification identification and characteristic pollutant identification | |
CN109492230B (en) | Method for extracting insurance contract key information based on interested text field convolutional neural network | |
CN112767106B (en) | Automatic auditing method, system, computer readable storage medium and auditing equipment | |
CN111427775B (en) | Method level defect positioning method based on Bert model | |
CN110046943B (en) | Optimization method and optimization system for network consumer subdivision | |
CN113806482A (en) | Cross-modal retrieval method and device for video text, storage medium and equipment | |
CN113076734A (en) | Similarity detection method and device for project texts | |
CN112906398B (en) | Sentence semantic matching method, sentence semantic matching system, storage medium and electronic equipment | |
Almiman et al. | Deep neural network approach for Arabic community question answering | |
CN110969015A (en) | Automatic label identification method and equipment based on operation and maintenance script | |
Westermann et al. | Computer-assisted creation of boolean search rules for text classification in the legal domain | |
CN116304020A (en) | Industrial text entity extraction method based on semantic source analysis and span characteristics | |
Singh et al. | SciDr at SDU-2020: IDEAS--Identifying and Disambiguating Everyday Acronyms for Scientific Domain | |
CN116680481B (en) | Search ranking method, apparatus, device, storage medium and computer program product | |
Li et al. | Evaluating the rationality of judicial decision with LSTM-based case modeling | |
CN117332858A (en) | Construction method of intelligent automobile fault diagnosis system based on knowledge graph | |
CN112069379A (en) | Efficient public opinion monitoring system based on LSTM-CNN | |
CN114707507B (en) | List information detection method and device based on artificial intelligence algorithm | |
CN116069874A (en) | Fault positioning method, device, equipment and storage medium based on knowledge graph | |
Anishaa et al. | Identifying similar question pairs using machine learning techniques | |
CN114595324A (en) | Method, device, terminal and non-transitory storage medium for power grid service data domain division | |
Choi et al. | Just-in-Time Defect Prediction for Self-driving Software via a Deep Learning Model | |
Naqvi et al. | Generating semantic matches between maintenance work orders for diagnostic decision support | |
CN117521673B (en) | Natural language processing system with analysis training performance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |