CN112269901B - Fault distinguishing and reasoning method based on knowledge graph - Google Patents

Fault distinguishing and reasoning method based on knowledge graph Download PDF

Info

Publication number
CN112269901B
CN112269901B CN202010959082.8A CN202010959082A CN112269901B CN 112269901 B CN112269901 B CN 112269901B CN 202010959082 A CN202010959082 A CN 202010959082A CN 112269901 B CN112269901 B CN 112269901B
Authority
CN
China
Prior art keywords
map
fault
event
data
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010959082.8A
Other languages
Chinese (zh)
Other versions
CN112269901A (en
Inventor
孔小飞
王晨
程栋梁
刘海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Zhongke Leinao Intelligent Technology Co ltd
Original Assignee
Hefei Zhongke Leinao Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Zhongke Leinao Intelligent Technology Co ltd filed Critical Hefei Zhongke Leinao Intelligent Technology Co ltd
Priority to CN202010959082.8A priority Critical patent/CN112269901B/en
Publication of CN112269901A publication Critical patent/CN112269901A/en
Application granted granted Critical
Publication of CN112269901B publication Critical patent/CN112269901B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/83Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a fault distinguishing and reasoning method based on a knowledge graph, which comprises the steps of obtaining equipment data, fault data and disposal scheme data; establishing an equipment map based on the equipment data, establishing a fault map based on the fault data, and establishing a disposal scheme map based on the disposal scheme data; performing map fusion, map completion and map inference on the equipment map, the fault map and the disposal scheme map based on an event extraction algorithm and a TranSE algorithm to obtain a knowledge map; carrying out fault discrimination reasoning by using a knowledge graph of a graph neural network; the method comprises the steps of establishing maps by utilizing a large amount of equipment data, historical fault data and disposal scheme data, fusing and complementing the maps, carrying out fault distinguishing and reasoning by utilizing a final knowledge map, and automatically providing methods for distinguishing, diagnosing and overhauling faults of the transformer.

Description

Fault distinguishing and reasoning method based on knowledge graph
Technical Field
The invention belongs to the field of fault reasoning, and particularly relates to a fault distinguishing and reasoning method based on a knowledge graph.
Background
At present, the fault treatment of the transformer mainly depends on field professionals, even special experts are needed, when the transformer has a fault, the transformer can not be maintained in time, the result is unpredictable, and the maintenance cost is high. In view of the problem of manual troubleshooting of transformer faults, there are also professional system researches, such as expert systems, which can help maintenance personnel to find out the fault reason and fault maintenance scheme more quickly. However, the expert system has high construction cost, general accuracy and high maintenance cost, and cannot adapt to new situations, so that the situations that positioning reasons exist in actual use and a given maintenance suggestion makes mistakes are caused.
Disclosure of Invention
Aiming at the problems, the invention provides a fault discrimination inference method based on a knowledge graph,
acquiring equipment data, fault data and disposal scheme data;
establishing an equipment map based on the equipment data, establishing a fault map based on the fault data, and establishing a disposal scheme map based on the disposal scheme data;
performing map fusion, map completion and map inference on the equipment map, the fault map and the disposal scheme map based on an event extraction algorithm and a TranSE algorithm to obtain a knowledge map;
and (4) carrying out fault discrimination reasoning of the knowledge graph by using the graph neural network.
Preferably, the establishing of the fault map based on the fault data specifically includes:
acquiring fault data;
screening sentences containing events in fault data, and labeling elements in the sentences in a tag-element form;
dividing the labeled fault data into a training set and a test set;
pre-training: mapping the training set into vectors by the pre-training language model to obtain word embedded vectors;
constructing an event extraction model: inputting the word embedding vector into an event extraction model, outputting sequence label information by the event extraction model, and establishing a loss function based on the sequence label information;
evaluation: evaluating the event extraction model by using the test set, if the evaluation score is lower than a preset target, repeating the step of constructing the event extraction model, and if the evaluation score reaches the preset target, terminating the step of constructing the event extraction model to obtain an event extraction model;
adjusting the training set and the test set structure for multiple times, repeating the pre-training, the constructing of the event extraction model and the evaluation steps to obtain a plurality of event extraction models, and selecting the event extraction model with the best evaluation result as the optimal model;
inputting new fault data into the optimal model, outputting a label corresponding to the new fault data by the optimal model, extracting formatted event data based on the label, and establishing a fault map based on the event data.
Preferably, the establishing a treatment plan map based on the treatment plan data specifically includes:
acquiring disposal scheme data;
screening sentences containing events in the disposal scheme data, and labeling elements in the sentences in a tag-element form;
dividing the annotated treatment plan data into a training set and a test set;
pre-training: mapping the training set into vectors by the pre-training language model to obtain word embedded vectors;
constructing an event extraction model: inputting the word embedding vector into an event extraction model, outputting sequence label information by the event extraction model, and establishing a loss function based on the sequence label information;
evaluation: evaluating the event extraction model by using the test set, if the evaluation score is lower than a preset target, repeating the step of constructing the event extraction model, and if the evaluation score reaches the preset target, terminating the step of constructing the event extraction model to obtain an event extraction model;
adjusting the training set and the test set structure for multiple times, repeating the pre-training, the constructing of the event extraction model and the evaluation steps to obtain a plurality of event extraction models, and selecting the event extraction model with the best evaluation result as the optimal model;
inputting the new treatment scheme data into the optimal model, outputting a label corresponding to the new treatment scheme data by the optimal model, extracting formatted event data based on the label, and establishing a treatment scheme map based on the event data.
Preferably, the obtaining of the knowledge graph by performing graph fusion and graph completion on the equipment graph, the fault graph and the treatment scheme graph based on the TranSE algorithm specifically comprises the following steps:
the equipment map, the fault map and the treatment scheme map are all represented in a triple (h, r, t) form, h represents a head entity, r represents a relation, and t represents a tail entity;
initializing a head entity vector, a relationship vector, and a tail entity vector for each dimension of each vector
Figure GDA0002854401200000031
Taking a value at random, wherein k is the dimension of the low-dimensional vector;
constructing negative sampling samples (h1, r, T1), (h2, r, T2) … … by replacing the correct triplet head entity or tail entity with the correct triplet (h, r, T) as the positive sampling sample, establishing T-batch based on the positive sampling sample and the negative sampling sample,
T-batch={[(h,r,t),(h1,r,t1)],[(h,r,t),(h2,r,t2)],……}
training a TranSE model by utilizing T-batch, and performing parameter adjustment by adopting a gradient descent strategy, wherein an objective function of the TranSE model is as follows:
Figure GDA0002854401200000032
s represents a positive sample, S1 represents a negative sample, γ represents a distance parameter, γ > 0, d (h + r, t) represents the distance between h + r and t, d (hi + r, ti) represents the distance between hi + r and ti, [ ] + represents a positive function;
acquiring vector representation of the triples by using a trained TranSE model;
calculating the similarity between entity vectors based on cosine similarity, and performing map fusion based on the similarity, wherein the cosine similarity formula is as follows:
Figure GDA0002854401200000033
wherein A, B is a representation vector of a head entity or a tail entity;
and based on h and r, calculating t by using the trained TranSE model, and completing the atlas completion.
Preferably, the performing of the map inference on the device map, the fault map and the treatment plan map based on the graph neural network specifically includes:
calculating the branch weight (vector representation of r) of the meta-event in the equipment graph, the fault graph and the treatment scheme graph, wherein the branch weight calculation formula is as follows:
Figure GDA0002854401200000041
wherein e isi、ej、ekRespectively represent different meta-events;
representing the average number of all dimension numbers by using the vector of r of the relation obtained by the TranSE algorithm in the equipment map, the fault map and the disposal scheme map
Figure GDA0002854401200000045
And w (e) abovej|ei) Summing to obtain the initial transfer weight of the new event;
initialization of meta-event representation vector h with bert word vectoriObtaining an adjacent matrix of the equipment map, the fault map and the disposal scheme map according to the transfer weight, and inputting the adjacent matrix of the local equipment map, the fault map and the disposal scheme map and the initialized meta-event and context representation into a graph neural network for training, wherein the structure and the training process of the model are as follows;
the adjacency matrix is as follows:
Figure GDA0002854401200000042
adding the previously obtained event representation information, inputting the information into the graph attention network,
node vector h of local graphiThe dimensionality is F, the number of nodes is N:
Figure GDA0002854401200000043
wherein, W(l)A matrix F' x F, l representing the number of layers of the network, each time a representation of all the nodes of the local graph is computed;
Figure GDA0002854401200000044
wherein
Figure GDA0002854401200000051
For splicing together two matrices, a(l)The vector is 2F', and the two are subjected to inner product.
Figure GDA0002854401200000052
The similarity coefficient of the j node relative to the i node is calculated by the formula;
Figure GDA0002854401200000053
the expression of the node of the next layer is calculated by the formula, and sigma is a sigmoid function;
Figure GDA0002854401200000054
Figure GDA0002854401200000055
for the context event representation obtained by the above equation,
Figure GDA0002854401200000056
for waitingA selected event;
Figure GDA0002854401200000057
the method obtains the similarity coefficient of the event;
Figure GDA0002854401200000058
the most similar event is obtained by the calculation of the g similarity function.
The fault distinguishing and reasoning method based on the knowledge graph establishes the graphs by utilizing a large amount of equipment data, historical fault data and disposal scheme data respectively, then performs fusion and completion operation on the graphs, performs fault distinguishing and reasoning by utilizing the final knowledge graph, and automatically provides methods for distinguishing, diagnosing and repairing the fault of the transformer.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 illustrates a top mode layer in an embodiment of the present invention;
FIG. 2 illustrates a partial view of a knowledge-graph that accomplishes graph fusion and graph completion;
FIG. 3 shows a partial view of a knowledge-graph that accomplishes graph fusion and graph completion.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a fault discrimination inference method based on a knowledge graph, which can be applied to the field of transformer fault maintenance and certainly can be applied to other technical fields, and the transformer fault maintenance is taken as an example for explanation.
An equipment map, a fault map, and a treatment plan map are first established. The graph comprises event elements, and the event elements form a network-shaped event graph through event relations. The event element comprises a core node and event attributes, the core node corresponds to the fault of the equipment and comprises a name, a type and a keyword for describing the fault event and description capable of being distinguished from other faults, the event attributes correspond to the equipment state and comprise basic information of the equipment, fault occurrence time, fault state information, maintenance scheme, place, people, equipment name, model, delivery date, service life, installation date, maintenance cycle and the like, the coverage range is wide, i.e., the graph is an open network, entities related to an event can be associated with the event element, and at the same time, it can be seen that the event graph is a dynamic network, events are time dependent, time is a changing process, therefore, the event map can describe the dynamic change of the event, the map is applied to the equipment fault analysis, the method can find the causal consequence of the event, and can play a greater role in the prevention and early warning of equipment failure.
And acquiring equipment data, and establishing an equipment map based on the equipment data. Specifically, the device data mainly comes from a substation ledger, and is mainly semi-structured data, where the semi-structured data mainly refers to data in a table, a database, and the like, and can be processed and completed based on a certain rule, such as data similar to the following table. And processing through a program and a rule to obtain a triple group containing the relationship and the attribute relationship, wherein the triple group data is mapped into a triple group data of a { "110 kV transformer substation", "SF 6 circuit breaker", "Satsu Rugao high-voltage electrical apparatus Co., Ltd" according to an equipment area "," equipment name "and" equipment manufacturer ". A large number of different triplets constitute a device map.
Figure GDA0002854401200000071
Establishing a fault map based on fault data, wherein the fault data is from a fault log record, training a fault event extraction model by marking the fault data, and extracting event elements by using the fault event extraction model to obtain the fault map. For the corpus in the field of the transformer substation, an event based on chapter level, such as a specific fault event, is a main extraction target for basic information of equipment, fault occurrence time, fault state information, maintenance schemes and the like, and may be referred to as topic element extraction. Specifically, the meta-event included in the failure event is defined as follows: in an event in the field of a transformer substation, main components of the event are identified in a specific corpus: o-subject; a p-attribute; an s-object; v-verb. The main types of events: "status defect event", "operation defect event", "action defect event", "status accident event", and the like; the main types of event relationships: the relationship between events such as "cause and effect prevention", "cause and effect", "cause and effect with subject", "compliance with subject", "condition relationship", "possible juxtaposition", "upper and lower relationship", etc. The event extraction specifically includes the following steps.
Obtaining original corpora: the method comprises the steps of taking fault data to be extracted as original corpora, wherein the original corpora can be from professional transformer fault books and fault recording texts, the presentation mode of the obtained original corpora can be different forms, such as picture formats, PDF formats and the like, and the obtained original corpora need to be converted into pure text data, for example, OCR (optical character recognition) technology can be adopted to convert non-pure text data into pure text data, then the pure text data are processed by methods such as programs and manual operation, and the non-pure text data are divided into different texts to be stored according to specific transformer fault cases.
Data annotation: screening sentences containing events, wherein the events are sentences related to transformer faults in the embodiment, such as 'main transformer oil temperature is high', 'iron core intermittent multipoint grounding', elements in the sentences are labeled in a tag-element form, the event elements mainly comprise 'fault phenomenon', 'specific fault equipment', 'equipment production company' and the like, and each element obtains a tag; in other book literature, the form of a tag-element may be expressed as [ border position-element ]. In this embodiment, the tag includes { B (element start), M (element inside), E (element end), S (single element) }, and all other parts in the event are marked as "O".
Data set allocation: dividing the labeled corpus into a training set and a test set; illustratively, the ratio of 4: a ratio of 1 assigns a training set (train.txt) and a test set (test.txt).
Pre-training: carrying out fine tuning training by utilizing the existing large-scale pre-training language model, mapping the Chinese characters in the training set into vectors by using the pre-training language model to obtain word embedded vectors: e is an element of Rl*dSo as to adapt to the field of transformer faults.
Constructing an event extraction model: and inputting the word embedding vector into an event extraction model, outputting sequence label information by the event extraction model, establishing a loss function based on the sequence label information, and finally obtaining a trained event extraction model by optimizing the value of the loss function.
Evaluation: and evaluating the event extraction model by using the test set, repeating the step of constructing the event extraction model to continue training if the evaluation result is lower than a preset target, terminating the step of constructing the event extraction model if the evaluation result reaches the preset target, obtaining the event extraction model, and storing the event extraction model.
Adjusting the structure of a training set and a test set for multiple times, namely taking data in two texts of the training set (train.txt) and the test set (test.txt) as a whole, and calculating the data strip number according to 4: the proportion of 1 redistributes the data set into two new training sets (train.txt) and test sets (test.txt), and the purpose of verifying the validity of the event extraction model is achieved. And repeating the pre-training, the event extraction model building and the evaluation steps to obtain a plurality of event extraction models, and selecting the event extraction model with the best evaluation result as the best model.
Event extraction: and inputting the text to be extracted into the trained event extraction model, wherein the text to be extracted can be fault data related to any transformer, and a labeling result of each character of the text is obtained. And then reading out the meanings represented by the labels correspondingly to form text information, splicing the text information to form a text sentence to obtain structured text information, or independently storing the structured text information in a data structure.
In this embodiment, BiLSTM + ATT + CRF (bidirectional long-short term memory artificial neural network + attention mechanism + conditional random field) is used as an event extraction model, and the pre-training language model maps the Chinese characters of the labeled data into vectors, for example, word vectors of "change", "press" and "device" are sequentially input to forward LSTM to obtain three vectors (H;)L0,HL1,HL2) Sequentially inputting the word vectors of 'device', 'pressure', 'variation' to the LSTM to obtain three vectors (H)R0,HR1,HR2) Finally, the two vectors are spliced to obtain { [ H ]L0,HR0],[HL1,HR1],[HL2,HR2]And on the basis of scores of all labels finally output by the BilSTM network, taking the maximum numerical value as the label of each character and as the input of a CRF layer behind the character (the front BilSTM already learns the relation between a text sequence and the labels, and the CRF layer can learn the transfer relation between the labels to ensure that an E label is not generated before the label M, which belongs to a useless sequence), obtaining a final label sequence through the CRF layer, establishing a loss function by using the real label sequence, evaluating an event extraction model on a test set, and when indexes such as recall ratio and the like are not generatedAfter a certain number of rounds of rising, terminating the training; adjusting the structure of the data set for a plurality of times, repeating the training steps to finally obtain an optimal model, and inputting the original text into the trained event extraction model to obtain an output label corresponding to the input sequence; and extracting formatted event data according to the label prediction result, and finally obtaining an event extraction result, so that the quality and reliability of knowledge in the fault map are improved.
And establishing a disposal scheme map based on disposal scheme data, wherein the disposal scheme data is derived from a transformer overhaul manual, training a disposal scheme event extraction model by labeling the disposal scheme data, and extracting event elements by using the disposal scheme event extraction model to obtain the disposal scheme map. The process of training the treatment scheme event extraction model and obtaining the treatment scheme map is the same as the process of training the fault event extraction model and obtaining the fault map, and is not repeated.
After the device map, the fault map and the disposal scheme map are established, a top mode layer is established, as shown in fig. 1, the three maps are fused and supplemented to obtain a knowledge map, and specifically, the map fusion and the supplementation can be performed through a TranSE algorithm or a multi-hop inference and event prediction can be performed by using a graph neural network. The method comprises the following specific steps of carrying out map fusion and map completion on an equipment map, a fault map and a disposal scheme map based on a tranSE algorithm to obtain a knowledge map.
The equipment graph, the fault graph and the treatment plan graph are all represented in the form of triples (h, r, t), h represents a head entity, r represents a relation, t represents a tail entity, and the head entity and the tail entity are events in the graph, namely the relation represents the relation between the events. The triples are created to realize the representation of nodes and relationships in the graph as low-dimensional vectors, such as: (iron core multiple grounding, cause and over-heating), so that the "iron core multiple grounding" is no longer a single node, but a vector, such as (0.002,0.006,0.005,0.008,0.001), in practice, a higher dimension, such as 50,100 dimensions, is generally set.
The TranSE model assumes that the vector of the correct triplet should satisfy h1+ r1 ═ t1, and defines a distance function d (h + r, t) for measuring the distance between h + r and t, and in practical applications, L1 or L2 norm can be used, and the definitions of L1 and L2 norm are as follows:
l1 norm:
Figure GDA0002854401200000101
l2 norm:
Figure GDA0002854401200000102
there is another vector: y ═ y1,y2,……ynUsing the L1 norm to measure the distance between x and y, d (x, y) represents the distance between x and y,
Figure GDA0002854401200000103
the objective function of the TranSE model is as follows:
Figure GDA0002854401200000104
s represents a positive sample of the triplet, S1 represents a negative sample of the triplet, S1 is obtained by replacing h or t with S, γ represents a distance parameter, γ > 0, d (h + r, t) represents the distance between h + r and t, d (hi + r, ti) represents the distance between hi + r and ti, and [ ] + represents a positive function.
The training procedure for the TranSE model is as follows:
setting a distance parameter gamma and a learning rate lambda, initializing a head entity vector, a relation vector and a tail entity vector for each dimension of each vector
Figure GDA0002854401200000111
And (4) taking a value at random, wherein k is the dimension of the low-dimensional vector, and normalizing after all vectors are initialized.
The correct triplet (h, r, T) is used as a positive sampling sample S to replace the correct triplet head entity or tail entity to construct a negative sampling sample S1, S1 is specifically (h1, r, T1), (h2, r, T2) … …, T-batch is established based on the positive sampling sample and the negative sampling sample,
T-batch={[(h,r,t),(h1,r,t1)],[(h,r,t),(h2,r,t2)],……}
training a TranSE model by utilizing T-batch, adjusting parameters by adopting a gradient descent strategy,
and acquiring vector representation of nodes and relations in the graph by using the trained TranSE model.
Calculating the similarity between triples based on a cosine similarity company, and performing map fusion based on the similarity, wherein a cosine similarity formula is as follows:
Figure GDA0002854401200000112
where a and B are representative vectors of head or tail entities, typically vectors of fixed (100, 200, etc.) dimensions.
And based on h and r, calculating t by using the trained TranSE model, and completing the atlas completion. FIGS. 2-3 show a fused and complemented knowledge-graph.
The method can also perform map fusion and map completion on the equipment map, the fault map and the treatment scheme map based on the map neural network to obtain the knowledge map, and specifically comprises the following steps.
Calculating the transfer weight of the meta-event in the equipment map, the fault map and the disposal scheme map, wherein the transfer weight calculation formula is as follows:
Figure GDA0002854401200000113
wherein e isi、ej、ekRespectively represent different meta-events;
simultaneously using the vector of r of the relation obtained by the TranSE algorithm in the equipment map, the fault map and the disposal scheme map to represent the average number of all dimension numbers
Figure GDA0002854401200000121
And w (e) abovej|ei) And adding to obtain the initial transfer weight of the new event.
And obtaining an adjacency matrix of the equipment map, the fault map and the treatment scheme map according to the transfer weights, wherein for the initialization of the meta-event, firstly aiming at the abstract expression of the obtained event, the event elements of o (subject), p (attribute), s (object) and v (verb) are included, and the initialization of the event elements is carried out by using bert to form an initial representation of the event. Secondly, inputting the adjacency matrix of the local atlas and the initialized event and context representation into a neural network of the atlas for training, wherein the structure and the training process of the model are as follows:
the adjacency matrix is as follows:
Figure GDA0002854401200000122
adding the previously obtained event representation information, inputting the information into the graph attention network,
node vector h of local graphiThe dimensionality is F, the number of nodes is N:
Figure GDA0002854401200000123
wherein, W(l)The matrix, F' x F, l represents the number of layers of the network, each time a representation of all the nodes of the local graph is computed.
Figure GDA0002854401200000124
Wherein
Figure GDA0002854401200000125
For splicing together two matrices, a(l)The vector is 2F', and the two are subjected to inner product.
Figure GDA0002854401200000126
The similarity coefficient of the j node relative to the i node is calculated by the formula.
Figure GDA0002854401200000131
The expression calculates the expression of the node of the next layer, and sigma is a sigmoid function.
Figure GDA0002854401200000132
Figure GDA0002854401200000133
For the context event representation obtained by the above equation,
Figure GDA0002854401200000134
are candidate events.
Figure GDA0002854401200000135
The similarity coefficient of the event is obtained by the formula.
Figure GDA0002854401200000136
The most similar event is obtained by the calculation of the g similarity function.
That is, after the final event representation is obtained, similarity calculation is performed between the final event representation and the candidate event to obtain a final event, and the prediction of the event is completed.
Based on the fused and complemented knowledge graph, the following actual functions can be realized:
when the substation equipment has a fault or a defect, a possible problem source can be found according to an event chain based on multi-step reasoning of a graph neural network, for example, a relation r1 exists between a and b, a relation r2 exists between b and c, a direct relation corresponding to the two-step path is a relation r3 exists between a and c, and the function of reasoning the fault phenomenon to the possible cause of the fault is realized through the fault phenomenon of the equipment, specific fault equipment, the running state of the equipment and the like.
Similarly, based on historical fault events, close relations between defect fault phenomena, fault sources and the like of equipment and part manufacturers are analyzed, the similar reason events are analyzed according to faults and defect events with determined reasons, the occurring equipment defect faults, equipment running states and the like are used as input of a knowledge graph, a multi-step reasoning method is utilized, possible problem sources and probability of the problem sources are given, and analysis and positioning efficiency of users on the equipment defect faults is improved.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (4)

1. A fault discrimination inference method based on knowledge graph is characterized in that,
acquiring equipment data, fault data and disposal scheme data;
establishing an equipment map based on the equipment data, establishing a fault map based on the fault data, and establishing a disposal scheme map based on the disposal scheme data;
performing map fusion, map completion and map inference on the equipment map, the fault map and the disposal scheme map based on an event extraction algorithm and a TranSE algorithm to obtain a knowledge map;
the method for obtaining the knowledge graph by performing graph fusion and graph completion on the equipment graph, the fault graph and the disposal scheme graph based on the TranSE algorithm specifically comprises the following steps of:
the equipment map, the fault map and the treatment scheme map are all represented in a triple (h, r, t) form, h represents a head entity, r represents a relation, and t represents a tail entity;
initializing a head entity vector, a relationship vector, and a tail entity vector for each dimension of each vector
Figure 779354DEST_PATH_IMAGE001
Taking a value at random, wherein k is the dimension of the low-dimensional vector;
constructing negative sampling samples (h1, r, T1), (h2, r, T2) … … by replacing the correct triplet head entity or tail entity with the correct triplet (h, r, T) as the positive sampling sample, establishing T-batch based on the positive sampling sample and the negative sampling sample,
T-batch={[(h,r,t),(h1,r,t1)],[(h,r,t),(h2,r,t2)],……}
training a TranSE model by utilizing T-batch, and performing parameter adjustment by adopting a gradient descent strategy, wherein an objective function of the TranSE model is as follows:
Figure 368599DEST_PATH_IMAGE002
s represents a positive sample, S1 represents a negative sample, γ represents a distance parameter, γ > 0, d (h + r, t) represents the distance between h + r and t, d (hi + r, ti) represents the distance between hi + r and ti, [ ] + represents a positive function;
acquiring vector representation of the triples by using a trained TranSE model;
calculating the similarity between entity vectors based on cosine similarity, and performing map fusion based on the similarity, wherein the cosine similarity formula is as follows:
Figure 617177DEST_PATH_IMAGE003
wherein A, B is a representation vector of a head entity or a tail entity;
based on h and r, calculating t by using the trained TranSE model to complete atlas completion;
and (4) carrying out fault discrimination reasoning of the knowledge graph by using the graph neural network.
2. The fault discrimination inference method based on knowledge-graphs according to claim 1, wherein the establishing of the fault-graph based on fault data specifically comprises:
acquiring fault data;
screening sentences containing events in fault data, and labeling elements in the sentences in a tag-element form;
dividing the labeled fault data into a training set and a test set;
pre-training: mapping the training set into vectors by the pre-training language model to obtain word embedded vectors;
constructing an event extraction model: inputting the word embedding vector into an event extraction model, outputting sequence label information by the event extraction model, and establishing a loss function based on the sequence label information;
evaluation: evaluating the event extraction model by using the test set, if the evaluation score is lower than a preset target, repeating the step of constructing the event extraction model, and if the evaluation score reaches the preset target, terminating the step of constructing the event extraction model to obtain an event extraction model;
adjusting the training set and the test set structure for multiple times, repeating the pre-training, the constructing of the event extraction model and the evaluation steps to obtain a plurality of event extraction models, and selecting the event extraction model with the best evaluation result as the optimal model;
inputting new fault data into the optimal model, outputting a label corresponding to the new fault data by the optimal model, extracting formatted event data based on the label, and establishing a fault map based on the event data.
3. The method of knowledge-graph-based fault-discriminating inference as claimed in claim 1, wherein said establishing a treatment-plan graph based on treatment-plan data specifically comprises:
acquiring disposal scheme data;
screening sentences containing events in the disposal scheme data, and labeling elements in the sentences in a tag-element form;
dividing the annotated treatment plan data into a training set and a test set;
pre-training: mapping the training set into vectors by the pre-training language model to obtain word embedded vectors;
constructing an event extraction model: inputting the word embedding vector into an event extraction model, outputting sequence label information by the event extraction model, and establishing a loss function based on the sequence label information;
evaluation: evaluating the event extraction model by using the test set, if the evaluation score is lower than a preset target, repeating the step of constructing the event extraction model, and if the evaluation score reaches the preset target, terminating the step of constructing the event extraction model to obtain an event extraction model;
adjusting the training set and the test set structure for multiple times, repeating the pre-training, the constructing of the event extraction model and the evaluation steps to obtain a plurality of event extraction models, and selecting the event extraction model with the best evaluation result as the optimal model;
inputting the new treatment scheme data into the optimal model, outputting a label corresponding to the new treatment scheme data by the optimal model, extracting formatted event data based on the label, and establishing a treatment scheme map based on the event data.
4. The fault discrimination inference method based on knowledge-graph according to claim 1, wherein the graph inference of the device graph, the fault graph and the treatment plan graph based on the graph neural network specifically comprises:
calculating the transfer weight of the meta-event in the equipment map, the fault map and the disposal scheme map, wherein the transfer weight calculation formula is as follows:
Figure 481228DEST_PATH_IMAGE004
wherein e isi、ej、ekRespectively represent different meta-events;
utilizing the TranSE algorithm in the equipment map, the fault map and the disposal scheme mapThe resulting vector of r of the relationship represents the average of all dimensional numbers
Figure 262977DEST_PATH_IMAGE005
And the above
Figure 554281DEST_PATH_IMAGE006
Summing to obtain the initial transfer weight of the new event;
initialization of meta-event representation vectors using bert word vectors
Figure 290156DEST_PATH_IMAGE007
Obtaining an adjacent matrix of the equipment map, the fault map and the disposal scheme map according to the transfer weight, and inputting the adjacent matrix of the local equipment map, the fault map and the disposal scheme map and the initialized meta-event and context representation into a graph neural network for training, wherein the structure and the training process of the model are as follows;
the adjacency matrix is as follows:
Figure 223477DEST_PATH_IMAGE008
adding the previously obtained event representation information, inputting the information into the graph attention network,
node vector of local graph
Figure 361197DEST_PATH_IMAGE007
Dimension is F, number of nodes is
Figure 823403DEST_PATH_IMAGE009
Figure 780994DEST_PATH_IMAGE010
Wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE011
is composed of
Figure 211015DEST_PATH_IMAGE012
x F, l represents the number of layers of the network, each time a representation of all the nodes of the local graph is computed;
Figure 203241DEST_PATH_IMAGE013
wherein
Figure 101927DEST_PATH_IMAGE014
In order to splice together the two matrices,
Figure 687760DEST_PATH_IMAGE015
is composed of
Figure 930261DEST_PATH_IMAGE016
The vector of (3), the two being inner-multiplied;
Figure 42573DEST_PATH_IMAGE017
the similarity coefficient of the j node relative to the i node is calculated by the formula;
Figure 846581DEST_PATH_IMAGE018
)
the equation calculates the representation of the node of the next layer,
Figure 247607DEST_PATH_IMAGE019
is sigmoid function;
Figure 795263DEST_PATH_IMAGE020
Figure 558819DEST_PATH_IMAGE021
for the context event representation obtained by the above equation,
Figure 799308DEST_PATH_IMAGE022
an event that is a candidate;
Figure DEST_PATH_IMAGE023
the method obtains the similarity coefficient of the event;
Figure 769188DEST_PATH_IMAGE024
the book type is passed through
Figure 386114DEST_PATH_IMAGE025
The similarity function calculates the most similar event.
CN202010959082.8A 2020-09-14 2020-09-14 Fault distinguishing and reasoning method based on knowledge graph Active CN112269901B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010959082.8A CN112269901B (en) 2020-09-14 2020-09-14 Fault distinguishing and reasoning method based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010959082.8A CN112269901B (en) 2020-09-14 2020-09-14 Fault distinguishing and reasoning method based on knowledge graph

Publications (2)

Publication Number Publication Date
CN112269901A CN112269901A (en) 2021-01-26
CN112269901B true CN112269901B (en) 2021-11-05

Family

ID=74349950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010959082.8A Active CN112269901B (en) 2020-09-14 2020-09-14 Fault distinguishing and reasoning method based on knowledge graph

Country Status (1)

Country Link
CN (1) CN112269901B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882911B (en) * 2021-02-01 2023-12-29 中电科网络空间安全研究院有限公司 Abnormal performance behavior detection method, system, device and storage medium
CN113092044B (en) * 2021-03-31 2022-03-18 东南大学 Rotary machine fault diagnosis method based on weighted level visible graph
CN113112164A (en) * 2021-04-19 2021-07-13 特变电工股份有限公司新疆变压器厂 Transformer fault diagnosis method and device based on knowledge graph and electronic equipment
CN113190651B (en) * 2021-04-23 2022-09-09 宁波乾睿导航科技有限公司 Electric power data global knowledge graph completion method based on quota knowledge graph technology
CN113283027B (en) * 2021-05-20 2024-04-02 南京航空航天大学 Mechanical fault diagnosis method based on knowledge graph and graph neural network
CN113190844B (en) * 2021-05-20 2024-05-28 深信服科技股份有限公司 Detection method, correlation method and correlation device
CN113590834B (en) * 2021-06-21 2023-04-21 安徽工程大学 Construction method of full life cycle knowledge graph of RV reducer
CN113420162B (en) * 2021-06-24 2023-04-18 国网天津市电力公司 Equipment operation chain state monitoring method based on knowledge graph
CN113342993B (en) * 2021-07-02 2023-10-03 上海申瑞继保电气有限公司 Power failure map generation method
CN113360679B (en) * 2021-07-08 2023-11-21 北京国信会视科技有限公司 Fault diagnosis method based on knowledge graph technology
CN114785674A (en) * 2022-04-27 2022-07-22 中国电信股份有限公司 Fault positioning method and device, and computer-storable medium
CN114912637B (en) * 2022-05-21 2023-08-29 重庆大学 Human-computer object knowledge graph manufacturing production line operation and maintenance decision method and system and storage medium
CN115366157B (en) * 2022-10-24 2023-02-03 北京奔驰汽车有限公司 Industrial robot maintenance method and device
CN116149297B (en) * 2023-01-18 2023-09-08 北京控制工程研究所 Performance-fault relation graph-based fault diagnosis capability assessment method and device
CN116910633B (en) * 2023-09-14 2024-01-23 北京科东电力控制系统有限责任公司 Power grid fault prediction method based on multi-modal knowledge mixed reasoning
CN117455745B (en) * 2023-12-26 2024-03-19 四川省大数据技术服务中心 Public safety event sensing method and system based on multidimensional fusion data analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509483A (en) * 2018-01-31 2018-09-07 北京化工大学 The mechanical fault diagnosis construction of knowledge base method of knowledge based collection of illustrative plates
CN109635280A (en) * 2018-11-22 2019-04-16 园宝科技(武汉)有限公司 A kind of event extraction method based on mark
CN110705710A (en) * 2019-04-17 2020-01-17 中国石油大学(华东) Knowledge graph-based industrial fault analysis expert system
CN111209472A (en) * 2019-12-24 2020-05-29 中国铁道科学研究院集团有限公司电子计算技术研究所 Railway accident fault association and accident fault reason analysis method and system
CN111339311A (en) * 2019-12-30 2020-06-26 智慧神州(北京)科技有限公司 Method, device and processor for extracting structured events based on generative network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033101B (en) * 2019-03-07 2021-02-12 华中科技大学 Hydroelectric generating set fault diagnosis method and system based on knowledge graph of fusion features
CN111311059B (en) * 2020-01-16 2023-08-29 成都大汇物联科技有限公司 Waterwheel house fault diagnosis method based on knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509483A (en) * 2018-01-31 2018-09-07 北京化工大学 The mechanical fault diagnosis construction of knowledge base method of knowledge based collection of illustrative plates
CN109635280A (en) * 2018-11-22 2019-04-16 园宝科技(武汉)有限公司 A kind of event extraction method based on mark
CN110705710A (en) * 2019-04-17 2020-01-17 中国石油大学(华东) Knowledge graph-based industrial fault analysis expert system
CN111209472A (en) * 2019-12-24 2020-05-29 中国铁道科学研究院集团有限公司电子计算技术研究所 Railway accident fault association and accident fault reason analysis method and system
CN111339311A (en) * 2019-12-30 2020-06-26 智慧神州(北京)科技有限公司 Method, device and processor for extracting structured events based on generative network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A Joint Embedding Method for Entity Alignment of Knowledge Bases;Yanchao Hao;《China Conference on Knowledge Graph and Semantic Computing》;20161123;论文正文 *

Also Published As

Publication number Publication date
CN112269901A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
CN112269901B (en) Fault distinguishing and reasoning method based on knowledge graph
CN109918489A (en) A kind of knowledge question answering method and system of more strategy fusions
CN109949637B (en) Automatic answering method and device for objective questions
CN112069815B (en) Answer selection method and device for idiom filling-in-blank question and computer equipment
CN113191148A (en) Rail transit entity identification method based on semi-supervised learning and clustering
CN110276069A (en) A kind of Chinese braille mistake automatic testing method, system and storage medium
CN115858758A (en) Intelligent customer service knowledge graph system with multiple unstructured data identification
CN112257441A (en) Named entity identification enhancement method based on counterfactual generation
CN113239142A (en) Trigger-word-free event detection method fused with syntactic information
CN114443844A (en) Social network comment text sentiment analysis method and system fusing user sentiment tendency
CN113220768A (en) Resume information structuring method and system based on deep learning
CN113420543A (en) Automatic mathematical test question labeling method based on improved Seq2Seq model
CN115063119A (en) Recruitment decision system and method based on adaptivity of recruitment behavior data
CN113869055A (en) Power grid project characteristic attribute identification method based on deep learning
CN113312918B (en) Word segmentation and capsule network law named entity identification method fusing radical vectors
Firoozi et al. Using active learning methods to strategically select essays for automated scoring
CN114580418A (en) Knowledge map system for police physical training
CN113807519A (en) Knowledge graph construction method integrating teaching feedback and learned understanding
CN113283488A (en) Learning behavior-based cognitive diagnosis method and system
CN111966828A (en) Newspaper and magazine news classification method based on text context structure and attribute information superposition network
CN116910196A (en) Campus security emergency extraction method based on multi-task learning
CN114757183B (en) Cross-domain emotion classification method based on comparison alignment network
CN117216617A (en) Text classification model training method, device, computer equipment and storage medium
CN115935969A (en) Heterogeneous data feature extraction method based on multi-mode information fusion
CN113505603A (en) Multitask learning intelligent marking method and device suitable for judicial examination subjective questions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant