CN115129892A - Power distribution network fault disposal knowledge graph construction method and device - Google Patents

Power distribution network fault disposal knowledge graph construction method and device Download PDF

Info

Publication number
CN115129892A
CN115129892A CN202210738046.8A CN202210738046A CN115129892A CN 115129892 A CN115129892 A CN 115129892A CN 202210738046 A CN202210738046 A CN 202210738046A CN 115129892 A CN115129892 A CN 115129892A
Authority
CN
China
Prior art keywords
data
distribution network
fault handling
defect
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210738046.8A
Other languages
Chinese (zh)
Inventor
尚磊
叶欣智
董旭柱
田野
方华亮
王波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Wuhan University WHU
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Wuhan University WHU
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Wuhan University WHU, Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202210738046.8A priority Critical patent/CN115129892A/en
Publication of CN115129892A publication Critical patent/CN115129892A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention provides a method and a device for constructing a power distribution network fault disposal knowledge graph, wherein the method comprises the following steps: step 1, acquiring four kinds of data, namely distribution network equipment account data, scheduling procedure data, distribution network defect library data and fault disposal data, through a regulation and control system of a distribution network, and preprocessing the data; step 2, the first three kinds of data are arranged into a triple form; step 3, labeling the fourth data to obtain a plurality of named entities and a plurality of entity types in each piece of data, and obtaining a labeled fault handling data set; step 4, using the labeled fault handling data set as a model training data set, constructing a BERTHIMIC-BilSTM-CRF entity identification model by adopting a pre-training method, carrying out fine tuning training, identifying complex professional entities in the field of the power distribution network, and arranging the fault handling data into a triple form; and 5, constructing a power distribution network fault disposal knowledge graph according to the data in the triple form.

Description

Power distribution network fault disposal knowledge graph construction method and device
Technical Field
The invention belongs to the technical field of power dispatching operation and maintenance, and particularly relates to a power distribution network fault handling knowledge graph construction method and device.
Background
The deep physical fusion of the electric power information makes the fault form of the power distribution network increasingly complex, the existing power distribution network regulation and control systems such as SCADA, DMS, D5000 and the like mainly collect, monitor and analyze, a large amount of multi-source heterogeneous text data are accumulated, and the fault treatment of the power distribution network mainly depends on the subjective decision of regulation and control personnel. The distribution network fault text data are extracted and refined into knowledge by means of a knowledge map technology, and the knowledge is organized into a structured and visual representation form to assist a regulation and control person in making fault handling decisions.
Most of the existing power grid knowledge graph construction methods need a large amount of labeled data to train an entity recognition model, and the quantity of fault handling data in a power distribution network is small, the labeling cost is high, and the training of a conventional deep learning model cannot be supported. In addition, the existing power grid knowledge graph cannot effectively identify complex professional entities carrying numbers and symbols in the field of power distribution networks, and text complex features are difficult to effectively extract.
Disclosure of Invention
The invention aims to solve the problems and provides a method and a device for constructing a power distribution network fault handling knowledge graph based on a BERTWITH Dic-BilSTM-CRF model, which can realize the efficient identification of complex professional entities in the field of power distribution networks.
In order to achieve the purpose, the invention adopts the following scheme:
< method >
The invention provides a power distribution network fault disposal knowledge graph construction method which is characterized by comprising the following steps:
step 1, acquiring distribution network equipment account data, fault handling data, scheduling procedure data and distribution network defect library data through a regulation and control system of a distribution network, and performing data cleaning pretreatment on the data;
step 2, arranging the preprocessed distribution network equipment ledger data, distribution network defect library data and scheduling procedure data into a triple form;
step 3, labeling the preprocessed fault handling data to obtain a plurality of named entities and a plurality of entity types in each piece of data, and obtaining a labeled fault handling data set;
and 4, taking the fault handling data set marked in the step 3 as a model training data set, constructing a BERTHDic-BilSTM-CRF entity recognition model by adopting a pre-training method, and performing fine tuning training, wherein the BERTHDic-BilSTM-CRF entity recognition model comprises the following steps: the fault handling method comprises the following steps of (1) effectively identifying complex professional entities in the field of the power distribution network by using a BERT layer, a Word2Vec Word embedded layer, a bidirectional long-short-term memory network layer BilSTM, a characteristic series layer Concatenate, a full connection layer Dense and a conditional random field layer CRF, and then arranging fault handling data into a triple form according to a preset relationship between the entities;
and 5, taking the distribution network equipment account data, the distribution network defect library data and the scheduling regulation data in the triple form obtained in the step 2 and the fault disposal data in the triple form obtained in the step 4 as distribution network text data, and storing and visualizing the distribution network text data through a database to construct a distribution network fault disposal knowledge map.
Preferably, the BERT-based method provided by the inventionThe power distribution network fault disposal knowledge graph construction method of the model can also have the following characteristics: in step 4, extracting dictionary feature representation in the constructed electric power field dictionary based on a Word2Vec model; performing Word segmentation on an input text, performing Word Embedding on an input sequence by using a Word vector matrix obtained by Word2Vec training to obtain a dictionary feature representation vector sequence D [ D ] 1 ,d 2 ,…,d n ](ii) a The power field dictionary comprises power professional vocabularies and equipment names in the distribution network equipment standing book data; and then, performing feature splicing on the output sequence T of the BERT layer and the dictionary feature representation vector sequence D to obtain spliced feature representation E ═ E [ E ] 1 ,e 2 ,…,e n ]Inputting the character into a BilSTM layer, capturing the language features of the character based on context, and outputting a vector sequence O (O) considering the context information of the character 1 ,o 2 ,…,o n ]。
Preferably, the method for constructing the power distribution network fault handling knowledge graph based on the BERT model provided by the invention can also have the following characteristics: in step 1, the relevant data set of the power distribution network is data k (x, y) represents the data value of the yth attribute of the xth data in the kth class in the power distribution network data set, and k belongs to [1, L ]],x∈[1,M],y∈[1,N]L is the number of the types of the related data of the power distribution network, M is the total number of the data under certain type of data, and N is the attribute number of certain data.
Preferably, the method for constructing the power distribution network fault handling knowledge graph based on the BERT model provided by the invention can also have the following characteristics: in step 2, aiming at the distribution network equipment ledger data, the equipment name is used as an entity, the rest part is used as an additional attribute, and the data are arranged into a triple form according to the real link relation among different equipment as follows:
C 1 =(data 1 (x 1 ,y),R,data 2 (x 2 ,y)),
in the formula, data 1 (x 1 Y) is device 1, data 2 (x 2 Y) is the device 2, and R is the inter-device relationship specified according to the actual link relationship of the devices;
aiming at the distribution network defect library data, the defect name is used as an entity, the rest part is used as an additional attribute, and the data are arranged into the following triple form according to the affiliated relationship between the defect and the defect equipment:
C 2 =(data 2 (x 1 ,y),R,data 1 (x 1 ,y)),
in the formula, data 2 (x 1 Y) is defect 1, data 1 (x 1 Y) is the device 1 to which the defect 1 is directed, R is a relationship specified according to the relationship of the defect and the defective device;
aiming at scheduling procedure data, a method of keywords and short texts is adopted to arrange the data into the form of the following triples:
C 3 =(data 3 (x 1 ,y 1 ),R,data 3 (x 1 ,y 2 )),
in the formula, data 3 (x 1 ,y 1 ) For scheduling keywords, data, in the protocol data 1 3 (x 1 ,y 2 ) For short texts under the keywords in the scheduling procedure data 1, R is the relationship between texts set according to the writing rule.
Preferably, the method for constructing the power distribution network fault handling knowledge graph based on the BERT model provided by the invention can also have the following characteristics: in step 5, an equipment topology knowledge graph, a fault plan knowledge graph, a defect library knowledge graph and a scheduling procedure knowledge graph are constructed for the distribution network text data based on the graph database, and the fault plan knowledge graph and the defect library knowledge graph are respectively linked to the equipment topology knowledge graph by taking a fault line and a defect line as correlation attributes.
< means >
Further, the present invention also provides a power distribution network fault handling knowledge graph construction apparatus capable of automatically implementing the above < method >, which is characterized by comprising:
the data acquisition part is used for acquiring the distribution network equipment account data, the fault handling data, the scheduling regulation data and the distribution network defect library data through a regulation and control system of the distribution network, and performing data cleaning pretreatment on the data;
the processing part is used for arranging the preprocessed distribution network equipment account data, distribution network defect library data and scheduling procedure data into a triple form;
the labeling part is used for labeling the preprocessed fault handling data to obtain a plurality of named entities and a plurality of entity types in each piece of data and obtain a labeled fault handling data set;
the model construction part is used for constructing a BERTHDic-BilSTM-CRF entity recognition model by using the labeled fault handling data set as a model training data set by adopting a pre-training method and carrying out fine tuning training, and the BERTHDic-BilSTM-CRF entity recognition model comprises the following steps: the fault handling method comprises the following steps of (1) effectively identifying complex professional entities in the field of the power distribution network by using a BERT layer, a Word2Vec Word embedded layer, a bidirectional long-short-term memory network layer BilSTM, a characteristic series layer Concatenate, a full connection layer Dense and a conditional random field layer CRF, and then arranging fault handling data into a triple form according to a preset relationship between the entities;
the map construction part is used for taking the distribution network equipment account data in the triple form, the distribution network defect library data and the fault handling data in the triple form, which are arranged by the processing part, in the arrangement mode and the model construction part as distribution network text data, and storing and visualizing the distribution network text data through a map database to construct a distribution network fault handling knowledge map; and
and the control part is in communication connection with the data acquisition part, the processing part, the labeling part, the model construction part and the map construction part and controls the operation of the data acquisition part, the processing part, the labeling part, the model construction part and the map construction part.
Preferably, the power distribution network fault handling knowledge graph constructing device provided by the present invention may further include: and the input display part is in communication connection with the control part, enables a user to input information for detection and an operation instruction, and performs corresponding display according to the operation instruction.
Preferably, the power distribution network fault handling knowledge graph construction device provided by the invention further has the following characteristics: the input display part can respectively display the data acquired by the data acquisition part and preprocessed, can display the data in the triple form arranged by the processing part, can display the fault handling data set marked by the marking part, can display the model constructed by the model construction part, the identified complex professional entity in the field of the power distribution network and the data in the triple form arranged, and can display the power distribution network fault handling knowledge graph constructed by the graph construction part.
Preferably, the power distribution network fault handling knowledge graph construction device provided by the invention further has the following characteristics: in the model construction part, a constructed electric power field dictionary is utilized, and dictionary feature representation in the electric power field dictionary is extracted based on a Word2Vec model; performing Word segmentation on an input text, performing Word Embedding on an input sequence by using a Word vector matrix obtained by Word2Vec training to obtain a dictionary feature representation vector sequence D [ D ] 1 ,d 2 ,…,d n ](ii) a The power field dictionary comprises power professional vocabularies and equipment names in the distribution network equipment standing book data; then, performing feature splicing on the output sequence T of the BERT layer and the dictionary feature representation vector sequence D to obtain spliced feature representation E ═ E 1 ,e 2 ,…,e n ]Inputting the language features into a BilSTM layer, capturing the language features of the characters based on the context, and outputting a vector sequence O ═ O considering the context information of the characters 1 ,o 2 ,…,o n ]。
Preferably, the power distribution network fault handling knowledge graph construction device provided by the invention can further have the following characteristics: in the processing part, aiming at the distribution network equipment ledger data, the equipment name is used as an entity, the rest part is used as an additional attribute, and the data are arranged into a triple form according to the following real link relation among different equipment:
C 1 =(data 1 (x 1 ,y),R,data 2 (x 2 ,y)),
in the formula, data 1 (x 1 Y) is device 1, data 2 (x 2 Y) is the device 2, and R is the relationship between the devices specified according to the actual linking relationship of the devices;
aiming at the distribution network defect library data, the defect name is used as an entity, the rest part is used as an additional attribute, and the data are arranged into the following triple form according to the affiliated relationship between the defect and the defect equipment:
C 2 =(data 2 (x 1 ,y),R,data 1 (x 1 ,y)),
in the formula, data 2 (x 1 Y) is defect 1, data 1 (x 1 Y) is the device 1 to which the defect 1 is directed, R is a relationship specified according to the relationship of the defect and the defective device;
aiming at scheduling procedure data, a method of keywords and short texts is adopted to arrange the data into the form of the following triples:
C 3 =(data 3 (x 1 ,y 1 ),R,data 3 (x 1 ,y 2 )),
in the formula, data 3 (x 1 ,y 1 ) For scheduling keywords, data, in the protocol data 1 3 (x 1 ,y 2 ) For short texts under the keywords in the scheduling procedure data 1, R is the relationship between texts set according to the writing rule.
Preferably, the power distribution network fault handling knowledge graph construction device provided by the invention can further have the following characteristics: in the map construction part, an equipment topology knowledge map, a fault plan knowledge map, a defect library knowledge map and a scheduling procedure knowledge map are constructed for the distribution network text data based on a map database, and the fault plan knowledge map and the defect library knowledge map are respectively linked to the equipment topology knowledge map by taking a fault line and a defect line as associated attributes.
Action and effects of the invention
According to the method and the device for constructing the power distribution network fault disposal knowledge graph, a BERTHwit Dic-BilSTM-CRF model is constructed, dictionary feature representation in the model is extracted through a Word2Vec layer, external information of an entity in the power field is enhanced, and output of bert and output feature of BiLSTM are connected in series through a catenate layer; the problems of less trainable data and high labeling cost of the power distribution network are effectively solved, the high-efficiency identification of complex professional entities in the field of the power distribution network is realized, and meanwhile, the high automation of knowledge graph application is realized; complex professional entities carrying numbers and symbols in the field of power distribution networks can be effectively identified; and the condition that a part of berts are possibly lost to obtain features when the output of the bert layer is extracted by the bilstm layer is effectively avoided, and the complex features of the text in the field of the power distribution network can be extracted to the greatest extent.
Drawings
Fig. 1 is a flowchart of a power distribution network fault handling knowledge graph construction method based on a BERT model according to an embodiment of the present invention;
FIG. 2 is a flow diagram of fault handling data named entity identification involved in an embodiment of the present invention;
FIG. 3 is a diagram of the BERTWITHIDic-BilSTM-CRF model architecture involved in an embodiment of the present invention;
fig. 4 is a power distribution network fault handling knowledge graph construction framework diagram according to an embodiment of the present invention.
Detailed Description
The following describes specific embodiments of the method and apparatus for constructing a power distribution network fault handling knowledge graph according to the present invention in detail with reference to the accompanying drawings.
< example >
As shown in fig. 1 to 4, the method for constructing a power distribution network fault handling knowledge graph based on a BERT model provided in this embodiment specifically includes:
step 1, respectively acquiring distribution network equipment account data, distribution network defect library data, scheduling regulation data and fault handling data from a power grid regulation and control system, and establishing the above power distribution network related data set as follows:
data k (x,y),
wherein, the data k (x, y) is the data value under the yth attribute of the xth data in the kth class in the power distribution network data set, and k belongs to [1, L ]],x∈[1,M],y∈[1,N]L is the number of the types of the related data of the power distribution network, M is the total number of the data under certain type of data, and N is the attribute number of certain data;
the process of data cleansing of the raw data set is as follows: 1) deleting data lacking key attributes; 2) filling up the data lacking partial unimportant data with null values; 3) and filtering out repeated data.
And 2, processing the data in different modes, wherein the processing process is as follows:
aiming at the distribution network equipment ledger data, the equipment name is used as an entity, the rest part is used as an additional attribute, and the following triple forms are arranged according to the actual link relation among different equipment:
C 1 =(data 1 (x 1 ,y),R,data 2 (x 2 ,y)),
wherein, the data 1 (x 1 Y) is device 1, data 2 (x 2 Y) is device 2, and R is the inter-device relationship specified according to the device real link relationship.
Aiming at the distribution network defect library data, the defect name is used as an entity, the rest part is used as an additional attribute, and the data are arranged into the following triple form according to the affiliated relationship between the defect and the defect equipment:
C 2 =(data 2 (x 1 ,y),R,data 1 (x 1 ,y)),
wherein, the data 2 (x 1 Y) is defect 1, data 1 (x 1 Y) is the device 1 to which the defect 1 is directed, and R is a relationship specified according to the relationship of the defect and the defective device.
Aiming at the scheduling procedure data, as the scheduling procedure data is unstructured data with strong regularity, the scheduling procedure data is organized into the form of the following triples by adopting a method of 'keywords + short texts' according to the writing rule of the scheduling procedure data:
C 3 =(data 3 (x 1 ,y 1 ),R,data 3 (x 1 ,y 2 )),
wherein, the data 3 (x 1 ,y 1 ) For scheduling keywords, data, in protocol data 1 3 (x 1 ,y 2 ) For short texts under keywords in the scheduling procedure data 1, R is a relationship between texts set according to a writing rule.
And 3, adopting a YEDDA corpus labeling tool to label BIO entities aiming at the fault handling data, wherein B represents the beginning of the entities, I represents the end of the entities, and O represents non-entities. Six types of entities are set by analyzing fault handling data, which are respectively as follows: fault location, voltage class, fault cause, trouble equipment, trouble are dealt with, are moved after the trouble to "22 hours 52 minute youth becomes one section protection action of 10 kilovolts blue temple line overcurrent, switch trip, the bad coincidence" fault alarm information is the example, and its mark result is { fault location: cyan temple line }, { voltage rating: 10 kv }, { cause of failure: switch trip, misregistration }.
The post-annotation fault handling dataset is:
{data 4 (X,Y),E(x,y)},
wherein E (x, y) is the y-th entity under the x-th class, and x belongs to [1, N ] x ],y∈[1,N y ],N x Number of entity types, N y Is the total number of entities under each type.
Step 4, constructing a BERTHwit Dic-BilSTM-CRF entity recognition model shown in FIG. 3 and performing fine tuning training, wherein the model comprises the following steps:
the device comprises a BERT layer, a Word2Vec Word embedding layer, a bidirectional long-time memory network layer (BilSTM), a characteristic series layer (Concatenate), a full connection layer (Dense) and a conditional random field layer (CRF);
firstly, a BERT layer is a model pre-trained on a general language library, an original input text is firstly subjected to word segmentation, a beginning tag [ CLS ] and an ending tag [ SEP ] are simultaneously inserted to the head end and the tail end of a word segmentation result, and then an input sequence is converted into a vector sequence X which can be identified by a neural network through an Embedding layer (Embedding):
Figure BDA0003711588240000071
wherein H represents the vector dimension, 768 in the BERT model;
Figure BDA0003711588240000072
refers to the word-embedded encoding of the word sequence,
Figure BDA0003711588240000073
refers to the position information coding of the word sequence,
Figure BDA0003711588240000074
the method refers to coding the sentence information of the character sequence and coding the above threeMapping the input vector to a high-dimensional space with the same dimension, and adding to obtain an input vector;
then, the vector sequence X enters a BERT layer (12 transform encoder stacks) to extract rich text features to obtain an output vector sequence T ═ T [ [ T ] ] 1 ,t 2 ,…,t n ];
And then, extracting dictionary feature representation in the electric power field dictionary based on the Word2Vec model by using the constructed electric power field dictionary, and enhancing external information of the electric power field entity. The input text is also subjected to Word segmentation, Word Embedding is carried out on the input sequence by using a Word vector matrix obtained by Word2Vec training, and a dictionary feature representation vector sequence D is obtained 1 ,d 2 ,…,d n ]. The electric power field dictionary consists of 6000 electric power professional vocabularies from hundred-degree libraries provided by 'AIIA' cup-national power grid-electric power professional field vocabulary mining competitions and equipment names in distribution network equipment standing book data;
secondly, performing feature splicing on the output sequence T of the BERT layer and the dictionary feature representation vector sequence D to obtain spliced feature representation E ═ E 1 ,e 2 ,…,e n ]Inputting the character into a BilSTM layer, capturing the language features of the character based on context, and outputting a vector sequence O (O) considering the context information of the character 1 ,o 2 ,…,o n ];
Then, the characteristic tandem layer carries out joint operation on the spliced characteristic representation E and the output vector sequence O of the BilSTM layer to obtain an output vector sequence T out =[h 1 ,h 2 ,…,h n ]And the score sequence S is converted into a score sequence S of each entity corresponding to each word through a full connection layer;
finally, the conditional random field layer automatically learns and updates a value of [ n ] Label (R) ,n Label (R) ]Calculating a loss function based on the score sequence and the transition matrix, and outputting an optimal label sequence;
the loss function of the BERTHIWDic-BilSTM-CRF model is as follows:
Figure BDA0003711588240000081
wherein, P RealPath The denominator is the total score of all possible paths for the score value of the true path.
In the training process, a method of pre-training and fine tuning is adopted, and the specific fine tuning strategy is as follows: 1) setting a smaller learning rate L _ rate for the pre-trained BERT model, and setting a larger learning rate 100 x L _ rate for a BilSTM layer and a CRF layer; 2) repeatedly adjusting parameters such as batch _ size, max _ len, epoch and the like; 3) different learning rate attenuation strategies are set.
And carrying out entity identification on the fault handling data by using the trained model, and arranging the fault handling data into a triple form by combining with a preset relationship between entities.
And 5, storing and visualizing the power distribution network text data arranged in the triple form by using a neo4j database, and constructing an equipment topology knowledge graph, a fault plan knowledge graph, a defect library knowledge graph and a scheduling procedure knowledge graph, wherein the equipment topology knowledge graph can be used for inquiring related lines and equipment after a fault occurs, the defect library knowledge graph can assist in finding fault reasons, the fault plan knowledge graph can provide a handling scheme similar to historical faults, and the scheduling procedure knowledge graph mainly comprises a general fault handling principle, a scheduling rule, an operation flow and the like.
In the practical application process, the entity recognition model is used for carrying out entity recognition on the real-time fault text information, knowledge retrieval is carried out in the constructed power distribution network fault disposal knowledge map by using Cypher sentences, and a user is assisted in carrying out fault disposal decision.
Further, the embodiment also provides a power distribution network fault handling knowledge graph construction device capable of automatically realizing the method, and the device comprises a data acquisition part, a processing part, a labeling part, a model construction part, a graph construction part, an input display part and a control part.
The data acquisition part executes the content described in the step 1, acquires the distribution network equipment account data, the fault treatment data, the scheduling regulation data and the distribution network defect library data through the regulation and control system of the distribution network, and performs data cleaning pretreatment on the data.
The processing part executes the content described in the step 2, and arranges the preprocessed distribution network equipment account data, distribution network defect library data and scheduling procedure data into a triple form.
The labeling part executes the content described in the step 3, labels the preprocessed fault handling data to obtain a plurality of named entities and a plurality of entity types in each piece of data, and obtains a labeled fault handling data set.
The model construction part executes the content described in the step 4, the labeled fault handling data set is used as a model training data set, a pretraining method is adopted to construct a BERTHIWthdic-BilSTM-CRF entity recognition model and carry out fine tuning training, and the BERTHIWthdic-BilSTM-CRF entity recognition model comprises the following steps: the fault handling method comprises the following steps of A, a BERT layer, a Word2Vec Word embedding layer, a bidirectional long-time memory network layer BilSTM, a characteristic series layer Concatenate, a full connection layer Dense and a conditional random field layer CRF, so that complex professional entities in the field of the power distribution network are effectively identified, and then fault handling data are arranged into a triple form according to a preset relationship between the entities;
the map construction part executes the content described in the step 5, the distribution network equipment account data in the triple form, the distribution network defect library data and the fault handling data in the triple form, which are arranged by the processing part, are used as the distribution network text data, and the distribution network text data are stored and visualized through the map database to construct the distribution network fault handling knowledge map.
The input display unit allows a user to input information for detection and an operation command, and displays the information and the operation command accordingly. For example, the input display part can respectively display the data acquired by the data acquisition part and preprocessed, can display the data in the triple form arranged by the processing part, can display the fault handling data set marked by the marking part, can display the model constructed by the model construction part, the identified complex professional entity in the field of the power distribution network and the data in the triple form arranged, and can display the fault handling knowledge map of the power distribution network constructed by the map construction part.
The control part is communicated with the data acquisition part, the processing part, the labeling part, the model construction part, the map construction part and the input display part to control the operation of the data acquisition part, the processing part, the labeling part, the model construction part, the map construction part and the input display part.
The above embodiments are merely illustrative of the technical solutions of the present invention. The method and apparatus for constructing a power distribution network fault handling knowledge graph according to the present invention are not limited to the content described in the above embodiments, but are subject to the scope defined by the claims. Any modification or supplement or equivalent replacement made by a person skilled in the art on the basis of this embodiment is within the scope of the invention as claimed in the claims.

Claims (10)

1. The power distribution network fault disposal knowledge graph construction method is characterized by comprising the following steps:
step 1, acquiring distribution network equipment ledger data, fault disposal data, scheduling regulation data and distribution network defect library data through a regulation and control system of a distribution network, and performing data cleaning pretreatment on the data;
step 2, arranging the preprocessed distribution network equipment account data, distribution network defect library data and scheduling procedure data into a triple form;
step 3, labeling the preprocessed fault handling data to obtain a plurality of named entities and a plurality of entity types in each piece of data, and obtaining a labeled fault handling data set;
and 4, taking the fault handling data set marked in the step 3 as a model training data set, constructing a BERTHDic-BilSTM-CRF entity recognition model by adopting a pre-training method, and performing fine tuning training, wherein the BERTHDic-BilSTM-CRF entity recognition model comprises the following steps: the fault handling method comprises the following steps of (1) effectively identifying complex professional entities in the field of the power distribution network by using a BERT layer, a Word2Vec Word embedded layer, a bidirectional long-short-term memory network layer BilSTM, a characteristic series layer Concatenate, a full connection layer Dense and a conditional random field layer CRF, and then arranging fault handling data into a triple form according to a preset relationship between the entities;
and 5, taking the distribution network equipment account data, the distribution network defect library data and the scheduling regulation data in the triple form obtained in the step 2 and the fault disposal data in the triple form obtained in the step 4 as distribution network text data, and storing and visualizing the distribution network text data through a database to construct a distribution network fault disposal knowledge map.
2. The power distribution network fault handling knowledge graph construction method according to claim 1, wherein:
in step 4, extracting dictionary feature representation in the constructed electric power field dictionary based on a Word2Vec model; performing Word segmentation on an input text, performing Word Embedding on an input sequence by using a Word vector matrix obtained by Word2Vec training to obtain a dictionary feature representation vector sequence D [ D ] 1 ,d 2 ,…,d n ](ii) a The power field dictionary comprises power professional vocabularies and equipment names in the distribution network equipment standing book data; then, performing feature splicing on the output sequence T of the BERT layer and the dictionary feature representation vector sequence D to obtain spliced feature representation E ═ E 1 ,e 2 ,…,e n ]Inputting the character into a BilSTM layer, capturing the language features of the character based on context, and outputting a vector sequence O (O) considering the context information of the character 1 ,o 2 ,…,o n ]。
3. The power distribution network fault handling knowledge graph construction method according to claim 1, characterized in that:
in step 1, the relevant data set of the power distribution network is data k (x, y) represents the data value of the yth attribute of the xth data in the kth class in the power distribution network data set, and k belongs to [1, L ]],x∈[1,M],y∈[1,N]L is the number of the types of the related data of the power distribution network, M is the total number of the data under certain type of data, and N is the attribute number of certain data.
4. The power distribution network fault handling knowledge graph construction method according to claim 1, characterized in that:
in step 2, aiming at the distribution network equipment ledger data, the equipment name is used as an entity, the rest parts are used as additional attributes, and the data are arranged into a triple form according to the real link relation among different equipment in the following mode:
C 1 =(data 1 (x 1 ,y),R,data 2 (x 2 ,y)),
in the formula, data 1 (x 1 Y) as device 1, data 2 (x 2 Y) is the device 2, and R is the relationship between the devices specified according to the actual linking relationship of the devices;
aiming at the distribution network defect library data, the defect name is used as an entity, the rest part is used as an additional attribute, and the data are organized into the following triple form according to the affiliated relationship between the defect and the defect equipment:
C 2 =(data 2 (x 1 ,y),R,data 1 (x 1 ,y)),
in the formula, data 2 (x 1 Y) is defect 1, data 1 (x 1 Y) is the device 1 to which the defect 1 is directed, R is a relationship specified according to the relationship of the defect and the defective device;
aiming at scheduling procedure data, a method of keywords and short texts is adopted to arrange the data into the form of the following triples:
C 3 =(data 3 (x 1 ,y 1 ),R,data 3 (x 1 ,y 2 )),
in the formula, data 3 (x 1 ,y 1 ) For scheduling keywords, data, in the protocol data 1 3 (x 1 ,y 2 ) For short texts under the keywords in the scheduling procedure data 1, R is the relationship between texts set according to the writing rule.
5. The power distribution network fault handling knowledge graph construction method according to claim 1, characterized in that:
in step 5, an equipment topology knowledge graph, a fault plan knowledge graph, a defect library knowledge graph and a scheduling procedure knowledge graph are constructed for the distribution network text data based on the graph database, and the fault plan knowledge graph and the defect library knowledge graph are respectively linked to the equipment topology knowledge graph by taking a fault line and a defect line as correlation attributes.
6. Distribution network fault handles knowledge map construction equipment, its characterized in that includes:
the data acquisition part is used for acquiring the distribution network equipment account data, the fault handling data, the scheduling regulation data and the distribution network defect library data through a regulation and control system of the distribution network, and performing data cleaning pretreatment on the data;
the processing part is used for arranging the preprocessed distribution network equipment ledger data, distribution network defect library data and scheduling procedure data into a triple form;
the labeling part is used for labeling the preprocessed fault handling data to obtain a plurality of named entities and a plurality of entity types in each piece of data and obtain a labeled fault handling data set;
the model construction part is used for constructing a BERTWITH Dic-BilSTM-CRF entity recognition model by using the labeled fault handling data set as a model training data set and adopting a pre-training method and carrying out fine tuning training, wherein the BERTWITH Dic-BilSTM-CRF entity recognition model comprises the following steps: the fault handling method comprises the following steps of A, a BERT layer, a Word2Vec Word embedding layer, a bidirectional long-time memory network layer BilSTM, a characteristic series layer Concatenate, a full connection layer Dense and a conditional random field layer CRF, so that complex professional entities in the field of the power distribution network are effectively identified, and then fault handling data are arranged into a triple form according to a preset relationship between the entities;
the map construction part is used for storing and visualizing the distribution network text data through a map database to construct a distribution network fault handling knowledge map, wherein the distribution network equipment account data, the distribution network defect library data and the scheduling procedure data in the triple form which are arranged by the processing part are used as the distribution network text data; and
and the control part is in communication connection with the data acquisition part, the processing part, the labeling part, the model construction part and the map construction part and controls the operation of the data acquisition part, the processing part, the labeling part, the model construction part and the map construction part.
7. The distribution network fault handling knowledge map construction apparatus of claim 6, further comprising:
and the input display part is in communication connection with the control part, enables a user to input information for detection and an operation instruction, and performs corresponding display according to the operation instruction.
8. The distribution network fault handling knowledge graph building apparatus of claim 6, wherein:
the input display part can respectively display the data acquired by the data acquisition part and preprocessed, can display the data in the triple form arranged by the processing part, can display the fault handling data set marked by the marking part, can display the model constructed by the model construction part, the identified complex professional entity in the field of the power distribution network and the data in the triple form arranged, and can display the fault handling knowledge graph of the power distribution network constructed by the graph construction part.
9. The distribution network fault handling knowledge graph building apparatus of claim 6, wherein:
wherein, in the model construction part, the constructed electric power field dictionary is utilized to extract dictionary feature representation in the electric power field dictionary based on a Word2Vec model; performing Word segmentation on an input text, performing Word Embedding on an input sequence by using a Word vector matrix obtained by Word2Vec training to obtain a dictionary feature representation vector sequence D [ [ D ] ] 1 ,d 2 ,…,d n ](ii) a The power field dictionary comprises power professional vocabularies and equipment names in the distribution network equipment standing book data; and then, performing feature splicing on the output sequence T of the BERT layer and the dictionary feature representation vector sequence D to obtain spliced feature representation E ═ E [ E ] 1 ,e 2 ,…,e n ]Inputting the language features into a BilSTM layer, capturing the language features of the characters based on the context, and outputting a vector sequence O ═ O considering the context information of the characters 1 ,o 2 ,…,o n ]。
10. The distribution network fault handling knowledge graph building apparatus of claim 6, wherein:
in the processing part, aiming at the distribution network equipment ledger data, the equipment name is used as an entity, the rest parts are used as additional attributes, and the data are arranged into a triple form according to the actual link relation among different equipment in the following mode:
C 1 =(data 1 (x 1 ,y),R,data 2 (x 2 ,y)),
in the formula, data 1 (x 1 Y) as device 1, data 2 (x 2 Y) is the device 2, and R is the inter-device relationship specified according to the actual link relationship of the devices;
aiming at the distribution network defect library data, the defect name is used as an entity, the rest part is used as an additional attribute, and the data are arranged into the following triple form according to the affiliated relationship between the defect and the defect equipment:
C 2 =(data 2 (x 1 ,y),R,data 1 (x 1 ,y)),
in the formula, data 2 (x 1 Y) Defect 1, data 1 (x 1 Y) is the device 1 to which the defect 1 is directed, R is a relationship specified according to the relationship of the defect and the defective device;
aiming at scheduling procedure data, a method of keywords and short texts is adopted to arrange the data into the form of the following triples:
C 3 =(data 3 (x 1 ,y 1 ),R,data 3 (x 1 ,y 2 )),
in the formula, data 3 (x 1 ,y 1 ) For scheduling keywords, data, in the protocol data 1 3 (x 1 ,y 2 ) For short texts under the keywords in the scheduling procedure data 1, R is the relationship between texts set according to the writing rule.
CN202210738046.8A 2022-06-24 2022-06-24 Power distribution network fault disposal knowledge graph construction method and device Pending CN115129892A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210738046.8A CN115129892A (en) 2022-06-24 2022-06-24 Power distribution network fault disposal knowledge graph construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210738046.8A CN115129892A (en) 2022-06-24 2022-06-24 Power distribution network fault disposal knowledge graph construction method and device

Publications (1)

Publication Number Publication Date
CN115129892A true CN115129892A (en) 2022-09-30

Family

ID=83379297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210738046.8A Pending CN115129892A (en) 2022-06-24 2022-06-24 Power distribution network fault disposal knowledge graph construction method and device

Country Status (1)

Country Link
CN (1) CN115129892A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115658930A (en) * 2022-12-26 2023-01-31 中建科技集团有限公司 Production line fault analysis method and device based on knowledge graph and storage medium
CN116010896A (en) * 2023-02-03 2023-04-25 南京南瑞继保电气有限公司 Wind driven generator fault diagnosis method based on countermeasure training and transducer
CN117650623A (en) * 2023-11-22 2024-03-05 广东天汇储能科技有限公司 Lithium battery energy storage power station operation and maintenance method and device based on light-weight large model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115658930A (en) * 2022-12-26 2023-01-31 中建科技集团有限公司 Production line fault analysis method and device based on knowledge graph and storage medium
CN116010896A (en) * 2023-02-03 2023-04-25 南京南瑞继保电气有限公司 Wind driven generator fault diagnosis method based on countermeasure training and transducer
CN117650623A (en) * 2023-11-22 2024-03-05 广东天汇储能科技有限公司 Lithium battery energy storage power station operation and maintenance method and device based on light-weight large model

Similar Documents

Publication Publication Date Title
CN112269901B (en) Fault distinguishing and reasoning method based on knowledge graph
CN115129892A (en) Power distribution network fault disposal knowledge graph construction method and device
CN110598203A (en) Military imagination document entity information extraction method and device combined with dictionary
CN108549658A (en) A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree
CN114926150B (en) Digital intelligent auditing method and device for transformer technology compliance assessment
CN114781392A (en) Text emotion analysis method based on BERT improved model
CN112883693A (en) Method and terminal for automatically generating electric power work ticket
CN112329767A (en) Contract text image key information extraction system and method based on joint pre-training
CN114238652A (en) Industrial fault knowledge map establishing method for end-to-end scene
CN109299470A (en) The abstracting method and system of trigger word in textual announcement
CN115455194A (en) Knowledge extraction and analysis method and device for railway faults
CN116910633A (en) Power grid fault prediction method based on multi-modal knowledge mixed reasoning
CN113656569B (en) Context information reasoning-based generation type dialogue method
CN113065352B (en) Method for identifying operation content of power grid dispatching work text
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN117131856A (en) Traffic accident text causal relation extraction method based on problem guidance
CN115270774B (en) Big data keyword dictionary construction method for semi-supervised learning
CN110619877A (en) Voice recognition man-machine interaction method, device and system applied to laser pen and storage medium
CN116523042A (en) Combined extraction method and system for power grid dispatching entity relationship
CN114579706B (en) Automatic subjective question review method based on BERT neural network and multi-task learning
CN115936001A (en) Power grid IT operation and maintenance entity identification method and system based on BERT-BilSTM-CRF model and attention
CN113590745B (en) Interpretable text inference method
CN115098687A (en) Alarm checking method and device for scheduling operation of electric power SDH optical transmission system
CN114298339A (en) Intelligent decision-making method and system for substation equipment alarm
CN110955768B (en) Question-answering system answer generation method based on syntactic analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination