CN115438190A - Power distribution network fault decision-making assisting knowledge extraction method and system - Google Patents

Power distribution network fault decision-making assisting knowledge extraction method and system Download PDF

Info

Publication number
CN115438190A
CN115438190A CN202211086406.7A CN202211086406A CN115438190A CN 115438190 A CN115438190 A CN 115438190A CN 202211086406 A CN202211086406 A CN 202211086406A CN 115438190 A CN115438190 A CN 115438190A
Authority
CN
China
Prior art keywords
data
text data
entities
weight
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211086406.7A
Other languages
Chinese (zh)
Other versions
CN115438190B (en
Inventor
李智
刘正祎
李默涵
张瑶瑶
张海
倪玉露
刘鑫蕊
裴玉杰
金银龙
王野
袁明阳
路学文
贾俊海
吴厚毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fushun Power Supply Co Of State Grid Liaoning Electric Power Supply Co ltd
State Grid Corp of China SGCC
Original Assignee
Fushun Power Supply Co Of State Grid Liaoning Electric Power Supply Co ltd
State Grid Corp of China SGCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fushun Power Supply Co Of State Grid Liaoning Electric Power Supply Co ltd, State Grid Corp of China SGCC filed Critical Fushun Power Supply Co Of State Grid Liaoning Electric Power Supply Co ltd
Priority to CN202211086406.7A priority Critical patent/CN115438190B/en
Publication of CN115438190A publication Critical patent/CN115438190A/en
Application granted granted Critical
Publication of CN115438190B publication Critical patent/CN115438190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a power distribution network fault assistant decision knowledge extraction method and a power distribution network fault assistant decision knowledge extraction system, wherein vectorization operation is carried out on original text data after being subjected to overall aggregation to form word vector set data; performing entity extraction on the word vector set, and labeling the obtained entities; distributing the labeled attribute relation of the same entity to a certain relation class by adopting a multi-classification principle to complete entity relation extraction; training the relationship between adjacent labeled entities, repairing the labeling error between the entities, and outputting the structural association relationship between the repaired entities; evaluating the knowledge extraction result according to the error range of the structured incidence relation between the original text data and each repaired entity; the system comprises a data preprocessing module, a Bi-LSTM module, a weight correlation model, an error correction module and a model evaluation module. The method and the system realize knowledge extraction of the fault processing text of the power distribution network, are closer to the original semantics and are beneficial to optimizing the knowledge extraction process.

Description

Power distribution network fault decision-making assisting knowledge extraction method and system
Technical Field
The invention relates to the technical field of power transmission and distribution, in particular to a power distribution network fault auxiliary decision knowledge extraction method and system.
Background
With the acceleration of the national technical energy reform process, the requirements of electric power systems and users on the links of electric energy transmission and distribution are increasingly enhanced, and higher requirements and challenges are provided for the dispatching operation of a power grid. In the face of the problems of regional dispersion of terminal users, large-scale grid connection of renewable energy sources, power supply reliability guarantee and the like in the current large environment of the power market, under emergency situations such as emergency faults and the like, real-time fault data are analyzed and calculated only by a dispatcher, corresponding screening decisions are carried out according to textual data and empirical knowledge, a large amount of resource waste is caused, the fault time is greatly prolonged, the current big data of the power internet of things has the characteristics of diversification, large quantity and the like, misjudgment and misoperation conditions are easily caused in the face of different task scenes, and the operation stability and reliability of a power system are influenced.
Disclosure of Invention
In order to improve the stability and reliability of the operation of the power system, make accurate judgment and operation under different task scenes and avoid prolonging the fault time, the invention provides a power distribution network fault auxiliary decision knowledge extraction method and system.
The adopted technical scheme is as follows:
on one hand, the invention provides a power distribution network fault assistant decision-making knowledge extraction method, which comprises the steps of vectorizing obtained original text data to form a word vector set retaining original semantics;
performing entity extraction on the word vector set, and labeling the obtained entities;
distributing the labeled attribute relation of the same entity to a certain relation class by adopting a multi-classification principle to finish labeling of various attribute relations of the entity;
training the relationship between adjacent labeled entities, repairing the labeling error between the entities, and outputting the structural association relationship between the repaired entities;
and evaluating the knowledge extraction result according to the error range of the structured incidence relation between the original text data and each repaired entity.
Furthermore, vectorizing the obtained original text data to form a word vector set retaining original semantics, wherein the word vector set comprises that non-text data in the input original text data is arranged and summarized into text data by manual operation or character conversion software; segmenting the text data by using Python codes and punctuation marks as identifiers to form segmented text data; vectorization operation is carried out on the Duan Wen data by using a word vector training tool, and the vectorized data are collected into a data set after multiple cycles to form a word vector set capable of retaining original semantics.
Preferably, the non-text data includes one or more of operation procedure, treatment plan, scheduling procedure, table of fault information and scheduling instruction, picture and voice.
Further, after non-text data in the input original text data is manually operated or is arranged and summarized into text data by using character conversion software, missing value processing, abnormal value processing, repeated value processing and noise filtering processing are carried out on the text data.
Further, performing entity extraction on the word vector set, and labeling the obtained entities, including: the method comprises the following steps of learning input entities by adopting a Bi-LSTM combined model, extracting the entities through an LSTM network, and labeling the entities extracted from text data by adopting a BIEOS entity labeling method, and specifically comprises the following steps:
the LSTM network receives a set of word vectors as input and performs learning, including a receiving gate
Figure 658429DEST_PATH_IMAGE001
Door for throwing away
Figure 278678DEST_PATH_IMAGE002
Door for recording and displaying results
Figure 26054DEST_PATH_IMAGE003
And a data recording gate
Figure 377401DEST_PATH_IMAGE004
The discarding gate processes the text data to be discarded, and the formula adopted by the discarded content is as follows:
Figure 74093DEST_PATH_IMAGE005
wherein:
Figure 485482DEST_PATH_IMAGE006
characterizing time
Figure 782471DEST_PATH_IMAGE007
A receiving variable of (1);
Figure 671930DEST_PATH_IMAGE008
characterizing a previous period
Figure 190505DEST_PATH_IMAGE009
Deep layer results of (2);
Figure 897430DEST_PATH_IMAGE010
is that
Figure 557081DEST_PATH_IMAGE011
A weight;
Figure 125597DEST_PATH_IMAGE012
is that
Figure 655935DEST_PATH_IMAGE013
The weight of (2);
Figure 533762DEST_PATH_IMAGE014
is an offset;
the receiving door
Figure 992294DEST_PATH_IMAGE001
Calculating the information to be stored when the LSTM network carries out cell updating, and adoptingThe formula of (1) is as follows:
Figure 489134DEST_PATH_IMAGE015
wherein:
Figure 264192DEST_PATH_IMAGE016
is that
Figure 188286DEST_PATH_IMAGE017
The weight of (c);
Figure 432316DEST_PATH_IMAGE018
is that
Figure 795165DEST_PATH_IMAGE019
The weight of (c);
Figure 565674DEST_PATH_IMAGE020
is the hypothetical cell state;
Figure 768992DEST_PATH_IMAGE021
is that
Figure 624952DEST_PATH_IMAGE022
Is/are as follows
Figure 588229DEST_PATH_IMAGE023
A weight;
Figure 213245DEST_PATH_IMAGE024
is that
Figure 557770DEST_PATH_IMAGE025
Is/are as follows
Figure 760081DEST_PATH_IMAGE026
A weight;
Figure 402415DEST_PATH_IMAGE027
and
Figure 249980DEST_PATH_IMAGE028
characterization of each
Figure 952357DEST_PATH_IMAGE029
And
Figure 641964DEST_PATH_IMAGE030
the error amount of (a);
Figure 822410DEST_PATH_IMAGE031
representing the state of the current grid;
Figure 31805DEST_PATH_IMAGE032
characterizing a grid state of a previous period;
Figure 967400DEST_PATH_IMAGE033
characterization of
Figure 19670DEST_PATH_IMAGE034
CAVs with medium parameters transformed by a Sigmoid function;
Figure 112129DEST_PATH_IMAGE035
characterization of
Figure 566244DEST_PATH_IMAGE029
CAVs with middle parameters transformed by Sigmoid function;
Figure 735057DEST_PATH_IMAGE036
characterization of
Figure 274623DEST_PATH_IMAGE037
CAVs with medium parameters transformed by tanh function, the CAVs are activation vectors
An amount;
the result door
Figure 672237DEST_PATH_IMAGE038
Output Bi-LSTM combined modelExtracted entities:
Figure 308755DEST_PATH_IMAGE039
in the formula:
Figure 523836DEST_PATH_IMAGE040
is that
Figure 659020DEST_PATH_IMAGE041
A weight;
Figure 312855DEST_PATH_IMAGE042
is that
Figure 475983DEST_PATH_IMAGE043
A weight;
Figure 471752DEST_PATH_IMAGE044
characterizing an error amount;
Figure 985910DEST_PATH_IMAGE045
characterizing an output result of the LSTM network;
Figure 240174DEST_PATH_IMAGE046
characterization of
Figure 834972DEST_PATH_IMAGE047
Middle parameters are CAVs transformed by Sigmoid function.
Further, a weight association mechanism is introduced to improve the weight parameters, text data are trained according to different weight parameters, and key contents are screened for entity relationship extraction, which specifically comprises the following steps:
the input text data is learnt and trained by adopting the following formula, and the input text data is selectively processed in parallel:
Figure 391855DEST_PATH_IMAGE048
in the formula:
Figure 252364DEST_PATH_IMAGE049
characterizing non-apparent states
Figure 185685DEST_PATH_IMAGE050
Relative degree of importance of;
Figure 933192DEST_PATH_IMAGE051
sign a certain vector
Figure 660977DEST_PATH_IMAGE052
The amount of error of (a);
Figure 8781DEST_PATH_IMAGE053
is the weight automatically assigned by the weight association model;
Figure 797658DEST_PATH_IMAGE054
characterizing the number of independent parameters in the Bi-LSTM network;
Figure 789885DEST_PATH_IMAGE055
characterization of
Figure 78783DEST_PATH_IMAGE056
CAVs with medium parameters transformed by tanh functions;
Figure 523671DEST_PATH_IMAGE057
characterizing the look-ahead relationship for each text datum;
Figure 408582DEST_PATH_IMAGE057
characterizing each subsequent association of textual data;
Figure 848790DEST_PATH_IMAGE058
is an operator;
and substituting the obtained data into the following formula to determine the structural relationship among different extraction entities:
Figure 652798DEST_PATH_IMAGE059
in the formula:
Figure 224463DEST_PATH_IMAGE060
is the final output of the weight correlation model;
Figure 37698DEST_PATH_IMAGE061
is a weight correlation model
Figure 129151DEST_PATH_IMAGE062
Assigning time instants to non-distinct states
Figure 182689DEST_PATH_IMAGE063
The weight of (c).
Further, decoupling analysis is carried out by calling the mutual relation between adjacent labeling entities, global optimal sequence solving of output data and knowledge extraction of a power distribution network fault processing text are sequentially completed, and the correct output possibility ratio is calculated according to the following formula:
Figure 867748DEST_PATH_IMAGE064
wherein:
Figure 343729DEST_PATH_IMAGE065
representing a result value of a Linear-Chain model in the error correction module;
Figure 165054DEST_PATH_IMAGE066
characterization error correctionReceiving values of a Linear-Chain model in the module;
Figure 215924DEST_PATH_IMAGE067
characterizing the emission probability;
Figure 122701DEST_PATH_IMAGE068
characterizing a transition probability;
Figure 136793DEST_PATH_IMAGE069
is referred to as a parameter
Figure 78204DEST_PATH_IMAGE070
Number of elements in the vector.
Further, evaluating the quality of the auxiliary decision knowledge extraction based on a reward and punishment mechanism according to the error range of the structural association relationship between the obtained original text data and each repaired entity, wherein the evaluation function is as follows:
Figure 473544DEST_PATH_IMAGE071
wherein, F is a reward and punishment result,
Figure 992251DEST_PATH_IMAGE072
is the error threshold range;
Figure 950979DEST_PATH_IMAGE073
is the value of the error in the physical sequence,
Figure 120798DEST_PATH_IMAGE074
is a relational sequence error value;
Figure 873991DEST_PATH_IMAGE075
is the entity weight bias coefficient;
Figure 83255DEST_PATH_IMAGE076
is a relation weight bias coefficient;
and then combining the following formula to obtain an evaluation result of the decision reference value:
Figure 189883DEST_PATH_IMAGE077
wherein the content of the first and second substances,
Figure 840307DEST_PATH_IMAGE078
characterization of
Figure 889034DEST_PATH_IMAGE079
The number of parameters of (2);
Figure 523278DEST_PATH_IMAGE080
representing the total number of F;
Figure 762209DEST_PATH_IMAGE081
the system error rate is characterized, expressed as a percentage.
On the other hand, the invention also provides a power distribution network fault assistant decision knowledge extraction system, which comprises the following modules:
the data preprocessing module is used for performing quantization operation on the original text data after being processed and aggregated to form a word vector set which retains original semantics;
the Bi-LSTM module is used for performing entity extraction and multiple attribute relation labeling on the word vector set output by the data preprocessing module;
the error correction module is used for training the relationship between adjacent labeled entities in the Bi-LSTM module, repairing labeled errors existing in the Bi-LSTM module and outputting the structural association relationship between the repaired entities;
and the model evaluation module is used for evaluating the accuracy of the model according to the error ranges of the structured incidence relations between the sorted and aggregated original text data and the repaired entities.
Furthermore, the system is also provided with a weight association model which is used for screening the weight of each entity extracted from the input text data, identifying and judging the relation among the entities and extracting the relation; and repairing the structural association relationship among the entities through the error correction module.
The technical scheme of the invention has the following advantages:
A. the method realizes entity extraction and relation extraction by vectorizing operation processing on the original text data to be summarized, repairs standard errors among entities are added in the extraction method, global optimal sequence solving can be completed on the output data of the weight association model by training the relation among adjacent labeled entities, finally knowledge extraction on the fault processing text of the power distribution network is realized, the system accuracy is improved, the method is closer to the original semantics and is beneficial to optimizing the knowledge extraction process.
B. The invention provides a power distribution network fault assistant decision knowledge extraction method and a power distribution network fault assistant decision knowledge extraction system based on weight association and error correction, which deeply learn entities in text data of a power system and information such as logic, organizational structures, operation and constraint of the entities, analyze fault states and fault data parameters in real time, simultaneously convert unstructured contents such as operation rules, treatment plans, scheduling rules and fault information into a structural knowledge network expression mode capable of reasoning by combining a natural language processing technology, process unstructured and semi-structured data by relying on a data preprocessing module, a Bi-LSTM module, a weight association model, an error correction module and a model evaluation module, enable the text data to be vectorized, simultaneously retain original semantics of the text data, and output assistant decisions in a relatively short time range, which is the greatest advantage of a power field application knowledge map and helps power system scheduling personnel to make decisions in the power field, the scheduling field and the operation and inspection field and complete a power distribution network fault treatment series emergency response plan, thereby promoting intelligent control and scheduling process of the power system.
C. The system provided by the invention can be used for constructing the subsequent power system fault decision knowledge graph to be a foundation, and mapping the topology of the power grid (for example, the position of power equipment, parts and other related information form a natural topological graph) to the knowledge graph plate, so that the early-stage work is completed, and the completeness of a knowledge graph decision item is ensured; compared with the traditional manual scheduling decision process, the design of the system benefits from the support of cloud computing and AI technology, a large amount of manpower, financial resources and material resources can be saved, the system is combined with the audit of professional schedulers, the decision specialty and accuracy can be improved while the fault processing scheduling decision time is shortened, the deployment and the conversion with low cost and quick response are realized, and the maximization of the benefit is brought to a power grid company.
Drawings
In order to more clearly illustrate the embodiments of the present invention, the drawings which are needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained from the drawings without inventive labor to those skilled in the art.
FIG. 1 is a flow chart of a power distribution network fault assistant decision knowledge extraction method provided by the invention;
FIG. 2 is a flow chart of a Bi-LSTM module provided by the present invention;
FIG. 3 is a flowchart of the weight relevance model provided by the present invention;
FIG. 4 is a flow chart of the linear chain model provided by the present invention;
FIG. 5 is a comparison of the effect of the error-free correction module provided by the present invention;
FIG. 6 is a flow chart of accident history knowledge extraction provided by the present invention;
fig. 7 is a structural diagram of a power distribution network fault assistant decision knowledge extraction system provided by the invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the invention provides a power distribution network fault decision-making assisting knowledge extraction method, which comprises the following steps:
s01, vectorizing the original text data after the integration is conducted, and a word vector set with original semantics reserved is formed.
Because the data structure in the power system contains information such as entities, logics, organizational structures, operations, constraints and the like, and contains unstructured contents such as operation rules, treatment plans, scheduling rules, fault information and the like, the unstructured and semi-structured data can be converted into a structural knowledge network expression mode which can be inferred through a natural language processing technology, so that the original semantics of the text data can be kept while vectorization of the text data is carried out, auxiliary decisions are output within a relatively short time range, power system scheduling personnel are helped to carry out power distribution network fault processing, and the intelligent control and scheduling process of the power system is promoted.
The processing method for performing vectorization operation on non-text data in original text data to form a word vector set retaining original semantics comprises the following steps: manually operating input non-text data or sorting and summarizing the input non-text data into text data by using character conversion software; segmenting the text data by using Python codes and punctuation marks as identifiers; vectorization operation is carried out on the Duan Wen data by using a word vector training tool, and the data are collected into a data set for storage after multiple cycles to form a word vector set capable of retaining original semantics. The Word vector training tools herein may be Word2Vec, fastText, etc.
After the non-text data in the original text data is converted and sorted into the text data, the method further comprises the step of processing the detail problem of the text data, wherein the process mainly comprises four processes of missing value processing, abnormal value processing, repeated value processing and noise filtering.
For text data, when a missing value exists in a certain piece of data, different operations may be performed according to the number of attributes of each piece of data, and when the number of attributes of the data is small (for example, there is only one associated attribute or none), a deletion method is selected, that is, the piece of data is directly deleted. The method for processing the data loss is suitable for the conditions that the number of samples is large, the percentage of the lost data in the total amount of the samples is small, and the method has the advantages of simplicity, convenience, easy operation and the like. When the data has a large number of attributes, the data may be padded using interpolation. Firstly, a maximum likelihood estimation method is adopted, secondly, a mean interpolation method is adopted, and finally, a regression interpolation method is selected. The specific operation of the maximum likelihood estimation interpolation method is to call the sample information in the existing database and calculate the maximum probability parameter causing the result to appear by adopting reverse thinking. The method has wide application range, almost covers most data conditions, and has great effectiveness and consistency in interpolation values. Its maximum likelihood function is expressed as:
Figure 267140DEST_PATH_IMAGE082
the value of the current parameter is
Figure 486769DEST_PATH_IMAGE083
When, if a function
Figure 421358DEST_PATH_IMAGE084
When the maximum value is reached, the parameter at this time
Figure 994422DEST_PATH_IMAGE085
Is the estimated value sought.
The average method is to collect the samples in the database into a data set, add the data set and then call a weighted average function, or sort and screen the data set to obtain a value with the most repetition times to replace the original data, so as to obtain an interpolation value. The calculation can be done using the following simplified equation:
Figure 9651DEST_PATH_IMAGE086
wherein the parameters
Figure 275547DEST_PATH_IMAGE087
The condition of the response is characterized,
Figure 930389DEST_PATH_IMAGE088
= 1 i.e. the default is the "yes" state,
Figure 697356DEST_PATH_IMAGE089
state of no by default if = 0;
Figure 176879DEST_PATH_IMAGE090
the quantity is characterized.
The regression interpolation method is to analyze variables needing to be supplemented and completely existing variables in the existing data set and estimate the missing value variable by constructing a regression equation simultaneously containing the two variables. Is provided with
Figure 754622DEST_PATH_IMAGE091
Is an original independent variable of the system,
Figure 319596DEST_PATH_IMAGE092
is a variable to be solved for the system,
Figure 359096DEST_PATH_IMAGE093
is a constant parameter of the number of the optical fibers,
Figure 958704DEST_PATH_IMAGE094
is as follows
Figure 143567DEST_PATH_IMAGE095
The weight coefficient of each independent variable, taking random factors into account
Figure 320470DEST_PATH_IMAGE096
Constructing a data set, substituting the data set into the following formula to respectively obtain each missing interpolation value:
Figure 39028DEST_PATH_IMAGE097
[ S012 ] abnormal value processing. Irrelevant material involved in the text data will be considered as anomalous data. The mode of directly discarding the abnormal data is not applicable; the data with fuzzy definition of the data content is transferred to a professional to be secondarily adjusted, so that the data content is specialized, glossed and refined, and the use requirement is met.
[ S013 ] repetition value processing. For repeated text contents existing in the data, firstly defining the range of the repeated contents needing to be inquired; grouping the repeated contents By using Group By; the method comprises the steps of querying data with the quantity larger than 1 by using Having, independently storing the data in a temporary file list, deleting all records with data repetition stored in the temporary file list by using a ' Select Distinct (Select deduplication) ' From address (query expression) ' loop statement to obtain a temporary file list without repeated records, summarizing the temporary file list and the data with the repetition frequency smaller than 1 to obtain a final result set, and accordingly reducing unnecessary resource waste in a system data processing process.
[ S014 ] noise filtering. There are often random errors in the raw data, which are referred to as noise. The truth of the data set is greatly influenced by noise data, so that the noise filtering is carried out in the system by adopting an outlier analysis method, a mean smoothing method and a regression method in consideration of the form diversity of the original data.
The outlier analysis method obtains a group of data sets through clustering, the data sets are called as a class of clusters, data in the clusters have high similarity, and outliers which fall outside the clusters due to large difference with values in the clusters are deleted, so that the purpose of noise reduction can be achieved.
The mean value smoothing method is commonly used for removing noise data in images, each pixel gray value with sequence characteristics is replaced by the data mean value in the field range, and effective data denoising and vocalization can be achieved.
The regression method firstly needs to preliminarily judge the trend of original data by using a visualization method, and the central idea is to find a regression function to fit the original data, then carry out smoothing operation of the function, and replace the original data set with the data set obtained again.
S015, selecting a Continuous-Bag-Of-Words (CBOW) algorithm to generate a word vector based on the size degree Of the data set; the text content is trained into a WORD vector format through WORD vector training software WORD2VEC, so that the conversion from text knowledge to a computer recognizable form is realized, the subsequent calculation and analysis process is participated, and the characteristics and text expression of the power system distribution network fault auxiliary decision are realized.
And S02, performing entity extraction on the converted word vector set, outputting the overall characteristics of the hidden layer and labeling the obtained entity.
Obtaining the established word vector set information by adopting a Bi-LSTM combined model and learning; then, using the LSTM network to perform entity extraction of the knowledge extraction system; using a Bi-LSTM network to obtain two different characteristics of each sentence, namely forward information and backward information, and outputting the overall characteristics of a hidden layer through a certain algorithm; and selecting a BIEOS method to label the entity extracted from the text data.
The invention introduces a 'gate' concept combined with a storage mechanism, after receiving information, an input gate calculates the information required to be stored by an LSTM network according to the unit state and the current weight setting and updates the information in a rolling way in real time, and the method comprises the following specific steps:
[ S021 ] the LSTM network is improved based on the RNN structure model, receives the preprocessed word vector set output information as input, and learns. By receiving doors
Figure 368509DEST_PATH_IMAGE098
Door for throwing away
Figure 412688DEST_PATH_IMAGE099
Sum result door
Figure 76888DEST_PATH_IMAGE100
And a data recording gate
Figure 599136DEST_PATH_IMAGE101
The four parts are formed. The discard gate processes the data to be discarded, and the discarded content is determined by the following formula:
Figure 484921DEST_PATH_IMAGE102
wherein:
Figure 559057DEST_PATH_IMAGE103
characterizing time
Figure 320339DEST_PATH_IMAGE104
A receiving variable of (1);
Figure 787224DEST_PATH_IMAGE105
characterizing a previous time period
Figure 950352DEST_PATH_IMAGE106
Deep layer results of (1);
Figure 398651DEST_PATH_IMAGE107
is that
Figure 292569DEST_PATH_IMAGE103
A weight;
Figure 156620DEST_PATH_IMAGE108
is that
Figure 564468DEST_PATH_IMAGE109
The weight of (2);
Figure 121351DEST_PATH_IMAGE110
is an offset.
The receiving gate calculates the information to be stored when the cell is updated in the LSTM network, and the information is determined by the following formula:
Figure 732592DEST_PATH_IMAGE111
in the formula (I), the compound is shown in the specification,
Figure 134755DEST_PATH_IMAGE112
is that
Figure 600371DEST_PATH_IMAGE113
The weight of (c);
Figure 702057DEST_PATH_IMAGE114
is that
Figure 659649DEST_PATH_IMAGE115
The weight of (c);
Figure 255715DEST_PATH_IMAGE116
is the hypothetical cell state;
Figure 247942DEST_PATH_IMAGE117
is that
Figure 756415DEST_PATH_IMAGE116
Is/are as follows
Figure 466882DEST_PATH_IMAGE118
A weight;
Figure 866639DEST_PATH_IMAGE119
is that
Figure 713372DEST_PATH_IMAGE116
Is/are as follows
Figure 360123DEST_PATH_IMAGE120
A weight;
Figure 416941DEST_PATH_IMAGE121
and
Figure 230176DEST_PATH_IMAGE122
characterization of each
Figure 337941DEST_PATH_IMAGE123
And
Figure 312850DEST_PATH_IMAGE116
the amount of error of (a);
Figure 122543DEST_PATH_IMAGE124
representing the state of the current grid;
Figure 473890DEST_PATH_IMAGE125
characterizing a grid state of a previous time period;
Figure 872379DEST_PATH_IMAGE126
characterization of
Figure 142823DEST_PATH_IMAGE127
CAVs with middle parameters transformed by Sigmoid function;
Figure 315179DEST_PATH_IMAGE128
characterization of
Figure 345583DEST_PATH_IMAGE129
CAVs with middle parameters transformed by Sigmoid function;
Figure 349311DEST_PATH_IMAGE130
characterization of
Figure 931602DEST_PATH_IMAGE116
Middle parameters are CAVs transformed by tanh function. (CAVs are activation vectors)
The results gate outputs the LSTM model:
Figure 693716DEST_PATH_IMAGE131
in the above formula:
Figure 386866DEST_PATH_IMAGE132
is that
Figure 572996DEST_PATH_IMAGE115
A weight;
Figure 326189DEST_PATH_IMAGE133
is that
Figure 817344DEST_PATH_IMAGE134
A weight;
Figure 642081DEST_PATH_IMAGE135
characterizing an error amount;
Figure 292505DEST_PATH_IMAGE136
characterizing the output result of the LSTM;
Figure 590500DEST_PATH_IMAGE137
characterization of
Figure 959164DEST_PATH_IMAGE138
Middle parameters are CAVs transformed by Sigmoid function.
Because the LSTM network only has the status feature of one-way transmission from the front to the back, the LSTM network can only obtain the advanced link of the text message, and cannot obtain the subsequent link of the text message. The invention provides an advanced relation for fusing each data by a Bi-LSTM network architecture
Figure 384329DEST_PATH_IMAGE139
And subsequent contact
Figure 889260DEST_PATH_IMAGE140
These 2 different connections, as shown in fig. 2; then calculating to obtain the final edition expression result of the deep layer
Figure 859621DEST_PATH_IMAGE141
And acquiring a result set of a scheduling data vector version of the knowledge extraction system which keeps the original semantics.
BIEOS coding mode is a commonly used entity notation, where B (begin) characterizes the beginning of an entity; i (inside) characterizes the piece of data positioned inside this label; o (outside) characterizes the piece of data as being located outside this tag; e (end) characterizes the end of this entity; s (single) characterizes the piece of data as a single entity. And will not be described in detail herein.
S03 focuses on key knowledge in the converted text file, and the labeled attribute relation of the same entity is distributed to a certain relation class by adopting a multi-classification principle, so that entity relation extraction is completed.
The multi-classification principle here means that the same entity may have multiple attribute relationships, and these attribute relationships belong to different relationship classes. While elements in the same relationship class may have multiple attributes. Therefore, the invention classifies the multi-classification principle. Here, the attribute relationship may be allocated to different relationship classes according to the content, such as ID, charcter, task, time, location, quantity, retrieval, status, and the like; it can also be assigned according to characteristics, such as simple attributes, composite attributes; single-value attribute, multi-value attribute and null value attribute; derived attributes, etc.
A weight association mechanism is introduced to improve the weight parameters, a knowledge extraction process is selectively carried out, parallel calculation can be carried out, text data are trained in a targeted mode according to different weight parameters, key contents are screened, and the running speed and the running efficiency are improved. The method comprises the following specific steps:
s031, learn and train the input data by applying the following formula:
Figure 777899DEST_PATH_IMAGE142
in the formula:
Figure 616542DEST_PATH_IMAGE143
characterizing non-apparent states
Figure 615459DEST_PATH_IMAGE144
The relative degree of importance of;
Figure 146935DEST_PATH_IMAGE145
characterizing a certain vector
Figure 349246DEST_PATH_IMAGE146
The amount of error of (a);
Figure 726001DEST_PATH_IMAGE147
is the weight automatically assigned by the weight association model;
Figure 284152DEST_PATH_IMAGE148
characterizing the number of independent parameters in the Bi-LSTM network;
Figure 845584DEST_PATH_IMAGE149
characterization of
Figure 410557DEST_PATH_IMAGE150
Middle parameters are CAVs transformed by tanh function.
Substituting the following equation to obtain the final result, as shown in fig. 3:
Figure 964904DEST_PATH_IMAGE059
in the formula:
Figure 564513DEST_PATH_IMAGE060
is the final output of the weight correlation model;
Figure 562425DEST_PATH_IMAGE061
is a weight-associated model of
Figure 349115DEST_PATH_IMAGE062
Assigning time instants to non-distinct states
Figure 411880DEST_PATH_IMAGE063
The weight of (c).
S04, training the relationship between adjacent labeled entities, repairing the labeled errors among the entities, and outputting the repaired structural association relationship among the entities.
And calling the interdependence relation between adjacent entity labels and carrying out decoupling analysis, thereby completing global optimal sequence solution on the output data of the weight association model and finally realizing the knowledge extraction of the fault processing text of the power distribution network.
As shown in fig. 4, the correct output likelihood ratio is estimated according to the following equation:
Figure 990629DEST_PATH_IMAGE151
parameter(s)
Figure 769229DEST_PATH_IMAGE152
Linear chain model in characterization error correction module(Linear-Chain) result value,
Figure 688556DEST_PATH_IMAGE153
representing a received value of a Linear-Chain model in an error correction module;
Figure 210804DEST_PATH_IMAGE154
characterizing the emission probability;
Figure 847321DEST_PATH_IMAGE155
representing the conversion probability;
Figure 672189DEST_PATH_IMAGE156
denotes a parameter
Figure 699051DEST_PATH_IMAGE157
And
Figure 149624DEST_PATH_IMAGE158
number of elements in the vector.
The simplified format is obtained by logarithmizing both sides of the equation:
Figure 578331DEST_PATH_IMAGE159
the maximum overall probability ratio result sequence of the prediction stage can be obtained:
Figure 275898DEST_PATH_IMAGE160
in the formula:
Figure 649110DEST_PATH_IMAGE161
characterizing specified predicted input values
Figure 513161DEST_PATH_IMAGE162
As a function of (c).
And S05, evaluating a knowledge extraction result according to the error range of the structured incidence relation between the summarized original text data and the repaired entities in the S01.
Comparing the deviation degree of the result sequence with the original sequence, and judging the accuracy of the work done before from two aspects: on one hand, comparing and calculating the entity sequence deviation amount and the relation sequence deviation amount respectively; on the other hand, the overall memory change polarity (i.e., whether the data is increased or decreased from the original state) of the result sequence obtained by the comparison.
For the first aspect, for the entity sequence deviation amount, if the entity has a deviation in actual work, the accuracy of the scheduling decision is greatly affected, and the influence of the deviation of the relationship data is relatively small, so that the different influence degree of the relationship data and the entity is represented by a bias coefficient.
Determining respective reward and punishment parameter forms aiming at the second aspect, and if the deviation amount is within an allowable error range, determining that the decision is effective and relatively accurate, wherein the reward and punishment parameters are embodied as reward factors; and if the deviation amount is out of the allowable error range, the decision is not reliable, and the reward and punishment parameters are reflected as penalty factors. If the whole sequence is changed in the forward direction (namely the content of the result sequence is increased), the decision is omitted before, and information loss can be caused when the decision is serious, so that reward and punishment parameters are calculated in a quadratic function mode; if the sequence is changed in a negative direction as a whole (namely the content of the result sequence is reduced), the decision is not obviously omitted, but the screening is not precise, the situation of data redundancy exists, and therefore the reward and punishment parameters participate in the calculation in a linear function form.
And (3) carrying out final reward and punishment training by using the following functions and taking the result as the evaluation standard of the model goodness and badness:
Figure 937320DEST_PATH_IMAGE163
wherein, F is a reward and punishment result,
Figure 228624DEST_PATH_IMAGE164
is the error threshold range;
Figure 89133DEST_PATH_IMAGE165
is the value of the error in the physical sequence,
Figure 334038DEST_PATH_IMAGE166
is a relational sequence error value;
Figure 471759DEST_PATH_IMAGE167
is the entity weight bias coefficient;
Figure 58598DEST_PATH_IMAGE168
is a relation bias weight coefficient.
Then the system quality evaluation result can be obtained:
Figure 547348DEST_PATH_IMAGE169
wherein, the first and the second end of the pipe are connected with each other,
Figure 894147DEST_PATH_IMAGE170
characterization of
Figure 214270DEST_PATH_IMAGE171
The number of parameters of (2);
Figure 847376DEST_PATH_IMAGE172
characterization of
Figure 197324DEST_PATH_IMAGE173
The total number;
Figure 206868DEST_PATH_IMAGE174
the system error rate is characterized, expressed as a percentage.
Figure 381498DEST_PATH_IMAGE175
The smaller the value is, the higher the system accuracy is, and the higher the decision reference value is;
Figure 592030DEST_PATH_IMAGE176
the larger the value, the larger the system error rate, the lower the decision reference value, and the need for the dispatcher to take careAnd checking and judging and modifying the system decision by combining manpower.
In addition, the invention also provides a power distribution network fault assistant decision knowledge extraction system, which comprises a data preprocessing module, a Bi-LSTM module, a weight correlation model, an error correction module and a model evaluation module, as shown in FIG. 7. The data preprocessing module is used for performing quantization operation on the text data after being processed and aggregated to form a word vector set which retains original semantics, and further comprises a missing value processing module, an abnormal value processing module, a repeated value processing module and a noise filtering processing module, wherein the missing value processing module is used for performing direct deletion processing on the text data with less data attribute quantity and performing interpolation filling processing on the text data with more data attribute quantity; the abnormal value processing module is used for discarding non-relevant data related to the text data; the repeated value processing module is used for deleting the repeated text content in the text data; the noise filtering processing module is used for carrying out noise filtering processing on random errors contained in the text data.
The Bi-LSTM module is used for extracting and labeling the entity of the word vector set output by the data preprocessing module; the weight association model is used for screening the weight of each entity extracted from the input text data, identifying and judging the relation among the entities and extracting the relation;
the error correction module is used for training the relationship between adjacent labeled entities in the Bi-LSTM module, repairing labeled errors existing in the Bi-LSTM module and outputting the structural association relationship between the repaired entities;
and the model evaluation module evaluates the accuracy of the model according to the error range of the structured incidence relation between the original text data and each repaired entity.
Examples
In order to verify the application value of the knowledge extraction system designed by the text, the failure report of a certain area and historical scheduling decision text data are used as samples for experimental verification, the processed failure is represented as a power failure event caused by switch tripping, and the analyzed scheduling decision text is used for power transmission operation after the failure.
Firstly, input non-text data is sorted and summarized to form text data by using character conversion software; the text data is segmented by using Python codes and punctuation marks as identifiers, then vectorization operation is carried out on the binary Duan Wen data by using a word vector training tool, and a word vector set capable of retaining original semantics is formed finally after multiple cycles.
Secondly, acquiring information output by the last module by adopting a Bi-LSTM combined model and learning; using LSTM network to extract entity of knowledge extraction system; and selecting a BIEOS method to label the entity extracted from the text data.
And then, modifying the weight by adopting a weight association model, selectively carrying out parallel processing on the data input into the module, identifying and judging the possibly existing relation among different entities, and finishing the relation extraction. And then, repairing a small amount of labeling errors in the Bi-LSTM module through an error correction module, establishing an association relation among output labels and outputting a final result.
And finally, evaluating the accuracy of the designed model based on a reward and punishment mechanism training result according to the error range of the original sequence of the receiving end of the last module and the final output end result sequence.
The specific steps in the power grid model are as follows:
setting experiment parameters: the BIEOS notation is used, with the entity tagging rules as shown in table 1.
Table 1 entity tagging rules
Figure 524214DEST_PATH_IMAGE177
The values of the parameters required for the experiment are shown in table 2:
TABLE 2 System parameter settings
Figure 462083DEST_PATH_IMAGE178
In order to prove that compared with the traditional method, the combination of the Bi-LSTM module, the weight correlation model and the error correction module provided by the invention can better perform entity extraction and relationship extraction, different test combinations are set for comparison, the final result of the system is evaluated, and the system accuracy under different experimental conditions is obtained as shown in Table 3.
TABLE 3 comparison of accuracy rates for different models
Figure 758065DEST_PATH_IMAGE179
In order to verify the existence of the error correction module, the global optimal sequence solution can be completed on the output data of the weight association model, the knowledge extraction of the fault processing text of the power distribution network is finally realized, and the system accuracy is improved. The existing setting of a contrast experiment is verified, the first scheme does not carry out an error correction link, the second scheme adopts the error correction module provided by the system of the invention, the two schemes of the other links are completely consistent, and the obtained result is shown in figure 5.
Fig. 6 shows the application result of the system knowledge extraction method of the present invention visually with the accident passage and the failure disposition in a certain area as the raw data. The power distribution network fault auxiliary decision knowledge extraction system comprises the following modules:
the data preprocessing module is used for performing vectorization operation on the original text data after the completion and the summarization to form a word vector set which retains original semantics;
the Bi-LSTM module is used for performing entity extraction and multiple attribute relation labeling on the word vector set output by the data preprocessing module;
the weight association model is used for screening the weights of all the entities extracted from the input text data, identifying and judging the relation among the entities and extracting the relation; the error correction module is used for repairing the structural association relationship among the entities;
the error correction module is used for training the relationship between adjacent labeled entities in the Bi-LSTM module, repairing labeled errors in the Bi-LSTM module and outputting the structural association relationship between the repaired entities;
and the model evaluation module is used for evaluating the accuracy of the model according to the error ranges of the structured incidence relations between the sorted and aggregated original text data and the repaired entities.
Firstly, carrying out segmentation processing on input text data by using a data preprocessing module to form a plurality of segmented text data; then, vectorizing the data Duan Wen by using a word vector training tool to form a word vector set capable of retaining original semantics, such as 'accepting client application, examining, meeting new installation requirements, performing site investigation and determining a scheme, examining and approving a scheme … …' and the like in text data; adopting a Bi-LSTM module to extract entities from the word vector set, outputting the overall characteristics of the hidden layer and labeling the obtained entities; focusing key knowledge in the text data by adopting a weight association model, abandoning unnecessary knowledge, and distributing the attribute relations of the same entity to a certain relation class by adopting a multi-classification principle to finish the labeling of various attribute relations; the error correction module trains the relationship between adjacent entity labels to obtain the global optimum of the text labels, and finally, a flow diagram of the structured association relationship is formed, as shown in the right diagram in FIG. 6; and finally, the model evaluation module evaluates the work of the system according to the output result, and whether the evaluation accords with the meaning expressed by the original text data. The system of the invention carries out text preprocessing, entity and entity relation extraction on the obtained left graph in FIG. 6, automatically generates a structural association relation graphic diagram of the relation between adjacent entities, combines the system with the auditing of professional dispatchers, can improve the specialty and accuracy of decision while shortening the decision time of fault handling dispatching, realizes the deployment and conversion with low cost and quick response, and brings the maximization of the income for power grid companies.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the invention can be realized by adopting various computer languages, such as object-oriented programming language Java and transliteration scripting language JavaScript.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the scope of the invention.
Nothing is said about the invention as applied to the prior art.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications derived therefrom are intended to be within the scope of the present invention.

Claims (10)

1. A power distribution network fault auxiliary decision knowledge extraction method is characterized in that,
vectorizing the obtained original text data to form a word vector set retaining original semantics;
performing entity extraction on the word vector set, and labeling the obtained entities;
distributing the labeled attribute relation of the same entity to a certain relation class by adopting a multi-classification principle to finish labeling of various attribute relations of the entity;
training the relationship between adjacent labeling entities, repairing the labeling errors among the entities, and outputting the structural association relationship among the repaired entities;
and evaluating the knowledge extraction result according to the error range of the structured incidence relation between the original text data and each repaired entity.
2. The power distribution network fault assistant decision-making knowledge extraction method according to claim 1, wherein vectorization operation is performed on the obtained original text data to form a word vector set retaining original semantics, and the method comprises the following steps: manually operating non-text data in the input original text data or sorting and summarizing the non-text data into text data by using character conversion software; segmenting the text data by using Python codes and punctuation marks as identifiers to form segmented text data; vectorization operation is carried out on the Duan Wen data by using a word vector training tool, and the data are collected into a data set after multiple cycles to form a word vector set capable of retaining original semantics.
3. The extraction method of power distribution network fault assistant decision knowledge according to claim 2, wherein the non-text data includes one or more of operation procedures, treatment plans, scheduling procedures, fault information and tables, pictures and voices of scheduling instructions.
4. The extraction method of power distribution network fault assistant decision knowledge according to claim 2, wherein after non-text data in the input original text data is manually operated or is collated and summarized into text data by using character conversion software, missing value processing, abnormal value processing, repeated value processing and noise filtering processing are further performed on the text data.
5. The power distribution network fault assistant decision knowledge extraction method of claim 4, wherein performing entity extraction on the word vector set and labeling the obtained entities comprises:
the method comprises the following steps of learning input entities by adopting a Bi-LSTM combined model, extracting the entities through an LSTM network, and labeling the entities extracted from text data by adopting a BIEOS entity labeling method, and specifically comprises the following steps:
the LSTM network receives a set of word vectors as input and performs learning, including a receiving gate
Figure 783008DEST_PATH_IMAGE001
Discarding door
Figure 297166DEST_PATH_IMAGE002
Door for recording and displaying results
Figure 489113DEST_PATH_IMAGE003
And a data recording gate
Figure 880649DEST_PATH_IMAGE004
The discarding gate processes the text data to be discarded, and the formula adopted by the discarded content is as follows:
Figure 765428DEST_PATH_IMAGE005
wherein:
Figure 376669DEST_PATH_IMAGE006
a receive variable characterizing time t;
Figure 372307DEST_PATH_IMAGE007
characterizing a previous period
Figure 510028DEST_PATH_IMAGE008
Deep layer results of (2);
Figure 346134DEST_PATH_IMAGE009
is that
Figure 834884DEST_PATH_IMAGE010
A weight;
Figure 430951DEST_PATH_IMAGE011
is that
Figure 501806DEST_PATH_IMAGE012
The weight of (2);
Figure 134913DEST_PATH_IMAGE013
is an offset;
the receiving door
Figure 970014DEST_PATH_IMAGE014
When the LSTM network carries out cell updating, the information to be stored is calculated, and the formula is as follows:
Figure 245137DEST_PATH_IMAGE015
wherein:
Figure 471631DEST_PATH_IMAGE016
is that
Figure 541218DEST_PATH_IMAGE017
The weight of (c);
Figure 598036DEST_PATH_IMAGE018
is that
Figure 489900DEST_PATH_IMAGE019
The weight of (c);
Figure 191140DEST_PATH_IMAGE020
is the hypothetical cell state;
Figure 821841DEST_PATH_IMAGE021
is that
Figure 241321DEST_PATH_IMAGE022
Is/are as follows
Figure 169832DEST_PATH_IMAGE023
A weight;
Figure 115791DEST_PATH_IMAGE024
is that
Figure 792760DEST_PATH_IMAGE025
Is/are as follows
Figure 574902DEST_PATH_IMAGE019
A weight;
Figure 729940DEST_PATH_IMAGE026
and
Figure 733668DEST_PATH_IMAGE027
characterization of each
Figure 689861DEST_PATH_IMAGE028
And
Figure 83933DEST_PATH_IMAGE029
the amount of error of (a);
Figure 167295DEST_PATH_IMAGE030
representing the state of the current grid;
Figure 963213DEST_PATH_IMAGE031
characterizing a grid state of a previous time period;
Figure 326192DEST_PATH_IMAGE032
characterization of
Figure 535457DEST_PATH_IMAGE033
CAVs with medium parameters transformed by a Sigmoid function;
Figure 32297DEST_PATH_IMAGE034
characterization of
Figure 56623DEST_PATH_IMAGE035
CAVs with middle parameters transformed by Sigmoid function;
Figure 246296DEST_PATH_IMAGE036
characterization of
Figure 677277DEST_PATH_IMAGE037
CAVs with medium parameter transformed by tanh function, the CAVs being activation vectors
An amount;
the result door
Figure 587595DEST_PATH_IMAGE038
Outputting the entity extracted from the Bi-LSTM combined model:
Figure 358105DEST_PATH_IMAGE039
in the formula:
Figure 843313DEST_PATH_IMAGE040
is that
Figure 433695DEST_PATH_IMAGE041
A weight;
Figure 843642DEST_PATH_IMAGE042
is that
Figure 593292DEST_PATH_IMAGE043
A weight;
Figure 124768DEST_PATH_IMAGE044
characterizing an error amount;
Figure 77811DEST_PATH_IMAGE045
characterizing an output result of the LSTM network;
Figure 454566DEST_PATH_IMAGE046
characterization of
Figure 261985DEST_PATH_IMAGE047
Middle parameters are CAVs transformed by Sigmoid function.
6. The power distribution network fault assistant decision-making knowledge extraction method according to claim 5, wherein a weight association mechanism is introduced to improve weight parameters, text data are trained according to different weight parameters, and key contents are screened to extract entity relationships, and the method specifically comprises the following steps:
the input word vector set data is learnt and trained by adopting the following formula, and the input word vector set data is selectively processed in parallel:
Figure 338263DEST_PATH_IMAGE048
in the formula:
Figure 903237DEST_PATH_IMAGE049
characterizing non-apparent states
Figure 411578DEST_PATH_IMAGE050
Relative degree of importance of;
Figure 620974DEST_PATH_IMAGE051
sign a certain vector
Figure 228673DEST_PATH_IMAGE052
The amount of error of (a);
Figure 405576DEST_PATH_IMAGE053
is the weight automatically assigned by the weight association model;
Figure 655292DEST_PATH_IMAGE054
characterizing the number of independent parameters in the Bi-LSTM network;
Figure 217729DEST_PATH_IMAGE055
characterization of
Figure 324226DEST_PATH_IMAGE052
CAVs with medium parameters transformed by tanh functions;
Figure 863791DEST_PATH_IMAGE056
characterizing the look-ahead relationship for each text datum;
Figure 261406DEST_PATH_IMAGE057
characterizing subsequent contacts for each text datum;
Figure 835607DEST_PATH_IMAGE058
is an operator;
and substituting the obtained data into the following formula to determine the structural relationship among different extraction entities:
Figure 113004DEST_PATH_IMAGE059
in the formula:
Figure 513767DEST_PATH_IMAGE060
is the final output of the weight correlation model;
Figure 574127DEST_PATH_IMAGE061
is a weight correlation model
Figure 861889DEST_PATH_IMAGE062
Assigning time instants to non-distinct states
Figure 513450DEST_PATH_IMAGE063
The weight of (c).
7. The power distribution network fault assistant decision knowledge extraction method according to claim 5, wherein decoupling analysis is performed by calling the interrelation between adjacent labeled entities, global optimal sequence solution for output data and knowledge extraction for power distribution network fault processing texts are sequentially completed, and the correct output probability ratio is calculated according to the following formula:
Figure 840658DEST_PATH_IMAGE064
wherein:
Figure 829342DEST_PATH_IMAGE065
representing a result value of a Linear-Chain model in the error correction module;
Figure 112556DEST_PATH_IMAGE066
representing a received value of a Linear-Chain model in an error correction module;
Figure 49200DEST_PATH_IMAGE067
characterizing the emission probability;
Figure 785075DEST_PATH_IMAGE068
representing the conversion probability;
Figure 780713DEST_PATH_IMAGE069
is referred to as a parameter
Figure 793799DEST_PATH_IMAGE070
Number of elements in the vector.
8. The method for extracting power distribution network fault assistant decision knowledge according to claim 7, wherein the quality of the extraction of the assistant decision knowledge is evaluated based on a reward and punishment mechanism according to an error range of a structured incidence relation between the obtained original text data and the repaired entities, and an evaluation function is adopted as follows:
Figure 256004DEST_PATH_IMAGE071
wherein F is a reward and punishment result,
Figure 603809DEST_PATH_IMAGE072
is the error threshold range;
Figure 75242DEST_PATH_IMAGE073
is the value of the error in the physical sequence,
Figure 441370DEST_PATH_IMAGE074
is a relational sequence error value;
Figure 667952DEST_PATH_IMAGE075
is the entity weight bias coefficient;
Figure 378419DEST_PATH_IMAGE076
is a relation weight bias coefficient;
and then combining the following formula to obtain an evaluation result of the decision reference value:
Figure 997750DEST_PATH_IMAGE077
wherein, the first and the second end of the pipe are connected with each other,
Figure 375642DEST_PATH_IMAGE078
characterization of
Figure 304284DEST_PATH_IMAGE079
The number of parameters of (2);
Figure 502047DEST_PATH_IMAGE080
representing the total number of F;
Figure 626867DEST_PATH_IMAGE081
the system error rate is characterized, expressed as a percentage.
9. The system for extracting the power distribution network fault auxiliary decision knowledge is characterized by comprising the following modules:
the data preprocessing module is used for performing vectorization operation on the original text data after the completion and the summarization to form a word vector set which retains original semantics;
the Bi-LSTM module is used for performing entity extraction and multiple attribute relation labeling on the word vector set output by the data preprocessing module;
the error correction module is used for training the relationship between adjacent labeled entities in the Bi-LSTM module, repairing labeled errors existing in the Bi-LSTM module and outputting the structural association relationship between the repaired entities;
and the model evaluation module is used for evaluating the accuracy of the model according to the error range of the structured incidence relation between the summarized original text data and the repaired entities.
10. The system for extracting power distribution network fault assistant decision knowledge according to claim 9, wherein a weight association model is further provided in the system, and is used for performing weight screening on each entity extracted from the input text data, identifying and determining the relation among the entities, and performing relation extraction; and repairing the structural association relationship among the entities through the error correction module.
CN202211086406.7A 2022-09-06 2022-09-06 Power distribution network fault auxiliary decision knowledge extraction method and system Active CN115438190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211086406.7A CN115438190B (en) 2022-09-06 2022-09-06 Power distribution network fault auxiliary decision knowledge extraction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211086406.7A CN115438190B (en) 2022-09-06 2022-09-06 Power distribution network fault auxiliary decision knowledge extraction method and system

Publications (2)

Publication Number Publication Date
CN115438190A true CN115438190A (en) 2022-12-06
CN115438190B CN115438190B (en) 2023-06-06

Family

ID=84246442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211086406.7A Active CN115438190B (en) 2022-09-06 2022-09-06 Power distribution network fault auxiliary decision knowledge extraction method and system

Country Status (1)

Country Link
CN (1) CN115438190B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115618603A (en) * 2022-10-14 2023-01-17 华能信息技术有限公司 Service monitoring method and system based on APM platform

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598001A (en) * 2019-08-05 2019-12-20 平安科技(深圳)有限公司 Method, device and storage medium for extracting association entity relationship
CN111241303A (en) * 2020-01-16 2020-06-05 东方红卫星移动通信有限公司 Remote supervision relation extraction method for large-scale unstructured text data
US20210240603A1 (en) * 2018-11-05 2021-08-05 Yangzhou University Entity and relationship joint extraction method oriented to software bug knowledge
CN113268452A (en) * 2021-05-25 2021-08-17 联仁健康医疗大数据科技股份有限公司 Entity extraction method, device, equipment and storage medium
CN113609857A (en) * 2021-07-22 2021-11-05 武汉工程大学 Legal named entity identification method and system based on cascade model and data enhancement
CN113901178A (en) * 2021-10-28 2022-01-07 北京航空航天大学 Entity relation extraction method of wind tunnel fault text knowledge

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210240603A1 (en) * 2018-11-05 2021-08-05 Yangzhou University Entity and relationship joint extraction method oriented to software bug knowledge
CN110598001A (en) * 2019-08-05 2019-12-20 平安科技(深圳)有限公司 Method, device and storage medium for extracting association entity relationship
CN111241303A (en) * 2020-01-16 2020-06-05 东方红卫星移动通信有限公司 Remote supervision relation extraction method for large-scale unstructured text data
CN113268452A (en) * 2021-05-25 2021-08-17 联仁健康医疗大数据科技股份有限公司 Entity extraction method, device, equipment and storage medium
CN113609857A (en) * 2021-07-22 2021-11-05 武汉工程大学 Legal named entity identification method and system based on cascade model and data enhancement
CN113901178A (en) * 2021-10-28 2022-01-07 北京航空航天大学 Entity relation extraction method of wind tunnel fault text knowledge

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘凯;符海东;邹玉薇;顾进广;: "基于卷积神经网络的中文医疗弱监督关系抽取", 计算机科学, no. 10, pages 249 - 253 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115618603A (en) * 2022-10-14 2023-01-17 华能信息技术有限公司 Service monitoring method and system based on APM platform

Also Published As

Publication number Publication date
CN115438190B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
US9292599B2 (en) Decision-tree based quantitative and qualitative record classification
CN108470022A (en) A kind of intelligent work order quality detecting method based on operation management
CN112541600A (en) Knowledge graph-based auxiliary maintenance decision method
CN113138920B (en) Software defect report allocation method and device based on knowledge graph and semantic role labeling
CN116861924A (en) Project risk early warning method and system based on artificial intelligence
CN113886562A (en) AI resume screening method, system, equipment and storage medium
CN115438190B (en) Power distribution network fault auxiliary decision knowledge extraction method and system
CN113674846A (en) Hospital intelligent service public opinion monitoring platform based on LSTM network
CN113379214A (en) Method for automatically filling and assisting decision of power grid accident information based on affair map
CN112685374B (en) Log classification method and device and electronic equipment
CN109635008B (en) Equipment fault detection method based on machine learning
CN116541755A (en) Financial behavior pattern analysis and prediction method based on time sequence diagram representation learning
CN114048856B (en) Knowledge reasoning-based automatic safety event handling method and system
CN116226747A (en) Training method of data classification model, data classification method and electronic equipment
CN115204179A (en) Entity relationship prediction method and device based on power grid public data model
CN112905845B (en) Multi-source unstructured data cleaning method for discrete intelligent manufacturing application
CN112199287B (en) Cross-project software defect prediction method based on enhanced hybrid expert model
CN113807704A (en) Intelligent algorithm platform construction method for urban rail transit data
CN114637849B (en) Legal relation cognition method and system based on artificial intelligence
CN116150610B (en) Training method, system, computer and storage medium for suspicious error data processing model
CN116976640B (en) Automatic service generation method, device, computer equipment and storage medium
CN117354251A (en) Automatic extraction method for electric power Internet of things terminal characteristics
CN117592778A (en) Risk early warning system
CN117668627A (en) Information processing method, apparatus, device, storage medium, and product
CN113946693A (en) Knowledge graph construction method, system, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant