CN113312494A - Vertical domain knowledge graph construction method, system, equipment and storage medium - Google Patents

Vertical domain knowledge graph construction method, system, equipment and storage medium Download PDF

Info

Publication number
CN113312494A
CN113312494A CN202110594447.6A CN202110594447A CN113312494A CN 113312494 A CN113312494 A CN 113312494A CN 202110594447 A CN202110594447 A CN 202110594447A CN 113312494 A CN113312494 A CN 113312494A
Authority
CN
China
Prior art keywords
rule
rules
entity
business
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110594447.6A
Other languages
Chinese (zh)
Inventor
张中浩
谈元鹏
焦飞
徐会芳
仝杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN202110594447.6A priority Critical patent/CN113312494A/en
Publication of CN113312494A publication Critical patent/CN113312494A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a system, equipment and a storage medium for constructing a vertical domain knowledge graph, wherein the method comprises the following steps: acquiring service data in a unified data model of the power grid; combing and classifying the business rules in the business data; describing the relevant service rule in a function form to obtain a service rule function; embedding a business rule function into a neural network, and fusing the business rule function with a decision tree to form a text data stream classification model; arranging the business rules to form rule parameters, forming an activation function to replace the activation function in the original extraction algorithm of the entity and the relationship, and obtaining an entity relationship extraction algorithm of the fusion rules; and performing entity relation extraction on the unified data model based on an entity relation extraction algorithm of the fusion rule, and constructing a knowledge graph according to an entity relation extraction result. By linking the general entities and the associated services in different services, information interconnection among different service maps is ensured, and the construction and application effects of the knowledge maps in the vertical field are improved.

Description

Vertical domain knowledge graph construction method, system, equipment and storage medium
Technical Field
The invention belongs to the technical field of vertical domain knowledge graph construction and application, and particularly relates to a vertical domain knowledge graph construction method, a vertical domain knowledge graph construction system, a vertical domain knowledge graph construction device and a storage medium.
Background
The unified data model is an enterprise data model constructed based on an object-oriented modeling technology, is a data physical model constructed by referring to international standards and industry best practices, combining core business requirements of corresponding enterprises and adopting a mode of combining 'business requirement driven top-down' and 'current situation driven bottom-up', and comprises data in a plurality of sub-business fields. The unified data model collects data in a unified manner, but the relevance among the data is weak, and the data in multiple service fields are difficult to be subjected to through analysis. In recent years, the knowledge graph is used as a management and application technology of the intellectual data, and a link barrier between service data is opened through a triple form of entity-relation-attribute, so that an important method is provided for knowledge management of a unified data model. Therefore, knowledge application transformation is carried out on the unified data model by combining the knowledge map technology, so that a semantic-level interoperation unified data service is established, efficient and intelligent query of data is realized, cross-business communication of the data is guaranteed, and efficient application and intelligent analysis of the text data of the unified data model are facilitated.
At present, when a knowledge graph is constructed aiming at service data in a unified information model, the algorithm technology is usually started, and the service rule is rarely considered, so that the problems of complicated knowledge system, difficulty in meeting the service rule by an entity relationship extraction result and the like exist in the construction process of the knowledge graph.
Disclosure of Invention
In order to solve the problems existing in the construction of the conventional knowledge graph, the invention provides a method, a system, equipment and a storage medium for constructing a vertical domain knowledge graph.
In order to achieve the purpose, the invention adopts the following technical scheme:
a vertical domain knowledge graph construction method comprises the following steps:
acquiring service data in a unified data model of the power grid;
combing the service rules in the service data, and expressing the service rules by using functions to obtain a service rule mapping function;
embedding a rule mapping function into a neural network, fusing the rule mapping function with a decision tree to form a text data stream classification model, and distributing text data into different service types; performing entity and relationship extraction on the text data classified by the text data stream classification model to obtain an entity relationship extraction algorithm;
combining with the business rule mapping function to obtain rule coefficients, introducing the rule coefficients into the activation function of the entity relationship extraction algorithm to form an entity and relationship extraction algorithm fusing the business rules,
and performing entity relation extraction on the power grid unified data model based on an entity relation extraction algorithm of the fusion rule, and constructing a knowledge graph according to an entity relation extraction result.
As a further improvement of the present invention, the combing the business rules in the business data specifically includes:
analyzing rules and constraint conditions of different business text data in a power grid unified data model, combing and constructing rules to be followed in a corpus relationship network, and listing a rule set;
a carding rule set, which divides rules into a mechanistic rule, a constraint rule and a dependency rule; the mechanistic rule is that the place is often connected with a line and a station name later; the constraint rule is the expression of noun + number in the text; the dependency rule is a vocabulary related to the power transmission field in the text;
and counting different types of rules, and integrating and describing the business rules to obtain the text data.
As a further improvement of the present invention, the obtaining of the service rule mapping function by expressing the service rule with a function specifically includes:
converting the mechanism and condition constraint in the business rule into a functional form;
wherein, the mechanistic rule is as follows: different element types in the text have x, y and z, and the satisfied mechanism rule is expressed by a functional relation:
n(x)=a*f(x)+b*g(y) (2)
wherein, n (x) represents the numerical representation of the adjacent elements behind the place, f (x), g (y) respectively represent the numerical representation represented by the line and the station, a and b respectively represent the probability coefficient of the connection between the x and the numerical representation of the line and the station;
constraint rules: if different element types in the text have x and y, the constraint conditions are expressed as follows:
R2:f(x)∈{y|a<y<b} (3)
wherein, f (x) is a related numerical calculation mode of x and y, and different element relationships are constrained by constraining the numerical range of the x and the y;
the dependency rules are: the single or multiple elements in the text are contained in a certain range, and the constraint conditions are represented as follows:
R3:x,y,z∈{f(a),g(a)} (4)
wherein, f (a), g (a) represent the range boundary defined by the element a in different functional forms;
then, performing functional representation on the rules of different types, and further establishing a rule set:
∑R:{R1,R2,...,Rn} (5)
where Σ R represents a certain type of set of rule functions, R1, R2.
As a further improvement of the present invention, the rule mapping function is embedded in the neural network and fused with the decision tree to form a text data stream classification model, specifically:
integrating based on a rule function set, constructing and forming a new function as an activation function of a neural network neuron, wherein the Relu activation function is expressed as:
Max(0,∑(Ri(x)*ai)) (6)
wherein R isiIs a rule function in a certain class of rules, aiIs the weight of the rule function, x is the input text functional representation variable;
and obtaining an activation function fused with a rule through the formula, respectively constructing neural networks for classifying different services, combining the neural networks with the decision tree to form different neural networks as nodes of the decision tree, and classifying the input data streams layer by layer to obtain a text data stream classification model.
As a further improvement of the present invention, the specific method of the entity relationship extraction algorithm for obtaining the fusion rule is as follows:
the rule mechanism coefficient, the rule constraint coefficient and the sample boundary coefficient are arranged by combining a business rule mapping function;
the business rule arrangement mechanism coefficient is formed by arranging business rules to form rule parameters, and the rule coefficient is calculated as follows:
Figure BDA0003090429030000041
wherein alpha isiIn order to be a rule-mechanism coefficient,
Figure BDA0003090429030000042
for the rule-mechanism function fit values, n (x) is the actual value,
Figure BDA0003090429030000043
is an average value;
the specific calculation method of the business rule constraint coefficient comprises the following steps:
βi=∑(f(x)i*k) (8)
wherein the content of the first and second substances,βifor regular constraint coefficients, f (x)iCalculating a value for a constraint function of an element, wherein k is 1 when the constraint function satisfies a constraint condition, and is zero otherwise;
the specific calculation method of the sample boundary coefficient comprises the following steps:
γi=∑(∏fk(xj)) (9)
wherein, γiIs a sample boundary coefficient, xjAs a single element in the text, fk(xj) Calculating a result for a range of an element;
introducing a rule mechanism coefficient, a rule constraint coefficient and a sample boundary coefficient into an activation function of the long-time and short-time memory network to replace the activation function in the entity relationship extraction algorithm to obtain the entity relationship extraction algorithm fused with the rules:
Figure BDA0003090429030000044
wherein σr(z) is an activation function that incorporates rules, αiIs a coefficient of a rule mechanism, betaiFor the regular constraint coefficient, gammaiAre sample boundary coefficients.
As a further improvement of the present invention, the constructing of the knowledge graph according to the entity relationship extraction result specifically comprises:
and after extracting the entities and the relations in the text data stream of the unified data model, obtaining the relations between the entities with the types similar to the types of the rules, further obtaining entity-relation combination units comprising the mechanical relations, the constraint relations and the dependency relations, and connecting different entity nodes in the map through multiple relations to form the knowledge map containing the entity-relation combination units.
As a further improvement of the invention, the knowledge graph constructed based on the unified data model is divided by taking the sub-services as blocks, and the association relationship is established between the sub-services according to the shared entity or the associated service.
A vertical domain knowledge graph building system, comprising:
the acquisition unit is used for acquiring service data in the unified data model of the power grid;
the business rule sorting unit is used for sorting the business rules in the business data and expressing the business rules by functions to obtain a business rule mapping function;
the entity relationship establishing unit is used for embedding the rule mapping function into the neural network, fusing the rule mapping function with the decision tree to form a text data stream classification model and distributing the text data into different service types; performing entity and relationship extraction on the text data classified by the text data stream classification model to obtain an entity relationship extraction algorithm;
an entity relationship fusion unit for combining the service rule mapping function to obtain rule coefficients, introducing the rule coefficients into the activation function of the entity relationship extraction algorithm to form an entity and relationship extraction algorithm for fusing the service rules,
and the knowledge graph construction unit is used for extracting the entity relationship of the power grid unified data model based on the entity relationship extraction algorithm of the fusion rule and constructing the knowledge graph according to the entity relationship extraction result.
An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the vertical domain knowledge graph construction method when executing the computer program.
A computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the vertical domain knowledge graph construction method.
The invention has the beneficial effects that:
the invention effectively integrates and extracts the rules in the text data in the unified information model, and provides a functional representation method thereof, so that the business rules can be effectively integrated into an algorithm. The neural network and decision tree combined model improved based on the rule function improves the limitation that text data streams in a unified information model are difficult to automatically divide according to business rules. By the entity and relation extraction algorithm of the fusion business rule, the goodness of fit between the knowledge graph entity relation and the business rule is improved. By linking the general entities and the associated services in different services, information interconnection among different service maps is ensured, and the construction and application effects of the knowledge maps in the vertical field are improved.
Drawings
FIG. 1 is a schematic flow chart of a vertical domain knowledge graph construction method according to a preferred embodiment of the present invention;
FIG. 2 is a step one of extracting business rules in the unified data model of the power grid;
FIG. 3 is a functional depiction of the business rules of step two;
FIG. 4 is a step three rule embedded in neural network activation function, combined with decision tree to form data flow classification;
FIG. 5 is a classification model of a text data stream;
FIG. 6 is a diagram of the fourth step of sorting out rule parameters and extracting entity relationships of fusion rules;
FIG. 7 is a step five of constructing a knowledge graph based on an entity relationship extraction algorithm of an integration rule;
FIG. 8 is a sample knowledge graph containing multiple types of relationships;
FIG. 9 is an example of associations between knowledge graph sub-business graphs based on a unified data model;
FIG. 10 is a schematic diagram of the vertical domain knowledge graph building system according to the preferred embodiment of the present invention;
fig. 11 is a schematic structural diagram of an electronic device according to a preferred embodiment of the invention.
Detailed Description
The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The following detailed description is exemplary in nature and is intended to provide further details of the invention. Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention.
The invention belongs to a vertical domain knowledge graph construction and application technology, and provides a vertical domain knowledge graph construction method based on a unified data model, as shown in figure 1.
Firstly, combing and classifying rules contained in the unified data model according to the business data in the unified data model. Such as a grid unified data model.
And secondly, designing a function representation mode of the rule according to rule logic.
And embedding the rule mapping function into an activation function of the neural network model again, combining the rule mapping function with the decision tree to form a text data stream splitting model, and splitting the text into different service types.
And finally, designing an activation function introducing rule coefficients, forming an entity and relationship extraction algorithm fusing the business rules, and constructing a knowledge graph according to the entity relationship extraction result.
The construction of the knowledge graph is mainly carried out through the following five steps.
The invention provides a vertical domain knowledge graph construction method based on a unified data model, which can improve the knowledge application technical level and the application effect of the unified data model in various industries from three aspects:
firstly, based on text data in a unified data model, business rules are sorted and extracted and are functionally represented, and the problem that the business rules are difficult to analyze through a mathematical means is solved.
And secondly, the neural network is used as a decision tree node to construct a service division model aiming at the same data model, so that the efficiency of data service division in the vertical field is improved, and the original text corpus is deeply combined with the service characteristics.
And thirdly, introducing a business rule in the construction of the knowledge graph to form an entity relation extraction algorithm of a fusion rule, so that the constructed knowledge graph better conforms to the business specification of the vertical field, and the application of the knowledge in the vertical field is better supported.
The method specifically comprises the following steps:
acquiring service data in a unified data model of the power grid;
combing the service rules in the service data, and expressing the service rules by using functions to obtain a service rule mapping function;
embedding a rule mapping function into a neural network, fusing the rule mapping function with a decision tree to form a text data stream classification model, and distributing text data into different service types; performing entity and relationship extraction on the text data classified by the text data stream classification model to obtain an entity relationship extraction algorithm;
combining with the business rule mapping function to obtain rule coefficients, introducing the rule coefficients into the activation function of the entity relationship extraction algorithm to form an entity and relationship extraction algorithm fusing the business rules,
and performing entity relation extraction on the power grid unified data model based on an entity relation extraction algorithm of the fusion rule, and constructing a knowledge graph according to an entity relation extraction result.
Taking the electric field as an example, the invention realizes the purpose through the following technical scheme:
the construction of the knowledge graph is mainly carried out through the following five steps.
The method comprises the following steps: and combing and extracting the service rules of the text data in the unified data model of the power grid.
FIG. 2 is a step one: extracting business rules in unified data model of power grid
(1) Analyzing rules and constraint conditions of text data in different services, combing and constructing rules to be followed in the corpus relationship network, and listing a rule set, for example, as follows:
rule one is as follows: the place is often connected with a line and a station name;
rule two: the expression of noun + number is taken as the main expression in the text;
rule three: the related words such as 'pole tower', 'steel pipe pole', 'angle steel tower' in the text belong to the field of power transmission.
And (4) combing a rule set, and dividing the rule into a mechanistic rule, a constraint rule, a dependency rule and the like. The mechanistic rule means that the rule determines the association mode and the driving mode among different elements, and if the place is connected with a line and a station name later, the association mode and the driving mode are determined; the constraint rule determines the existence mode of the element, such as the expression of noun + number in the text; the subordinate ranges of the elements are determined according to the attribute rules, and related words such as 'tower', 'steel pipe pole', 'angle steel tower' and the like in the text belong to the field of power transmission and the like. And counting different types of rules, and integrating and describing main conditions of the rules.
Step two: and (4) arranging mechanism and condition constraints in the business rules, and describing the relevant rules in a functional form.
FIG. 3 is a functional description of the business rules of step two.
And the mechanism and condition constraint in the business rule are converted into a functional form, so that the rule is easy to fuse with a machine learning algorithm. Taking the mechanistic rule, the constraint rule and the dependency rule as examples, the method respectively carries out functional description on the rules, and comprises the following steps:
the mechanistic rules are as follows: different element types in the text have x, y and z, and the satisfied mechanism rule can be represented by a functional relationship:
R1:F(x,y,z)=a (1)
where R1 represents rule 1, the function F defines the mathematical relationship of the elements x, y, z, and the relationship can be determined according to the specific situation, and for the example of "place is often connected with line and station name later", the functional form can be represented as:
n(x)=a*f(x)+b*g(y) (2)
wherein, n (x) represents the numerical representation of the adjacent elements behind the place, f (x), g (y) respectively represent the numerical representation represented by the line and the station, a and b respectively represent the probability coefficient of the connection between the x and the numerical representation of the line and the station.
Constraint rules: different elements in the text need to satisfy certain constraint conditions, and if different element types in the text have x and y, the constraint conditions can be expressed as follows:
R2:f(x)∈{y|a<y<b} (3)
wherein, f (x) is a relevant numerical calculation mode of x and y, and different element relations are restricted by restricting the numerical range of x and y.
The dependency rules are: the single or multiple elements in the text are included in a certain range, and the constraint conditions can be expressed as:
R3:x,y,z (4)
∈{f(a),g(a)}
wherein, f (a), g (a) represent the range boundary defined by the element a in different functional forms.
As described above, different types of rules may be functionally represented, thereby creating a rule set:
∑R:{R1,R2,...,Rn} (5)
step three: and embedding the rules into a neural network, and fusing the rules with a decision tree to form a text data stream classification model. FIG. 4 shows a third step: the rules are embedded in the neural network activation function and combined with the decision tree to form data flow classification.
And integrating based on the rule function set, constructing and forming a new function as an activation function of the neural network neurons, ensuring that the network meets the training requirement of a gradient descent method, and enabling the neural network to classify and process the data stream according to the service rule. Taking the Relu type activation function as an example, when a certain type of business rule function is embedded in the activation function, the Relu activation function can be expressed as follows:
Max(0,∑(Ri(x)*ai)) (6)
wherein R isiIs a rule function in a certain class of rules, aiFor regular function weights, x is the input text functional representation variable.
And obtaining an activation function fused with the rules through the formula, and for different service types, wherein the number and the types of the included rules are different, respectively constructing neural networks for classifying different services, combining the neural networks with a decision tree to form different neural networks as nodes of the decision tree, classifying input data streams layer by layer, and finally obtaining a classification model of the service types of the data streams.
The text classification process in this step is explained by taking the text classification in the power domain as an example.
Fig. 5 is a classification model of a text data stream.
In fig. 5, a text field, "power transmission line fault trip", is vectorized and input into the neural network + decision tree classification model in the third step, and service division is performed successively through decision tree nodes from top to bottom, if no service classification is obtained at the node of the current stage, classification is performed at the node of the next stage, and thus, successive judgment is performed, and finally, a text classification is obtained. In this example, if the "transmission line fault trip" belongs to the transmission service, the classification result may be obtained in the neural network NN1 of the first-stage node.
Step four: and sorting the business rules to form rule parameters, forming an activation function to replace the activation function in the original extraction algorithm of the entity and the relationship, and providing the entity relationship extraction algorithm of the fusion rule.
FIG. 6, step four: sorting out rule parameters and extracting entity relation of fusion rule
And aiming at the classified text data of a certain specific service field, performing entity and relation extraction on the classified text data. Combining the business rule functions in different fields, which are combed in the step one, sorting parameters such as rule mechanism coefficients, rule constraint coefficients, sample boundary coefficients and the like, and improving the activation function of the neurons of the long and short memory networks, wherein the improvement process comprises the following steps:
rule mechanism coefficients:
Figure BDA0003090429030000111
wherein alpha isiIn order to be a rule-mechanism coefficient,
Figure BDA0003090429030000112
for the rule-mechanism function fit values, n (x) is the actual value,
Figure BDA0003090429030000113
the average value may reflect the degree of compliance of the business data with the rules.
Rule constraint coefficient:
βi=∑(f(x)i*k) (8)
wherein, betaiFor regular constraint coefficients, f (x)iThe value is calculated for the constraint function of an element, k being 1 when it satisfies the constraint condition, and zero otherwise.
Sample boundary coefficients:
γi=∑(∏fk(xj)) (9)
wherein, γiIs a sample boundary coefficient, xjAs a single element in the text, fk(xj) The result is calculated for the range of an element.
After a rule mechanism coefficient, a rule constraint coefficient and a sample boundary coefficient are obtained, the rule mechanism coefficient, the rule constraint coefficient and the sample boundary coefficient are introduced into an activation function of a long-time memory network, and the improved activation function is as follows:
Figure BDA0003090429030000114
wherein σr(z) is an activation function that incorporates rules, αiIs a coefficient of a rule mechanism, betaiFor the regular constraint coefficient, gammaiAre sample boundary coefficients.
Rule coefficients are introduced into a forgetting gate, an input gate and an output gate of the long-time memory network, so that an algorithm is effectively combined with rules.
Step five: and performing entity and relationship extraction on the unified data model based on an entity relationship extraction algorithm of the fusion rule, and further constructing a knowledge graph.
FIG. 7, step five: entity relation extraction algorithm based on blending rule, and knowledge graph is constructed
After the entities and the relations in the text data stream of the unified data model are extracted, the types of the entities and the relations are similar to the types of the rules, the relations among the entities also exist in various types, such as a mechanical relation, a constraint relation, a dependency relation and the like, further the entity-relations also have various combination units, different entity nodes in the map are connected through various relations to form a knowledge map containing various entity-relation combination units, and by taking the electric power field as an example, the knowledge map containing the mechanical relation, the constraint relation and the dependency relation is as follows:
FIG. 8 is a sample knowledge graph containing multiple types of relationships.
For a unified data model in a certain field, the unified data model often contains data in a plurality of sub-service ranges, so that a knowledge graph constructed based on the unified data model is often divided by taking sub-services as blocks, and meanwhile, an association relation is established between the sub-services according to a common entity or an association service, so that the knowledge graph constructed based on the unified data model is represented in the following graph form in different sub-services:
FIG. 9 is an example of associations between knowledge graph sub-business graphs based on a unified data model.
Therefore, when the knowledge graph is constructed based on the unified data model, on one hand, entity and relation extraction is carried out based on each sub-service data, and then the sub-service knowledge data is stored through the knowledge graph, on the other hand, the shared entities and the incidence relation among the sub-service knowledge graphs are linked, and interconnection and intercommunication of the knowledge graph data in each service are realized.
And forming a vertical domain knowledge graph construction technology based on the unified data model through the processing procedures from the first step to the fifth step.
And (4) combing the business rules of the text data in the unified data model, and performing functional representation on the business rules according to the rule internal logic. And introducing a rule function into the neuron activation function, and classifying the text data stream in the statistical data model according to the business rule logic. And arranging the business rule functions to form rule parameters, and constructing activation functions to replace the activation functions in the entity and relation original extraction algorithm, so that the construction of the knowledge map is more in line with the business rules, and the effect of knowledge application in the vertical domain unified data model is improved.
The invention has the advantages that:
firstly, a functional representation method of text data business rules in a unified data model is provided. Combing and integrating the business rules of the text data in the unified model, extracting different types of rules such as mechanism rules, constraint rules, dependency rules and the like, further analyzing mechanism and condition constraints in the business rules, and describing related rules in a function form;
and secondly, designing a fusion model of the neural network and the decision tree to realize classification of the text data streams in the unified data model. Based on the functionally expressed business rules, new activation functions of the neural network are formed in a sorting mode, and a decision tree model with the neural network as nodes is designed to realize classification of business text data;
thirdly, designing a knowledge graph construction technology for fusing the service rules of the unified data model. Based on the functionally expressed service rules, the parameters such as rule mechanism coefficients, rule constraint coefficients, sample boundary coefficients and the like are further formed in a sorting mode, the activation functions of long-time memory network neurons used by entity and relation extraction are improved, and the electric power field knowledge graph construction technology fusing the service rules is formed.
As shown in FIG. 10, another objective of the present invention is to provide a vertical domain knowledge graph building system, comprising:
the acquisition unit is used for acquiring service data in the unified data model of the power grid;
the business rule sorting unit is used for sorting the business rules in the business data and expressing the business rules by functions to obtain a business rule mapping function;
the entity relationship establishing unit is used for embedding the rule mapping function into the neural network, fusing the rule mapping function with the decision tree to form a text data stream classification model and distributing the text data into different service types; performing entity and relationship extraction on the text data classified by the text data stream classification model to obtain an entity relationship extraction algorithm;
an entity relationship fusion unit for combining the service rule mapping function to obtain rule coefficients, introducing the rule coefficients into the activation function of the entity relationship extraction algorithm to form an entity and relationship extraction algorithm for fusing the service rules,
and the knowledge graph construction unit is used for extracting the entity relationship of the power grid unified data model based on the entity relationship extraction algorithm of the fusion rule and constructing the knowledge graph according to the entity relationship extraction result.
The business rule sorting unit specifically comprises:
the rule set arrangement unit is used for analyzing rules and constraint conditions of different business text data in the power grid unified data model, combing and constructing rules to be followed in the corpus relationship network, and listing a rule set;
the rule set classification unit is used for carding a rule set and dividing the rule into a mechanistic rule, a constraint rule and a dependency rule; the mechanistic rule is that the place is often connected with a line and a station name later; the constraint rule is the expression of noun + number in the text; the dependency rule is a vocabulary related to the power transmission field in the text;
and the business rule integration unit is used for counting different types of rules and integrating and describing the business rules of the obtained text data.
A third object of the present invention is to provide an electronic device, as shown in fig. 11, including a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the vertical domain knowledge graph construction method when executing the computer program.
The method comprises the following steps:
acquiring service data in a unified data model of the power grid;
combing and classifying the business rules in the business data; the mechanism and condition constraint in the business rule are arranged, and the relevant business rule is described in a function form to obtain a business rule function;
embedding a business rule function into a neural network, and fusing the business rule function with a decision tree to form a text data stream classification model; arranging the business rules to form rule parameters, forming an activation function to replace the activation function in the original extraction algorithm of the entity and the relationship, and obtaining an entity relationship extraction algorithm of the fusion rules;
and performing entity relation extraction on the unified data model based on an entity relation extraction algorithm of the fusion rule, and constructing a knowledge graph according to an entity relation extraction result.
It is a fourth object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the vertical domain knowledge graph construction method.
The method comprises the following steps:
acquiring service data in a unified data model of the power grid;
combing and classifying the business rules in the business data; the mechanism and condition constraint in the business rule are arranged, and the relevant business rule is described in a function form to obtain a business rule function;
embedding a business rule function into a neural network, and fusing the business rule function with a decision tree to form a text data stream classification model; arranging the business rules to form rule parameters, forming an activation function to replace the activation function in the original extraction algorithm of the entity and the relationship, and obtaining an entity relationship extraction algorithm of the fusion rules;
and performing entity relation extraction on the unified data model based on an entity relation extraction algorithm of the fusion rule, and constructing a knowledge graph according to an entity relation extraction result.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. A vertical domain knowledge graph construction method is characterized by comprising the following steps:
acquiring service data in a unified data model of the power grid;
combing the service rules in the service data, and expressing the service rules by using functions to obtain a service rule mapping function;
embedding a rule mapping function into a neural network, fusing the rule mapping function with a decision tree to form a text data stream classification model, and distributing text data into different service types; performing entity and relationship extraction on the text data classified by the text data stream classification model to obtain an entity relationship extraction algorithm;
combining with the business rule mapping function to obtain rule coefficients, introducing the rule coefficients into the activation function of the entity relationship extraction algorithm to form an entity and relationship extraction algorithm fusing the business rules,
and performing entity relation extraction on the power grid unified data model based on an entity relation extraction algorithm of the fusion rule, and constructing a knowledge graph according to an entity relation extraction result.
2. The method of claim 1, wherein:
the combing the service rules in the service data specifically comprises:
analyzing rules and constraint conditions of different business text data in a power grid unified data model, combing and constructing rules to be followed in a corpus relationship network, and listing a rule set;
a carding rule set, which divides rules into a mechanistic rule, a constraint rule and a dependency rule; the mechanistic rule is that the place is often connected with a line and a station name later; the constraint rule is the expression of noun + number in the text; the dependency rule is a vocabulary related to the power transmission field in the text;
and counting different types of rules, and integrating and describing the business rules to obtain the text data.
3. The method of claim 1, wherein:
the obtaining of the service rule mapping function by expressing the service rule with a function specifically includes:
converting the mechanism and condition constraint in the business rule into a functional form;
wherein, the mechanistic rule is as follows: different element types in the text have x, y and z, and the satisfied mechanism rule is expressed by a functional relation:
n(x)=a*f(x)+b*g(y) (2)
wherein, n (x) represents the numerical representation of the adjacent elements behind the place, f (x), g (y) respectively represent the numerical representation represented by the line and the station, a and b respectively represent the probability coefficient of the connection between the x and the numerical representation of the line and the station;
constraint rules: if different element types in the text have x and y, the constraint conditions are expressed as follows:
R2:f(x)∈{y|a<y<b} (3)
wherein, f (x) is a related numerical calculation mode of x and y, and different element relationships are constrained by constraining the numerical range of the x and the y;
the dependency rules are: the single or multiple elements in the text are contained in a certain range, and the constraint conditions are represented as follows:
R3:x,y,z∈{f(a),g(a)} (4)
wherein, f (a), g (a) represent the range boundary defined by the element a in different functional forms;
then, performing functional representation on the rules of different types, and further establishing a rule set:
∑R:{R1,R2,...,Rn} (5)
where Σ R represents a certain type of set of rule functions, R1, R2.
4. The method of claim 1, wherein:
the rule mapping function is embedded into the neural network and fused with the decision tree to form a text data stream classification model, which specifically comprises the following steps:
integrating based on a rule function set, constructing and forming a new function as an activation function of a neural network neuron, wherein the Relu activation function is expressed as:
Max(0,∑(Ri(x)*ai)) (6)
wherein R isiIs a rule function in a certain class of rules, aiIs the weight of the rule function, x is the input text functional representation variable;
and obtaining an activation function fused with a rule through the formula, respectively constructing neural networks for classifying different services, combining the neural networks with the decision tree to form different neural networks as nodes of the decision tree, and classifying the input data streams layer by layer to obtain a text data stream classification model.
5. The method of claim 1, wherein:
the concrete method of the entity relationship extraction algorithm for obtaining the fusion rule comprises the following steps:
the rule mechanism coefficient, the rule constraint coefficient and the sample boundary coefficient are arranged by combining a business rule mapping function;
the business rule arrangement mechanism coefficient is formed by arranging business rules to form rule parameters, and the rule coefficient is calculated as follows:
Figure FDA0003090429020000031
wherein alpha isiIn order to be a rule-mechanism coefficient,
Figure FDA0003090429020000032
for the rule-mechanism function fit values, n (x) is the actual value,
Figure FDA0003090429020000033
is an average value;
the specific calculation method of the business rule constraint coefficient comprises the following steps:
βi=∑(f(x)i*k) (8)
wherein, betaiFor regular constraint coefficients, f (x)iCalculating a value for a constraint function of an element, wherein k is 1 when the constraint function satisfies a constraint condition, and is zero otherwise;
the specific calculation method of the sample boundary coefficient comprises the following steps:
γi=∑(Πfk(xj)) (9)
wherein, γiIs a sample boundary coefficient, xjFor a single documentElement, fk(xi) Calculating a result for a range of an element;
introducing a rule mechanism coefficient, a rule constraint coefficient and a sample boundary coefficient into an activation function of the long-time and short-time memory network to replace the activation function in the entity relationship extraction algorithm to obtain the entity relationship extraction algorithm fused with the rules:
Figure FDA0003090429020000034
wherein σr(z) is an activation function that incorporates rules, αiIs a coefficient of a rule mechanism, betaiFor the regular constraint coefficient, gammaiAre sample boundary coefficients.
6. The method of claim 1, wherein:
the construction of the knowledge graph according to the entity relationship extraction result specifically comprises the following steps:
and after extracting the entities and the relations in the text data stream of the unified data model, obtaining the relations between the entities with the types similar to the types of the rules, further obtaining entity-relation combination units comprising the mechanical relations, the constraint relations and the dependency relations, and connecting different entity nodes in the map through multiple relations to form the knowledge map containing the entity-relation combination units.
7. The method of claim 6, wherein:
the knowledge graph constructed based on the unified data model is divided by taking the sub-services as blocks, and the association relationship is established between the sub-services according to the shared entities or the associated services.
8. A vertical domain knowledge graph building system, comprising:
the acquisition unit is used for acquiring service data in the unified data model of the power grid;
the business rule sorting unit is used for sorting the business rules in the business data and expressing the business rules by functions to obtain a business rule mapping function;
the entity relationship establishing unit is used for embedding the rule mapping function into the neural network, fusing the rule mapping function with the decision tree to form a text data stream classification model and distributing the text data into different service types; performing entity and relationship extraction on the text data classified by the text data stream classification model to obtain an entity relationship extraction algorithm;
an entity relationship fusion unit for combining the service rule mapping function to obtain rule coefficients, introducing the rule coefficients into the activation function of the entity relationship extraction algorithm to form an entity and relationship extraction algorithm for fusing the service rules,
and the knowledge graph construction unit is used for extracting the entity relationship of the power grid unified data model based on the entity relationship extraction algorithm of the fusion rule and constructing the knowledge graph according to the entity relationship extraction result.
9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the vertical domain knowledge graph construction method of any one of claims 1-7 when executing the computer program.
10. A computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the vertical domain knowledge graph construction method of any one of claims 1-7.
CN202110594447.6A 2021-05-28 2021-05-28 Vertical domain knowledge graph construction method, system, equipment and storage medium Pending CN113312494A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110594447.6A CN113312494A (en) 2021-05-28 2021-05-28 Vertical domain knowledge graph construction method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110594447.6A CN113312494A (en) 2021-05-28 2021-05-28 Vertical domain knowledge graph construction method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113312494A true CN113312494A (en) 2021-08-27

Family

ID=77376462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110594447.6A Pending CN113312494A (en) 2021-05-28 2021-05-28 Vertical domain knowledge graph construction method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113312494A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114282011A (en) * 2022-03-01 2022-04-05 支付宝(杭州)信息技术有限公司 Knowledge graph construction method and device, and graph calculation method and device
CN117436420A (en) * 2023-12-18 2024-01-23 武汉大数据产业发展有限公司 Method and device for generating business process model based on natural language processing
CN117648975A (en) * 2023-12-19 2024-03-05 北京侏罗纪软件股份有限公司 Oil-gas enterprise knowledge graph construction method based on petroleum business model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114282011A (en) * 2022-03-01 2022-04-05 支付宝(杭州)信息技术有限公司 Knowledge graph construction method and device, and graph calculation method and device
CN114282011B (en) * 2022-03-01 2022-08-23 支付宝(杭州)信息技术有限公司 Knowledge graph construction method and device, and graph calculation method and device
CN117436420A (en) * 2023-12-18 2024-01-23 武汉大数据产业发展有限公司 Method and device for generating business process model based on natural language processing
CN117648975A (en) * 2023-12-19 2024-03-05 北京侏罗纪软件股份有限公司 Oil-gas enterprise knowledge graph construction method based on petroleum business model

Similar Documents

Publication Publication Date Title
CN113312494A (en) Vertical domain knowledge graph construction method, system, equipment and storage medium
WO2018014610A1 (en) C4.5 decision tree algorithm-based specific user mining system and method therefor
CN103761254B (en) Method for matching and recommending service themes in various fields
CN111309824A (en) Entity relationship map display method and system
CN107808278A (en) A kind of Github open source projects based on sparse self-encoding encoder recommend method
CN105117422A (en) Intelligent social network recommender system
CN103838857B (en) Automatic service combination system and method based on semantics
CN110288824B (en) Early-late peak congestion condition and propagation mechanism analysis method based on Granger cautuality road network
CN113326377A (en) Name disambiguation method and system based on enterprise incidence relation
CN110751355A (en) Scientific and technological achievement assessment method and device
CN113254669B (en) Knowledge graph-based power distribution network CIM model information completion method and system
CN110990718A (en) Social network model building module of company image improving system
CN108664509A (en) A kind of method, apparatus and server of extemporaneous inquiry
CN104331523A (en) Conceptual object model-based question searching method
CN112580902A (en) Object data processing method and device, computer equipment and storage medium
CN107729939A (en) A kind of CIM extended method and device towards newly-increased power network resources
CN112508726A (en) False public opinion identification system based on information spreading characteristics and processing method thereof
Du et al. Research on decision tree algorithm based on information entropy
CN109885797B (en) Relational network construction method based on multi-identity space mapping
CN114925165A (en) Consultation task decomposition method, system and platform
AU2021102006A4 (en) A system and method for identifying online rumors based on propagation influence
WO2024093468A1 (en) Risk evaluation method and system for windage yaw flashover, device, and readable storage medium
CN115334179B (en) Unknown protocol reverse analysis method based on named entity recognition
CN116541792A (en) Method for carrying out group partner identification based on graph neural network node classification
CN112560213B (en) System modeling method and system based on model system engineering and hyper-network theory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination