CN111090740A - Knowledge graph generation method for dialog system - Google Patents

Knowledge graph generation method for dialog system Download PDF

Info

Publication number
CN111090740A
CN111090740A CN201911237107.7A CN201911237107A CN111090740A CN 111090740 A CN111090740 A CN 111090740A CN 201911237107 A CN201911237107 A CN 201911237107A CN 111090740 A CN111090740 A CN 111090740A
Authority
CN
China
Prior art keywords
node
knowledge graph
nodes
graph
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911237107.7A
Other languages
Chinese (zh)
Other versions
CN111090740B (en
Inventor
余轲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lun Zi Technology Co ltd
Original Assignee
Beijing Lun Zi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Lun Zi Technology Co ltd filed Critical Beijing Lun Zi Technology Co ltd
Priority to CN201911237107.7A priority Critical patent/CN111090740B/en
Publication of CN111090740A publication Critical patent/CN111090740A/en
Application granted granted Critical
Publication of CN111090740B publication Critical patent/CN111090740B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a method for generating a knowledge graph of a question answering system. One embodiment of the method comprises: initializing a knowledge graph, acquiring an input statement, determining each node in the initialized knowledge graph corresponding to the input statement, determining structural features and non-structural features of each node, determining graph embedding features of each node in the initialized knowledge graph by using a confidence coefficient propagation mechanism, and generating the knowledge graph of the question-answering system. The method utilizes the structured knowledge base information and the unstructured dialogue statement information in the question-answering system in the process of generating the knowledge map, and can better assist in simulating and generating the dialogue statements of a real speaker in the question-answering process.

Description

Knowledge graph generation method for dialog system
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to the technical field of computer natural language processing, and particularly relates to a method for generating a knowledge graph of a question-answering system.
Background
The open question-answering system is capable of learning the sentence embedding characteristics of the current input sentence from the previous dialogue sentence based on the knowledge base in the open dialogue environment, thereby outputting the target sentence. A knowledge graph is a knowledge representation method that uses a graph model to describe knowledge and model entity relationships. The knowledge-graph-based question-answering system is a question-answering system for answering questions based on the knowledge graph, and generates graph embedding characteristics by mapping the semantic understanding result to the knowledge graph so as to answer the questions. The process of generating map embedding features by embedding a knowledge graph refers to mapping contents including entities and relations in the knowledge graph to a continuous vector space, wherein each node corresponds to the map embedding features.
The knowledge graph uses a vector expression mode and utilizes a numerical calculation method to improve the calculation efficiency of the application in the question-answering system. The vector expression mode in the knowledge map can effectively utilize the current popular neural network, deep learning and other machine learning methods, so that the diversity of the design of the question-answering system can be increased.
The prior knowledge graph for the question-answering system can capture the openness characteristic of the dialog, but can not be directly applied to the scene of knowledge interaction depending on the structural characteristic due to the lack of the structural dialog state characteristic. If the knowledge graph can be constructed based on the structural features and the non-structural features in the question-answering system at the same time, new nodes are added along with the continuous input of sentences, the context knowledge is spread, and the knowledge graph is updated, the dialogue effect of the question-answering system can be improved.
Disclosure of Invention
The embodiment of the application provides a method for generating a knowledge graph of a question answering system.
In a first aspect, an embodiment of the present application provides a method for generating information, where the method includes: generating an initialization knowledge graph based on the conversation sample library; acquiring an input statement; determining each node in the initialization knowledge graph corresponding to the input sentence, determining structural features of each node, and determining unstructured features of each node; and determining graph embedding characteristics of each node in the initialization knowledge graph based on the determined structural characteristics and the non-structural characteristics by using a confidence coefficient propagation mechanism, and generating the knowledge graph of the question-answering system.
In some embodiments, generating the initialization knowledge graph based on the conversation sample library includes: generating nodes and edges based on a structured knowledge base in the conversation sample base; updating the nodes and edges based on unstructured dialog statements in the dialog sample library; wherein the dialogue sample library comprises a structured knowledge base and unstructured dialogue sentences, and the nodes and edges form an initialization knowledge graph.
In some embodiments, generating nodes and edges based on a structured knowledge base in a conversational sample library includes: generating nodes in the initialization knowledge graph, the nodes comprising: project nodes, attribute nodes and entity nodes; generating edges in the initialization knowledge graph, wherein the edges represent relationships between different nodes.
In some embodiments, updating the nodes and edges based on unstructured conversational statements in a conversational sample library includes: and if the unstructured sentences in the conversation sample library contain nodes which are not in the initialization knowledge graph, adding new nodes, and updating edges according to node relations.
In some embodiments, determining the structural characteristics of each node comprises: determining a unique heat vector of the occurrence times of each node, wherein the unique heat vector of the occurrence times represents the occurrence times of each node in all sentences stored in the question answering system; determining a one-hot vector for each node type, wherein the degree-hot vector for the type represents the type of each node; determining a one-hot vector of the occurrence of each node, wherein the degree-hot vector of the occurrence indicates whether each node occurs in the input statement; and connecting the occurrence times, the occurrence types and the one-hot vectors of the occurrence conditions in series to determine the structural characteristics of each node.
In some embodiments, determining the unstructured features of each node comprises: generating an entity set based on the input statement; determining sentence embedding characteristics of the input sentence; determining unstructured features of the each node based on the entity sets and the sentence embedding features.
In some embodiments, generating the set of entities based on the input statement comprises: initializing an entity set as an empty set; if the input statement comprises the entity node in the initialization knowledge graph, determining an entity node set as the entity set; and if the input statement does not contain the entity node in the initialization knowledge graph, using an entity set corresponding to a previous statement as the entity set, wherein the previous statement refers to the previous statement stored in the question-answering system and aiming at the input statement.
In some embodiments, determining sentence embedding characteristics of the input sentence comprises: taking the input statement as an input of a recurrent neural network; and outputting the value of the last hidden layer of the recurrent neural network as the sentence embedding feature of the input sentence.
In some embodiments, determining graph embedding features for each node in the initialization knowledge-graph based on the determined structured features and unstructured features using a belief propagation mechanism, generating a knowledge-graph for a question-answering system, comprising: layering nodes in the initialization knowledge graph; determining the concatenation result of the determined structural features and unstructured features as graph embedding features of nodes in the initialization knowledge graph of the 0 th layer; updating the graph embedding characteristics of each layer of nodes in the initialized knowledge graph by using a confidence coefficient propagation mechanism, wherein the confidence coefficient propagation mechanism updates the knowledge graph by using a method of mutually transmitting information between the nodes; and connecting the graph embedding characteristics of each layer of nodes in series to generate a knowledge graph of the question answering system.
In some embodiments, the method further comprises: and sending the knowledge graph spectrum of the question answering system to target display equipment, and controlling the target display equipment to display the knowledge graph spectrum.
In a second aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method as described in any implementation manner of the first aspect.
In a third aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method as described in any implementation manner of the first aspect.
The method for generating information includes initializing a knowledge graph, inputting sentences, determining each node in the initialization knowledge graph corresponding to the input sentences, determining structural features and unstructured features of each node, and determining graph embedding features of each node in the initialization knowledge graph, so that the knowledge graph for a question-answering system is generated.
One of the above embodiments of the present application has the following beneficial effects: and generating an initialization knowledge graph based on a structured knowledge base and unstructured dialogue sentences in the dialogue sample base, and determining graph embedding characteristics of each node in the initialization knowledge graph by using structured knowledge base information and unstructured dialogue sentence information in the question-answering system. Because the structured features and the unstructured features are simultaneously considered in the process of generating the embedding features of the knowledge map node graph, the real dialogue environment information of the question answering system can be effectively captured. Therefore, the embodiment of the application can better assist in simulating and generating the dialogue sentences of the real speakers in the question-answering process.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for generating a knowledge-graph of a question-answering system according to the present application;
FIG. 3 is a flow diagram for one embodiment of generating an initialization knowledge-graph according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a method for generating a knowledge-graph of a question-answering system in accordance with the present application;
FIG. 5 is a schematic block diagram of a computer system suitable for use in implementing an electronic device according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for generating information of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a text processing application, a natural language processing application, a question and answer system application, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. Which may be implemented as a plurality of software or software modules (e.g., to provide conversational speech input) or as a single software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a generation server that analyzes sentences input by the terminal apparatuses 101, 102, 103 and generates corresponding knowledge maps. The knowledge graph generation server may analyze and otherwise process the received data such as the sentence, and feed back the processing result (e.g., the knowledge graph) to the terminal device.
It should be noted that the method for generating the knowledge graph of the question and answer system provided in the embodiment of the present application is generally performed by the server 105, and accordingly, the apparatus for generating the knowledge graph of the question and answer system is generally disposed in the server 105.
It should be noted that the local area of the server 105 may also directly store the sentences, and the server 105 may directly extract the local dialogue sentences to generate the knowledge graph of the question-answering system, in which case the exemplary system architecture 100 may not include the terminal devices 101, 102, 103 and the network 104.
It should be noted that the terminal apparatuses 101, 102, and 103 may also have a knowledge-graph generation application installed therein, and in this case, the method for generating the knowledge graph of the question-answering system may also be executed by the terminal apparatuses 101, 102, and 103. At this point, the exemplary system architecture 100 may also not include the server 105 and the network 104.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide a knowledgegraph generation service), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating a knowledge-graph of a question-answering system in accordance with the present application is shown. The method for generating the knowledge graph of the question answering system comprises the following steps:
an initialization knowledge graph is generated based on the dialog sample library, step 201.
In this embodiment, the nodes and edges in the knowledge-graph are generated based on a structured knowledge base contained in the conversational sample library within a defined specific domain. The nodes and edges are updated based on unstructured conversational statements stored in a conversational sample library. The nodes and edges generated based on the dialog sample library constitute the initialization knowledge-graph.
Step 202, an input statement is obtained.
In this embodiment, an executive (e.g., a server as shown in FIG. 1) of a method for generating a knowledge graph may retrieve an input sentence. Here, the input sentence may be a dialogue sentence of arbitrary content. For example, the input sentence may be a dialogue sentence about weather conditions.
The input sentence may be uploaded to the execution body by a terminal device (for example, terminal devices 101, 102, and 103 shown in fig. 1) communicatively connected to the execution body through a wired connection or a wireless connection, or may be locally stored in the execution body. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.
Step 203, determining each node in the initialization knowledge graph corresponding to the input sentence, determining the structural feature of each node, and determining the non-structural feature of each node.
In the present embodiment, the execution subject (e.g., the server shown in fig. 1) determines each node in the initialization knowledge graph that matches each word in the input sentence according to the input sentence. And calculating the structural features of each node based on the input sentences and the conversation sample library, and calculating the unstructured features of each node based on the input sentences.
And step 204, determining graph embedding characteristics of each node in the initialization knowledge graph based on the determined structural characteristics and the non-structural characteristics by using a confidence coefficient propagation mechanism, and generating the knowledge graph of the question-answering system.
In this embodiment, the execution subject (for example, the server shown in fig. 1) concatenates the structured features and the unstructured features of each node, and propagates the determined structured features and unstructured features of each node to the neighborhood nodes corresponding to each node in the initialization knowledge graph by using a confidence propagation mechanism, so as to determine the graph embedding features of the nodes in the initialization knowledge graph, and generate the knowledge graph of the question-answering system. The confidence coefficient propagation mechanism updates the knowledge graph by utilizing a method of mutually transmitting information between nodes.
One embodiment presented in fig. 2 has the following beneficial effects: and generating an initialization knowledge graph based on a structured knowledge base and unstructured dialogue sentences in the dialogue sample base, and determining graph embedding characteristics of each node in the initialization knowledge graph by using structured knowledge base information and unstructured dialogue sentence information in the question-answering system. Because the structured features and the unstructured features are simultaneously considered in the process of generating the embedding features of the knowledge map node map, the real dialogue environment information of the question-answering system can be effectively captured, and therefore, one embodiment of the application can better assist in simulating and generating dialogue sentences of a real speaker in the question-answering process.
With continued reference to FIG. 3, FIG. 3 illustrates a flow 300 of one embodiment of initializing a knowledge-graph according to the present application. The initialization process may include the steps of:
step 301, determining a basic structure of the knowledge graph.
In this embodiment, the basic structure of the initialization knowledge-graph is first determined. A knowledge graph is a semantic network that characterizes relationships between entities, describes real-world objects and their interrelationships in a structured form, and stores the objects and their interrelationships as structured knowledge. In this embodiment, the basic structure of the knowledge-graph includes nodes and edges, where each node represents an entity of specific knowledge and the connecting edges of the nodes represent relationships between the entities.
Step 302, a conversation sample library is obtained.
In the present embodiment, the question-answering system is implemented based on a dialogue sample library. The dialogue sample library comprises a structured knowledge base and unstructured dialogue sentences in a defined specific field, and the question answering system obtains answers by matching questions in the dialogue sample library. The structured knowledge base is stored in a list form, and specifically includes items and corresponding attributes thereof. Unstructured conversation statements are stored in the form of statements of a daily chat conversation. In this embodiment, the structured knowledge base of the dialogue sample base may contain public knowledge bases, such as knowledge bases of topics including natural language understanding, self-learning, scientific knowledge, and the like. The unstructured dialog statements may contain daily dialog statements collected in a defined application scenario, etc.
303, generating nodes and edges based on a structured knowledge base in the conversation sample base; the nodes and edges are updated based on unstructured conversational statements in the conversational sample library.
In this embodiment, the nodes in the initialization knowledge graph are generated based on the structured text in the dialogue sample library, and the nodes include: project nodes, attribute nodes, and entity nodes. Generating edges in the initialization knowledge graph based on structured text in a dialogue sample library, wherein the edges represent relations between different nodes. And determining a ternary relationship group consisting of node-edge-node.
And updating the nodes and the edges based on the unstructured sentences in the dialogue sample library, adding new nodes if the sentences in the dialogue sample contain nodes which are not in the initialization knowledge graph, and updating the edges according to the relationship between the newly added nodes and other nodes. If the statement does not contain a node that is not in the initialization knowledge graph, then the node and the edge are not updated.
And step 304, determining the obtained knowledge graph as an initialization knowledge graph.
In the present embodiment, a knowledge graph composed of the obtained nodes and edges is used as the initialization knowledge graph.
One embodiment presented in fig. 3 has the following beneficial effects: an initialization knowledge graph is generated based on the structured knowledge base and the unstructured dialogue statements in the dialogue sample base. The knowledge-graph is composed of nodes and edges. The knowledge base and the dialogue sentences in the dialogue sample base are considered simultaneously in the process of determining the nodes and the edges, so that the nodes in the knowledge base stored in a structured form and the nodes in the dialogue sentences stored in an unstructured form can be completely contained. Therefore, the knowledge graph based on the question answering process can be better generated by the embodiment of the application.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for generating a knowledge-graph of a question-and-answer system is shown. The flow 400 of the method for generating a knowledge-graph includes the steps of:
step 401, an input statement is obtained.
In this embodiment, an executive (e.g., the server shown in fig. 1) of the knowledge-graph method for generating a question-answering system may retrieve the input sentences. Here, the input sentence may be an arbitrary dialogue sentence.
Here, the input sentence may be uploaded to the execution body by a terminal device (for example, terminal devices 101, 102, and 103 shown in fig. 1) communicatively connected to the execution body through a wired connection or a wireless connection, or may be locally stored in the execution body.
Step 402, determining each node in the initialization knowledge graph corresponding to the input sentence.
In this embodiment, all words in the input sentence are matched with all nodes in the initialization knowledge graph, and each node corresponding to all words in the input sentence is determined.
And 403, connecting the unique heat vectors of the occurrence condition, the occurrence frequency and the type of each node in series, and determining the structural characteristics of each node.
In this embodiment, after the input sentence is acquired, the execution body may determine each node in the initialization knowledge graph corresponding to the input sentence. Judging whether each node appears in the input sentence or not based on the dialogue sample library and the input sentence, judging the type of each node, and calculating the number of times each node appears in the dialogue sentences in the dialogue sample library stored by the question-answering system.The occurrence condition, the type and the occurrence frequency of each node are expressed as one-hot vectors, and the structural characteristics F of the nodes in the initialized knowledge graph are obtained by connecting the three one-hot vectors in seriest(v) Where v denotes a node, t denotes a calculation for the current input sentence, and F denotes a feature vector
Step 404, calculating the entity node condition in the input statement, and generating an entity set.
In the embodiment, an entity set E is generated according to the node condition of the entity type in the input statementtWherein E represents an entity set, and t represents that the entity set is initialized to be an empty set for the calculation of the current input statement. And if the input statement contains the entity node in the initialization knowledge graph, the corresponding entity node is included in the entity set. If the input statement does not contain the entity node in the initialization knowledge graph, using the entity set E corresponding to the previous statementt-1As the set of entities. Where E represents the entity set and t-1 represents the computation for the previous statement. The last sentence refers to the last sentence for the input sentence in the question-answering system. In the application process of the question-answering system, input sentences are continuously input, and the entity set is updated.
Step 405, input the input sentence into the recurrent neural network, generating the embedded feature of the input sentence.
In this embodiment, the current input statement is
Figure BDA0002305174140000101
Where t denotes the calculation for the current input sentence, t-1 denotes the calculation for the previous sentence, ntIndicating a common inclusion of n in the input sentencetIndividual word (token), xtIt is shown that the input sentence,
Figure BDA0002305174140000102
representing words in the input sentence. The input sentence is input into the recurrent neural network, and the Long Short-Term Memory network (LSTM) is taken as an example in the embodiment:
ht,j=LSTM(ht,j-1,xt,j)
wherein x ist,jRepresenting the jth word in the input sentence, i.e. input sentence xtThe words are input into the recurrent neural network one by one. h ist,j-1And representing the output of the recurrent neural network corresponding to the j-1 th word of the input sentence, namely the embedded characteristic of the j-1 th word. h ist,jIndicating the embedded feature corresponding to the jth word of the input sentence. Wherein the content of the first and second substances,
Figure BDA0002305174140000111
namely, the recurrent neural network output of the previous sentence (t-1) is used as the input of the 1 st word of the input sentence (t) corresponding to the recurrent neural network, wherein the previous sentence is the previous sentence aiming at the input sentence in the question-answering system. The state value of the last hidden layer of the recurrent neural network is used as the output h of the recurrent neural networkt,j
Figure BDA0002305174140000112
As an input statement xtSentence embedding feature utWhere u represents the sentence embedding feature vector.
Step 406, determining the unstructured features of each node by using the entity sets and the sentence embedding features.
In the present embodiment, according to EtAnd utCalculating the unstructured characteristics of each node: mt(v)=λtMt-1(v)+(1-λt)ut
Wherein M ist-1(v) And M is an unstructured feature set corresponding to the previous sentence, wherein the previous sentence is the previous sentence aiming at the input sentence in the question-answering system. Lambda [ alpha ]tIs a control parameter, where t represents the calculation for the current input statement:
Figure BDA0002305174140000113
wherein WincTo control the transformation matrix, σ is a scale parameter, which is not initialized when all words in the input sentence are initializedWhen nodes in the knowledge graph correspond, Mt(v)=Mt-1(v)。
Step 407, determining graph embedding characteristics of each node in the initialization knowledge graph based on the structural characteristics and the non-structural characteristics, and generating the knowledge graph of the question-answering system
In this embodiment, the structured features and the unstructured features that have been determined are propagated into neighborhood nodes in the initialization knowledge graph using a confidence propagation mechanism. Layering the nodes in the initialization knowledge graph for K layers, wherein the graph embedding characteristics of the K-th layer nodes V are represented as Vt k(v) And t represents a calculation for the current input sentence. Graph embedding characteristics of the level 0 nodes are the determined tandem result V of the structured characteristics and the unstructured characteristicst 0=[Ft(v),Mt(v)]. Node V of k layert kThe calculation is as follows:
Figure BDA0002305174140000114
wherein N ist(v) A set of neighborhood nodes representing node v. Node v' in neighborhood belongs to Nt(v) Depends on V' the graph embedding features V at the (k-1) layert k-1(v'), edge mark ev->v'And, a parameter matrix WmpWherein e isv->v'Represented by an embedding function R. Node Vt k(v) The graph embedding characteristics of (1) are obtained by aggregating the maximization (element-wise max) processing of all nodes in the neighborhood.
Connecting the graph embedding characteristics of each layer of nodes in series to obtain the graph embedding characteristics V of each nodet(v) Comprises the following steps:
Vt(v)=[Vt 0(v),...,Vt K(v)]
and generating the knowledge graph of the question-answering system according to the initialized knowledge graph and the calculated graph embedding characteristics of each node.
One embodiment presented in fig. 4 has the following beneficial effects: graph embedding characteristics of each node in the initialization knowledge graph are determined by using the structured knowledge base information and the unstructured dialogue statement information in the question-answering system. Because the structured features and the unstructured features are simultaneously considered in the process of generating the embedding features of the knowledge map node map, the real dialogue environment information of the question-answering system can be effectively captured, and therefore, one embodiment of the application can better assist in simulating and generating dialogue sentences of a real speaker in the question-answering process.
Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM502, and RAM 503 are connected to each other via a bus 504. An Input/Output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: a storage section 506 including a hard disk and the like; and a communication section 507 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 507 performs communication processing via a network such as the internet. The driver 508 is also connected to the I/O interface 505 as necessary. A removable medium 509 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 508 as necessary, so that a computer program read out therefrom is mounted into the storage section 506 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 507 and/or installed from the removable medium 509. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 501. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring an input statement; determining nodes of the initialization knowledge graph corresponding to the input sentences based on the initialization knowledge graph obtained by initialization in advance; calculating the structural features and the non-structural features of each node; and determining graph embedding characteristics of all nodes in the initialization knowledge graph based on the determined structural characteristics and the non-structural characteristics by using a confidence coefficient propagation mechanism, and generating the knowledge graph of the question-answering system.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A knowledge graph generation method for a question-answering system comprises the following steps:
generating an initialization knowledge graph based on the conversation sample library;
acquiring an input statement;
determining each node in the initialization knowledge graph corresponding to the input sentence, determining structural features of each node, and determining unstructured features of each node;
and determining graph embedding characteristics of each node in the initialization knowledge graph based on the determined structural characteristics and the non-structural characteristics by using a confidence coefficient propagation mechanism, and generating the knowledge graph of the question-answering system.
2. The method of claim 1, the generating an initialization knowledge graph based on a conversation sample library, comprising:
generating nodes and edges based on a structured knowledge base in the conversation sample base;
updating the nodes and edges based on unstructured dialog statements in the dialog sample library;
wherein the dialogue sample library comprises a structured knowledge base and unstructured dialogue sentences, and the nodes and edges form an initialization knowledge graph.
3. The method of claim 2, the generating nodes and edges based on a structured knowledge base in the conversation sample library, comprising:
generating nodes in the initialization knowledge graph, the nodes comprising: project nodes, attribute nodes and entity nodes;
generating edges in the initialization knowledge graph, wherein the edges represent relationships between different nodes.
4. The method of claim 2, the updating the nodes and edges based on unstructured conversational statements in the conversational sample library, comprising:
and if the unstructured sentences in the conversation sample library contain nodes which are not in the initialization knowledge graph, adding new nodes, and updating edges according to node relations.
5. The method of claim 1, the determining structural features of the each node, comprising:
determining a unique heat vector of the occurrence times of each node, wherein the unique heat vector of the occurrence times represents the occurrence times of each node in all sentences stored in the question answering system;
determining a one-hot vector for each node type, wherein the one-hot vector for the type represents the type of each node;
determining a one-hot vector of the occurrence of each node, wherein the one-hot vector of occurrence indicates whether each node occurs in the input statement;
and connecting the occurrence times, the occurrence types and the one-hot vectors of the occurrence conditions in series to determine the structural characteristics of each node.
6. The method of claim 1, the determining unstructured features of the each node, comprising:
generating an entity set based on the input statement;
determining sentence embedding characteristics of the input sentence;
determining unstructured features of the each node based on the entity sets and the sentence embedding features.
7. The method of claim 6, the generating a set of entities based on the input sentence, comprising:
initializing an entity set to be a 0 set;
if the input statement comprises the entity node in the initialization knowledge graph, determining an entity node set as the entity set;
and if the input statement does not contain the entity node in the initialization knowledge graph, using an entity set corresponding to a previous statement as the entity set, wherein the previous statement refers to the previous statement stored in the question-answering system and aiming at the input statement.
8. The method of claim 6, the determining sentence embedding characteristics of the input sentence, comprising:
taking the input statement as an input of a recurrent neural network;
and outputting the value of the last hidden layer of the recurrent neural network as the sentence embedding feature of the input sentence.
9. The method of claim 1, the determining graph embedding features for each node in the initialization knowledge-graph based on the determined structured features and unstructured features using a belief propagation mechanism to generate a knowledge-graph of a question-answering system, comprising:
layering nodes in the initialization knowledge graph;
determining the concatenation result of the determined structural features and unstructured features as graph embedding features of nodes in the initialization knowledge graph of the 0 th layer;
updating the graph embedding characteristics of each layer of nodes in the initialized knowledge graph by using a confidence coefficient propagation mechanism, wherein the confidence coefficient propagation mechanism updates the knowledge graph by using a method of mutually transmitting information between the nodes;
and connecting the graph embedding characteristics of each layer of nodes in series to generate a knowledge graph of the question answering system.
10. The method of claim 1, further comprising:
and sending the knowledge graph spectrum of the question answering system to target display equipment, and controlling the target display equipment to display the knowledge graph spectrum.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-10.
12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-10.
CN201911237107.7A 2019-12-05 2019-12-05 Knowledge graph generation method for dialogue system Active CN111090740B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911237107.7A CN111090740B (en) 2019-12-05 2019-12-05 Knowledge graph generation method for dialogue system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911237107.7A CN111090740B (en) 2019-12-05 2019-12-05 Knowledge graph generation method for dialogue system

Publications (2)

Publication Number Publication Date
CN111090740A true CN111090740A (en) 2020-05-01
CN111090740B CN111090740B (en) 2023-09-29

Family

ID=70394739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911237107.7A Active CN111090740B (en) 2019-12-05 2019-12-05 Knowledge graph generation method for dialogue system

Country Status (1)

Country Link
CN (1) CN111090740B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163087A (en) * 2020-11-10 2021-01-01 山东比特智能科技股份有限公司 Method, system and device for solving intention conflict in conversation system
CN113032527A (en) * 2021-03-25 2021-06-25 北京轮子科技有限公司 Information generation method and device for question-answering system and terminal equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180144252A1 (en) * 2016-11-23 2018-05-24 Fujitsu Limited Method and apparatus for completing a knowledge graph
CN108197290A (en) * 2018-01-19 2018-06-22 桂林电子科技大学 A kind of knowledge mapping expression learning method for merging entity and relationship description
CN109344262A (en) * 2018-10-31 2019-02-15 百度在线网络技术(北京)有限公司 Architectonic method for building up, device and storage medium
CN109871542A (en) * 2019-03-08 2019-06-11 广东工业大学 A kind of text knowledge's extracting method, device, equipment and storage medium
CN110413760A (en) * 2019-07-31 2019-11-05 北京百度网讯科技有限公司 Interactive method, device, storage medium and computer program product

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180144252A1 (en) * 2016-11-23 2018-05-24 Fujitsu Limited Method and apparatus for completing a knowledge graph
CN108197290A (en) * 2018-01-19 2018-06-22 桂林电子科技大学 A kind of knowledge mapping expression learning method for merging entity and relationship description
CN109344262A (en) * 2018-10-31 2019-02-15 百度在线网络技术(北京)有限公司 Architectonic method for building up, device and storage medium
CN109871542A (en) * 2019-03-08 2019-06-11 广东工业大学 A kind of text knowledge's extracting method, device, equipment and storage medium
CN110413760A (en) * 2019-07-31 2019-11-05 北京百度网讯科技有限公司 Interactive method, device, storage medium and computer program product

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163087A (en) * 2020-11-10 2021-01-01 山东比特智能科技股份有限公司 Method, system and device for solving intention conflict in conversation system
CN113032527A (en) * 2021-03-25 2021-06-25 北京轮子科技有限公司 Information generation method and device for question-answering system and terminal equipment
CN113032527B (en) * 2021-03-25 2023-08-22 北京轮子科技有限公司 Information generation method and device for question-answering system and terminal equipment

Also Published As

Publication number Publication date
CN111090740B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN110366734B (en) Optimizing neural network architecture
US11328180B2 (en) Method for updating neural network and electronic device
CN110766142A (en) Model generation method and device
CN111143535B (en) Method and apparatus for generating a dialogue model
CN111523640B (en) Training method and device for neural network model
CN110555714A (en) method and apparatus for outputting information
WO2018153806A1 (en) Training machine learning models
CN109740167B (en) Method and apparatus for generating information
CN108090218B (en) Dialog system generation method and device based on deep reinforcement learning
WO2019232772A1 (en) Systems and methods for content identification
CN113590776B (en) Knowledge graph-based text processing method and device, electronic equipment and medium
EP4350572A1 (en) Method, apparatus and system for generating neural network model, devices, medium and program product
US11900263B2 (en) Augmenting neural networks
CN114424215A (en) Multitasking adapter neural network
CN111368973A (en) Method and apparatus for training a hyper-network
CN111090740B (en) Knowledge graph generation method for dialogue system
Windiatmoko et al. Developing FB chatbot based on deep learning using RASA framework for university enquiries
CN113392197A (en) Question-answer reasoning method and device, storage medium and electronic equipment
CN115062617A (en) Task processing method, device, equipment and medium based on prompt learning
CN113077237B (en) Course arrangement method and system for self-adaptive hybrid algorithm
CN113569017A (en) Model processing method and device, electronic equipment and storage medium
CN114119123A (en) Information pushing method and device
CN111797220A (en) Dialog generation method and device, computer equipment and storage medium
CN113268575B (en) Entity relationship identification method and device and readable medium
CN116957006A (en) Training method, device, equipment, medium and program product of prediction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant