CN113157891B - Knowledge graph path ordering method, system, equipment and storage medium - Google Patents

Knowledge graph path ordering method, system, equipment and storage medium Download PDF

Info

Publication number
CN113157891B
CN113157891B CN202110493759.8A CN202110493759A CN113157891B CN 113157891 B CN113157891 B CN 113157891B CN 202110493759 A CN202110493759 A CN 202110493759A CN 113157891 B CN113157891 B CN 113157891B
Authority
CN
China
Prior art keywords
attention
path
cross
data stream
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110493759.8A
Other languages
Chinese (zh)
Other versions
CN113157891A (en
Inventor
李钊
赵凯
邓晓雨
刘岩
宋慧驹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN202110493759.8A priority Critical patent/CN113157891B/en
Publication of CN113157891A publication Critical patent/CN113157891A/en
Application granted granted Critical
Publication of CN113157891B publication Critical patent/CN113157891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a system, equipment and a storage medium for ordering a knowledge graph path, wherein the method comprises the following steps: acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream; performing path query based on the entity set to obtain a candidate path set, and taking each candidate path in the candidate path set as a second data stream respectively; inputting the first data stream and each second data stream into a double-stream deep learning model in sequence, and obtaining the similarity of the first data stream and each second data stream output by the double-stream deep learning model; and sequencing the candidate paths according to similarity values corresponding to the candidate paths, which are output by the double-flow deep learning model. The method solves the problem of uncertainty of the paths in the knowledge graph and improves the accuracy of the sequencing of the paths of the knowledge graph.

Description

Knowledge graph path ordering method, system, equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, a system, an apparatus, and a storage medium for ordering a knowledge graph path.
Background
Open field text question answering (OpenQA) does not provide a single paragraph or document while giving a question, but rather requires finding an answer in a collection of documents or in the entire web page. Because the knowledge surface related to the open domain question-answer dialogue technology is very wide and can be infinitely expanded in theory, the traditional FAQ technology is not suitable for open domain scenes.
When knowledge patterns are used as knowledge sources of questions and answers, the related technologies include syntactic analysis, entity identification, graph database query, pattern path ordering and the like. The key of knowledge-graph question-answering technology is to correctly identify the entities and semantics of questions and find the correct path in the graph to determine the answer. The existing common methods are as follows: and identifying the entity by naming the entity, determining the attribute by the keyword or sentence template, and searching the answer by using the relation of the entity and the attribute. However, due to the diversity of natural language, the identified "entity-attribute" relationship pairs may be ambiguous, or multiple relationship pairs may be identified, giving uncertainty to the validation of the atlas path.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a method, a system, equipment and a storage medium for ordering paths of a knowledge graph, which solve the uncertainty of paths in the knowledge graph.
The embodiment of the invention provides a knowledge graph path sorting method, which comprises the following steps:
acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream;
performing path query based on the entity set to obtain a candidate path set, and taking each candidate path in the candidate path set as a second data stream respectively;
inputting the first data stream and each second data stream into a double-stream deep learning model in sequence, and obtaining the similarity of the first data stream and each second data stream output by the double-stream deep learning model;
and sequencing the candidate paths according to similarity values corresponding to the candidate paths, which are output by the double-flow deep learning model.
In some embodiments, the dual stream deep learning model includes a text self-attention module, a path self-attention module, a cross-attention module, and an output layer;
after the first data stream and each second data stream are sequentially input into a double-flow deep learning model, the first data stream is input into the text self-attention module, the second data stream is input into the path self-attention module, the first text feature output by the text self-attention module and the first path feature output by the path self-attention module are input into the cross-attention module together, the output feature of the cross-attention module is input into the output layer, and the output layer outputs the similarity of the first data stream and the second data stream.
In some embodiments, the text self-attention module comprises a plurality of text self-attention encoders in series, each of the text self-attention encoders comprising a text self-attention layer and a text forward propagation layer;
the path self-attention module comprises a plurality of path self-attention encoders which are sequentially connected in series, and each path self-attention encoder comprises a path self-attention layer and a path forward propagation layer.
In some embodiments, the cross-attention module includes a plurality of cross-attention encoders serially connected in series, each cross-attention encoder including a first cross-attention unit that receives text features output from a previous layer and outputs text features provided to a next layer, and a second cross-attention unit that receives path features output from the previous layer and outputs path features provided to the next layer, and cross-attention calculations are performed between the first cross-attention unit and the second cross-attention unit;
the first dimension in the characteristics output by the first cross attention unit is a cross characteristic, the characteristics except the first dimension are second text characteristics, and the characteristics output by the second cross attention unit are second path characteristics.
In some embodiments, the first cross-attention unit comprises a first cross-attention layer, a first self-attention layer, and a first forward propagation layer in series, and the second cross-attention unit comprises a second cross-attention layer, a second self-attention layer, and a second forward propagation layer in series, with cross-attention calculations between the first cross-attention layer and the second cross-attention layer.
In some embodiments, the output features of the cross-attention module input the output layer, the output layer outputting the similarity of the first data stream and the second data stream, comprising the steps of:
and combining the second text feature, the cross feature and the second path feature output by the cross attention module to obtain a total feature, inputting the total feature into the output layer, and outputting the similarity of the first data stream and the second data stream by the output layer.
In some embodiments, the output features of the cross-attention module input the output layer, the output layer outputting the similarity of the first data stream and the second data stream, comprising the steps of:
inputting a second text feature and a second path feature output by the cross attention module into the output layer, wherein the output layer calculates the similarity of the second text feature and the second path feature as the similarity of the first data stream and the second data stream; or (b)
And inputting the cross features output by the cross attention module into the output layer, classifying the output layer based on the cross features to obtain classification categories, and taking the probability of the classification categories as the similarity of the first data stream and the second data stream.
The embodiment of the invention also provides a knowledge graph path sequencing system for realizing the knowledge graph path sequencing method, which comprises the following steps:
the entity set acquisition module is used for acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream;
the path set acquisition module is used for carrying out path query based on the entity set to obtain a candidate path set, and each candidate path in the candidate path set is respectively used as a second data stream;
the similarity calculation module is used for inputting the first data stream and each second data stream into a double-stream deep learning model in sequence to obtain the similarity of the first data stream and each second data stream output by the double-stream deep learning model;
and the candidate path sorting module is used for sorting the candidate paths according to the similarity values corresponding to the candidate paths output by the double-flow deep learning model.
The embodiment of the invention also provides a knowledge graph path sequencing device, which comprises:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the knowledge-graph path ordering method via execution of the executable instructions.
The embodiment of the invention also provides a computer readable storage medium for storing a program, which when being executed by a processor, realizes the steps of the knowledge graph path sorting method.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
The knowledge graph path sorting method, system, equipment and storage medium have the following beneficial effects:
according to the method, the entity set and the candidate path set are firstly obtained respectively, then the entity set and the candidate path set are input into the double-flow deep learning model as the first data flow and the second data flow, and the candidate paths are ordered based on the similarity output by the model, so that the uncertainty of the paths in the knowledge graph is solved, the accuracy of the ordering of the paths of the knowledge graph is improved, and the user experience of various upper-layer applications (such as various question-answering robots) is further improved. The scheme of the invention can be applied to question-answering scenes, and can also be applied to other scenes such as artificial intelligence, machine learning and the like which relate to the sequencing of the knowledge graph paths.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings.
FIG. 1 is a flow chart of a knowledge-graph path ordering method, according to an embodiment of the invention;
FIG. 2 is a flow chart of an open domain graph question-answering implementation based on the knowledge graph path ordering method, according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a dual stream deep learning model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a self-attention module according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a self-attention encoder according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of the data flow of a self-attention encoder of an embodiment of the present invention;
FIG. 7 is a schematic diagram of a cross-attention module according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of the structure of a cross-attention encoder in accordance with an embodiment of the present invention;
FIG. 9 is a schematic diagram of the data flow of a cross-attention encoder of an embodiment of the present invention;
FIG. 10 is a schematic diagram of a knowledge-graph path ordering system, according to an embodiment of the invention;
FIG. 11 is a schematic diagram of a knowledge-graph path ordering apparatus according to an embodiment of the present invention;
fig. 12 is a schematic structural view of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
The flow diagrams depicted in the figures are exemplary only and not necessarily all steps are included. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
As shown in fig. 1, an embodiment of the present invention provides a method for sorting paths of knowledge maps, including the following steps:
s100: acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream;
here, the types of input text are different in different application scenarios. For example, in a question-answer scenario, the input text may be a question-answer text, in an information search scenario, the input text may be a search keyword text, or the like;
s200: performing path query based on the entity set to obtain a candidate path set, and taking each candidate path in the candidate path set as a second data stream respectively;
s300: inputting the first data stream and each second data stream into a double-stream deep learning model in sequence, and obtaining the similarity of the first data stream and each second data stream output by the double-stream deep learning model;
s400: and sequencing the candidate paths according to similarity values corresponding to the candidate paths, which are output by the double-flow deep learning model.
According to the knowledge graph path sorting method, firstly, the entity set and the candidate path set are respectively obtained through the step S100 and the step S200, then the entity set and the candidate path set are input into the double-flow deep learning model as the first data flow and the second data flow through the step S300, and the candidate paths are sorted based on the similarity output by the model through the step S400, so that the uncertainty of the paths in the knowledge graph is solved, and the accuracy of the knowledge graph path sorting is improved.
Fig. 2 is a schematic flow chart of an open domain map question-answering method in this embodiment. Prior to providing an open domain atlas question-answer, pre-preparation involves building knowledge-maps of the relevant domain, typically in the form of triples < subject, pre, object >. And importing the constructed knowledge Graph into a Graph database (Graph DB) for storage, so as to inquire the subsequent path.
When the user needs to ask questions, the input query character string Q is obtained through a front-end page (web page, application program, APP and the like) or an API interface, namely, the input query character string Q corresponds to the input text Q. Then, through step S100, the query string Q is subjected to syntactic analysis, entity analysis and identification, and entity linking, so as to obtain an entity set E.
Then, through step S200, an m-degree path query is performed in the graph database based on the entity set E, top n of the query result is taken, and the candidate path set { P is located n }。
Then through step S300, the query string Q and the path P n And the two input ends of the double-flow deep learning model are input in a one-to-one correspondence mode. Then, the path sorting is performed according to the output of the model through step S400 to obtain a sorted path set { P } rank }. And returning to the user through the front-end page.
As shown in fig. 3, in this embodiment, the dual stream deep learning model includes a text self-attention module Z2 (self-attention), a path self-attention module Z4 (self-attention), a cross-attention module M2 (cross-attention), and an output layer.
After the first data stream and each second data stream are sequentially input into a double-stream deep learning model, the first data stream, namely, the text input Q, is encoded by a text encoding module Z1 and then input into a text self-attention module Z2, and the second data stream is encoded by a path encoding module Z3 and then input into a path self-attention module Z4. The first text feature output by the text self-attention module Z2 and the first path feature output by the path self-attention module Z4 are input into the cross-attention module M2 together, the output feature of the cross-attention module M2 is input into the output layer, and the output layer outputs the similarity of the first data stream and the second data stream. As shown in fig. 3, the output features of the cross-attention module include a second text feature V1, a cross feature V2, and a second path feature V3.
As shown in fig. 4 and 5, in this embodiment, the text self-attention module Z2 and the path self-attention module Z4 each adopt a structure that is: a self-attention module M1 comprises a plurality of self-attention encoders M11 in series, each self-attention encoder M11 comprising a self-attention layer and a forward propagation layer. Fig. 6 is a schematic diagram of the data flow of M11 in a self-attention encoder. The self-attention encoder M11 includes addition & normalization (Add & Norm), feed forward (feed forward), and Multi-head attention (Multi-head attention), among others. That is, the text self-attention module includes a plurality of text self-attention encoders in series, each of the text self-attention encoders including a text self-attention layer and a text forward propagation layer. The path self-attention module comprises a plurality of path self-attention encoders which are sequentially connected in series, and each path self-attention encoder comprises a path self-attention layer and a path forward propagation layer. Each self-attention encoder inputs the feature vector of the text/path and outputs the new feature vector after self-attention operation. Wherein:
the first layer self-attention encoder receives the text/path feature vector output by the text/path encoding module. The self-attention encoders of the remaining layers receive the text/path feature vectors output by the self-attention encoder of the previous layer. The output of the last layer self-attention encoder is the output of the entire text/path self-attention module. I.e. the output of the last layer text self-attention encoder is the first text feature and the output of the last layer path self-attention encoder is the first path feature.
In this embodiment, as shown in fig. 7, the cross-attention module M2 includes a plurality of cross-attention encoders serially connected in sequence. Specifically, each of the cross attention encoders includes a first cross attention unit that receives text features output from a previous layer and outputs text features provided to a next layer, and a second cross attention unit that receives path features output from the previous layer and outputs path features provided to the next layer, and cross attention calculation is performed between the first cross attention unit and the second cross attention unit.
The first dimension in the characteristics output by the first cross attention unit is a cross characteristic V2, the characteristics except the first dimension are second text characteristics V1, and the characteristics output by the second cross attention unit are second path characteristics V3.
Fig. 8 is a schematic structural diagram of each cross-attention encoder M21 of this embodiment. Each cross-attention encoder M21 is a "dual stream architecture". Each "stream" is composed of a cross-attention layer, a self-attention layer, and a forward propagation layer in series, respectively. The 'double flow' carries out cross attention operation through the cross attention layer. Each cross-attention encoder receives feature vectors of both the text stream and the path stream simultaneously as a "dual stream" input, outputting new feature vectors of the text stream and the path stream. Fig. 9 is a schematic diagram of the data flow in the cross-attention encoder M21 of this embodiment. In this embodiment, the first cross-attention unit includes a first cross-attention layer, a first self-attention layer, and a first forward propagation layer in series, and the second cross-attention unit includes a second cross-attention layer, a second self-attention layer, and a second forward propagation layer in series, with cross-attention calculations performed between the first cross-attention layer and the second cross-attention layer. Wherein:
the first layer cross attention encoder receives simultaneously the first text feature output by the text self attention module of the last layer and the first path feature output by the path self attention module as 'double flow input'. The cross attention encoders of the rest layers receive the text flow and the path flow feature vectors output by the cross attention module of the last layer. The feature vector output by the cross attention encoder of the last layer is taken as the output of the whole cross attention module.
In one implementation of this embodiment, the output feature of the cross-attention module inputs the output layer, and the output layer outputs the similarity of the first data stream and the second data stream, including the steps of:
and combining the second text feature, the cross feature and the second path feature output by the cross attention module to obtain a total feature, inputting the total feature into the output layer, and outputting the similarity of the first data stream and the second data stream by the output layer.
In another implementation of this embodiment, the output feature of the cross-attention module inputs the output layer, and the output layer outputs the similarity of the first data stream and the second data stream, including the steps of:
inputting a second text feature and a second path feature output by the cross attention module into the output layer, wherein the output layer calculates the similarity of the second text feature and the second path feature as the similarity of the first data stream and the second data stream; or (b)
And inputting the cross features output by the cross attention module into the output layer, classifying the output layer based on the cross features to obtain classification categories, and taking the probability of the classification categories as the similarity of the first data stream and the second data stream.
In this embodiment, the two output layer processing manners described above may be selectively used according to different situations. The output layer may include a fully connective layer and a softmax layer. For example, the selection is made according to the type of tag, specifically as follows:
a. if discrete label: inputting the total feature vector V into the full connection layer sum softmax layer output predictedThe cross entropy is used as a loss function during training, and model loss is calculated by combining the real label y.
b. If a continuous label: the similarity between the text feature V1 and the path feature V3 can be selected as the predicted valueOr the cross feature V2 is connected with a softmax layer to be converted into a two-class problem, and the probability of one class is taken as a predicted value +.>Model losses are calculated during training using a square loss function (Mean Square Error, MSE) in combination with the true value y.
The following describes the implementation process of the knowledge graph path sorting method according to the present invention in detail by using a specific example, and specifically includes the following steps:
front preparation: open domain knowledge graph.
In the first step, a query string Q is obtained from the front end, several examples being listed in Table 1.
Second, corresponding to step S100: and carrying out syntactic analysis, entity analysis and identification and entity linking on the query string Q to obtain an entity set { E }.
Third, corresponding to step S200: and carrying out 2-degree path query on the entity set { E } in the graph database, taking top 5 of a query result, and determining the top 5 as a candidate path set { Pn }. (see Table 1 below)
TABLE 1 query string and candidate Path set
Fourth, corresponding to step S300, the query string Q and the candidate path setSum { P n Elements P in } n And (5) one-to-one combination, and inputting a double-flow depth ordering model. Using the "0/1" classification prediction, the probability of the "1" class label after softmax is output as path P n Similarity p to the query string Q. (see Table 2 below)
Table 2 dual stream depth model data output
Fifth, corresponding to step S400, a selected path set { P } is obtained sequentially n Prediction results of each candidate path in the sequence according to the prediction valuesFor candidate path set { P n The paths in the sequence are sequenced to obtain a sequence path set { P } rank }。
Sixth step, { P } rank Return to the front end interface or API interface. (see Table 3 below)
TABLE 3 Path ordering results
As shown in fig. 10, the embodiment of the present invention further provides a system for sorting knowledge-graph paths, for implementing the method for sorting knowledge-graph paths, where the system includes:
the entity set acquisition module M100 is used for acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream;
the path set acquisition module M200 is configured to perform path query based on the entity set to obtain a candidate path set, and each candidate path in the candidate path set is used as a second data stream respectively;
the similarity calculation module M300 is used for inputting the first data stream and each second data stream into a double-stream deep learning model together in sequence to obtain the similarity of the first data stream and each second data stream output by the double-stream deep learning model;
and the candidate path sorting module M400 is used for sorting the candidate paths according to the similarity values corresponding to the candidate paths output by the double-flow deep learning model.
According to the knowledge graph path sorting system, firstly, the entity set acquisition module M100 and the path set acquisition module M200 are used for respectively acquiring the entity set and the candidate path set, then the entity set and the candidate path set are used as a first data stream and a second data stream to be input into the double-stream deep learning model through the similarity calculation module M300, and the candidate paths are sorted through the candidate path sorting module M400 based on the similarity output by the model, so that the uncertainty of the paths in the knowledge graph is solved, and the accuracy of the sorting of the knowledge graph paths is improved.
In the knowledge-graph path sorting system of the present invention, the functions of each module may be implemented by adopting the specific implementation manner of the knowledge-graph path sorting method described above, which is not described herein.
The embodiment of the invention also provides a knowledge graph path sequencing device, which comprises a processor; a memory having stored therein executable instructions of the processor; wherein the processor is configured to perform the steps of the knowledge-graph path ordering method via execution of the executable instructions.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" platform.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 11. The electronic device 600 shown in fig. 11 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 11, the electronic device 600 is in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting the different system components (including the memory unit 620 and the processing unit 610), a display unit 640, etc.
Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs the steps according to various exemplary embodiments of the present invention described in the above-described knowledge-graph path ordering method section of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.
The memory unit 620 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.
The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
In the knowledge spectrum path sequencing device, the steps of the knowledge spectrum path sequencing method are realized when the program in the memory is executed by the processor, so that the device can obtain the technical effects of the knowledge spectrum path sequencing method.
The embodiment of the invention also provides a computer readable storage medium for storing a program, which when being executed by a processor, realizes the steps of the knowledge graph path sorting method. In some possible embodiments, the aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the above description of the method of ranking knowledge-graph paths, when said program product is executed on a terminal device.
Referring to fig. 12, a program product 800 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be executed on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The steps of the knowledge-graph path sorting method are implemented when the program in the computer storage medium is executed by the processor, so that the computer storage medium can also obtain the technical effects of the knowledge-graph path sorting method.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (6)

1. The knowledge graph path sorting method is characterized by comprising the following steps of:
acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream;
performing path query based on the entity set to obtain a candidate path set, and taking each candidate path in the candidate path set as a second data stream respectively;
inputting the first data stream and each second data stream into a double-stream deep learning model in sequence, and obtaining the similarity of the first data stream and each second data stream output by the double-stream deep learning model;
sorting the candidate paths according to similarity values corresponding to the candidate paths, which are output by the double-flow deep learning model;
the double-flow deep learning model comprises a text self-attention module, a path self-attention module, a cross-attention module and an output layer;
after the first data stream and each second data stream are sequentially input into a double-flow deep learning model together, the first data stream is input into the text self-attention module, the second data stream is input into the path self-attention module, a first text feature output by the text self-attention module and a first path feature output by the path self-attention module are input into the cross-attention module together, an output feature of the cross-attention module is input into the output layer, and the output layer outputs the similarity of the first data stream and the second data stream;
the output features of the cross-attention module are input to the output layer, and the output layer outputs the similarity of the first data stream and the second data stream, comprising the following steps:
combining the second text feature, the cross feature and the second path feature output by the cross attention module to obtain a total feature, inputting the total feature into the output layer, and outputting the similarity of the first data stream and the second data stream by the output layer;
the cross attention module comprises a plurality of cross attention encoders which are sequentially connected in series, each cross attention encoder comprises a first cross attention unit and a second cross attention unit, the first cross attention unit receives text features output by a previous layer and outputs text features provided for a next layer, the second cross attention unit receives path features output by the previous layer and outputs path features provided for the next layer, and cross attention calculation is carried out between the first cross attention unit and the second cross attention unit;
the first dimension in the characteristics output by the first cross attention unit is a cross characteristic, the characteristics except the first dimension are second text characteristics, and the characteristics output by the second cross attention unit are second path characteristics.
2. The knowledge-graph path ordering method of claim 1, wherein the text self-attention module comprises a plurality of text self-attention encoders in series, each of the text self-attention encoders comprising a text self-attention layer and a text forward propagation layer;
the path self-attention module comprises a plurality of path self-attention encoders which are sequentially connected in series, and each path self-attention encoder comprises a path self-attention layer and a path forward propagation layer.
3. The knowledge graph path ordering method according to claim 1, wherein the first cross attention unit comprises a first cross attention layer, a first self attention layer and a first forward propagation layer connected in series in turn, the second cross attention unit comprises a second cross attention layer, a second self attention layer and a second forward propagation layer connected in series in turn, and cross attention calculation is performed between the first cross attention layer and the second cross attention layer.
4. A knowledge-graph path ordering system, comprising:
the entity set acquisition module is used for acquiring an input text, obtaining an entity set of the input text, and taking the entity set as a first data stream;
the path set acquisition module is used for carrying out path query based on the entity set to obtain a candidate path set, and each candidate path in the candidate path set is respectively used as a second data stream;
the similarity calculation module is used for inputting the first data stream and each second data stream into a double-stream deep learning model in sequence to obtain the similarity of the first data stream and each second data stream output by the double-stream deep learning model;
the candidate path sorting module is used for sorting the candidate paths according to the similarity values corresponding to the candidate paths output by the double-flow deep learning model;
the double-flow deep learning model comprises a text self-attention module, a path self-attention module, a cross-attention module and an output layer;
after the first data stream and each second data stream are sequentially input into a double-flow deep learning model together, the first data stream is input into the text self-attention module, the second data stream is input into the path self-attention module, a first text feature output by the text self-attention module and a first path feature output by the path self-attention module are input into the cross-attention module together, an output feature of the cross-attention module is input into the output layer, and the output layer outputs the similarity of the first data stream and the second data stream;
the output features of the cross-attention module are input to the output layer, and the output layer outputs the similarity of the first data stream and the second data stream, comprising the following steps:
combining the second text feature, the cross feature and the second path feature output by the cross attention module to obtain a total feature, inputting the total feature into the output layer, and outputting the similarity of the first data stream and the second data stream by the output layer;
the cross attention module comprises a plurality of cross attention encoders which are sequentially connected in series, each cross attention encoder comprises a first cross attention unit and a second cross attention unit, the first cross attention unit receives text features output by a previous layer and outputs text features provided for a next layer, the second cross attention unit receives path features output by the previous layer and outputs path features provided for the next layer, and cross attention calculation is carried out between the first cross attention unit and the second cross attention unit;
the first dimension in the characteristics output by the first cross attention unit is a cross characteristic, the characteristics except the first dimension are second text characteristics, and the characteristics output by the second cross attention unit are second path characteristics.
5. A knowledge-graph path ordering apparatus, comprising:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the knowledge-graph path ordering method of any one of claims 1 to 3 via execution of the executable instructions.
6. A computer-readable storage medium storing a program, characterized in that the program when executed by a processor implements the steps of the knowledge-graph path ordering method of any one of claims 1 to 3.
CN202110493759.8A 2021-05-07 2021-05-07 Knowledge graph path ordering method, system, equipment and storage medium Active CN113157891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110493759.8A CN113157891B (en) 2021-05-07 2021-05-07 Knowledge graph path ordering method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110493759.8A CN113157891B (en) 2021-05-07 2021-05-07 Knowledge graph path ordering method, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113157891A CN113157891A (en) 2021-07-23
CN113157891B true CN113157891B (en) 2023-11-17

Family

ID=76873768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110493759.8A Active CN113157891B (en) 2021-05-07 2021-05-07 Knowledge graph path ordering method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113157891B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893523A (en) * 2016-03-31 2016-08-24 华东师范大学 Method for calculating problem similarity with answer relevance ranking evaluation measurement
CN110347798A (en) * 2019-07-12 2019-10-18 之江实验室 A kind of knowledge mapping auxiliary understanding system based on spatial term technology
CN111324769A (en) * 2020-01-20 2020-06-23 腾讯科技(北京)有限公司 Training method of video information processing model, video information processing method and device
WO2020214299A1 (en) * 2019-04-17 2020-10-22 Microsoft Technology Licensing, Llc Live comments generating
CN112231350A (en) * 2020-10-13 2021-01-15 汉唐信通(北京)科技有限公司 Enterprise business opportunity mining method and device based on knowledge graph
CN112417170A (en) * 2020-11-23 2021-02-26 南京大学 Relation linking method for incomplete knowledge graph
CN112650840A (en) * 2020-12-04 2021-04-13 天津泰凡科技有限公司 Intelligent medical question-answering processing method and system based on knowledge graph reasoning
CN112687388A (en) * 2021-01-08 2021-04-20 中山依数科技有限公司 Interpretable intelligent medical auxiliary diagnosis system based on text retrieval

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893523A (en) * 2016-03-31 2016-08-24 华东师范大学 Method for calculating problem similarity with answer relevance ranking evaluation measurement
WO2020214299A1 (en) * 2019-04-17 2020-10-22 Microsoft Technology Licensing, Llc Live comments generating
CN110347798A (en) * 2019-07-12 2019-10-18 之江实验室 A kind of knowledge mapping auxiliary understanding system based on spatial term technology
WO2020233261A1 (en) * 2019-07-12 2020-11-26 之江实验室 Natural language generation-based knowledge graph understanding assistance system
CN111324769A (en) * 2020-01-20 2020-06-23 腾讯科技(北京)有限公司 Training method of video information processing model, video information processing method and device
CN112231350A (en) * 2020-10-13 2021-01-15 汉唐信通(北京)科技有限公司 Enterprise business opportunity mining method and device based on knowledge graph
CN112417170A (en) * 2020-11-23 2021-02-26 南京大学 Relation linking method for incomplete knowledge graph
CN112650840A (en) * 2020-12-04 2021-04-13 天津泰凡科技有限公司 Intelligent medical question-answering processing method and system based on knowledge graph reasoning
CN112687388A (en) * 2021-01-08 2021-04-20 中山依数科技有限公司 Interpretable intelligent medical auxiliary diagnosis system based on text retrieval

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于短文本相似度计算的知识子图融合方法;郑志蕴;吴建萍;李钝;刘允;米高扬;;小型微型计算机系统(第01期);8-13 *

Also Published As

Publication number Publication date
CN113157891A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN109885672B (en) Question-answering type intelligent retrieval system and method for online education
WO2021082953A1 (en) Machine reading understanding method and apparatus, storage medium, and device
CN112633419B (en) Small sample learning method and device, electronic equipment and storage medium
CN112464641A (en) BERT-based machine reading understanding method, device, equipment and storage medium
CN110647614A (en) Intelligent question and answer method, device, medium and electronic equipment
CN111078837B (en) Intelligent question-answering information processing method, electronic equipment and computer readable storage medium
CN110598078B (en) Data retrieval method and device, computer-readable storage medium and electronic device
EP3913521A1 (en) Method and apparatus for creating dialogue, electronic device and storage medium
CN112287069B (en) Information retrieval method and device based on voice semantics and computer equipment
CN111625634A (en) Word slot recognition method and device, computer-readable storage medium and electronic device
CN111666416A (en) Method and apparatus for generating semantic matching model
US11461613B2 (en) Method and apparatus for multi-document question answering
CN109145083B (en) Candidate answer selecting method based on deep learning
JP2022169743A (en) Information extraction method and device, electronic equipment, and storage medium
US20230008897A1 (en) Information search method and device, electronic device, and storage medium
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN112949758A (en) Response model training method, response method, device, equipment and storage medium
JP2022091122A (en) Generalization processing method, apparatus, device, computer storage medium, and program
CN117057173B (en) Bionic design method and system supporting divergent thinking and electronic equipment
CN111125550A (en) Interest point classification method, device, equipment and storage medium
CN117454884A (en) Method, system, electronic device and storage medium for correcting historical character information
CN116402166B (en) Training method and device of prediction model, electronic equipment and storage medium
CN113157891B (en) Knowledge graph path ordering method, system, equipment and storage medium
CN117009516A (en) Converter station fault strategy model training method, pushing method and device
CN114742062B (en) Text keyword extraction processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant