CN113590774A - Event query method, device and storage medium - Google Patents

Event query method, device and storage medium Download PDF

Info

Publication number
CN113590774A
CN113590774A CN202110691962.6A CN202110691962A CN113590774A CN 113590774 A CN113590774 A CN 113590774A CN 202110691962 A CN202110691962 A CN 202110691962A CN 113590774 A CN113590774 A CN 113590774A
Authority
CN
China
Prior art keywords
event
node
nodes
information
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110691962.6A
Other languages
Chinese (zh)
Other versions
CN113590774B (en
Inventor
黄佳艳
陈玉光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110691962.6A priority Critical patent/CN113590774B/en
Publication of CN113590774A publication Critical patent/CN113590774A/en
Application granted granted Critical
Publication of CN113590774B publication Critical patent/CN113590774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure discloses an event query method, an event query device and a storage medium, and relates to the field of knowledge graphs. The specific implementation scheme is as follows: acquiring a keyword; matching the keyword with each event node in an event causal relationship graph to obtain a first event node matched with the keyword; determining an inference path according to the first event node and the associated nodes of the first event node having causal relationship in the event causal relationship graph; screening the event nodes in the inference path according to mutual information between the event nodes and target information contained in the inference path; wherein the target information includes at least one of the keyword and the first event node; and determining the events inquired by the keywords according to the event nodes reserved in the reasoning path. The method and the device effectively improve the efficiency of event query.

Description

Event query method, device and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to the field of knowledge graph technologies, and in particular, to an event query method, apparatus, and storage medium.
Background
In recent years, more and more work has been focused on reasoning about various aspects of events, such as Event causality inference (Event cause inference), Script Event inference (Script Event prediction), and so forth. The event cause and effect relationship inference is to acquire new knowledge or conclusion based on the existing information, and the knowledge and conclusion and the existing information satisfy the semantic event cause and effect relationship.
In the related art, the topic deviation of the inference result is generally prevented by controlling the inference step number, but the method cannot fundamentally solve the problem because whether the topic of the event node is consistent or not is not necessarily related to the step number of the interval of the event node on the graph, and the topic of the event node with closer interval is not consistent.
Disclosure of Invention
The disclosure provides a method, a device and a storage medium for event query.
According to a first aspect of the present disclosure, there is provided an event query method, including:
acquiring a keyword;
matching the keyword with each event node in an event causal relationship graph to obtain a first event node matched with the keyword;
determining an inference path according to the first event node and the associated nodes of the first event node having causal relationship in the event causal relationship graph;
screening the event nodes in the inference path according to mutual information between the event nodes and target information contained in the inference path; wherein the target information includes at least one of the keyword and the first event node;
and determining the events inquired by the keywords according to the event nodes reserved in the reasoning path.
According to a second aspect of the present disclosure, there is provided an event query apparatus, including:
the acquisition module is used for acquiring keywords;
the matching module is used for matching the keywords with all event nodes in the event causal relationship graph to obtain first event nodes matched with the keywords;
the reasoning module is used for determining a reasoning path according to the first event node and the associated nodes of the first event node having causal relationship in the event causal relationship graph;
the screening module is used for screening the event nodes in the inference path according to mutual information between the event nodes and target information contained in the inference path; wherein the target information includes at least one of the keyword and the first event node;
and the query module is used for determining the event queried by the keyword according to the event node reserved in the inference path.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the event query method of the first aspect of the disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the event query method of the first aspect of the present disclosure.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the event query method of the first aspect of the present disclosure.
According to the event query method, the event query device and the storage medium provided by the embodiment of the disclosure, the inference path can be determined according to the first event node matched with the keyword and the associated node of the causal relationship in the event causal relationship graph, and the event nodes in the inference path are screened according to the mutual information between each event node and the keyword or the first event node included in the inference path, so as to determine the queried event. After screening, the mutual information between the event nodes reserved in the inference path and the keywords or the first event nodes meets the set requirements, so that the semantic consistency is ensured, the situation of theme deviation is reduced, and the efficiency of event query is effectively improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of an event query method according to an embodiment of the present disclosure;
FIG. 2 is a flow diagram of another event query method according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of an information completion method according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of an event cause and effect relationship map according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an event query device according to an embodiment of the present disclosure;
fig. 6 is a block diagram of another event query device according to an embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device for implementing an event query method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Events are an important piece of knowledge, and in recent years, more and more work has been focused on extracting structured event knowledge from open domain or domain text. Meanwhile, in addition to the event extraction task which is difficult in itself, in recent years, more and more researchers have started to pay attention to the reasoning work of events. The relation mining and reasoning among the events has a plurality of research fields, different from the main task of event extraction, the event reasoning is developing towards various interesting directions, and the event reasoning has a continuously developing prospect as a more interesting application after the information extraction.
In the related art, no good solution is provided for the problem of topic deviation of event cause and effect reasoning, and topic deviation of reasoning results is generally prevented by controlling reasoning steps, but the problem cannot be solved fundamentally by the method because whether the topics of event nodes are consistent or not is not necessarily related to the step numbers of the event nodes spaced on a graph, and the topics of the event nodes spaced more closely are not consistent.
Based on this, an embodiment of the present disclosure provides an event query method, referring to a flowchart of the event query method shown in fig. 1, where the method may be executed by various electronic devices with data processing capability, where the electronic device executing the method of this embodiment is not limited, and mainly includes the following steps S102 to S110:
step S102, keywords are obtained.
The keywords can be input by a user, can be automatically extracted by the system according to the input of the user, and can be automatically acquired by the system according to needs. It should be noted that the keyword includes at least one segmented word.
And step S104, matching the keywords with each event node in the event causal relationship graph to obtain a first event node matched with the keywords.
The event causal relationship graph is a graph capable of representing causal relationship between events, and comprises at least one event node. It should be noted that, in the event cause and effect relationship graph, if a cause and effect relationship exists between two event nodes, an edge exists between the two event nodes, and the cause node points to the result node.
In the embodiment of the present disclosure, the first event node may be an optimal node matched by using an Expert System (ES) and an Artificial Neural Network (ANN), may also be obtained by calculating a keyword and mutual information of each event node in the event causal relationship graph, and may also be obtained by selecting another feasible matching algorithm (such as a KM algorithm, etc.) or a rule as needed. An Expert System (ES) is a program system having an expert level problem solving capability in a specific field, and is capable of processing problems in the field using knowledge of human experts and a method of solving the problems.
And S106, determining an inference path according to the first event node and the associated nodes of the first event node having the causal relationship in the event causal relationship graph.
The inference path is a sub-graph of an event causal relationship graph including at least one event node, and the associated node is a node having a causal relationship with the first event node. The subgraph comprises an association node, a first event node and a connecting edge used for indicating causal relation among the event nodes. In the embodiment of the present disclosure, the number of event nodes included in the inference path may be determined according to the inference step number, or may be determined by other possible rules and algorithms.
Step S108, according to mutual information between each event node and target information contained in the inference path, screening the event nodes in the inference path; wherein the target information includes at least one of the keyword and the first event node.
In a first possible implementation manner, mutual information between the keyword and each event node in the inference path is calculated, and event nodes and subsequent nodes of the event nodes, of which the mutual information does not meet the preset condition, are filtered out.
In a second possible implementation manner, mutual information between the first event node and each event node in the inference path is calculated, and event nodes and subsequent nodes of the event nodes, of which mutual information does not meet preset conditions, are filtered out.
In a third possible implementation manner, the keywords and mutual information between the first event node and each event node in the inference path are respectively calculated, and the event nodes and subsequent nodes of the event nodes, of which the mutual information does not meet the preset condition, are filtered out. For example, the sum of the mutual information between the keyword and any event node and the mutual information between the first event node and the event node may be used as a screening basis; for another example, a weighted average between the mutual information between the keyword and any event node and the mutual information between the first event node and the event node may be used as a screening basis.
Mutual Information (MI) measures the correlation (Mutual dependency) between two event sets. The Mutual Information is an expected value of inter-Point Mutual Information (PMI). The inter-point mutual information is a statistic used to measure the strength of association between two events. A larger value indicates a stronger correlation between the two events, and a smaller value indicates a weaker correlation between the two events.
And step S110, determining the event inquired by the keyword according to the event node reserved in the inference path.
In the embodiment of the present disclosure, the event nodes retained in the inference path are all event nodes after the filtering and screening in step S108 is completed.
According to the method provided by the embodiment of the disclosure, the inference path taking the event node as the center is obtained through the event node matched with the keyword, the mutual information between the keyword or the first event node and each event node in the inference path is calculated, the event nodes which do not meet the preset conditions are filtered, and the inquired event is determined. After screening, mutual information between the event nodes reserved in the inference path and the keywords or the first event nodes meets preset requirements, so that the inquired event nodes are consistent with the theme, semantic consistency is guaranteed, the condition of theme deviation is reduced, and the effect of event cause-effect inference is effectively improved.
Referring to fig. 2, a flowchart of an event query method mainly includes the following steps S202 to S214:
step S202, keywords are obtained.
And step S204, matching the keyword with each event node in the event causal relationship graph to obtain a first event node matched with the keyword.
And S206, determining an inference path according to the first event node and the associated nodes of the first event node having causal relationship in the event causal relationship graph.
Step S208, obtaining mutual information between each event node contained in the inference path and the keyword.
In the embodiment of the present disclosure, mutual information between each event node and a keyword included in the inference path may be acquired according to the co-occurrence frequency dictionary.
Optionally, the co-occurrence frequency dictionary may be preset, or may be generated according to description information of each event node in the event cause and effect relationship graph. The description information of the event node includes a subject, an object, a trigger word, and the like of the event, and the description information of the event may be extracted from an unstructured text.
It should be noted that the co-occurrence frequency is generally used to measure the semantic association degree of any two words. The co-occurrence frequency dictionary comprises at least one group of co-occurrence words and co-occurrence frequencies of the at least one group of co-occurrence words.
Optionally, determining mutual information between each event node included in the inference path and the keyword according to a co-occurrence frequency dictionary may include:
segmenting the description information of each event node; querying the co-occurrence frequency dictionary to obtain the co-occurrence frequency between each participle and the keyword; determining mutual information between each participle and the keyword according to the co-occurrence frequency; and determining mutual information between the corresponding event node and the keyword according to the mutual information between the participles belonging to the same event node and the keyword.
Assume a related keyword Q and an event node E in the inference path. And segmenting the words of the Q, wherein X is one of the obtained at least one segmented word, the description information of the E is segmented, and Y is one of the obtained at least one segmented word.
Then the mutual information PMI (X, Y) of X and Y is:
Figure BDA0003127121890000061
wherein, P (X, Y) represents the probability of X and Y appearing together, P (X) represents the probability of X appearing, P (Y) represents the probability of Y appearing, N is the total number of corpora in the corpus, N (X, Y) is the number of corpora where X and Y co-occur, N (X) is the number of corpora where X appears, and N (Y) is the number of corpora where Y appears. Considering that X and Y may not co-occur in all corpora, to ensure that the above formula is always true:
Figure BDA0003127121890000071
finally, the mutual information of the keyword Q and the event node E is:
PMI(Q,E)=∑x∈Q,y∈EPMI(x,y)。
optionally, in a second possible implementation manner, the obtaining mutual information between the first event node and each other event node in the inference path includes:
generating a co-occurrence frequency dictionary according to the description information of each event node in the event causal relationship graph; performing word segmentation on the description information of the first event node and other event nodes; querying the co-occurrence frequency dictionary to obtain co-occurrence frequency between each participle and the first event; determining mutual information between each participle and the first event according to the co-occurrence frequency; and determining mutual information between the corresponding event node and the keyword according to the mutual information between the participles belonging to the same event node and the first event.
Assume a first event node C and one other event node R in the inference path. And segmenting the description information of the C, wherein the U is one of the obtained at least one segmented word, the description information of the R is segmented, and the V is one of the obtained at least one segmented word.
The mutual information PMI (U, V) of U and V is:
Figure BDA0003127121890000072
wherein, P (U, V) represents the probability of the common occurrence of U and V, P (U) represents the probability of the occurrence of U, P (V) represents the probability of the occurrence of V, N is the total number of the linguistic data in the corpus, N (U, V) is the number of the linguistic data of the common occurrence of U and V, N (U) is the number of the linguistic data of the occurrence of U, and N (V) is the number of the linguistic data of the occurrence of V. Considering that U and V may not co-occur in all corpora, to ensure that the above formula is always true:
Figure BDA0003127121890000073
finally, the mutual information of the first event node C and the other event nodes R is:
PMI(C,R)=∑u∈C,v∈RPMI(u,v)。
optionally, in a third possible implementation manner, the keyword and the mutual information between the first event node and each other event node in the inference path are obtained, and the process of the method is similar to the process of the above two implementation manners, and is not described again here.
Step S210, according to mutual information between each event node and the keyword contained in the inference path, determining a theme deviation node of which the mutual information does not meet set conditions from the inference path.
The preset condition is that the mutual information obtained in step S208 is greater than or equal to a certain threshold. The topic deviation node that does not satisfy the preset condition is an event node whose mutual information with the keyword and/or the first event node is less than a certain threshold in step S208.
The threshold may be generally set to 0, and it is considered that two events having mutual information greater than or equal to 0 have strong correlation, and two events having mutual information smaller than 0 have weak correlation. Alternatively, the threshold value may be determined by other feasible algorithms and models according to different application scenarios and functional requirements.
Step S212, deleting the theme deviation nodes and all event nodes which are in result relationship with the theme deviation nodes from the reasoning path.
Wherein the result relationship comprises a direct result relationship and an indirect result relationship. That is, the deleted nodes include event nodes having a direct result relationship with the subject deviation nodes and event nodes having an indirect result relationship with the subject deviation nodes, i.e., all event nodes that the subject deviation nodes can reach in the inference path.
Step S214, determining the event inquired by the keyword according to the event node reserved in the inference path.
By the method, a co-occurrence frequency dictionary is generated, the domain topic relevance of corpus linguistic data is improved, the accuracy of the calculation result of the mutual information of the keywords or the first event node and other event nodes is improved, and the calculation efficiency is improved; the nodes which do not meet the preset conditions are screened, and the event nodes which are relatively weak in relevance with the keywords or the first event and the subsequent nodes of the event nodes can be deleted and filtered, so that the event nodes and the keywords which are reserved in the inference path are consistent in theme, the problem of theme deviation of cause-effect inference of the events is effectively solved, the effect of cause-effect inference of the events is greatly improved, the theme consistency of the inference results and the keywords is effectively ensured, the inference results are optimized, and the efficiency of event query is improved.
In some application scenarios, events are obtained through event extraction, and some of the events lack some description information in the extraction process, so that the description information of the events is incomplete, and the effect and efficiency of causal reasoning of subsequent events are affected. Therefore, the present disclosure further provides a method for completing information of event nodes, which may be applied before the first event node is obtained by matching the keyword with each event node in the event cause-and-effect relationship graph in the above embodiment. As shown in fig. 3, the method mainly includes the following steps S302 to S306:
step S302, inquiring the adjacent event nodes with causal relationship for the second event nodes with description information part missing in the event causal relationship graph.
In the embodiment of the present disclosure, for ease of understanding, reference may be made to the event cause and effect graph diagram shown in fig. 4 (only a part of the nodes and edges are cut). In the event cause and effect relationship graph, each event node is represented by a triple (s, p, o), wherein s represents an event subject, p represents an event trigger, o represents an event object, and s, p and o are extracted from unstructured text. The second event node where the descriptive information portion is missing is the event node where s or o is missing during the extraction process.
In the embodiment of the disclosure, node representation learning is performed on the event cause and effect graph by using the idea of trans, and Xs, Xp, and Xo are used to represent an event subject s, an event trigger p, and an event object o, where Xs, Xp, and Xo are N-dimensional numerical vectors, and each event node may be represented by a < Xs, Xp, and Xo > triplet.
Step S304, inputting the description information of the adjacent event node and the description information of the second event node that is not missing into the trained prediction model, so as to obtain a representation of the second event node.
In one embodiment, the training method of the prediction model is as follows: respectively inputting a positive sample and a corresponding negative sample in a training set into the prediction model to obtain the representation of two events in the positive sample and the representation of two events in the negative sample output by the prediction model; determining a loss function according to a difference between a first distance between the representations of the two events in the positive sample and a second distance between the representations of the two events in the negative sample; and adjusting the model parameters according to the loss function.
The positive sample refers to the description information of two event nodes with causal relationship in the event causal relationship graph; the negative sample refers to the description information of two event nodes without causal relationship in the event causal relationship graph.
Alternatively, suppose event HposAnd TposThere is a causal relationship, two events constitute a positive sample, denoted as<hpos,rpos,tpos>Wherein r isposIs an event HposAnd event TposVector characterization of the relationship between hposAnd tposAre respectively an event HposAnd TposThe vector characterization of (2):
Figure BDA0003127121890000091
suppose event HnegAnd TnegThere is no causal relationship, and the two events constitute a negative example, which is expressed as<hneg,rneg,tneg>Wherein r isnegIs an event HnegAnd event TnegVector characterization of the relationship between hposAnd tposAre respectively an event HposAnd TposThe vector characterization of (2):
Figure BDA0003127121890000092
in the positive sample, the distance between the two events is dpos=||hpos+rpos-tpos||;
In the negative example, the distance between the two events is dneg=||hneg+rneg-tneg||。
The loss function L is: l ═ max (0, d)pos-dneg+ margin), where max (·) denotes maximizing the element, margin is a constant term.
And carrying out unsupervised training on the prediction model, and adjusting model parameters to optimize the loss function and minimize the value of the loss function, namely, adjusting the model parameters according to the loss function so as to minimize the value of the loss function.
In an embodiment of the present disclosure, it is assumed that the second event node
Figure BDA0003127121890000101
Missing event object description information o, adjacent event node having causal relationship with second event node
Figure BDA0003127121890000102
Figure BDA0003127121890000103
Inputting the description information of the event node H and the event node T which are not lost into the trained prediction model to obtain the vector representation of the event T
Figure BDA0003127121890000104
And step S306, predicting the missing description information according to the representation of the second event node.
In one embodiment, the description information for predicting the missing includes:
obtaining the representation of each candidate information; determining the representation of the missing description information according to the representation of the second event node and the representation of the information which is not missing in the second event node; and determining the description information missing from the second event node from each candidate information according to the similarity between the representation of each candidate information and the representation of the missing description information.
Optionally, the node is characterized according to a second event
Figure BDA0003127121890000105
And characterization of the non-missing description information
Figure BDA0003127121890000106
And
Figure BDA0003127121890000107
vector characterization that can determine missing object information
Figure BDA0003127121890000108
Using artificial neural networks, from each candidate messageInformation vector characterization and vector
Figure BDA0003127121890000109
Similarity between, recall and vector
Figure BDA00031271218900001010
The closest candidate information is the object information to be completed by the event T.
By the mode, the positive and negative samples are used for training the prediction model, so that the prediction accuracy of the prediction model can be improved; and the event nodes missing part of the description information are effectively and accurately supplemented with information, so that the event nodes have complete description information in subsequent query, the accuracy of mutual information calculation is improved, and the reserved event nodes and the keywords have higher semantic relevance.
Corresponding to the foregoing event query method, an embodiment of the present disclosure further provides an event query device, referring to a structural block diagram of the event query device shown in fig. 5, which mainly includes the following steps:
an obtaining module 510, configured to obtain a keyword;
a matching module 520, configured to match the keyword with each event node in the event causal relationship graph, so as to obtain a first event node matched with the keyword;
an inference module 530, configured to determine an inference path according to the first event node and an associated node of the first event node having a causal relationship in the event causal relationship graph;
the screening module 540 is configured to screen event nodes in the inference path according to mutual information between each event node and target information included in the inference path; wherein the target information includes at least one of the keyword and the first event node;
and the query module 550 is configured to determine the event queried by the keyword according to the event node retained in the inference path.
In some embodiments, the filtering, according to mutual information between each event node included in the inference path and target information, the event node in the inference path includes:
acquiring mutual information between each event node and the keyword contained in the inference path;
determining topic deviating nodes of which mutual information does not meet set conditions from the reasoning path according to mutual information between each event node and the keywords contained in the reasoning path;
and deleting the theme deviation nodes and all event nodes which are in result relation with the theme deviation nodes from the reasoning path.
In some embodiments, the obtaining mutual information between each event node included in the inference path and the keyword includes:
generating a co-occurrence frequency dictionary according to the description information of each event node in the event causal relationship graph;
and determining mutual information between each event node contained in the inference path and the keyword according to the co-occurrence frequency dictionary.
In some embodiments, the determining mutual information between each event node included in the inference path and the keyword according to the co-occurrence frequency dictionary includes:
segmenting the description information of each event node;
querying the co-occurrence frequency dictionary to obtain the co-occurrence frequency between each participle and the keyword;
determining mutual information between each participle and the keyword according to the co-occurrence frequency;
and determining mutual information between the corresponding event node and the keyword according to the mutual information between the participles belonging to the same event node and the keyword.
In some application scenarios, in order to solve the problem that some events lack some description information in the extraction process, so that the description information of the events is incomplete, and the effect and efficiency of causal reasoning of subsequent events are affected, a structural block diagram of an event query device shown in fig. 6 is provided, which mainly includes:
an obtaining module 610, configured to obtain a keyword;
a node query module 620, configured to query, for a second event node in the event causal relationship graph, where a description information part is missing, an adjacent event node in the event causal relationship graph;
a representation module 630, configured to input the description information of the adjacent event node and the description information of the second event node that is not missing into a trained prediction model to obtain a representation of the second event node;
and a predicting module 640, configured to predict missing description information according to the characterization of the second event node.
A matching module 650, configured to match the keyword with each event node in the event causal relationship graph, so as to obtain a first event node matched with the keyword;
the inference module 660 is configured to determine an inference path according to the first event node and the associated node of the event cause-and-effect relationship in the event cause-and-effect relationship map;
a screening module 670, configured to screen event nodes in the inference path according to mutual information between each event node and target information included in the inference path; wherein the target information includes at least one of the keyword and the first event node;
and the query module 680 is configured to determine the event queried by the keyword according to the event node reserved in the inference path.
In some embodiments, the prediction model is obtained by inputting a positive sample and a corresponding negative sample in a training set into the prediction model respectively, and obtaining the representation of two events in the positive sample and the representation of two events in the negative sample output by the prediction model; determining a loss function according to a difference between a first distance between the representations of the two events in the positive sample and a second distance between the representations of the two events in the negative sample; adjusting the model parameters according to the loss function;
the positive sample is description information of two event nodes with causal relationship in the event causal relationship graph; the negative sample is description information of two event nodes without causal relationship in the event causal relationship graph.
In some embodiments, the loss function takes on the greater of the difference between the first distance and the second distance, and zero.
In some embodiments, predicting missing description information based on the characterization of the second event node includes:
obtaining the representation of each candidate information;
determining the representation of the missing description information according to the representation of the second event node and the representation of the information which is not missing in the second event node;
and determining the description information missing from the second event node from each candidate information according to the similarity between the representation of each candidate information and the representation of the missing description information.
It should be noted that the foregoing explanation of the event query method is also applicable to the event query device in the embodiment of the present disclosure, and the implementation principle and the beneficial effect thereof are similar and will not be described herein again.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
First, an embodiment of the present disclosure provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any of the foregoing event query methods.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 includes a computing unit 701, which can perform various appropriate actions and processes in accordance with a computer program stored in a ROM (Read-Only Memory) 702 or a computer program loaded from a storage unit 708 into a RAM (Random Access Memory) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An I/O (Input/Output) interface 705 is also connected to the bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing Unit 701 include, but are not limited to, a CPU (Central Processing Unit), a GPU (graphics Processing Unit), various dedicated AI (Artificial Intelligence) computing chips, various computing Units running machine learning model algorithms, a DSP (Digital Signal Processor), and any suitable Processor, controller, microcontroller, and the like. The computing unit 701 executes the respective methods and processes described above, such as the event query method. For example, in some embodiments, the event query method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into RAM 703 and executed by the computing unit 701, one or more steps of the event query method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the event query method by any other suitable means (e.g., by means of firmware).
The disclosed embodiments also provide a non-transitory computer readable storage medium storing computer instructions for causing a computer to execute any one of the aforementioned event query methods.
The embodiment of the present disclosure further provides a computer program product, which includes a computer program, and the computer program realizes any one of the foregoing event query methods when being executed by a processor.
Various implementations of the systems and techniques described here above may be realized in digital electronic circuitry, Integrated circuitry, FPGAs (Field Programmable Gate arrays), ASICs (Application-Specific Integrated circuits), ASSPs (Application Specific Standard products), SOCs (System On Chip, System On a Chip), CPLDs (Complex Programmable Logic devices), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an EPROM (Electrically Programmable Read-Only-Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only-Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a Display device (e.g., a CRT (Cathode Ray Tube) or LCD (Liquid Crystal Display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network), WAN (Wide Area Network), internet, and blockchain Network.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be noted that artificial intelligence is a subject for studying a computer to simulate some human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), and includes both hardware and software technologies. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. An event query method, comprising:
acquiring a keyword;
matching the keyword with each event node in an event causal relationship graph to obtain a first event node matched with the keyword;
determining an inference path according to the first event node and the associated nodes of the first event node having causal relationship in the event causal relationship graph;
screening the event nodes in the inference path according to mutual information between the event nodes and target information contained in the inference path; wherein the target information includes at least one of the keyword and the first event node;
and determining the events inquired by the keywords according to the event nodes reserved in the reasoning path.
2. The method according to claim 1, wherein the screening the event nodes in the inference path according to mutual information between the event nodes included in the inference path and the keyword includes:
acquiring mutual information between each event node and the keyword contained in the inference path;
determining topic deviating nodes of which mutual information does not meet set conditions from the reasoning path according to mutual information between each event node and the keywords contained in the reasoning path;
and deleting the theme deviation nodes and all event nodes which are in result relation with the theme deviation nodes from the reasoning path.
3. The method according to claim 2, wherein the obtaining mutual information between each event node included in the inference path and the keyword comprises:
generating a co-occurrence frequency dictionary according to the description information of each event node in the event causal relationship graph;
and determining mutual information between each event node contained in the inference path and the keyword according to the co-occurrence frequency dictionary.
4. The method according to claim 3, wherein the determining mutual information between each event node included in the inference path and the keyword according to the co-occurrence frequency dictionary comprises:
segmenting the description information of each event node;
querying the co-occurrence frequency dictionary to obtain the co-occurrence frequency between each participle and the keyword;
determining mutual information between each participle and the keyword according to the co-occurrence frequency;
and determining mutual information between the corresponding event node and the keyword according to the mutual information between the participles belonging to the same event node and the keyword.
5. The method of any of claims 1-4, wherein the method further comprises:
inquiring the adjacent event nodes with causal relationship for the second event nodes with description information part missing in the event causal relationship graph;
inputting the description information of the adjacent event node and the description information of the second event node which is not lost into a trained prediction model to obtain the representation of the second event node;
and predicting the missing description information according to the characterization of the second event node.
6. The method according to claim 5, wherein the prediction model is obtained by inputting positive samples and corresponding negative samples in a training set into the prediction model respectively, and obtaining the representation of two events in the positive samples and the representation of two events in the negative samples output by the prediction model; determining a loss function according to a difference between a first distance between the representations of the two events in the positive sample and a second distance between the representations of the two events in the negative sample; adjusting the model parameters according to the loss function;
the positive sample is description information of two event nodes with causal relationship in the event causal relationship graph; the negative sample is description information of two event nodes without causal relationship in the event causal relationship graph.
7. The method of claim 6, wherein the loss function takes on a value that is the greater of the difference between the first distance and the second distance, and zero.
8. The method of claim 5, wherein predicting missing description information based on the characterization of the second event node comprises:
obtaining the representation of each candidate information;
determining the representation of the missing description information according to the representation of the second event node and the representation of the information which is not missing in the second event node;
and determining the description information missing from the second event node from each candidate information according to the similarity between the representation of each candidate information and the representation of the missing description information.
9. An event query device, comprising:
the acquisition module is used for acquiring keywords;
the matching module is used for matching the keywords with all event nodes in the event causal relationship graph to obtain first event nodes matched with the keywords;
the reasoning module is used for determining a reasoning path according to the first event node and the associated nodes of the first event node having causal relationship in the event causal relationship graph;
the screening module is used for screening the event nodes in the inference path according to mutual information between the event nodes and target information contained in the inference path; wherein the target information includes at least one of the keyword and the first event node;
and the query module is used for determining the event queried by the keyword according to the event node reserved in the inference path.
10. The apparatus of claim 9, wherein the screening module comprises:
the obtaining unit is used for obtaining mutual information between each event node contained in the reasoning path and the keyword;
the determining unit is used for determining topic deviating nodes of which mutual information does not meet set conditions from the reasoning path according to mutual information between each event node and the keywords contained in the reasoning path;
and the deleting unit is used for deleting the theme deviation nodes and all event nodes which are in result relationship with the theme deviation nodes from the reasoning path.
11. The apparatus of claim 10, wherein the obtaining unit is configured to:
generating a co-occurrence frequency dictionary according to the description information of each event node in the event causal relationship graph;
and determining mutual information between each event node contained in the inference path and the keyword according to the co-occurrence frequency dictionary.
12. The apparatus of claim 11, wherein the obtaining unit is configured to:
segmenting the description information of each event node;
querying the co-occurrence frequency dictionary to obtain the co-occurrence frequency between each participle and the keyword;
determining mutual information between each participle and the keyword according to the co-occurrence frequency;
and determining mutual information between the corresponding event node and the keyword according to the mutual information between the participles belonging to the same event node and the keyword.
13. The apparatus of any of claims 9-12, wherein the apparatus further comprises:
the node query module is used for querying adjacent event nodes with causal relationship for second event nodes with description information part missing in the event causal relationship graph;
the representation module is used for inputting the description information of the adjacent event node and the description information of the second event node which is not lost into a trained prediction model to obtain the representation of the second event node;
and the predicting module is used for predicting the missing description information according to the representation of the second event node.
14. The apparatus of claim 13, wherein the prediction module is to:
obtaining the representation of each candidate information;
determining the representation of the missing description information according to the representation of the second event node and the representation of the information which is not missing in the second event node;
and determining the description information missing from the second event node from each candidate information according to the similarity between the representation of each candidate information and the representation of the missing description information.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.
17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.
CN202110691962.6A 2021-06-22 2021-06-22 Event query method, device and storage medium Active CN113590774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110691962.6A CN113590774B (en) 2021-06-22 2021-06-22 Event query method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110691962.6A CN113590774B (en) 2021-06-22 2021-06-22 Event query method, device and storage medium

Publications (2)

Publication Number Publication Date
CN113590774A true CN113590774A (en) 2021-11-02
CN113590774B CN113590774B (en) 2023-09-29

Family

ID=78244262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110691962.6A Active CN113590774B (en) 2021-06-22 2021-06-22 Event query method, device and storage medium

Country Status (1)

Country Link
CN (1) CN113590774B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114491232A (en) * 2021-12-24 2022-05-13 北京百度网讯科技有限公司 Information query method and device, electronic equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108461151A (en) * 2017-12-15 2018-08-28 北京大学深圳研究生院 A kind of the logic Enhancement Method and device of knowledge mapping
CN110390006A (en) * 2019-07-23 2019-10-29 腾讯科技(深圳)有限公司 Question and answer corpus generation method, device and computer readable storage medium
CN110489520A (en) * 2019-07-08 2019-11-22 平安科技(深圳)有限公司 Event-handling method, device, equipment and the storage medium of knowledge based map
CN110675912A (en) * 2019-09-17 2020-01-10 东北大学 Gene regulation and control network construction method based on structure prediction
CN111291265A (en) * 2020-02-10 2020-06-16 青岛聚看云科技有限公司 Recommendation information generation method and device
CN111324725A (en) * 2020-02-17 2020-06-23 昆明理工大学 Topic acquisition method, terminal and computer readable storage medium
CN111695583A (en) * 2019-07-18 2020-09-22 广东电网有限责任公司信息中心 Feature selection method based on causal network
CN111950279A (en) * 2019-05-17 2020-11-17 百度在线网络技术(北京)有限公司 Entity relationship processing method, device, equipment and computer readable storage medium
CN111949787A (en) * 2020-08-21 2020-11-17 平安国际智慧城市科技股份有限公司 Automatic question-answering method, device, equipment and storage medium based on knowledge graph
CN112035672A (en) * 2020-07-23 2020-12-04 深圳技术大学 Knowledge graph complementing method, device, equipment and storage medium
WO2021092099A1 (en) * 2019-11-05 2021-05-14 Epacca, Inc. Mechanistic causal reasoning for efficient analytics and natural language

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108461151A (en) * 2017-12-15 2018-08-28 北京大学深圳研究生院 A kind of the logic Enhancement Method and device of knowledge mapping
CN111950279A (en) * 2019-05-17 2020-11-17 百度在线网络技术(北京)有限公司 Entity relationship processing method, device, equipment and computer readable storage medium
CN110489520A (en) * 2019-07-08 2019-11-22 平安科技(深圳)有限公司 Event-handling method, device, equipment and the storage medium of knowledge based map
WO2021004333A1 (en) * 2019-07-08 2021-01-14 平安科技(深圳)有限公司 Knowledge graph-based event processing method and apparatus, device, and storage medium
CN111695583A (en) * 2019-07-18 2020-09-22 广东电网有限责任公司信息中心 Feature selection method based on causal network
CN110390006A (en) * 2019-07-23 2019-10-29 腾讯科技(深圳)有限公司 Question and answer corpus generation method, device and computer readable storage medium
CN110675912A (en) * 2019-09-17 2020-01-10 东北大学 Gene regulation and control network construction method based on structure prediction
WO2021092099A1 (en) * 2019-11-05 2021-05-14 Epacca, Inc. Mechanistic causal reasoning for efficient analytics and natural language
CN111291265A (en) * 2020-02-10 2020-06-16 青岛聚看云科技有限公司 Recommendation information generation method and device
CN111324725A (en) * 2020-02-17 2020-06-23 昆明理工大学 Topic acquisition method, terminal and computer readable storage medium
CN112035672A (en) * 2020-07-23 2020-12-04 深圳技术大学 Knowledge graph complementing method, device, equipment and storage medium
CN111949787A (en) * 2020-08-21 2020-11-17 平安国际智慧城市科技股份有限公司 Automatic question-answering method, device, equipment and storage medium based on knowledge graph

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SAURAV ACHARYA: "Enhanced Fast Causal Network Inference over Event Streams", TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS XVII *
王连喜;: "面向公共安全领域的词典构建及其舆情事件识别研究", 情报探索, no. 02 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114491232A (en) * 2021-12-24 2022-05-13 北京百度网讯科技有限公司 Information query method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113590774B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN111967262A (en) Method and device for determining entity tag
CN112989235A (en) Knowledge base-based internal link construction method, device, equipment and storage medium
CN115248890B (en) User interest portrait generation method and device, electronic equipment and storage medium
CN113392920B (en) Method, apparatus, device, medium, and program product for generating cheating prediction model
CN112560480B (en) Task community discovery method, device, equipment and storage medium
CN113033194B (en) Training method, device, equipment and storage medium for semantic representation graph model
CN114037059A (en) Pre-training model, model generation method, data processing method and data processing device
CN113963197A (en) Image recognition method and device, electronic equipment and readable storage medium
CN117688946A (en) Intent recognition method and device based on large model, electronic equipment and storage medium
CN113590774B (en) Event query method, device and storage medium
CN117436505A (en) Training data processing method, training device, training equipment and training medium
CN117271884A (en) Method, device, electronic equipment and storage medium for determining recommended content
CN116467461A (en) Data processing method, device, equipment and medium applied to power distribution network
CN115186738B (en) Model training method, device and storage medium
CN116383382A (en) Sensitive information identification method and device, electronic equipment and storage medium
CN113704256B (en) Data identification method, device, electronic equipment and storage medium
CN112560481B (en) Statement processing method, device and storage medium
CN114117248A (en) Data processing method and device and electronic equipment
CN114817476A (en) Language model training method and device, electronic equipment and storage medium
CN114969444A (en) Data processing method and device, electronic equipment and storage medium
CN114511064A (en) Neural network model interpretation method and device, electronic equipment and storage medium
CN114119972A (en) Model acquisition and object processing method and device, electronic equipment and storage medium
CN116737520B (en) Data braiding method, device and equipment for log data and storage medium
CN116244413B (en) New intention determining method, apparatus and storage medium
CN117573817A (en) Model training method, correlation determining method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant