CN111709225B

CN111709225B - Event causal relationship discriminating method, device and computer readable storage medium

Info

Publication number: CN111709225B
Application number: CN202010385693.6A
Authority: CN
Inventors: 袁杰; 于皓; 张�杰; 陈秀坤; 高古明
Original assignee: Beijing Mininglamp Software System Co ltd
Current assignee: Beijing Mininglamp Software System Co ltd
Priority date: 2020-05-09
Filing date: 2020-05-09
Publication date: 2023-05-09
Anticipated expiration: 2040-05-09
Also published as: CN111709225A

Abstract

The embodiment of the application discloses an event causal relationship judging method, an event causal relationship judging device and a computer readable storage medium, wherein the method comprises the following steps: acquiring an original event corpus, and preprocessing the event corpus; the preprocessing may include: event extraction and event labeling; acquiring event vector representation according to the event marked by the event; the event vector is used as input data and is input into a preset deep neural network model; and calculating the output result of the deep neural network model through a preset calculation function so as to obtain the prediction of the causal relationship among a plurality of events. By the scheme of the embodiment, the classification of causal events is facilitated, and the accuracy of classification results is improved.

Description

Event causal relationship discriminating method, device and computer readable storage medium

Technical Field

The present invention relates to information processing technology, and in particular, to a method, an apparatus and a computer readable storage medium for determining event causal relationship.

Background

With the rapid development of artificial intelligence technology, many developments in the field of natural language processing are advanced, but the mutual influence between event causality simulation events is still to be improved by utilizing natural language processing technology. The causal event relationship judgment can be applied to a plurality of fields, for example, the financial field can evaluate the conditions of business states, stock fluctuation and the like of downstream companies by event causal transfer, thereby providing valuable reference analysis for future development of the companies, effectively assisting decision and reducing financial operation risks.

At present, based on the causal relation of text information extraction events, classification recognition is mainly carried out by calculating word frequency and constructing feature vectors through word frequency information such as TF-IDF (term frequency-inverse document frequency word frequency-inverse text frequency index) and the like.

The prior art method only utilizes the frequency information of the mutual relation of the words in the event in the causal information, and the frequency information is utilized to express that the mutual sufficient and necessary relation dimension between the words is single, so that the influence of the internal word distribution of the text data is larger. Meanwhile, some words in the sentences do not have great help to causal calculation, belong to the category of noise, and the information redundancy can be brought by fully utilizing the words in the sentences, so that the classification accuracy can be even reduced.

Disclosure of Invention

The embodiment of the application provides an event causal relationship judging method, an event causal relationship judging device and a computer readable storage medium, which can be beneficial to the classification of causal events and improve the accuracy of classification results.

The embodiment of the application provides a method for distinguishing event causal relationship, which can comprise the following steps:

acquiring an original event corpus, and preprocessing the event corpus; the pretreatment comprises the following steps: event extraction and event labeling;

acquiring event vector representation according to the event marked by the event;

the event vector is used as input data and is input into a preset deep neural network model;

and calculating the output result of the deep neural network model through a preset calculation function so as to obtain the prediction of the causal relationship among a plurality of events.

The embodiment of the application also provides a device for judging the event causal relationship, which can comprise a processor and a computer readable storage medium, wherein the computer readable storage medium stores instructions, and when the instructions are executed by the processor, the method for judging the event causal relationship is realized.

The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for distinguishing the event causal relationship is realized.

Compared with the related art, the method comprises the steps of obtaining original event corpus and preprocessing the event corpus; the preprocessing may include: event extraction and event labeling; acquiring event vector representation according to the event marked by the event; the event vector is used as input data and is input into a preset deep neural network model; and calculating the output result of the deep neural network model through a preset calculation function so as to obtain the prediction of the causal relationship among a plurality of events. By the scheme of the embodiment, the classification of causal events is facilitated, and the accuracy of classification results is improved.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. Other advantages of the present application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

The accompanying drawings are included to provide an understanding of the technical aspects of the present application, and are incorporated in and constitute a part of this specification, illustrate the technical aspects of the present application and together with the examples of the present application, and not constitute a limitation of the technical aspects of the present application.

FIG. 1 is a flowchart of a method for determining event cause and effect relationships according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a deep neural network model according to an embodiment of the present application;

fig. 3 is a block diagram of an apparatus for determining event cause and effect relationships according to an embodiment of the present application.

Detailed Description

The present application describes a number of embodiments, but the description is illustrative and not limiting and it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the embodiments described herein. Although many possible combinations of features are shown in the drawings and discussed in the detailed description, many other combinations of the disclosed features are possible. Any feature or element of any embodiment may be used in combination with or in place of any other feature or element of any other embodiment unless specifically limited.

The present application includes and contemplates combinations of features and elements known to those of ordinary skill in the art. The embodiments, features and elements of the present disclosure may also be combined with any conventional features or elements to form a unique inventive arrangement as defined in the claims. Any feature or element of any embodiment may also be combined with features or elements from other inventive arrangements to form another unique inventive arrangement as defined in the claims. Thus, it should be understood that any of the features shown and/or discussed in this application may be implemented alone or in any suitable combination. Accordingly, the embodiments are not to be restricted except in light of the attached claims and their equivalents. Further, various modifications and changes may be made within the scope of the appended claims.

Furthermore, in describing representative embodiments, the specification may have presented the method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. Other sequences of steps are possible as will be appreciated by those of ordinary skill in the art. Accordingly, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. Furthermore, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the embodiments of the present application.

The embodiment of the application provides a method for discriminating event causal relationship, as shown in fig. 1, the method may include S101-S104:

s101, acquiring an original event corpus, and preprocessing the event corpus; the pretreatment comprises the following steps: event extraction and event annotation.

In an exemplary embodiment of the present application, data preparation, i.e., data acquisition, may be performed first to acquire an original event corpus.

In an exemplary embodiment of the present application, obtaining the original event corpus may include: a large amount of event data is crawled from network resources using crawler technology.

In the exemplary embodiment of the present application, a great amount of news corpus, article stories describing narratives related to an event, and the like may be crawled from network resources such as news websites by using existing crawler technologies, and may be used as the event corpus.

In an exemplary embodiment of the present application, after the event corpus is obtained, preprocessing such as event extraction, event labeling (may also be referred to as corpus labeling) and the like may be further performed on the event corpus.

In an exemplary embodiment of the present application, when the preprocessing is event extraction, the preprocessing the event corpus may include:

extracting core predicates of the events from sentences of the event corpus by adopting a preset event extraction tool, forming a binary group by the core predicates and main languages corresponding to the core predicates, and respectively representing one event by each binary group to realize event extraction.

In the exemplary embodiment of the present application, the obtained corpus data of the event corpus including the event relationships may be extracted by a preset event extraction tool, for example, NLP (natural language processing tool) to extract the event representation. Specifically, core representation words (typically a predicate) of an event may be extracted from a sentence, such as: eating, typing, running, exploding, dying, etc., an event is represented by a binary relationship of the core predicate and a subject based on the core predicate. For example, a news: the su state xx chemical plant has an explosion event; the news event may be extracted as a (explosion, chemical plant) doublet to represent the event. Another news: the energy stocks of xx month xx days are greatly increased and can be extracted into a binary group (increased energy stocks).

In an exemplary embodiment of the present application, when the preprocessing is event labeling, the preprocessing the event corpus may include:

respectively marking the events of each two binary groups to indicate the causal relationship between the first event and the second event corresponding to the two binary groups; the causal relationship comprises: the first event is the cause of the second event, the first event is the result of the second event, or the first event has no causal relationship with the second event.

In the exemplary embodiment of the present application, since the supervision algorithm in machine learning is predetermined to label the data, at least a part of the data of the event extraction result may be labeled based on the result of the event extraction. For example, whether the marked event a (which may be the first event described above) is the cause of the event B (which may be the second event described above), whether the marked event B is the cause of the event a, or both are irrelevant. The result of marking whether the event a is the event B can be represented by a one-time hot code, for example, using [1, 0] to mark whether the event a is the cause of the event B, using [0,1,0] to mark whether the event a is the result of the event B, and using [0, 1] to mark that the event a has no causal relationship with the event B.

S102, obtaining event vector representation according to the event marked by the event.

In an exemplary embodiment of the present application, after the tuple with the event annotation is obtained, the vector representation mapping of the event may be performed on the tuple.

In an exemplary embodiment of the present application, the obtaining the event vector representation according to the event after the event labeling may include:

obtaining a vector representation p of two statements in each tuple by randomly initializing the vectors _i And

i is a natural number, i=1, 2, 3, …, m representing the vector representing p _i and />

Is a dimension of (2);

representing the vector by p _i and s_i Splicing to obtain the event vector representation

wherein ,p_i For the statement vector representation corresponding to the core predicate,/->

And representing the sentence vector corresponding to the subject. />

In an exemplary embodiment of the present application, the words of the obtained two-tuple may be obtained by randomly initializing the vectors to obtain corresponding word vectors

Will p _i And->

Is to splice the vector results of (i.e. directly vector +.>

Each row is sequentially arranged in a vector p _i After the last column) to get the event vector representation +.>

In an exemplary embodiment of the present application, the method may further include:

dividing sentences of the event corpusFor single Chinese character, obtaining the character vector representation w corresponding to each Chinese character by means of random initialization vector _j The method comprises the steps of carrying out a first treatment on the surface of the Representing w to the word vector by a preset vector calculation formula _j Calculating to obtain statement vector representation corresponding to the subject

The preset vector calculation formula comprises the following steps:

n represents the dimension represented by the word vector, j is a natural number, j=1, 2, 3, …, n.

In the exemplary embodiment of the present application, the sentence of the event is segmented into individual Chinese characters, and the vector representation is obtained by randomly initializing the vector

wherein ,w_i Chinese characters represent corresponding word vectors; s is(s) _i ＝[w ₁ ，w ₂ ，..，w _n ]。s _i Representing sentence vectors consisting of one or more word vectors, < >>

For sentence vector s _i Average value of (2); />

May be a word vector or a sentence vector.

S103, taking the event vector representation as input data, and inputting a preset deep neural network model.

In an exemplary embodiment of the present application, an event vector representation is obtained

Thereafter, the e may be represented based on the resulting event vector _i and />

And performing causal relation calculation by a pre-constructed model (such as the deep neural network model) based on a supervision algorithm in machine learning.

In an exemplary embodiment of the present application, as shown in fig. 2, the deep neural network model may sequentially include: an input layer, a hidden layer and an output layer;

the input layer may include at least a first input unit for inputting an event vector representation e corresponding to a first event to be predicted and a second input end member _i 1 and corresponding first subject

The second input unit is used for inputting an event vector representation e corresponding to a second event to be predicted _i 2 and the corresponding sentence vector for the corresponding second subject +.>

The hidden layer may include the following models:

wherein ,g_i For the output of the hidden layer, W _g B is the preset hidden layer network parameter weight _g For a preset bias parameter, f represents a nonlinear variation function,

representing the event vector representation e _i Sentence vector representation corresponding to said subject +.>

A vector representation of the composition;

the output layer may include the following models: o (o) _i ＝f(W _o ·g _i +b _o )；

Wherein the hidden layer has an output ofInput of the output layer o _i O for the output of the output layer _i The output dimension of (2) is at least 3; w (W) _o 、b _o Network parameters of a preset output layer;

output=o of the deep neural network model _i 1-o _i 2＝[x ₁ ，x ₂ ，x ₃ ]；

wherein ,o_i 1 is the first output result corresponding to the first event to be predicted, o _i 2 is a second output result corresponding to the second event to be predicted; x is x ₁ 、x ₂ 、x ₃ To a numerical value indicative of a causal relationship between the first event to be predicted and the second event to be predicted, respectively.

In an exemplary embodiment of the present application, the vectors represent ei and

can be input as an input layer, the vector representation can include at least two groups, such as vector representation e _i1 and />

Vector representation e _i2 and />

The input layer can directly express the vector e _i1 and />

Vector representation e _i2 and />

Input hidden layer as input of hidden layer, output g of hidden layer _i Can be used as input of output layer, output o of output layer _i Can perform o _i 1-o _i 2, and output=o _i 1-o _i 2 is taken as an output result of the deep neural network model.

In an exemplary embodiment of the present application, due to the presettingsO _i The output=o if the output dimension of (a) is at least 3 _i 1-o _i 2＝[x ₁ ，x ₂ ，x ₃]； wherein ,x₁ It may be indicated that the first event to be predicted is the cause of the second event to be predicted, that the second event to be predicted is the cause of the first event to be predicted, or that neither the first event to be predicted nor the second event to be predicted have a relationship.

S104, calculating an output result of the deep neural network model through a preset calculation function to obtain predictions of causal relationships among a plurality of events.

In an exemplary embodiment of the present application, the calculating, by a preset calculation function, the output result of the deep neural network model to obtain a prediction of a causal relationship between a plurality of events may include:

calculating the numerical value with the maximum occurrence probability in the output result of the deep neural network model through the calculation function; and taking the calculation result of the calculation function as a prediction result of causal relation among a plurality of events.

In an exemplary embodiment of the present application, the computing function may include: softmax function.

In an exemplary embodiment of the present application, the output result may be further calculated through a softmax function, so as to obtain a final predicted value.

In an exemplary embodiment of the present application, a method for determining causal relationships of classified events is provided in the embodiments of the present application, which effectively calculates causal relationships between events, and provides a certain prediction and assistance for decision analysis in different fields. The method can make up the defects of the prior method that only frequency information is utilized, the expressive power is insufficient and the like. According to the embodiment of the application, the frequency information of words in the causal event is not only utilized, but also the semantic information is effectively captured by constructing a deep neural network model and utilizing a word2vec method, so that the causal event classification is facilitated, and the accuracy of the classification result is improved.

The embodiment of the application further provides an event causal relationship discriminating apparatus 1, as shown in fig. 3, may include a processor 11 and a computer readable storage medium 12, where the computer readable storage medium 12 stores instructions, and when the instructions are executed by the processor 11, the event causal relationship discriminating method described in any one of the above is implemented.

In the exemplary embodiments of the present application, any of the embodiments of the method described above are applicable to the embodiment of the apparatus, and are not described herein in detail.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Claims

1. A method for discriminating event cause and effect relationships, the method comprising:

when the preprocessing is event extraction, the preprocessing the event corpus includes: extracting a core predicate of an event from sentences of the event corpus by adopting a preset event extraction tool, and forming a binary group by the core predicate and a main word corresponding to the core predicate, wherein each binary group respectively represents an event so as to realize event extraction;

obtaining event vector representation according to the event after event labeling, comprising: obtaining vector representations of two statements in each tuple by randomly initializing vectors

and />

The method comprises the steps of carrying out a first treatment on the surface of the i is a natural number, i=1, 2, 3, …, m, < >>

Representing the vector representation +.>

and />

Is a dimension of (2); representing the vector +.>

and />

Splicing to obtain the event vector representation +.>

； wherein ,

for the statement vector representation corresponding to the core predicate,/->

Representing the sentence vector corresponding to the subject;

calculating an output result of the deep neural network model through a preset calculation function to obtain predictions of causal relationships among a plurality of events;

the deep neural network model sequentially comprises the following components: an input layer, a hidden layer and an output layer;

the input layer at least comprises a first input unit and a second input unit, wherein the first input unit is used for inputting an event vector representation corresponding to a first event to be predicted

Sentence vector representation ++corresponding to the corresponding first subject>

The method comprises the steps of carrying out a first treatment on the surface of the The second input unit is used for inputting an event vector representation +_corresponding to a second event to be predicted>

Sentence vector representation corresponding to the corresponding second subject +.>

；

The hidden layer comprises the following models:

；

wherein ,

for the output of the hidden layer, +.>

For the preset hidden network parameter weight, < +.>

For a predetermined bias parameter, f represents a nonlinear variation function, +>

Representing +.>

Sentence vector representation corresponding to said subject +.>

A vector representation of the composition;

the output layer includes the following models:

；

wherein the output of the hidden layer is the input of the output layer,

for the output layerGo out (I)>

The output dimension of (2) is at least 3; />

、/>

Network parameters of a preset output layer;

output of the deep neural network model

；

wherein ,

for a first output result corresponding to said first event to be predicted,/>

A second output result corresponding to the second event to be predicted; />

To a numerical value indicative of a causal relationship between the first event to be predicted and the second event to be predicted, respectively.

2. The method for determining event cause and effect relationships according to claim 1, wherein when the preprocessing is event labeling, the preprocessing the event corpus includes:

3. The method for discriminating event cause and effect relationships according to claim 1, further comprising:

dividing sentences of the event corpus into single Chinese characters, and obtaining word vector representation corresponding to each Chinese character by means of random initialization vector

The method comprises the steps of carrying out a first treatment on the surface of the Representing +.>

Calculating to obtain sentence vector representation ++>

；

The preset vector calculation formula comprises the following steps:

；/>

and j represents the dimension represented by the word vector, j is a natural number, and j=1, 2, 3, … and n.

4. A method for determining causal relationship between events according to any one of claims 1 to 3, wherein calculating, by a preset calculation function, an output result of the deep neural network model to obtain a prediction of causal relationship between a plurality of events includes:

5. The method of claim 4, wherein the computing function comprises: softmax function.

6. An event causal relationship discriminating apparatus comprising a processor and a computer readable storage medium having instructions stored therein, wherein the instructions, when executed by the processor, implement the event causal relationship discriminating method of any of claims 1-5.

7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the event causal relationship discriminating method according to any one of claims 1-5.