CN109918646B

CN109918646B - Method, system and device for judging causal relationship of chapters

Info

Publication number: CN109918646B
Application number: CN201910089352.1A
Authority: CN
Inventors: 向露; 刘洋; 张家俊; 周玉; 宗成庆
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2019-01-30
Filing date: 2019-01-30
Publication date: 2020-08-11
Anticipated expiration: 2039-01-30
Also published as: CN109918646A

Abstract

The invention belongs to the technical field of natural language processing, and particularly relates to a method, a system and a device for judging discourse causal relationship, aiming at solving the problem of judging discourse causal relationship in robot interaction. The method comprises the following steps: based on a language activation model, acquiring a registration event with the highest matching degree for each target text in an input target text pair; based on the registration event corresponding to each target text, calculating the correlation between the two registration events according to the stored registration event sequence in each scene; and calculating the causal relationship of the target text pair based on the correlation of the target text pair and the two registration events. The method and the device can accurately judge the cause and effect relationship of the input target text pair.

Description

Method, system and device for judging causal relationship of chapters

Technical Field

The invention belongs to the technical field of natural language processing, and particularly relates to a method, a system and a device for judging causal relationship of chapters.

Background

The chapters are almost inseparable from the daily communication of people. People more frequently use discourse relationships to express and convey semantic relationships (e.g., causal relationships, progressive relationships, turning relationships, etc.) between contexts. Meanwhile, as the robot gradually plays an increasingly important role in the daily life of people, how to make the robot understand the chapter relationship in the daily expression of people becomes an irremediable problem. However, the difficulty of the problem is that a good theory and model framework does not exist, and the existing method is limited at a text level and difficult to acquire the inherent semantics of the text, so that the existing method still has many problems.

Recently, in the field of artificial intelligence, researchers have proposed an idea of episodic language learning, which considers that "the language semantics mainly come from the environment", and based on this idea, if a machine is expected to understand the language, the machine should be sensible and interactive, wherein the judgment of the causal relationship of the chapters plays an important role in the interaction of the robot.

Disclosure of Invention

In order to solve the above problems in the prior art, that is, to solve the problem of chapter cause and effect relationship judgment in robot interaction, a first aspect of the present invention provides a chapter cause and effect relationship judgment method, including:

step S10, based on the language activation model, obtaining the registration event with the highest matching degree for each target text in the input target text; the target text pair is two input target texts;

step S20, based on the registration event corresponding to each target text, calculating the correlation between two registration events according to the stored registration event sequence in each scene; the registration event sequence in the scene is a registration event sequence with Boolean time sequence characteristics, which is constructed based on scene structured experience information;

step S30, calculating the causal relationship of the target text pair based on the correlation between the target text pair and the two registration events obtained in step S20;

the language activation model is obtained by training a machine translation model through experience-language activation training corpora; the experience-language activation corpus is constructed based on registered events in the robot experience.

In some preferred embodiments, step S30, "calculating the causal relationship of the target text pair" includes:

f_r＝softmax(tanh(W_c*[s₁；s₂]+W_t*fea_t+b))

wherein f is_rThe causal relationship probability value comprises a causal relationship probability and a non-causal relationship probability; s₁、s₂The sentence vectors are obtained by two target texts in the target text pair through a text coding model; w_cA parameter matrix of a preset text vector; w_tA parameter matrix which is a preset Boolean time sequence characteristic; b is a preset offset.

In some preferred embodiments, step S20 "calculate the correlation between two registration events" is performed by:

wherein fea_tIs the calculated correlation; e.g. of the type₁、e₂Respectively matching a first target text and a second target text in a target text pair with a registration event; p (e)₁) E of experience information structured for all stored scenes₁The probability of occurrence; p (e)₂|e₁) To appear e₁E in the structured empirical information of₂The probability of occurrence of.

In some preferred embodiments, the registration event,

e_i＝{obj_i}，i∈R_i

wherein e is_iIs the ith registration event; obj_iThe object in the activated state in the registration event is selected; r_iThe numbers of all objects in the language activation model are signed for the objects in the registration event.

In some preferred embodiments, the empirically-linguistically-activated corpus is expressed as

E_j＝{obj_i}：{LS_i}，i∈R_i

Wherein E is_jActivating a corpus for a jth experience-language; LS (least squares)_iAs an object obj_iCorresponding language characters.

In some preferred embodiments, the event sequence is registered in each scenario in step S20, and the obtaining method includes:

step A10, in the working environment of the robot, the object information obtained by the interaction between the robot and the object is obtained by the perception device of the robot; the object information comprises object attributes and object state information obtained by interaction between the robot and the object;

and A20, performing structuring processing on the object information acquired in the step A10, organizing and storing the object information according to time sequence, and using the object information as the structured experience information obtained by the operation working environment of the robot.

And A30, removing the object in the non-activated state in the structured experience information to obtain a registration event sequence in the corresponding scene.

In some preferred embodiments, the structured empirical information is expressed as

E_0-t＝{f₀,f₁,f₂,…,f_t}

f_t＝{obj₁:a₁；…obj_i:a_i；…obj_n:a_n；}

Wherein E is_0-tAll experience from time 0 to time t; f. of_tAll the object information input for the t moment; obj_iIs the ith object, a_iAs an object obj_iActive state of a_i When 1, it means that the ith object is activated, a_iWhen 0 indicates that the ith object is not activated, i ∈ [1, n]。

The second aspect of the invention provides a chapter causal relationship judging system, which comprises a registration event matching module, a correlation calculation module and a causal relationship calculation module;

the registration event matching module is configured to respectively acquire a registration event with the highest matching degree for each target text in the input target text based on the language activation model; the target text pair is two input target texts;

the correlation calculation module is configured to calculate the correlation between the two registration events according to the stored registration event sequences in each scene based on the registration event corresponding to each target text; the registration event sequence in the scene is a registration event sequence with Boolean time sequence characteristics, which is constructed based on scene structured experience information;

the causal relationship calculation module is configured to calculate a causal relationship of the target text pair based on the target text pair and the correlation between the two registration events obtained in step S20;

In a third aspect of the present invention, a storage device is provided, in which a plurality of programs are stored, the programs being adapted to be loaded and executed by a processor to implement the discourse cause and effect relationship determination method described above.

In a fourth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the discourse cause and effect determination method described above.

The invention has the beneficial effects that:

correlation calculation is carried out through the registration event sequence of the scene, and the causal relationship of the target text pair is calculated by further combining the registration events of the target text pair, so that an accurate causal relationship judgment result can be obtained.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is a schematic flow chart of a chapter cause and effect relationship determination method according to an embodiment of the present invention;

FIG. 2 is an example of chapter causality determination according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The method has the basic idea that a curing network is constructed aiming at the robot by utilizing the signal input of a robot working platform, and the text semantics are better modeled by utilizing the deep neural network technology so as to achieve the aim of judging the final causal relationship.

The invention relates to a method for judging the causal relationship of sections, which comprises the following steps as shown in figure 1:

In order to more clearly illustrate the causal relationship determination method of the discourse of the present invention, the following three aspects are detailed in detail: 1. constructing a registration event sequence in each scene; 2. constructing a language activation model; 3. judging the causal relationship of the sections.

1. Construction of registration event sequences in each scene

Step A10, in the working environment of the robot, the object information obtained by the interaction between the robot and the object is obtained by the perception device of the robot; the object information comprises object attributes and object state information obtained by interaction between the robot and the object.

The robot is placed in a working environment, and a sensing device of the robot is used for receiving signals in the environment (namely object information obtained by interaction between the robot and an object). And continuously updating the acquired information through the continuous interaction of the robot in the working environment. For example, in a scene, there are a large number of objects interacting with the robot. The objects themselves have attributes, with the attributes having corresponding states. In particular, the state of the object changes after the robot interacts with the object. The robot needs to perceive these attributes and states to obtain information about these objects. The attribute and state of an object "cow" such as a robot face are displayed and expressed, and the object is denoted as obj2, and name is cow).

Object information resulting from the robot-object interaction from step a10 is stored in the long-term memory of the robot. The specific implementation method comprises the following steps: the robot stores these collected object attribute state information in long-term memory. In long-term memory, we reserve corresponding boolean nodes for the attributes and states of different objects.

The structuring process carries out structuring processing on the signals and stores the signals; the specific implementation scheme is as follows: structure points are reserved for the information Boolean nodes inside the robot to store information. And, these information will be organized and stored in chronological order. That is, let the input at a time t be f_tThen the expression of the structured empirical information is:

E_0-t＝{f₀,f₁,f₂,…,f_t}

wherein E is_0-tAll experience from time 0 to time t; f. of_tAll object information (attribute, status) input for the t-th time.

Meanwhile, the experience at each time includes activation information (expressed by boolean values) of all objects, and the formal expression is:

f_t＝{obj₁:a₁；…obj_i:a_i；…obj_n:a_n}

wherein obj_iIs the ith object, a_iAs an object obj_iActive state of a_iWhen 1, it means that the ith object is activated, a_iWhen 0 indicates that the ith object is not activated, i ∈ [1, n]。

E.g. at time 5 f₅＝{obj₁:0；obj₂:1；obj₃:1；obj ₄1 represents obj in all objects at that time₂、obj₃、obj₄In an activated state, obj₁In an inactive state.

For example, f above₅Corresponding to the registration sequence T in the scene₅＝{obj₂；obj₃；obj₄；}。

2. And (5) constructing a language activation model.

(1) Registration events in the robot experience are constructed.

According to the experience of the robot, these objects exhibit a boolean state, and if the robot receives a corresponding signal in the working environment, the corresponding object is activated. In the case where objects belonging to an event are all activated, the event is an activated state. The registration event expression is as follows:

e_i＝{obj_i}，i∈R_i

For example, based on f above₅The obtained registration event is e₅＝{obj₂；obj₃；obj₄At this time R_i＝[2,3,4]. When e is₅When activated, may be denoted as e₅:1＝{obj₂:1；obj₃:1；obj₄:1；}。

(2) Based on the constructed enrollment events, an experience-language activated corpus is constructed.

This step can be viewed as a symbolic translation process, and thus, the corpus is in the form of parallel corpora. The method specifically comprises the following steps: further experiential curing of internal objects into a sequence of symbols, e.g. obj₁₅Obj _ 15. When the text mentions that reference is made to this symbol, the symbol and associated concepts are activated. For example, in our actual experiment, obj₁₅(i.e., Obj _15) indicates "cow" in chinese, and "cow" in english. This step constructs a parallel corpus of such tokens corresponding to empirical internal symbols, as follows:

E_j＝{obj_i}：{LS_i}，i∈R_i

wherein E is_jActivating a corpus for a jth experience-language; LS (least squares)_iAs an object ob_jCorresponding language characters.

For example, if the object obj₂、obj₅、obj₁₅The corresponding language characters are Adam, attack and cow, and the training language chat can be expressed as Adam, attack and cow

obj₂,obj₅,obj₁₅∶Adam attack cow

(3) And training the machine translation model through experience-language activation training corpora to obtain a language activation model.

Training corpora are activated according to experience-language, and training is based on a machine translation model. In this embodiment, a sequence-to-sequence neural network machine translation model is adopted, and of course, other machine translation models may be adopted.

3. Judgment of causal relationship of chapters

Step S10, based on the language activation model, obtaining the registration event with the highest matching degree for each target text in the input target text; the target text pair is two target texts.

The construction process of the language activation model has been described in detail above, where the corresponding registration event can be obtained directly by changing the model, and the key process of the model for processing text can be shown below.

(1) The input text is translated by the language activation model into a sequence of symbols of the registration event, i.e. the object symbols contained in the registration event inside the robot. The formula is as follows:

y₁,y₂,y₃,…,y_j＝P(y_i|x₁,x₂,x₃,…,x_n)

wherein, y₁,y₂,y₃,…,y_jIs a target symbol sequence; x is the number of₁,x₂,x₃,…,x_nIs a sequence of language symbols; p (y)_i|x₁,x₂,x₃,…,x_n) To decode (i.e., translate) the object symbols according to the language symbol sequence.

In some embodiments, two word vector sets Emb may be constructed for the linguistic end and the empirical notation end, respectively_lAnd Emb_e。

(2) The language activation model activates corresponding experience information of the robot according to the input target text. And matching the registration events stored in all robots according to the translated object symbol sequences, and selecting the event with the highest matching degree as an activated event. In this case, a long-short time memory model LSTM may be used as a tool for generating a symbol sequence vector to generate a sequence vector of target symbols, as shown in the following formula

LSTM(Emb_e[obj₁],Emb_e[obj₂],…,Emb_e[ob_j])＝v_e

Then, the event with the highest matching degree is obtained according to the cosine similarity of the vectors.

Wherein v is_eThe translated object symbol sequence for the target text passes through the LSTM network and Emb_eObtaining a vector; v. of_iPassing LSTM network and Emb for ith registration event in language activation model_eObtaining a vector; sim (v)_i,v_e) Is the cosine similarity between the two vectors.

And selecting the registration event with the highest cosine similarity as an activation event, namely the registration event corresponding to the input target text.

Step S20, based on the registration event corresponding to each target text, calculating the correlation between two registration events according to the stored registration event sequence in each scene; the registration event sequence in the scene is a registration event sequence with Boolean time sequence characteristics, which is constructed based on scene structured experience information.

Removing the object in the non-activated state in the structured experience information to obtain a registration event sequence in the corresponding scene

When the causal relation is judged, two sentences are input each time, so that two registration events need to be searched at one time. Simultaneously according to the obtained event e₁、e₂The corresponding boolean temporal features of the two events need to be obtained in the sequence of registered events in each scene stored by the robot in order to calculate the correlation of the two registered events.

The correlation of the two registration events is calculated as follows:

And step S30, calculating the causal relationship of the target text pair based on the correlation between the target text pair and the two registration events obtained in the step S20.

The method for calculating the causal relationship of the target text pair comprises the following steps:

f_r＝softmax(tanh(W_c*[s₁；s₂]+W_t*fea_t+b))

wherein f is_rThe causal relationship probability values comprise causal relationship probability Cause and Non-causal relationship probability Non-Cause; s₁、s₂The sentence vectors are obtained by two target texts in the target text pair through a text coding model; w_cA parameter matrix of a preset text vector; w_tA parameter matrix which is a preset Boolean time sequence characteristic; b is a preset offset.

In this embodiment, a softmax classification model is adopted, and the softmax function outputs probability values of two classifications of the two classifications, and the largest one is taken as output. Such as: softmax will output a probability value of 0.7 for Cause and 0.3 for Non-Cause, and we select Cause as output.

The text coding model may adopt LSTM (long-short time memory model), BOW (mean bag of words model based on word vectors), CNN (convolutional neural network model), and may also adopt other models.

As shown in fig. 2, the target text pair of this example input is S1: adam anchors the cow with shock, S2: and Adam gains some beef. Acquiring a registration event e1 of the target text pair through a language activation model in Part1, wherein the registration event e1 is Adam anchors cow chord, e 2: adam gains some beef. In Part2, based on the perception signal sequence received by the robot in each scene (S plus number represents the ith scene, each scene is composed of many continuous time instants, such as T9-T10-T11, etc., each time instant contains a large number of events), finding out that e1 appears in scenes S87 and S72, e2 appears in scenes S54, S87 and S72, and locating the time instant of occurrence through an arrow in the figure (i.e. the event is located in the perception signal sequence in the robot memory), e1 locates at time instants T0 and T1 of scenes S87 and S72, e2 locates at time instants T11 of scenes S54, and T2 and T3 of S72 of S87, and calculating the correlation of two registered events; respectively carrying out text coding on target texts S1 and S2 at Part3 to obtain sentence vectors; and carrying out causal judgment on the Part4 based on the correlation obtained by the Part2 and the two sentence vectors obtained by the Part3, and outputting a judgment result.

The code implementation of the invention can be realized by adopting a python programming language and a TensorFlow deep learning framework, and the development platform is Ubuntu Linux 16.04. Meanwhile, the invention can also rely on the general artificial intelligence platform Malmo released by Microsoft as the working environment of the robot. The system implementation can also run on the Windows operating system, since the written program does not use any platform-dependent code.

The invention discloses a chapter causal relationship judging system of an embodiment, which comprises a registration event matching module, a correlation calculation module and a causal relationship calculation module;

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiments, and will not be described herein again.

It should be noted that the discourse cause and effect relationship determination system provided in the foregoing embodiment is only illustrated by the division of the functional modules, and in practical applications, the functions may be allocated by different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.

The storage device of an embodiment of the present invention stores a plurality of programs, and the programs are suitable for being loaded and executed by a processor to realize the discourse cause and effect relationship judging method.

The processing device of one embodiment of the invention comprises a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the discourse cause and effect determination method described above.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

And (3) experimental verification:

in the corpus constructed in the experiment, each example is composed of two sentences, and each sentence describes a separate event. By training on this corpus, the method of the present invention is compared to traditional plain text-based methods. Table 1 shows the basic statistics of the corpus of the causal relationship of chapters constructed in this experiment.

Attached table 1

	Cause and effect relationship	Non-causal relationship	Total number of	Degree of identity
					Number of	1619	3233	4852	98.2％

The attached table 2 shows the results of the method for judging the causal relationship between the invention and several existing chapters. In the upper part of the table, three kinds of neural network methods (BOW, CNN, LSTM) for processing sentence information are mainly used at present. The experiment takes two sentences as input, and examines whether the target model has the capability of judging whether the sentence pair is causal or not. It can be seen that the present invention (Our Model) performs significantly better on the data set than the conventional text method, where Model + BOW, Model + CNN, and Model + LSTM in table 2 respectively indicate that the BOW, CNN, and LSTM text encoding models are used in step S30 of the present invention method.

The obvious difference from the traditional text method is that the invention uses experience information collected from the robot platform. This process is very similar to human beings, so the key to the present invention is to propose a new approach, distinguished from the traditional thinking, which emphasizes the importance of the experience related to text.

Attached table 2

Those of skill in the art would appreciate that the various illustrative modules, method steps, and modules described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that programs corresponding to the software modules, method steps may be located in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as electronic hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.

The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims

1. A method for judging causal relationship of chapters is characterized by comprising the following steps:

2. The discourse causal relationship determination method according to claim 1, wherein the step S30 of calculating the causal relationship of the target text pair comprises:

f_r＝softmax(tanh(W_c*[s₁；s₂]+W_t*fea_t+b))

3. The discourse causal relationship determination method according to claim 1, wherein step S20 "calculating the correlation between two registered events" comprises:

4. The discourse cause and effect relationship determination method according to claim 1, wherein the registration event,

e_i＝{obj_i}，i∈R_i

5. The discourse causal relationship determination method of claim 4, wherein the empirical-language-activated corpus is expressed as

E_j＝{obj_i}：{LS_i}，i∈R_i

6. The discourse cause and effect relationship judgment method according to claim 1, wherein the event sequence is registered in each scene in step S20, and the obtaining method comprises:

step A20, carrying out structuring processing on the object information obtained in step A10, organizing and storing the object information according to time sequence, and using the object information as structured experience information obtained by the operation working environment of the robot;

7. The method of claim 6, wherein the structured empirical information is expressed as

E_0-t＝{f₀，f₁，f₂，...，f_t}

f_t＝{obj₁：a₁；...obj_i：a_i；...obj_n：a_n；}

Wherein E is_0-tAll experience from time 0 to time t; f. of_tAll the object information input for the t moment; obj_iIs the ith object, a_iAs an object obj_iActive state of a_iWhen 1, it means that the ith object is activated, a_iWhen 0 indicates that the ith object is not activated, i ∈ [1, n]。

8. The system is characterized by comprising a registration event matching module, a correlation calculation module and a causal relationship calculation module;

the causal relationship calculation module is configured to calculate a causal relationship of the target text pair based on the target text pair and the correlation between the two registration events obtained by the correlation calculation module;

9. A storage device having stored therein a plurality of programs, wherein the programs are adapted to be loaded and executed by a processor to implement the discourse cause and effect judgment method of claims 1-7.

10. A processing device comprising a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded and executed by a processor to implement the chapter cause and effect judgment method of claims 1-7.