CN109710943A

CN109710943A - Inconsistent statement recognition methods and system and clause logic discrimination method and system

Info

Publication number: CN109710943A
Application number: CN201811635859.4A
Authority: CN
Inventors: 鞠剑勋; 刘晔诚
Original assignee: Ctrip Travel Information Technology Shanghai Co Ltd
Current assignee: Ctrip Travel Information Technology Shanghai Co Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2019-05-03
Anticipated expiration: 2038-12-29
Also published as: CN109710943B

Abstract

The invention discloses a kind of inconsistent statement recognition methods and system and clause logic discrimination method and systems.The inconsistent statement recognition methods includes: to construct inconsistent statement identification model, the model is for predicting the probability that two texts contradict based on BiLSTM-Attention mechanism；Compare two texts are separately converted to matrix, the input as the model；By the calculating of the model, probability that two texts to be compared contradict.The present invention solves natural language inference problems using depth learning technology, can reduce the cost of labor of feature extraction, the accuracy rate of deduction is substantially improved.

Description

Contradictory statement identification method and system and clause logic identification method and system

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.

Background

Inference is one of the core research topics in the field of artificial intelligence, and natural language inference is an important research branch in the natural language processing direction, which is the research basis of tasks such as question and answer systems, information retrieval, automatic summarization and the like, and has a wide application space in many business scenes. One core problem in natural language inference is to give two sets of text (preconditions) and positivity (hypotheses) and determine whether there is a contradiction between the two at the semantic level.

Because of the diversity of language expression and the confusion of semantic understanding, especially the existence of a large number of polysemons and synonyms in Chinese texts, the difficulty of natural language inference is higher than that of expectation, and the accuracy of inference also has a great rising space.

In addition, the logical relationship of the relevant terms of the product is usually identified manually, an intelligent means is lacked, more labor cost is consumed, and the identification accuracy is also to be improved.

Disclosure of Invention

The invention aims to overcome the defects that natural language inference is high in difficulty and high in inference precision in the prior art, and provides a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.

The invention solves the technical problems through the following technical scheme:

a contradiction sentence identification method based on deep learning comprises the following steps:

constructing a contradiction statement identification model based on a BilSTM (long-short term memory network) -Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;

respectively converting the two compared texts into matrixes to be used as the input of the model;

and calculating the model to obtain the probability that the two compared texts are contradictory.

Preferably, the step of constructing comprises:

acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;

converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text

Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:

q is added toAs input to Attention to word match hypothesis text with premise text, by formula

Wherein,

obtaining an Attention vector a: a is tan h (W)_att1·c+W_att2·q)；

Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;

according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:

and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.

Preferably, the BiLSTM output vector sequence corresponding to the premise text is calculated by the following steps

Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;

combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text

Calculating a BilSt output vector sequence corresponding to the hypthesis text by the following steps

Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;

combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text

Preferably, the step of constructing further comprises: preprocessing the original training text data;

the contradictory statement identification method further includes: preprocessing the two texts to be compared;

the preprocessing at least comprises denoising, word segmentation and dictionary coding.

A clause logic identification method based on deep learning, the clause logic identification method comprising:

respectively converting the two compared clause texts into matrixes to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method;

and calculating the model to obtain the probability that the two compared clause texts are contradictory.

A contradictory sentence identification system based on deep learning, the contradictory sentence identification system comprising:

the model construction module is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;

the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;

and the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.

Preferably, the model building module is configured to:

Wherein,

obtaining an Attention vector a: a is tan h (W)_att1·c+W_att2·q)；

Preferably, the model building module is further configured to:

The model building module is further configured to:

Preferably, the model building module is further configured to preprocess the original training text data;

the model input module is also used for preprocessing the two compared texts;

A clause logic discrimination system based on deep learning, the clause logic discrimination system comprising:

the clause input module is used for respectively converting the two compared clause texts into matrixes to be used as the input of the contradictory sentence identification model constructed by the contradictory sentence identification system;

and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model. On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.

The positive progress effects of the invention are as follows: the invention solves the natural language inference problem by using the deep learning technology, can reduce the labor cost of feature extraction, and greatly improves the accuracy of inference.

Drawings

Fig. 1 is a flowchart of a contradictory sentence identification method based on deep learning according to embodiment 1 of the present invention.

Fig. 2 is a schematic block diagram of a contradictory sentence recognition system based on deep learning according to embodiment 3 of the present invention.

Fig. 3 is a schematic block diagram of a clause logic authentication system based on deep learning according to embodiment 4 of the present invention.

Detailed Description

The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.

Example 1

Fig. 1 shows a contradiction sentence identification method based on deep learning according to the present embodiment. The contradictory statement identification method includes:

step 11: based on a BilSTM-Attention mechanism, a contradiction sentence identification model is constructed, and the model is used for predicting the probability of contradiction between two texts.

Step 12: the two texts being compared are preprocessed.

Step 13: and respectively converting the two texts to be compared into matrixes to be used as the input of the model.

Step 14: and calculating the model to obtain the probability that the two compared texts are contradictory.

In this embodiment, step 11 may specifically include:

(1) acquiring original training text data, and preprocessing the original training text data;

the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary coding.

(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSTM unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text

The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:

(2-1) converting each word in the premix text into a word vector with a fixed length of n (such as n being 100);

(2-2) setting the maximum number of words in the text participating in training to m (for example, m is 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words except m;

(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text₁,x₂,…,x_m]。

Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.

Taking the matrix converted from the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: BiLSTM output vector sequence corresponding to premix textThe specific process is as follows:

(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment to be f respectively_tAnd i_tThe updated value of the cell state at the current time isSatisfies the following conditions:

f_t＝σ(W_f[x_t,h_t-1]+b_f)

i_t＝σ(W_i[x_t,h_t-1]+b_i)

wherein,x_tfor input at the current time, h_t-1For hiding the state of the layer at the previous moment, W_f、W_iAnd W_cUpdating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, b_f、b_iAnd b_cRespectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;

(2-5) by the formulaUpdating state C of cell_t；

(2-6) obtaining the output h of each hidden node according to the following formula_tH is to be_tAre connected in sequence to form a m-dimensional vector sequence h₁,h₂,…,h_m]：

o_t＝σ(W_o[x_t,h_t-1]+b_o)

h_t＝o_t*tanh(C_t)

Wherein, W_oAs a weight matrix of the output gate unit, b_oIs an offset vector of the output gate unit, o_tIs the output of the output gate unit;

(2-7) processing the premix text in the forward direction through the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix textProcessing the premix text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premix text

(2-8) Forward output vector sequence corresponding to premise textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix textIs marked as

In the reference process (2-4) - (2-8), forward processing is carried out on the hypthesis text in the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the hypthesis textProcessing the hypothesis text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the hypothesis textForward output vector sequence corresponding to hypthesis textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis textIs marked asThe detailed process is not described again.

(3) Preliminary semantic representation corresponding to hypthesis textPerforming maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis textFurther compute Query vector q:

(4) q is added toAs input to Attention to word match hypothesis text with premise text, by formula

Wherein,

obtaining an Attention vector a: a is tan h (W)_att1·c+W_att2·q)。

(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model The predicted valueIndicating the probability that the premix text and the hypthesis text contradict.

(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:

(7) and minimizing the loss by using an adam (adaptive motion estimation) optimization algorithm, and carrying out multiple times of iterative training to obtain a final model.

Through the steps, when the semantics of the hypthesis text are extracted, the semantics are integrated into a single vector by using the maximum pooling operation, and then the single vector is matched with each word vector in the premise text, so that the inference precision is ensured, and the training cost of the model is greatly reduced.

In step 12, preprocessing the two texts to be compared is the same as the preprocessing in step (1); in step 13, referring to the conversion processes (2-1) - (2-3) of the premix text, the two texts to be compared can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated; in step 14, it can be further determined whether the two texts to be compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts to be compared can be generally considered to be contradictory, and if the probability does not exceed 50%, the two texts to be compared can be generally considered to be not contradictory.

Example 2

The embodiment provides a clause logic identification method based on deep learning. The clause logic authentication method comprises the following steps:

converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of the embodiment 1;

Of course, the clause logic identification method of the present embodiment may pre-process the two compared clause texts with reference to step 12 of embodiment 1 before converting the two compared clause texts into matrices, respectively.

The term logic identification method of the embodiment can be applied to intelligent identification of the logic relationship of the related terms of the travel product, such as the term contents contained in the parts of reservation limitation, reservation description, product description and the like, and aims to ensure the reasonability and the logicality of the related terms of the travel product of a company, further fully guarantee the rights and interests of consumers and provide the most satisfactory service for the customers. Of course, the method for logically identifying clauses is not limited to this embodiment, and the method for logically identifying clauses may also be applied to other service products or entity products, even in some situations such as systems and rule clauses, so as to achieve the effects of reducing the labor cost of manual inspection and improving the accuracy of identification, and can be specifically revised and improved for the clauses having contradictions.

Example 3

Fig. 2 shows a contradictory sentence recognition system based on deep learning of the present embodiment. The contradictory sentence recognition system includes:

and the model construction module 21 is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts.

And the model input module 22 is used for preprocessing the two compared texts and respectively converting the two compared texts into matrixes to serve as the input of the model.

And the model output module 23 is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.

In this embodiment, the model input module 22 may be specifically configured to:

f_t＝σ(W_f[x_t,h_t-1]+b_f)

i_t＝σ(W_i[x_t,h_t-1]+b_i)

wherein x is_tFor input at the current time, h_t-1For hiding the state of the layer at the previous moment, W_f、W_iAnd W_cUpdating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, b_f、b_iAnd b_cRespectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;

(2-5) by the formulaUpdating state C of cell_t；

o_t＝σ(W_o[x_t,h_t-1]+b_o)

h_t＝o_t*tanh(C_t)

(2-7) processing the premix text in the forward direction by (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix textProcessing the premise text reversely by (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premise text

Referring to the processes (2-4) - (2-8), processing the hypthesis text in the forward direction through the steps (2-4) - (2-6),obtaining a forward output vector sequence corresponding to the hypthesis textProcessing (2-4) - (2-6) in the reverse direction of the hypthesis text to obtain a reverse output vector sequence corresponding to the hypthesis textForward output vector sequence corresponding to hypthesis textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis textIs marked asThe detailed process is not described again.

(4) q is added toAs input for Attention to associate hypothesis text with prMatching words and phrases of the emise text by a formula

Wherein,

obtaining an Attention vector a: a is tan h (W)_att1·c+W_att2·q)。

In this embodiment, the model building module 21 integrates the semantics of the hypthesis text into a single vector by using the maximum pooling operation when extracting the semantics of the hypthesis text, and then matches the single vector with each word vector in the premise text, thereby greatly reducing the training cost of the model while ensuring the inference accuracy.

The model input module 22 preprocesses the two compared texts, which is the same as the preprocessing in the model construction module 21, and with reference to the conversion processes (2-1) to (2-3) of the premise texts, the two compared texts can be converted into an m × n matrix according to the word vector of each word, and the specific process is not repeated; the model output module 23 can further determine whether the two texts being compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts being compared are generally considered to be contradictory, and if the probability does not exceed 50%, the two texts being compared are generally considered to be not contradictory.

Example 4

Fig. 3 shows a deep learning-based clause logic authentication system according to the present embodiment. The clause logic authentication system comprises:

a clause input module 31, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed by the contradictory sentence recognition system of embodiment 3;

and the probability output module 32 is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.

Of course, the clause input module may pre-process the two compared clause texts with reference to the model input module 22 of embodiment 3 before converting the two compared clause texts into matrices, respectively.

The term logic identification system of the embodiment can be applied to intelligent identification of the logic relationship of the related terms of the travel product, such as the term contents contained in the parts of reservation limitation, reservation description, product description and the like, and aims to ensure the reasonability and the logicality of the related terms of the travel product of a company, further fully guarantee the rights and interests of consumers and provide the most satisfactory service for the customers. Of course, the term logic identification system is not limited to this embodiment, and may also be applied to other service products or entity products, even in some situations such as systems and rule terms, to achieve the effects of reducing the labor cost of manual inspection and improving the accuracy of identification, and to specifically revise and improve the terms in contradiction.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims

1. A contradiction sentence identification method based on deep learning is characterized in that the contradiction sentence identification method comprises the following steps:

constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;

2. The contradictory statement identification method according to claim 1, characterized in that the step of constructing comprises:

Wherein,

obtaining an Attention vector a: a is tan h (W)_att1·c+W_att2·q)；

3. The contradictory sentence recognition method of claim 2, wherein the sequence of BilSTM output vectors corresponding to the premise text is calculated by the following steps

4. The contradictory statement identification method according to claim 2, characterized in that the step of constructing further comprises: preprocessing the original training text data;

5. A clause logic identification method based on deep learning, characterized in that the clause logic identification method comprises the following steps:

converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of any one of claims 1 to 4;

6. A contradictory sentence recognition system based on deep learning, the contradictory sentence recognition system comprising:

7. The contradictory statement identification system of claim 6, wherein the model building module is to:

Wherein,

obtaining an Attention vector a: a is tan h (W)_att1·c+W_att2·q)；

8. The contradictory statement identification system of claim 7, wherein the model building module is further configured to:

The model building module is further configured to:

9. The contradictory sentence recognition system of claim 7, wherein the model building module is further configured to pre-process the raw training text data;

the model input module is also used for preprocessing the two compared texts;

10. A clause logic authentication system based on deep learning, the clause logic authentication system comprising:

a clause input module, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed using the contradictory sentence recognition system according to any one of claims 6 to 9;

and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.