CN109710943B - Contradictory statement identification method and system and clause logic identification method and system - Google Patents

Contradictory statement identification method and system and clause logic identification method and system Download PDF

Info

Publication number
CN109710943B
CN109710943B CN201811635859.4A CN201811635859A CN109710943B CN 109710943 B CN109710943 B CN 109710943B CN 201811635859 A CN201811635859 A CN 201811635859A CN 109710943 B CN109710943 B CN 109710943B
Authority
CN
China
Prior art keywords
text
model
vector sequence
output vector
premix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811635859.4A
Other languages
Chinese (zh)
Other versions
CN109710943A (en
Inventor
鞠剑勋
刘晔诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Travel Information Technology Shanghai Co Ltd
Original Assignee
Ctrip Travel Information Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Travel Information Technology Shanghai Co Ltd filed Critical Ctrip Travel Information Technology Shanghai Co Ltd
Priority to CN201811635859.4A priority Critical patent/CN109710943B/en
Publication of CN109710943A publication Critical patent/CN109710943A/en
Application granted granted Critical
Publication of CN109710943B publication Critical patent/CN109710943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses. The contradictory statement identification method includes: constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts; respectively converting the two compared texts into matrixes to be used as the input of the model; and calculating the model to obtain the probability that the two compared texts are contradictory. The invention solves the natural language inference problem by using the deep learning technology, can reduce the labor cost of feature extraction, and greatly improves the accuracy of inference.

Description

Contradictory statement identification method and system and clause logic identification method and system
Technical Field
The invention belongs to the field of artificial intelligence, and particularly relates to a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.
Background
Inference has been one of the core research topics in the field of artificial intelligence, and natural language inference is an important research branch in the natural language processing direction, which is the research basis of tasks such as question-answering systems, information retrieval, automatic summarization, and has a wide application space in many business scenarios. One core problem in natural language inference is to give two sets of text (preconditions) and positivity, and determine whether there is a contradiction between the two at the semantic level.
Because of the diversity of language expression and the confusion of semantic understanding, especially the existence of a large number of polysemons and synonyms in Chinese texts, the difficulty of natural language inference is higher than that of expectation, and the accuracy of inference also has a great rising space.
In addition, the logical relationship of the relevant terms of the product is usually identified manually, so that an intelligent means is lacked, more labor cost is consumed, and the identification accuracy is also required to be improved.
Disclosure of Invention
The invention aims to overcome the defects that natural language inference is difficult and inference precision has a large space rise in the prior art, and provides a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.
The invention solves the technical problems through the following technical scheme:
a contradiction sentence identification method based on deep learning comprises the following steps:
constructing a contradiction statement identification model based on a BilSTM (long-short time memory network) -Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;
respectively converting the two compared texts into matrixes to be used as the input of the model;
and calculating the model to obtain the probability that the two compared texts are contradictory.
Preferably, the step of constructing comprises:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: biLSTM output vector sequence corresponding to premix text
Figure BDA0001930032490000021
BilsTM output vector sequence corresponding to hypthesis text
Figure BDA0001930032490000022
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis text
Figure BDA0001930032490000023
Computing a Query vector q:
Figure BDA0001930032490000024
q is added to
Figure BDA0001930032490000025
As input to Attention to word match hypothesis text with premise text, by formula
Figure BDA0001930032490000026
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0001930032490000027
Figure BDA0001930032490000028
obtaining an Attention vector a: a = tanh (W) att1 ·c+W att2 ·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relation to obtain a predicted value of the two-classification deep neural network model
Figure BDA0001930032490000029
The predicted value
Figure BDA00019300324900000210
Representing the probability of contradiction between the premix text and the hypothesis text;
the real value y and the predicted value according to the relation between the premix text and the hypothesis text
Figure BDA00019300324900000211
Calculating cross entropy loss:
Figure BDA00019300324900000212
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
Preferably, the BiLSTM output vector sequence corresponding to the premise text is calculated by the following steps
Figure BDA00019300324900000213
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Figure BDA0001930032490000031
Calculating a BilSt output vector sequence corresponding to the hypthesis text by the following steps
Figure BDA0001930032490000032
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
Figure BDA0001930032490000033
Preferably, the step of constructing further comprises: preprocessing the original training text data;
the contradictory statement identification method further includes: preprocessing the two texts to be compared;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
A clause logic identification method based on deep learning, the clause logic identification method comprising:
respectively converting the two compared clause texts into matrixes to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
A contradictory sentence identification system based on deep learning, the contradictory sentence identification system comprising:
the model construction module is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;
the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;
and the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
Preferably, the model building module is configured to:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: biLSTM output vector sequence corresponding to premix text
Figure BDA0001930032490000041
Bilstm output vector sequence corresponding to hypothesis text
Figure BDA0001930032490000042
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis text
Figure BDA0001930032490000043
Computing a Query vector q:
Figure BDA0001930032490000044
q is added to
Figure BDA0001930032490000045
As input to Attention to word match hypothesis text with premise text, by formula
Figure BDA0001930032490000046
Wherein the content of the first and second substances,
Figure BDA0001930032490000047
Figure BDA0001930032490000048
obtaining an Attention vector a: a = tanh (W) att1 ·c+W att2 ·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model
Figure BDA0001930032490000049
The predicted value
Figure BDA00019300324900000410
Representing the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis text
Figure BDA00019300324900000411
Calculating cross entropy loss:
Figure BDA00019300324900000412
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
Preferably, the model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
mixing textThe corresponding forward output vector sequence and the reverse output vector sequence are combined along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Figure BDA00019300324900000413
The model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
Figure BDA00019300324900000414
Preferably, the model building module is further configured to pre-process the original training text data;
the model input module is also used for preprocessing the two compared texts;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
A clause logic discrimination system based on deep learning, the clause logic discrimination system comprising:
the clause input module is used for respectively converting the two compared clause texts into matrixes to be used as input of the contradictory sentence recognition model constructed by the contradictory sentence recognition system;
and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model. On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows: the invention solves the natural language inference problem by using the deep learning technology, can reduce the labor cost of feature extraction, and greatly improves the accuracy of inference.
Drawings
Fig. 1 is a flowchart of a contradiction sentence identification method based on deep learning in embodiment 1 of the present invention.
Fig. 2 is a schematic block diagram of a contradictory sentence recognition system based on deep learning according to embodiment 3 of the present invention.
Fig. 3 is a schematic block diagram of a clause logic authentication system based on deep learning according to embodiment 4 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
Fig. 1 shows a contradiction sentence identification method based on deep learning according to the present embodiment. The contradictory statement identification method comprises the following steps:
step 11: based on a BilSTM-Attention mechanism, a contradiction sentence identification model is constructed, and the model is used for predicting the probability of contradiction between two texts.
Step 12: the two texts being compared are preprocessed.
Step 13: and respectively converting the two texts to be compared into matrixes to be used as the input of the model.
Step 14: and calculating the model to obtain the probability that the two compared texts are contradictory.
In this embodiment, step 11 may specifically include:
(1) Acquiring original training text data, and preprocessing the original training text data;
the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary coding.
(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSt unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: biLSTM output vector sequence corresponding to premise text
Figure BDA0001930032490000061
Bilstm output vector sequence corresponding to hypothesis text
Figure BDA0001930032490000062
The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:
(2-1) converting each word in the premix text into a word vector with a fixed length of n (e.g., n = 100);
(2-2) setting the maximum number of words in the text participating in training to m (e.g., m = 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words other than m;
(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text 1 ,x 2 ,…,x m ]。
Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.
Taking the matrix converted from the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: biLSTM output vector sequence corresponding to premix text
Figure BDA0001930032490000063
The specific process is as follows:
(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment to be f respectively t And i t The updated value of the cell state at the current time is
Figure BDA0001930032490000064
Satisfies the following conditions:
f t =σ(W f [x t ,h t-1 ]+b f )
i t =σ(W i [x t ,h t-1 ]+b i )
Figure BDA0001930032490000071
wherein x is t For input at the current time, h t-1 For hiding the state of the layer at the previous moment, W f 、W i And W c Updating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, b f 、b i And b c Respectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;
(2-5) by the formula
Figure BDA0001930032490000072
Updating state C of cell t
(2-6) obtaining the output h of each hidden node according to the following formula t H is to be t Connected in sequence to form m-dimensional vector sequence h 1 ,h 2 ,…,h m ]:
o t =σ(W o [x t ,h t-1 ]+b o )
h t =o t *tanh(C t )
Wherein, W o As a weight matrix of the output gate unit, b o Is an offset vector of the output gate unit, o t Is the output of the output gate unit;
(2-7) processing the premix text in the forward direction through the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix text
Figure BDA0001930032490000073
Processing the premix text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premix text
Figure BDA0001930032490000074
(2-8) Forward output vector sequence corresponding to premise text
Figure BDA0001930032490000075
And a reverse output vector sequence
Figure BDA0001930032490000076
Combining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Figure BDA0001930032490000077
Is marked as
Figure BDA0001930032490000078
In the reference process (2-4) - (2-8), forward processing is carried out on the hypthesis text in the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the hypthesis text
Figure BDA0001930032490000079
Processing the hypothesis text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the hypothesis text
Figure BDA00019300324900000710
Forward output vector sequence corresponding to hypothesis text
Figure BDA00019300324900000711
And a reverse output vector sequence
Figure BDA00019300324900000712
Combining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis text
Figure BDA00019300324900000713
Is marked as
Figure BDA00019300324900000714
The detailed process is not described again.
(3) For hypreliminary semantic representation corresponding to potesis text
Figure BDA00019300324900000715
Performing maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis text
Figure BDA0001930032490000081
Further compute Query vector q:
Figure BDA0001930032490000082
(4) Q is added to
Figure BDA0001930032490000083
As input to Attention to word match hypothesis text with premise text, by formula
Figure BDA0001930032490000084
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0001930032490000085
Figure BDA0001930032490000086
obtaining an Attention vector a: a = tanh (W) att1 ·c+W att2 ·q)。
(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model
Figure BDA0001930032490000087
Figure BDA0001930032490000088
The predicted value
Figure BDA0001930032490000089
Representing premix text and hypthesis textProbability of contradiction.
(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis text
Figure BDA00019300324900000810
Calculating cross entropy loss:
Figure BDA00019300324900000811
(7) And minimizing the loss by using an Adam (Adaptive motion Estimation) optimization algorithm, and carrying out iterative training for multiple times to obtain a final model.
Through the steps, when the semantics of the hypthesis text are extracted, the semantics are integrated into a single vector by using the maximum pooling operation, and then the single vector is matched with each word vector in the premise text, so that the inference precision is ensured, and the training cost of the model is greatly reduced.
In step 12, preprocessing the two texts to be compared is the same as the preprocessing in step (1); in step 13, referring to the conversion processes (2-1) - (2-3) of the premix text, the two texts to be compared can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated; in step 14, it can be further determined whether the two texts to be compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts to be compared can be generally considered to be contradictory, and if the probability does not exceed 50%, the two texts to be compared can be generally considered to be not contradictory.
Example 2
The embodiment provides a clause logic identification method based on deep learning. The clause logic authentication method comprises the following steps:
converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of the embodiment 1;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
Of course, the clause logic identification method of the present embodiment may pre-process the two compared clause texts with reference to step 12 of embodiment 1 before converting the two compared clause texts into matrices, respectively.
The term logic identification method of the embodiment can be applied to intelligent identification of logic relations of related terms of travel products, such as terms and contents contained in parts of reservation restriction, reservation description, product description and the like, and aims to ensure reasonability and logicality of related terms of travel products of companies, further fully guarantee rights and interests of consumers and provide the most satisfactory service for customers. Certainly, the method for logically identifying clauses is not limited to this embodiment, and the method for logically identifying clauses can also be applied to other service products or entity products, even scenes such as certain systems and rule clauses, so as to achieve the effects of reducing the labor cost of manual check and improving the identification accuracy, and can be specifically revised and improved for the clauses with contradictions.
Example 3
Fig. 2 shows a contradictory sentence recognition system based on deep learning of the present embodiment. The contradictory sentence recognition system includes:
and the model construction module 21 is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts.
And the model input module 22 is configured to preprocess the two compared texts, and convert the two compared texts into matrices, respectively, as input of the model.
And the model output module 23 is used for obtaining the probability that the two compared texts are inconsistent through the calculation of the model.
In this embodiment, the model input module 22 may be specifically configured to:
(1) Acquiring original training text data, and preprocessing the original training text data;
the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary encoding.
(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSt unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: biLSTM output vector sequence corresponding to premise text
Figure BDA0001930032490000101
Bilstm output vector sequence corresponding to hypothesis text
Figure BDA0001930032490000102
The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:
(2-1) converting each word in the premix text into a word vector with a fixed length of n (e.g., n = 100);
(2-2) setting the maximum number of words in the text participating in training to m (e.g., m = 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words other than m;
(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text 1 ,x 2 ,…,x m ]。
Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.
Taking the matrix converted by the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: biLSTM output vector sequence corresponding to premix text
Figure BDA0001930032490000103
The specific process is as follows:
(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment respectivelyIs f t And i t The updated value of the cell state at the current time is
Figure BDA0001930032490000104
Satisfies the following conditions:
f t =σ(W f [x t ,h t-1 ]+b f )
i t =σ(W i [x t ,h t-1 ]+b i )
Figure BDA0001930032490000105
wherein x is t For input at the current time, h t-1 For hiding the state of the layer at the previous moment, W f 、W i And W c Updating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, b f 、b i And b c Respectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;
(2-5) by the formula
Figure BDA0001930032490000111
Updating state C of cell t
(2-6) obtaining the output h of each hidden node according to the following formula t H is to be t Are connected in sequence to form a m-dimensional vector sequence h 1 ,h 2 ,…,h m ]:
o t =σ(W o [x t ,h t-1 ]+b o )
h t =o t *tanh(C t )
Wherein, W o As a weight matrix of the output gate unit, b o Is an offset vector of the output gate unit, o t Is the output of the output gate unit;
(2-7) processing the premix text in the forward direction by (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix text
Figure BDA0001930032490000112
Processing the premise text reversely by (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premise text
Figure BDA0001930032490000113
(2-8) Forward output vector sequence corresponding to premise text
Figure BDA0001930032490000114
And a reverse output vector sequence
Figure BDA0001930032490000115
Combining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Figure BDA0001930032490000116
Is marked as
Figure BDA0001930032490000117
In the reference processes (2-4) - (2-8), forward direction of the hypothesis text is processed through the steps (2-4) - (2-6) to obtain a forward direction output vector sequence corresponding to the hypothesis text
Figure BDA0001930032490000118
Processing (2-4) - (2-6) in the reverse direction of the hypthesis text to obtain a reverse output vector sequence corresponding to the hypthesis text
Figure BDA0001930032490000119
Forward output vector sequence corresponding to hypothesis text
Figure BDA00019300324900001110
And backward output vector sequence
Figure BDA00019300324900001111
Combining along the last dimension to obtain a BilSt output vector corresponding to the hypothesis textSequence of
Figure BDA00019300324900001112
Is marked as
Figure BDA00019300324900001113
The detailed process is not described again.
(3) Preliminary semantic representation corresponding to hypthesis text
Figure BDA00019300324900001114
Performing maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis text
Figure BDA00019300324900001115
A Query vector q is further computed:
Figure BDA00019300324900001116
(4) Q is added to
Figure BDA00019300324900001117
As input to Attention to word match hypothesis text with premise text, by formula
Figure BDA0001930032490000121
Wherein the content of the first and second substances,
Figure BDA0001930032490000122
Figure BDA0001930032490000123
obtaining an Attention vector a: a = tanh (W) att1 ·c+W att2 ·q)。
(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relation to obtain a predicted value of the two-classification deep neural network model
Figure BDA0001930032490000124
Figure BDA0001930032490000125
The predicted value
Figure BDA0001930032490000126
Indicating the probability that the premix text and the hypthesis text contradict.
(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis text
Figure BDA0001930032490000127
Calculating cross entropy loss:
Figure BDA0001930032490000128
(7) And minimizing the loss by using an Adam (Adaptive motion Estimation) optimization algorithm, and carrying out iterative training for multiple times to obtain a final model.
In this embodiment, the model building module 21 integrates the semantics of the hypthesis text into a single vector by using the maximum pooling operation when extracting the semantics of the hypthesis text, and then matches the single vector with each word vector in the premise text, thereby greatly reducing the training cost of the model while ensuring the inference accuracy.
The model input module 22 preprocesses the two compared texts, which is the same as the preprocessing in the model construction module 21, and with reference to the conversion processes (2-1) to (2-3) of the premise texts, the two compared texts can be converted into an m × n matrix according to the word vector of each word, and the specific process is not repeated; the model output module 23 may further determine whether the two texts to be compared contradict each other according to the probability, for example, if the probability exceeds 50%, the two texts to be compared may be generally considered to be contradictory, and if the probability does not exceed 50%, the two texts to be compared may be generally considered to be not contradictory.
Example 4
Fig. 3 shows a deep learning-based clause logic authentication system according to the present embodiment. The clause logic authentication system comprises:
a clause input module 31, configured to convert the two compared clause texts into matrices, respectively, and input the matrix as a contradiction sentence identification model constructed by using the contradiction sentence identification system in embodiment 3;
and the probability output module 32 is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.
Of course, the clause input module may pre-process the two compared clause texts with reference to the model input module 22 of embodiment 3 before converting the two compared clause texts into matrices, respectively.
The term logic identification system of the embodiment can be applied to intelligent identification of logic relations of related terms of travel products, such as terms and contents contained in parts of reservation restriction, reservation description, product description and the like, and aims to ensure reasonability and logicality of related terms of travel products of companies, further fully guarantee rights and interests of consumers and provide the most satisfactory service for customers. Certainly, the term logical identification system is not limited to this embodiment, and the term logical identification system may also be applied to other service products or entity products, even some systems, rules and other scenarios, to achieve the effects of reducing the labor cost of manual checking and improving the identification accuracy, and may be purposefully revised and improved for the terms in contradiction.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes or modifications to these embodiments may be made by those skilled in the art without departing from the principle and spirit of this invention, and these changes and modifications are within the scope of this invention.

Claims (8)

1. A contradiction sentence identification method based on deep learning is characterized in that the contradiction sentence identification method comprises the following steps:
constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;
respectively converting the two compared texts into matrixes to be used as the input of the model;
calculating the model to obtain the probability that the two compared texts are contradictory;
the steps of constructing include:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: biLSTM output vector sequence corresponding to premix text
Figure DEST_PATH_IMAGE001
BilsTM output vector sequence corresponding to hypthesis text
Figure 173611DEST_PATH_IMAGE002
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis text
Figure DEST_PATH_IMAGE003
And calculating a Query vector q:
Figure 917183DEST_PATH_IMAGE004
q is added to
Figure DEST_PATH_IMAGE005
As input to Attention to word match hypothesis text with premise text, by formula
Figure 352712DEST_PATH_IMAGE006
Wherein, in the step (A),
Figure DEST_PATH_IMAGE007
obtaining an Attention vector
Figure 374020DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE009
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model
Figure 313026DEST_PATH_IMAGE010
The predicted value
Figure DEST_PATH_IMAGE011
Representing the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis text
Figure 24237DEST_PATH_IMAGE012
Calculating cross entropy loss:
Figure DEST_PATH_IMAGE013
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
2. The contradictory sentence identification method of claim 1, wherein the sequence of BilSTM output vectors corresponding to the premise text is calculated by the following steps
Figure 150324DEST_PATH_IMAGE014
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Figure DEST_PATH_IMAGE015
Calculating a BiLSTM output vector sequence corresponding to the hypothesis text by the following steps
Figure 975323DEST_PATH_IMAGE016
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypothesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypothesis text
Figure DEST_PATH_IMAGE017
3. The contradictory statement identification method according to claim 1, characterized in that the step of constructing further comprises: preprocessing the original training text data;
the contradictory statement identification method further includes: preprocessing the two texts to be compared;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
4. A clause logic identification method based on deep learning, characterized in that the clause logic identification method comprises the following steps:
converting the two compared clause texts into matrixes respectively as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of any one of claims 1 to 3;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
5. A contradictory sentence recognition system based on deep learning, the contradictory sentence recognition system comprising:
the model construction module is used for constructing a contradiction sentence identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;
the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;
the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model;
the model building module is configured to:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: biLSTM output vector sequence corresponding to premix text
Figure 64109DEST_PATH_IMAGE001
Bilstm output vector sequence corresponding to hypothesis text
Figure 323052DEST_PATH_IMAGE002
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis text
Figure 405277DEST_PATH_IMAGE003
And calculating a Query vector q:
Figure 407868DEST_PATH_IMAGE004
q is added to
Figure 993570DEST_PATH_IMAGE005
As input to Attention to word match hypothesis text with premise text, by formula
Figure 987196DEST_PATH_IMAGE006
Wherein, in the process,
Figure 556718DEST_PATH_IMAGE007
deriving an Attention vector
Figure 894158DEST_PATH_IMAGE008
Figure 803209DEST_PATH_IMAGE009
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relation to obtain a predicted value of the two-classification deep neural network model
Figure 200692DEST_PATH_IMAGE010
The predicted value
Figure 490465DEST_PATH_IMAGE012
Representing the probability of contradiction between the premix text and the hypthesis text;
the real value y and the predicted value according to the relation between the premix text and the hypothesis text
Figure 897176DEST_PATH_IMAGE011
Calculating cross entropy loss:
Figure 660733DEST_PATH_IMAGE013
and minimizing loss by using an optimization algorithm, and performing iterative training for multiple times to obtain a final model.
6. The contradictory statement identification system of claim 5, wherein the model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Figure 324057DEST_PATH_IMAGE014
The model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypothesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypothesis text
Figure 540275DEST_PATH_IMAGE016
7. The contradictory sentence recognition system of claim 5 wherein the model construction module is further configured to preprocess the raw training text data;
the model input module is also used for preprocessing the two compared texts;
the preprocessing at least comprises denoising, word segmentation and dictionary encoding.
8. A clause logic authentication system based on deep learning, the clause logic authentication system comprising:
a clause input module, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed using the contradictory sentence recognition system according to any one of claims 5 to 7;
and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.
CN201811635859.4A 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system Active CN109710943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811635859.4A CN109710943B (en) 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811635859.4A CN109710943B (en) 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system

Publications (2)

Publication Number Publication Date
CN109710943A CN109710943A (en) 2019-05-03
CN109710943B true CN109710943B (en) 2022-12-20

Family

ID=66259511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811635859.4A Active CN109710943B (en) 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system

Country Status (1)

Country Link
CN (1) CN109710943B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110618980A (en) * 2019-09-09 2019-12-27 上海交通大学 System and method based on legal text accurate matching and contradiction detection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014132456A1 (en) * 2013-02-28 2014-09-04 Nec Corporation Method and system for determining non-entailment and contradiction of text pairs
WO2015053236A1 (en) * 2013-10-08 2015-04-16 独立行政法人情報通信研究機構 Device for collecting contradictory expression and computer program for same
CN108647207A (en) * 2018-05-08 2018-10-12 上海携程国际旅行社有限公司 Natural language modification method, system, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313515B2 (en) * 2006-05-01 2007-12-25 Palo Alto Research Center Incorporated Systems and methods for detecting entailment and contradiction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014132456A1 (en) * 2013-02-28 2014-09-04 Nec Corporation Method and system for determining non-entailment and contradiction of text pairs
WO2015053236A1 (en) * 2013-10-08 2015-04-16 独立行政法人情報通信研究機構 Device for collecting contradictory expression and computer program for same
CN108647207A (en) * 2018-05-08 2018-10-12 上海携程国际旅行社有限公司 Natural language modification method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN109710943A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN111444340B (en) Text classification method, device, equipment and storage medium
CN110263323B (en) Keyword extraction method and system based on barrier type long-time memory neural network
WO2022022163A1 (en) Text classification model training method, device, apparatus, and storage medium
CN108416058B (en) Bi-LSTM input information enhancement-based relation extraction method
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
CN110263325B (en) Chinese word segmentation system
CN109919175B (en) Entity multi-classification method combined with attribute information
CN112069831A (en) Unreal information detection method based on BERT model and enhanced hybrid neural network
US11651166B2 (en) Learning device of phrase generation model, phrase generation device, method, and program
CN110489551B (en) Author identification method based on writing habit
CN110532395B (en) Semantic embedding-based word vector improvement model establishing method
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN113553510B (en) Text information recommendation method and device and readable medium
CN113282714B (en) Event detection method based on differential word vector representation
CN111368542A (en) Text language association extraction method and system based on recurrent neural network
CN110825849A (en) Text information emotion analysis method, device, medium and electronic equipment
CN114020906A (en) Chinese medical text information matching method and system based on twin neural network
CN112287106A (en) Online comment emotion classification method based on dual-channel hybrid neural network
CN111090724B (en) Entity extraction method capable of judging relevance between text content and entity based on deep learning
CN113821635A (en) Text abstract generation method and system for financial field
CN113065349A (en) Named entity recognition method based on conditional random field
CN114462420A (en) False news detection method based on feature fusion model
CN116049387A (en) Short text classification method, device and medium based on graph convolution
CN113158659B (en) Case-related property calculation method based on judicial text
US20220156489A1 (en) Machine learning techniques for identifying logical sections in unstructured data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant