CN109710943A - Inconsistent statement recognition methods and system and clause logic discrimination method and system - Google Patents

Inconsistent statement recognition methods and system and clause logic discrimination method and system Download PDF

Info

Publication number
CN109710943A
CN109710943A CN201811635859.4A CN201811635859A CN109710943A CN 109710943 A CN109710943 A CN 109710943A CN 201811635859 A CN201811635859 A CN 201811635859A CN 109710943 A CN109710943 A CN 109710943A
Authority
CN
China
Prior art keywords
text
hypthesis
model
vector sequence
output vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811635859.4A
Other languages
Chinese (zh)
Other versions
CN109710943B (en
Inventor
鞠剑勋
刘晔诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Travel Information Technology Shanghai Co Ltd
Original Assignee
Ctrip Travel Information Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Travel Information Technology Shanghai Co Ltd filed Critical Ctrip Travel Information Technology Shanghai Co Ltd
Priority to CN201811635859.4A priority Critical patent/CN109710943B/en
Publication of CN109710943A publication Critical patent/CN109710943A/en
Application granted granted Critical
Publication of CN109710943B publication Critical patent/CN109710943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a kind of inconsistent statement recognition methods and system and clause logic discrimination method and systems.The inconsistent statement recognition methods includes: to construct inconsistent statement identification model, the model is for predicting the probability that two texts contradict based on BiLSTM-Attention mechanism;Compare two texts are separately converted to matrix, the input as the model;By the calculating of the model, probability that two texts to be compared contradict.The present invention solves natural language inference problems using depth learning technology, can reduce the cost of labor of feature extraction, the accuracy rate of deduction is substantially improved.

Description

Contradictory statement identification method and system and clause logic identification method and system
Technical Field
The invention belongs to the field of artificial intelligence, and particularly relates to a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.
Background
Inference is one of the core research topics in the field of artificial intelligence, and natural language inference is an important research branch in the natural language processing direction, which is the research basis of tasks such as question and answer systems, information retrieval, automatic summarization and the like, and has a wide application space in many business scenes. One core problem in natural language inference is to give two sets of text (preconditions) and positivity (hypotheses) and determine whether there is a contradiction between the two at the semantic level.
Because of the diversity of language expression and the confusion of semantic understanding, especially the existence of a large number of polysemons and synonyms in Chinese texts, the difficulty of natural language inference is higher than that of expectation, and the accuracy of inference also has a great rising space.
In addition, the logical relationship of the relevant terms of the product is usually identified manually, an intelligent means is lacked, more labor cost is consumed, and the identification accuracy is also to be improved.
Disclosure of Invention
The invention aims to overcome the defects that natural language inference is high in difficulty and high in inference precision in the prior art, and provides a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.
The invention solves the technical problems through the following technical scheme:
a contradiction sentence identification method based on deep learning comprises the following steps:
constructing a contradiction statement identification model based on a BilSTM (long-short term memory network) -Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;
respectively converting the two compared texts into matrixes to be used as the input of the model;
and calculating the model to obtain the probability that the two compared texts are contradictory.
Preferably, the step of constructing comprises:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
Preferably, the BiLSTM output vector sequence corresponding to the premise text is calculated by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Calculating a BilSt output vector sequence corresponding to the hypthesis text by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
Preferably, the step of constructing further comprises: preprocessing the original training text data;
the contradictory statement identification method further includes: preprocessing the two texts to be compared;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
A clause logic identification method based on deep learning, the clause logic identification method comprising:
respectively converting the two compared clause texts into matrixes to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
A contradictory sentence identification system based on deep learning, the contradictory sentence identification system comprising:
the model construction module is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;
the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;
and the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
Preferably, the model building module is configured to:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
Preferably, the model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
The model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
Preferably, the model building module is further configured to preprocess the original training text data;
the model input module is also used for preprocessing the two compared texts;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
A clause logic discrimination system based on deep learning, the clause logic discrimination system comprising:
the clause input module is used for respectively converting the two compared clause texts into matrixes to be used as the input of the contradictory sentence identification model constructed by the contradictory sentence identification system;
and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model. On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows: the invention solves the natural language inference problem by using the deep learning technology, can reduce the labor cost of feature extraction, and greatly improves the accuracy of inference.
Drawings
Fig. 1 is a flowchart of a contradictory sentence identification method based on deep learning according to embodiment 1 of the present invention.
Fig. 2 is a schematic block diagram of a contradictory sentence recognition system based on deep learning according to embodiment 3 of the present invention.
Fig. 3 is a schematic block diagram of a clause logic authentication system based on deep learning according to embodiment 4 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
Fig. 1 shows a contradiction sentence identification method based on deep learning according to the present embodiment. The contradictory statement identification method includes:
step 11: based on a BilSTM-Attention mechanism, a contradiction sentence identification model is constructed, and the model is used for predicting the probability of contradiction between two texts.
Step 12: the two texts being compared are preprocessed.
Step 13: and respectively converting the two texts to be compared into matrixes to be used as the input of the model.
Step 14: and calculating the model to obtain the probability that the two compared texts are contradictory.
In this embodiment, step 11 may specifically include:
(1) acquiring original training text data, and preprocessing the original training text data;
the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary coding.
(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSTM unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:
(2-1) converting each word in the premix text into a word vector with a fixed length of n (such as n being 100);
(2-2) setting the maximum number of words in the text participating in training to m (for example, m is 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words except m;
(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text1,x2,…,xm]。
Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.
Taking the matrix converted from the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: BiLSTM output vector sequence corresponding to premix textThe specific process is as follows:
(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment to be f respectivelytAnd itThe updated value of the cell state at the current time isSatisfies the following conditions:
ft=σ(Wf[xt,ht-1]+bf)
it=σ(Wi[xt,ht-1]+bi)
wherein,xtfor input at the current time, ht-1For hiding the state of the layer at the previous moment, Wf、WiAnd WcUpdating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, bf、biAnd bcRespectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;
(2-5) by the formulaUpdating state C of cellt
(2-6) obtaining the output h of each hidden node according to the following formulatH is to betAre connected in sequence to form a m-dimensional vector sequence h1,h2,…,hm]:
ot=σ(Wo[xt,ht-1]+bo)
ht=ot*tanh(Ct)
Wherein, WoAs a weight matrix of the output gate unit, boIs an offset vector of the output gate unit, otIs the output of the output gate unit;
(2-7) processing the premix text in the forward direction through the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix textProcessing the premix text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premix text
(2-8) Forward output vector sequence corresponding to premise textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix textIs marked as
In the reference process (2-4) - (2-8), forward processing is carried out on the hypthesis text in the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the hypthesis textProcessing the hypothesis text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the hypothesis textForward output vector sequence corresponding to hypthesis textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis textIs marked asThe detailed process is not described again.
(3) Preliminary semantic representation corresponding to hypthesis textPerforming maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis textFurther compute Query vector q:
(4) q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q)。
(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model The predicted valueIndicating the probability that the premix text and the hypthesis text contradict.
(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
(7) and minimizing the loss by using an adam (adaptive motion estimation) optimization algorithm, and carrying out multiple times of iterative training to obtain a final model.
Through the steps, when the semantics of the hypthesis text are extracted, the semantics are integrated into a single vector by using the maximum pooling operation, and then the single vector is matched with each word vector in the premise text, so that the inference precision is ensured, and the training cost of the model is greatly reduced.
In step 12, preprocessing the two texts to be compared is the same as the preprocessing in step (1); in step 13, referring to the conversion processes (2-1) - (2-3) of the premix text, the two texts to be compared can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated; in step 14, it can be further determined whether the two texts to be compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts to be compared can be generally considered to be contradictory, and if the probability does not exceed 50%, the two texts to be compared can be generally considered to be not contradictory.
Example 2
The embodiment provides a clause logic identification method based on deep learning. The clause logic authentication method comprises the following steps:
converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of the embodiment 1;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
Of course, the clause logic identification method of the present embodiment may pre-process the two compared clause texts with reference to step 12 of embodiment 1 before converting the two compared clause texts into matrices, respectively.
The term logic identification method of the embodiment can be applied to intelligent identification of the logic relationship of the related terms of the travel product, such as the term contents contained in the parts of reservation limitation, reservation description, product description and the like, and aims to ensure the reasonability and the logicality of the related terms of the travel product of a company, further fully guarantee the rights and interests of consumers and provide the most satisfactory service for the customers. Of course, the method for logically identifying clauses is not limited to this embodiment, and the method for logically identifying clauses may also be applied to other service products or entity products, even in some situations such as systems and rule clauses, so as to achieve the effects of reducing the labor cost of manual inspection and improving the accuracy of identification, and can be specifically revised and improved for the clauses having contradictions.
Example 3
Fig. 2 shows a contradictory sentence recognition system based on deep learning of the present embodiment. The contradictory sentence recognition system includes:
and the model construction module 21 is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts.
And the model input module 22 is used for preprocessing the two compared texts and respectively converting the two compared texts into matrixes to serve as the input of the model.
And the model output module 23 is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
In this embodiment, the model input module 22 may be specifically configured to:
(1) acquiring original training text data, and preprocessing the original training text data;
the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary coding.
(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSTM unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:
(2-1) converting each word in the premix text into a word vector with a fixed length of n (such as n being 100);
(2-2) setting the maximum number of words in the text participating in training to m (for example, m is 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words except m;
(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text1,x2,…,xm]。
Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.
Taking the matrix converted from the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: BiLSTM output vector sequence corresponding to premix textThe specific process is as follows:
(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment to be f respectivelytAnd itThe updated value of the cell state at the current time isSatisfies the following conditions:
ft=σ(Wf[xt,ht-1]+bf)
it=σ(Wi[xt,ht-1]+bi)
wherein x istFor input at the current time, ht-1For hiding the state of the layer at the previous moment, Wf、WiAnd WcUpdating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, bf、biAnd bcRespectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;
(2-5) by the formulaUpdating state C of cellt
(2-6) obtaining the output h of each hidden node according to the following formulatH is to betAre connected in sequence to form a m-dimensional vector sequence h1,h2,…,hm]:
ot=σ(Wo[xt,ht-1]+bo)
ht=ot*tanh(Ct)
Wherein, WoAs a weight matrix of the output gate unit, boIs an offset vector of the output gate unit, otIs the output of the output gate unit;
(2-7) processing the premix text in the forward direction by (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix textProcessing the premise text reversely by (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premise text
(2-8) Forward output vector sequence corresponding to premise textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix textIs marked as
Referring to the processes (2-4) - (2-8), processing the hypthesis text in the forward direction through the steps (2-4) - (2-6),obtaining a forward output vector sequence corresponding to the hypthesis textProcessing (2-4) - (2-6) in the reverse direction of the hypthesis text to obtain a reverse output vector sequence corresponding to the hypthesis textForward output vector sequence corresponding to hypthesis textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis textIs marked asThe detailed process is not described again.
(3) Preliminary semantic representation corresponding to hypthesis textPerforming maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis textFurther compute Query vector q:
(4) q is added toAs input for Attention to associate hypothesis text with prMatching words and phrases of the emise text by a formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q)。
(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model The predicted valueIndicating the probability that the premix text and the hypthesis text contradict.
(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
(7) and minimizing the loss by using an adam (adaptive motion estimation) optimization algorithm, and carrying out multiple times of iterative training to obtain a final model.
In this embodiment, the model building module 21 integrates the semantics of the hypthesis text into a single vector by using the maximum pooling operation when extracting the semantics of the hypthesis text, and then matches the single vector with each word vector in the premise text, thereby greatly reducing the training cost of the model while ensuring the inference accuracy.
The model input module 22 preprocesses the two compared texts, which is the same as the preprocessing in the model construction module 21, and with reference to the conversion processes (2-1) to (2-3) of the premise texts, the two compared texts can be converted into an m × n matrix according to the word vector of each word, and the specific process is not repeated; the model output module 23 can further determine whether the two texts being compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts being compared are generally considered to be contradictory, and if the probability does not exceed 50%, the two texts being compared are generally considered to be not contradictory.
Example 4
Fig. 3 shows a deep learning-based clause logic authentication system according to the present embodiment. The clause logic authentication system comprises:
a clause input module 31, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed by the contradictory sentence recognition system of embodiment 3;
and the probability output module 32 is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.
Of course, the clause input module may pre-process the two compared clause texts with reference to the model input module 22 of embodiment 3 before converting the two compared clause texts into matrices, respectively.
The term logic identification system of the embodiment can be applied to intelligent identification of the logic relationship of the related terms of the travel product, such as the term contents contained in the parts of reservation limitation, reservation description, product description and the like, and aims to ensure the reasonability and the logicality of the related terms of the travel product of a company, further fully guarantee the rights and interests of consumers and provide the most satisfactory service for the customers. Of course, the term logic identification system is not limited to this embodiment, and may also be applied to other service products or entity products, even in some situations such as systems and rule terms, to achieve the effects of reducing the labor cost of manual inspection and improving the accuracy of identification, and to specifically revise and improve the terms in contradiction.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (10)

1. A contradiction sentence identification method based on deep learning is characterized in that the contradiction sentence identification method comprises the following steps:
constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;
respectively converting the two compared texts into matrixes to be used as the input of the model;
and calculating the model to obtain the probability that the two compared texts are contradictory.
2. The contradictory statement identification method according to claim 1, characterized in that the step of constructing comprises:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
3. The contradictory sentence recognition method of claim 2, wherein the sequence of BilSTM output vectors corresponding to the premise text is calculated by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Calculating a BilSt output vector sequence corresponding to the hypthesis text by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
4. The contradictory statement identification method according to claim 2, characterized in that the step of constructing further comprises: preprocessing the original training text data;
the contradictory statement identification method further includes: preprocessing the two texts to be compared;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
5. A clause logic identification method based on deep learning, characterized in that the clause logic identification method comprises the following steps:
converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of any one of claims 1 to 4;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
6. A contradictory sentence recognition system based on deep learning, the contradictory sentence recognition system comprising:
the model construction module is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;
the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;
and the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
7. The contradictory statement identification system of claim 6, wherein the model building module is to:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
8. The contradictory statement identification system of claim 7, wherein the model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
The model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
9. The contradictory sentence recognition system of claim 7, wherein the model building module is further configured to pre-process the raw training text data;
the model input module is also used for preprocessing the two compared texts;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
10. A clause logic authentication system based on deep learning, the clause logic authentication system comprising:
a clause input module, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed using the contradictory sentence recognition system according to any one of claims 6 to 9;
and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.
CN201811635859.4A 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system Active CN109710943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811635859.4A CN109710943B (en) 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811635859.4A CN109710943B (en) 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system

Publications (2)

Publication Number Publication Date
CN109710943A true CN109710943A (en) 2019-05-03
CN109710943B CN109710943B (en) 2022-12-20

Family

ID=66259511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811635859.4A Active CN109710943B (en) 2018-12-29 2018-12-29 Contradictory statement identification method and system and clause logic identification method and system

Country Status (1)

Country Link
CN (1) CN109710943B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110618980A (en) * 2019-09-09 2019-12-27 上海交通大学 System and method based on legal text accurate matching and contradiction detection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070255555A1 (en) * 2006-05-01 2007-11-01 Palo Alto Research Center Incorporated Systems and methods for detecting entailment and contradiction
WO2014132456A1 (en) * 2013-02-28 2014-09-04 Nec Corporation Method and system for determining non-entailment and contradiction of text pairs
WO2015053236A1 (en) * 2013-10-08 2015-04-16 独立行政法人情報通信研究機構 Device for collecting contradictory expression and computer program for same
CN108647207A (en) * 2018-05-08 2018-10-12 上海携程国际旅行社有限公司 Natural language modification method, system, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070255555A1 (en) * 2006-05-01 2007-11-01 Palo Alto Research Center Incorporated Systems and methods for detecting entailment and contradiction
WO2014132456A1 (en) * 2013-02-28 2014-09-04 Nec Corporation Method and system for determining non-entailment and contradiction of text pairs
WO2015053236A1 (en) * 2013-10-08 2015-04-16 独立行政法人情報通信研究機構 Device for collecting contradictory expression and computer program for same
CN108647207A (en) * 2018-05-08 2018-10-12 上海携程国际旅行社有限公司 Natural language modification method, system, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110618980A (en) * 2019-09-09 2019-12-27 上海交通大学 System and method based on legal text accurate matching and contradiction detection

Also Published As

Publication number Publication date
CN109710943B (en) 2022-12-20

Similar Documents

Publication Publication Date Title
WO2022022163A1 (en) Text classification model training method, device, apparatus, and storage medium
CN111444340B (en) Text classification method, device, equipment and storage medium
CN111626063B (en) Text intention identification method and system based on projection gradient descent and label smoothing
CN110263323B (en) Keyword extraction method and system based on barrier type long-time memory neural network
CN108984526B (en) Document theme vector extraction method based on deep learning
CN108009148B (en) Text emotion classification representation method based on deep learning
CN110263325B (en) Chinese word segmentation system
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN109919175B (en) Entity multi-classification method combined with attribute information
CN109086269B (en) Semantic bilingual recognition method based on semantic resource word representation and collocation relationship
CN110532395B (en) Semantic embedding-based word vector improvement model establishing method
CN110825849A (en) Text information emotion analysis method, device, medium and electronic equipment
CN112287106A (en) Online comment emotion classification method based on dual-channel hybrid neural network
CN113282714B (en) Event detection method based on differential word vector representation
CN111368542A (en) Text language association extraction method and system based on recurrent neural network
CN114462420A (en) False news detection method based on feature fusion model
CN113821635A (en) Text abstract generation method and system for financial field
CN113488196A (en) Drug specification text named entity recognition modeling method
CN111858878A (en) Method, system and storage medium for automatically extracting answer from natural language text
CN113553510A (en) Text information recommendation method and device and readable medium
CN114417872A (en) Contract text named entity recognition method and system
CN113869054A (en) Deep learning-based electric power field project feature identification method
CN113837307A (en) Data similarity calculation method and device, readable medium and electronic equipment
Chan et al. Applying and optimizing NLP model with CARU

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant