CN109710943A - Inconsistent statement recognition methods and system and clause logic discrimination method and system - Google Patents
Inconsistent statement recognition methods and system and clause logic discrimination method and system Download PDFInfo
- Publication number
- CN109710943A CN109710943A CN201811635859.4A CN201811635859A CN109710943A CN 109710943 A CN109710943 A CN 109710943A CN 201811635859 A CN201811635859 A CN 201811635859A CN 109710943 A CN109710943 A CN 109710943A
- Authority
- CN
- China
- Prior art keywords
- text
- hypthesis
- model
- vector sequence
- output vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012850 discrimination method Methods 0.000 title abstract 2
- 239000011159 matrix material Substances 0.000 claims abstract description 27
- 230000007246 mechanism Effects 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims description 124
- 230000008094 contradictory effect Effects 0.000 claims description 50
- 238000012549 training Methods 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 19
- 238000007781 pre-processing Methods 0.000 claims description 18
- 238000013135 deep learning Methods 0.000 claims description 16
- 238000003062 neural network model Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 8
- 101150071716 PCSK1 gene Proteins 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 abstract description 2
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a kind of inconsistent statement recognition methods and system and clause logic discrimination method and systems.The inconsistent statement recognition methods includes: to construct inconsistent statement identification model, the model is for predicting the probability that two texts contradict based on BiLSTM-Attention mechanism;Compare two texts are separately converted to matrix, the input as the model;By the calculating of the model, probability that two texts to be compared contradict.The present invention solves natural language inference problems using depth learning technology, can reduce the cost of labor of feature extraction, the accuracy rate of deduction is substantially improved.
Description
Technical Field
The invention belongs to the field of artificial intelligence, and particularly relates to a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.
Background
Inference is one of the core research topics in the field of artificial intelligence, and natural language inference is an important research branch in the natural language processing direction, which is the research basis of tasks such as question and answer systems, information retrieval, automatic summarization and the like, and has a wide application space in many business scenes. One core problem in natural language inference is to give two sets of text (preconditions) and positivity (hypotheses) and determine whether there is a contradiction between the two at the semantic level.
Because of the diversity of language expression and the confusion of semantic understanding, especially the existence of a large number of polysemons and synonyms in Chinese texts, the difficulty of natural language inference is higher than that of expectation, and the accuracy of inference also has a great rising space.
In addition, the logical relationship of the relevant terms of the product is usually identified manually, an intelligent means is lacked, more labor cost is consumed, and the identification accuracy is also to be improved.
Disclosure of Invention
The invention aims to overcome the defects that natural language inference is high in difficulty and high in inference precision in the prior art, and provides a method and a system for identifying contradictory sentences and a method and a system for logically identifying clauses.
The invention solves the technical problems through the following technical scheme:
a contradiction sentence identification method based on deep learning comprises the following steps:
constructing a contradiction statement identification model based on a BilSTM (long-short term memory network) -Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;
respectively converting the two compared texts into matrixes to be used as the input of the model;
and calculating the model to obtain the probability that the two compared texts are contradictory.
Preferably, the step of constructing comprises:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
Preferably, the BiLSTM output vector sequence corresponding to the premise text is calculated by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Calculating a BilSt output vector sequence corresponding to the hypthesis text by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
Preferably, the step of constructing further comprises: preprocessing the original training text data;
the contradictory statement identification method further includes: preprocessing the two texts to be compared;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
A clause logic identification method based on deep learning, the clause logic identification method comprising:
respectively converting the two compared clause texts into matrixes to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
A contradictory sentence identification system based on deep learning, the contradictory sentence identification system comprising:
the model construction module is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;
the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;
and the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
Preferably, the model building module is configured to:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
Preferably, the model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
The model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
Preferably, the model building module is further configured to preprocess the original training text data;
the model input module is also used for preprocessing the two compared texts;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
A clause logic discrimination system based on deep learning, the clause logic discrimination system comprising:
the clause input module is used for respectively converting the two compared clause texts into matrixes to be used as the input of the contradictory sentence identification model constructed by the contradictory sentence identification system;
and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model. On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows: the invention solves the natural language inference problem by using the deep learning technology, can reduce the labor cost of feature extraction, and greatly improves the accuracy of inference.
Drawings
Fig. 1 is a flowchart of a contradictory sentence identification method based on deep learning according to embodiment 1 of the present invention.
Fig. 2 is a schematic block diagram of a contradictory sentence recognition system based on deep learning according to embodiment 3 of the present invention.
Fig. 3 is a schematic block diagram of a clause logic authentication system based on deep learning according to embodiment 4 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
Fig. 1 shows a contradiction sentence identification method based on deep learning according to the present embodiment. The contradictory statement identification method includes:
step 11: based on a BilSTM-Attention mechanism, a contradiction sentence identification model is constructed, and the model is used for predicting the probability of contradiction between two texts.
Step 12: the two texts being compared are preprocessed.
Step 13: and respectively converting the two texts to be compared into matrixes to be used as the input of the model.
Step 14: and calculating the model to obtain the probability that the two compared texts are contradictory.
In this embodiment, step 11 may specifically include:
(1) acquiring original training text data, and preprocessing the original training text data;
the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary coding.
(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSTM unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:
(2-1) converting each word in the premix text into a word vector with a fixed length of n (such as n being 100);
(2-2) setting the maximum number of words in the text participating in training to m (for example, m is 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words except m;
(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text1,x2,…,xm]。
Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.
Taking the matrix converted from the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: BiLSTM output vector sequence corresponding to premix textThe specific process is as follows:
(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment to be f respectivelytAnd itThe updated value of the cell state at the current time isSatisfies the following conditions:
ft=σ(Wf[xt,ht-1]+bf)
it=σ(Wi[xt,ht-1]+bi)
wherein,xtfor input at the current time, ht-1For hiding the state of the layer at the previous moment, Wf、WiAnd WcUpdating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, bf、biAnd bcRespectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;
(2-5) by the formulaUpdating state C of cellt;
(2-6) obtaining the output h of each hidden node according to the following formulatH is to betAre connected in sequence to form a m-dimensional vector sequence h1,h2,…,hm]:
ot=σ(Wo[xt,ht-1]+bo)
ht=ot*tanh(Ct)
Wherein, WoAs a weight matrix of the output gate unit, boIs an offset vector of the output gate unit, otIs the output of the output gate unit;
(2-7) processing the premix text in the forward direction through the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix textProcessing the premix text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premix text
(2-8) Forward output vector sequence corresponding to premise textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix textIs marked as
In the reference process (2-4) - (2-8), forward processing is carried out on the hypthesis text in the steps (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the hypthesis textProcessing the hypothesis text reversely through the steps (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the hypothesis textForward output vector sequence corresponding to hypthesis textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis textIs marked asThe detailed process is not described again.
(3) Preliminary semantic representation corresponding to hypthesis textPerforming maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis textFurther compute Query vector q:
(4) q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q)。
(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model The predicted valueIndicating the probability that the premix text and the hypthesis text contradict.
(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
(7) and minimizing the loss by using an adam (adaptive motion estimation) optimization algorithm, and carrying out multiple times of iterative training to obtain a final model.
Through the steps, when the semantics of the hypthesis text are extracted, the semantics are integrated into a single vector by using the maximum pooling operation, and then the single vector is matched with each word vector in the premise text, so that the inference precision is ensured, and the training cost of the model is greatly reduced.
In step 12, preprocessing the two texts to be compared is the same as the preprocessing in step (1); in step 13, referring to the conversion processes (2-1) - (2-3) of the premix text, the two texts to be compared can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated; in step 14, it can be further determined whether the two texts to be compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts to be compared can be generally considered to be contradictory, and if the probability does not exceed 50%, the two texts to be compared can be generally considered to be not contradictory.
Example 2
The embodiment provides a clause logic identification method based on deep learning. The clause logic authentication method comprises the following steps:
converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of the embodiment 1;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
Of course, the clause logic identification method of the present embodiment may pre-process the two compared clause texts with reference to step 12 of embodiment 1 before converting the two compared clause texts into matrices, respectively.
The term logic identification method of the embodiment can be applied to intelligent identification of the logic relationship of the related terms of the travel product, such as the term contents contained in the parts of reservation limitation, reservation description, product description and the like, and aims to ensure the reasonability and the logicality of the related terms of the travel product of a company, further fully guarantee the rights and interests of consumers and provide the most satisfactory service for the customers. Of course, the method for logically identifying clauses is not limited to this embodiment, and the method for logically identifying clauses may also be applied to other service products or entity products, even in some situations such as systems and rule clauses, so as to achieve the effects of reducing the labor cost of manual inspection and improving the accuracy of identification, and can be specifically revised and improved for the clauses having contradictions.
Example 3
Fig. 2 shows a contradictory sentence recognition system based on deep learning of the present embodiment. The contradictory sentence recognition system includes:
and the model construction module 21 is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts.
And the model input module 22 is used for preprocessing the two compared texts and respectively converting the two compared texts into matrixes to serve as the input of the model.
And the model output module 23 is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
In this embodiment, the model input module 22 may be specifically configured to:
(1) acquiring original training text data, and preprocessing the original training text data;
the original training text data comprises a premix text and a hypothesis text, and the preprocessing at least comprises denoising, word segmentation and dictionary coding.
(2) Converting the premise text and the hypthesis text into a matrix according to a word vector of each word (the word in the embodiment can be understood as a character and has the same meaning), and taking the matrix as the input of a BilSTM unit to obtain preliminary semantic representations corresponding to the premise text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
The specific process of converting the premise text into the matrix according to the word vector of each word is as follows:
(2-1) converting each word in the premix text into a word vector with a fixed length of n (such as n being 100);
(2-2) setting the maximum number of words in the text participating in training to m (for example, m is 100), if the actual number of words in the premix text is less than m, complementing the insufficient part with < PAD > characters, and if the actual number of words in the premix text exceeds m, deleting words except m;
(2-3) converting the premise text into an m x n matrix as input [ x ] according to the word vector of each word in the text1,x2,…,xm]。
Referring to the conversion processes (2-1) - (2-3) of the premix text, the hypthesis text can be converted into an m × n matrix according to the word vector of each word, and the detailed process is not repeated.
Taking the matrix converted from the premix text as the input of a BilSTM unit to obtain a preliminary semantic representation corresponding to the premix text: BiLSTM output vector sequence corresponding to premix textThe specific process is as follows:
(2-4) setting the outputs of the forgetting gate unit and the input gate unit at the current moment to be f respectivelytAnd itThe updated value of the cell state at the current time isSatisfies the following conditions:
ft=σ(Wf[xt,ht-1]+bf)
it=σ(Wi[xt,ht-1]+bi)
wherein x istFor input at the current time, ht-1For hiding the state of the layer at the previous moment, Wf、WiAnd WcUpdating the weight matrix of the states for the forgetting gate cell, the input gate cell and the cell, bf、biAnd bcRespectively updating bias vectors of states of a forgetting gate unit, an input gate unit and a cell, wherein sigma is a sigmoid activation function, and tanh is a hyperbolic tangent function;
(2-5) by the formulaUpdating state C of cellt;
(2-6) obtaining the output h of each hidden node according to the following formulatH is to betAre connected in sequence to form a m-dimensional vector sequence h1,h2,…,hm]:
ot=σ(Wo[xt,ht-1]+bo)
ht=ot*tanh(Ct)
Wherein, WoAs a weight matrix of the output gate unit, boIs an offset vector of the output gate unit, otIs the output of the output gate unit;
(2-7) processing the premix text in the forward direction by (2-4) - (2-6) to obtain a forward output vector sequence corresponding to the premix textProcessing the premise text reversely by (2-4) - (2-6) to obtain a reverse output vector sequence corresponding to the premise text
(2-8) Forward output vector sequence corresponding to premise textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix textIs marked as
Referring to the processes (2-4) - (2-8), processing the hypthesis text in the forward direction through the steps (2-4) - (2-6),obtaining a forward output vector sequence corresponding to the hypthesis textProcessing (2-4) - (2-6) in the reverse direction of the hypthesis text to obtain a reverse output vector sequence corresponding to the hypthesis textForward output vector sequence corresponding to hypthesis textAnd a reverse output vector sequenceCombining along the last dimension to obtain a BilSt output vector sequence corresponding to the hypthesis textIs marked asThe detailed process is not described again.
(3) Preliminary semantic representation corresponding to hypthesis textPerforming maximal pooling along the step dimension to extract the most important semantic representation as the final semantic representation of the hypothesis textFurther compute Query vector q:
(4) q is added toAs input for Attention to associate hypothesis text with prMatching words and phrases of the emise text by a formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q)。
(5) Using the Attention vector a to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network model The predicted valueIndicating the probability that the premix text and the hypthesis text contradict.
(6) According to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
(7) and minimizing the loss by using an adam (adaptive motion estimation) optimization algorithm, and carrying out multiple times of iterative training to obtain a final model.
In this embodiment, the model building module 21 integrates the semantics of the hypthesis text into a single vector by using the maximum pooling operation when extracting the semantics of the hypthesis text, and then matches the single vector with each word vector in the premise text, thereby greatly reducing the training cost of the model while ensuring the inference accuracy.
The model input module 22 preprocesses the two compared texts, which is the same as the preprocessing in the model construction module 21, and with reference to the conversion processes (2-1) to (2-3) of the premise texts, the two compared texts can be converted into an m × n matrix according to the word vector of each word, and the specific process is not repeated; the model output module 23 can further determine whether the two texts being compared are contradictory through the probability, for example, if the probability exceeds 50%, the two texts being compared are generally considered to be contradictory, and if the probability does not exceed 50%, the two texts being compared are generally considered to be not contradictory.
Example 4
Fig. 3 shows a deep learning-based clause logic authentication system according to the present embodiment. The clause logic authentication system comprises:
a clause input module 31, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed by the contradictory sentence recognition system of embodiment 3;
and the probability output module 32 is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.
Of course, the clause input module may pre-process the two compared clause texts with reference to the model input module 22 of embodiment 3 before converting the two compared clause texts into matrices, respectively.
The term logic identification system of the embodiment can be applied to intelligent identification of the logic relationship of the related terms of the travel product, such as the term contents contained in the parts of reservation limitation, reservation description, product description and the like, and aims to ensure the reasonability and the logicality of the related terms of the travel product of a company, further fully guarantee the rights and interests of consumers and provide the most satisfactory service for the customers. Of course, the term logic identification system is not limited to this embodiment, and may also be applied to other service products or entity products, even in some situations such as systems and rule terms, to achieve the effects of reducing the labor cost of manual inspection and improving the accuracy of identification, and to specifically revise and improve the terms in contradiction.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.
Claims (10)
1. A contradiction sentence identification method based on deep learning is characterized in that the contradiction sentence identification method comprises the following steps:
constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, wherein the model is used for predicting the probability of contradiction between two texts;
respectively converting the two compared texts into matrixes to be used as the input of the model;
and calculating the model to obtain the probability that the two compared texts are contradictory.
2. The contradictory statement identification method according to claim 1, characterized in that the step of constructing comprises:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
3. The contradictory sentence recognition method of claim 2, wherein the sequence of BilSTM output vectors corresponding to the premise text is calculated by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
Calculating a BilSt output vector sequence corresponding to the hypthesis text by the following steps
Calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
4. The contradictory statement identification method according to claim 2, characterized in that the step of constructing further comprises: preprocessing the original training text data;
the contradictory statement identification method further includes: preprocessing the two texts to be compared;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
5. A clause logic identification method based on deep learning, characterized in that the clause logic identification method comprises the following steps:
converting the two compared clause texts into matrixes respectively to be used as input of a contradictory sentence identification model constructed by the contradictory sentence identification method of any one of claims 1 to 4;
and calculating the model to obtain the probability that the two compared clause texts are contradictory.
6. A contradictory sentence recognition system based on deep learning, the contradictory sentence recognition system comprising:
the model construction module is used for constructing a contradiction statement identification model based on a BilSTM-Attention mechanism, and the model is used for predicting the probability of contradiction between two texts;
the model input module is used for respectively converting the two compared texts into matrixes to be used as the input of the model;
and the model output module is used for obtaining the probability that the two compared texts are contradictory through the calculation of the model.
7. The contradictory statement identification system of claim 6, wherein the model building module is to:
acquiring original training text data, wherein the original training text data comprises a premix text and a hypthesis text;
converting the premix text and the hypthesis text into a matrix according to the word vector of each word, and taking the matrix as the input of a BilSTM unit to obtain the preliminary semantic representations corresponding to the premix text and the hypthesis text, wherein the preliminary semantic representations are respectively as follows: BiLSTM output vector sequence corresponding to premix textBilsTM output vector sequence corresponding to hypthesis text
Performing maximum pooling processing on the preliminary semantic representation corresponding to the hypthesis text along the step dimension to extract the most important semantic representation as the final semantic representation of the hypthesis textComputing a Query vector q:
q is added toAs input to Attention to word match hypothesis text with premise text, by formula
Wherein,
obtaining an Attention vector a: a is tan h (W)att1·c+Watt2·q);
Using the Attention vector to infer a two-classification deep neural network model of a contradiction relationship to obtain a predicted value of the two-classification deep neural network modelThe predicted valueRepresenting the probability of contradiction between the premix text and the hypthesis text;
according to the real value y and the predicted value of the relationship between the premix text and the hypthesis textCalculating cross entropy loss:
and minimizing loss by using an optimization algorithm, and carrying out repeated iterative training to obtain a final model.
8. The contradictory statement identification system of claim 7, wherein the model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the premix text;
combining the forward output vector sequence and the reverse output vector sequence corresponding to the premix text along the last dimension to obtain a BilSTM output vector sequence corresponding to the premix text
The model building module is further configured to:
calculating a forward output vector sequence and a reverse output vector sequence corresponding to the hypthesis text;
combining the forward output vector sequence and the backward output vector sequence corresponding to the hypthesis text along the last dimension to obtain the BilSTM output vector sequence corresponding to the hypthesis text
9. The contradictory sentence recognition system of claim 7, wherein the model building module is further configured to pre-process the raw training text data;
the model input module is also used for preprocessing the two compared texts;
the preprocessing at least comprises denoising, word segmentation and dictionary coding.
10. A clause logic authentication system based on deep learning, the clause logic authentication system comprising:
a clause input module, configured to convert the two compared clause texts into matrices, respectively, as an input of a contradictory sentence recognition model constructed using the contradictory sentence recognition system according to any one of claims 6 to 9;
and the probability output module is used for obtaining the probability that the two compared clause texts are contradictory through the calculation of the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811635859.4A CN109710943B (en) | 2018-12-29 | 2018-12-29 | Contradictory statement identification method and system and clause logic identification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811635859.4A CN109710943B (en) | 2018-12-29 | 2018-12-29 | Contradictory statement identification method and system and clause logic identification method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109710943A true CN109710943A (en) | 2019-05-03 |
CN109710943B CN109710943B (en) | 2022-12-20 |
Family
ID=66259511
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811635859.4A Active CN109710943B (en) | 2018-12-29 | 2018-12-29 | Contradictory statement identification method and system and clause logic identification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109710943B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110618980A (en) * | 2019-09-09 | 2019-12-27 | 上海交通大学 | System and method based on legal text accurate matching and contradiction detection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070255555A1 (en) * | 2006-05-01 | 2007-11-01 | Palo Alto Research Center Incorporated | Systems and methods for detecting entailment and contradiction |
WO2014132456A1 (en) * | 2013-02-28 | 2014-09-04 | Nec Corporation | Method and system for determining non-entailment and contradiction of text pairs |
WO2015053236A1 (en) * | 2013-10-08 | 2015-04-16 | 独立行政法人情報通信研究機構 | Device for collecting contradictory expression and computer program for same |
CN108647207A (en) * | 2018-05-08 | 2018-10-12 | 上海携程国际旅行社有限公司 | Natural language modification method, system, equipment and storage medium |
-
2018
- 2018-12-29 CN CN201811635859.4A patent/CN109710943B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070255555A1 (en) * | 2006-05-01 | 2007-11-01 | Palo Alto Research Center Incorporated | Systems and methods for detecting entailment and contradiction |
WO2014132456A1 (en) * | 2013-02-28 | 2014-09-04 | Nec Corporation | Method and system for determining non-entailment and contradiction of text pairs |
WO2015053236A1 (en) * | 2013-10-08 | 2015-04-16 | 独立行政法人情報通信研究機構 | Device for collecting contradictory expression and computer program for same |
CN108647207A (en) * | 2018-05-08 | 2018-10-12 | 上海携程国际旅行社有限公司 | Natural language modification method, system, equipment and storage medium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110618980A (en) * | 2019-09-09 | 2019-12-27 | 上海交通大学 | System and method based on legal text accurate matching and contradiction detection |
Also Published As
Publication number | Publication date |
---|---|
CN109710943B (en) | 2022-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022022163A1 (en) | Text classification model training method, device, apparatus, and storage medium | |
CN111444340B (en) | Text classification method, device, equipment and storage medium | |
CN111626063B (en) | Text intention identification method and system based on projection gradient descent and label smoothing | |
CN110263323B (en) | Keyword extraction method and system based on barrier type long-time memory neural network | |
CN108984526B (en) | Document theme vector extraction method based on deep learning | |
CN108009148B (en) | Text emotion classification representation method based on deep learning | |
CN110263325B (en) | Chinese word segmentation system | |
CN110321563B (en) | Text emotion analysis method based on hybrid supervision model | |
CN113255320A (en) | Entity relation extraction method and device based on syntax tree and graph attention machine mechanism | |
CN109919175B (en) | Entity multi-classification method combined with attribute information | |
CN109086269B (en) | Semantic bilingual recognition method based on semantic resource word representation and collocation relationship | |
CN110532395B (en) | Semantic embedding-based word vector improvement model establishing method | |
CN110825849A (en) | Text information emotion analysis method, device, medium and electronic equipment | |
CN112287106A (en) | Online comment emotion classification method based on dual-channel hybrid neural network | |
CN113282714B (en) | Event detection method based on differential word vector representation | |
CN111368542A (en) | Text language association extraction method and system based on recurrent neural network | |
CN114462420A (en) | False news detection method based on feature fusion model | |
CN113821635A (en) | Text abstract generation method and system for financial field | |
CN113488196A (en) | Drug specification text named entity recognition modeling method | |
CN111858878A (en) | Method, system and storage medium for automatically extracting answer from natural language text | |
CN113553510A (en) | Text information recommendation method and device and readable medium | |
CN114417872A (en) | Contract text named entity recognition method and system | |
CN113869054A (en) | Deep learning-based electric power field project feature identification method | |
CN113837307A (en) | Data similarity calculation method and device, readable medium and electronic equipment | |
Chan et al. | Applying and optimizing NLP model with CARU |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |