CN111914560B

CN111914560B - Text inclusion relation recognition method, device, equipment and storage medium

Info

Publication number: CN111914560B
Application number: CN202010762885.4A
Authority: CN
Inventors: 王烨; 王燕蒙; 郝正鸿; 王少军
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2023-01-31
Anticipated expiration: 2040-07-31
Also published as: CN111914560A

Abstract

The invention relates to the field of big data analysis, and discloses a method, a device, equipment and a storage medium for identifying a text inclusion relationship, wherein the method comprises the following steps: acquiring a training text pair containing the implication relation, extracting a text feature vector and the implication feature vector of the training text pair, splicing to obtain a spliced feature vector, randomly generating a feature fusion network by adopting a preset neural network search model, accessing the feature fusion network into the pre-training model, inputting the spliced feature vector into the feature fusion network to identify the implication relation of the corresponding training text pair, training the feature fusion network by adopting the neural network search model according to the implication relation to obtain a text implication relation identification model, and identifying the implication relation of the text pair to be identified. The invention also relates to a block chain technology, wherein the training text pairs and the text pairs to be recognized are stored in the block chain. The invention improves the accuracy of the identification of the implication relationship of the text pair consisting of proper nouns and expands the identification range.

Description

Text inclusion relation recognition method, device, equipment and storage medium

Technical Field

The invention relates to the field of big data analysis, in particular to a method, a device, equipment and a storage medium for recognizing a text inclusion relation.

Background

The recognition of the text inclusion relationship belongs to a natural language reasoning task, and two sections of texts are given: a premise P and a hypothesis H, if the content of H can be deduced from P, the implication relationship is obtained; if the content opposite to H can be deduced from P, the contradiction relationship is formed; if P is not related to H, it is a neutral relationship. The text implication technology is a basic technology of machine reading understanding, text similarity matching, text summarization, dialogue question answering and the like. With the proposal of the pre-training models such as BERT (Bidirectional Ender restances from transducers), roberta (robust optimized Bidirectional Ender restances from transducers approach, transducer-based robust optimized Bidirectional language method model), albert (A Lite Bidirectional Ender restances from transducers), etc., the accuracy of the text inclusion relationship recognition is also qualitatively improved.

However, experiments show that although the pre-training model obtains higher precision, the sentence composed of proper nouns such as numbers, time, names of people, names of places and the like can be judged wrongly, and the pre-training model can not be fully learned about the part-of-speech information; and the current pre-training model is insensitive to the text sequence, and the difficulty in distinguishing sentences formed by proper nouns is increased.

Disclosure of Invention

The invention mainly aims to solve the technical problem that the existing text implication relation recognition technology is insufficient in recognition accuracy of texts composed of proper nouns.

The invention provides a text implication relation identification method in a first aspect, which comprises the following steps:

acquiring a training text pair containing an implication relation, and extracting a text feature vector of the training text pair;

extracting the inclusion characteristic vector of the training text pair by adopting a preset pre-training model, and splicing the inclusion characteristic vector and the text characteristic vector to obtain a corresponding spliced characteristic vector;

randomly generating a feature fusion network by adopting a preset neural network search model, and accessing the feature fusion network into the pre-training model;

inputting the spliced feature vectors into the feature fusion network to identify implication relations of corresponding training text pairs, and training the feature fusion network by adopting the neural network search model according to the implication relations to obtain a text implication relation identification model;

and acquiring a text pair to be recognized, and recognizing the text pair to be recognized input by the text inclusion relation recognition model to obtain the inclusion relation of the text pair to be recognized.

Optionally, in a first implementation manner of the first aspect of the present invention, the extracting text feature vectors of the training text pairs includes:

decomposing the training text pair to obtain a plurality of sub-words, and determining the text attribute of each sub-word;

and generating a text feature vector of the training text pair according to the text attribute, wherein the text feature vector comprises a part-of-speech feature vector, a named entity feature vector and a matching feature vector.

Optionally, in a second implementation manner of the first aspect of the present invention, the training text pair includes a first training text and a second training text, and the generating a text feature vector of the training text pair according to the text attribute includes:

determining the same subword and different subwords in the first training text and the second training text according to the text attributes;

converting the first training text and the second training text into corresponding binary feature vectors according to the same subword and the different subwords;

and obtaining a text feature vector of the training text pair according to the binary feature vector, wherein the text feature vector is a matching feature vector.

Optionally, in a third implementation manner of the first aspect of the present invention, the feature fusion network includes multiple basic networks, and the inputting the stitched feature vector into the feature fusion network to identify the implication relationship of the corresponding training text pair includes:

inputting the spliced feature vector into the feature fusion network;

determining a network connection list of each basic network according to the feature fusion network;

determining each basic network fusion path corresponding to the splicing feature vector according to the network connection list;

according to the fusion paths of each basic network, sequentially adopting the corresponding basic network to perform feature fusion on the spliced feature vector to obtain corresponding fusion feature vectors;

and identifying the implication relation of the corresponding training text pair according to the fusion feature vector.

Optionally, in a fourth implementation manner of the first aspect of the present invention, the sequentially performing feature fusion on the spliced feature vector by using corresponding basic networks according to the fusion paths of the basic networks to obtain corresponding fusion feature vectors includes:

according to different basic network fusion paths, respectively adopting corresponding basic networks to sequentially perform feature fusion on the spliced feature vectors to obtain corresponding single-chain fusion feature vectors;

and splicing the single-chain fusion feature vectors to obtain corresponding fusion feature vectors.

Optionally, in a fifth implementation manner of the first aspect of the present invention, the training the feature fusion network by using the neural network search model according to the implication relationship to obtain a text implication relationship recognition model includes:

judging whether the feature fusion network is converged or not according to the implication relation;

if so, taking the pre-training model accessed to the current feature fusion network as a text inclusion relation recognition model;

if not, updating the feature fusion network by adopting the neural network search model according to the implication relationship until the feature fusion network is converged, and taking the pre-training model accessed to the current feature fusion network as a text implication relationship recognition model.

A second aspect of the present invention provides a text inclusion relation recognition apparatus, including:

the extraction module is used for acquiring a training text pair containing the implication relationship and extracting a text feature vector of the training text pair;

the splicing module is used for extracting the implied characteristic vector of the training text pair by adopting a preset pre-training model and splicing the implied characteristic vector and the text characteristic vector to obtain a corresponding spliced characteristic vector;

the access module is used for randomly generating a feature fusion network by adopting a preset neural network search model and accessing the feature fusion network into the pre-training model;

the training module is used for inputting the spliced feature vectors into the feature fusion network so as to identify the implication relation of the corresponding training text pair, and training the feature fusion network by adopting the neural network search model according to the implication relation to obtain a text implication relation identification model;

and the recognition module is used for acquiring the text pair to be recognized and recognizing the text pair to be recognized by the text inclusion relation recognition model to obtain the inclusion relation of the text pair to be recognized.

Optionally, in a first implementation manner of the second aspect of the present invention, the extracting module includes:

the text decomposition unit is used for decomposing the training text pair to obtain a plurality of sub-words and determining the text attribute of each sub-word;

and the feature generation unit is used for generating a text feature vector of the training text pair according to the text attribute, wherein the text feature vector comprises a part-of-speech feature vector, a named entity feature vector and a matching feature vector.

Optionally, in a second implementation manner of the second aspect of the present invention, the feature generation unit is further configured to:

Optionally, in a third implementation manner of the second aspect of the present invention, the feature fusion network includes a plurality of basic networks, and the training module includes:

the characteristic input unit is used for inputting the splicing characteristic vector into the characteristic fusion network;

a path confirmation unit, configured to determine a network connection list of each basic network according to the feature fusion network; determining each basic network fusion path corresponding to the splicing feature vector according to the network connection list;

the characteristic fusion unit is used for sequentially adopting corresponding basic networks to perform characteristic fusion on the spliced characteristic vectors according to the basic network fusion paths to obtain corresponding fusion characteristic vectors;

and the recognition unit is used for recognizing the implication relation of the corresponding training text pair according to the fusion feature vector.

Optionally, in a fourth implementation manner of the second aspect of the present invention, the feature fusion unit is further configured to:

Optionally, in a fifth implementation manner of the second aspect of the present invention, the training module further includes:

the judging unit is used for judging whether the feature fusion network is converged or not according to the implication relation;

the model generation unit is used for taking the pre-training model accessed to the current feature fusion network as a text implication relation recognition model if the pre-training model is accessed to the current feature fusion network;

and the network updating unit is used for updating the feature fusion network by adopting the neural network search model according to the implication relation if the characteristic fusion network is not in the preset condition, stopping updating until the feature fusion network is converged, and taking the pre-training model accessed to the current feature fusion network as a text implication relation recognition model.

A third aspect of the present invention provides a text implication relationship recognition apparatus, including: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the text implication relationship recognition device to execute the text implication relationship recognition method.

A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the above-described text implication relationship recognition method.

According to the technical scheme provided by the invention, text characteristic vectors are extracted from a training text pair and spliced to the implied characteristic vectors output after the training text pair is subjected to a pre-training model to obtain corresponding spliced characteristic vectors, and the part-of-speech characteristics, named entity characteristics and matching characteristics of the training text pair are expanded; and then generating a feature fusion network through a neural network generation model, splicing the feature fusion network into a pre-training model, expanding the pre-training model, and identifying the implication relation of a training text pair through fusion and splicing of feature vectors. The method improves the accuracy of the relation recognition of the model to the text pair formed by the proper nouns, and expands the model recognition range.

Drawings

FIG. 1 is a diagram of a first embodiment of a method for recognizing a text implication relationship in an embodiment of the invention;

FIG. 2 is a diagram of a second embodiment of a method for recognizing a textual implication relationship in an embodiment of the invention;

FIG. 3 is a diagram of a third embodiment of a method for recognizing a text implication relationship in an embodiment of the invention;

FIG. 4 is a diagram of a fourth embodiment of a method for recognizing a text implication relationship in an embodiment of the invention;

fig. 5 is a schematic diagram of an embodiment of a text implication relationship recognition apparatus in the embodiment of the invention;

fig. 6 is a schematic diagram of another embodiment of a text implication relationship recognition device in the embodiment of the invention;

fig. 7 is a schematic diagram of an embodiment of a text implication relationship recognition device in the embodiment of the invention.

Detailed Description

The embodiment of the invention provides a method, a device, equipment and a storage medium for recognizing a text implication relation, wherein the method comprises the steps of obtaining a training text pair containing the implication relation, extracting a text feature vector and an implication feature vector of the training text pair, splicing the text feature vector and the implication feature vector, obtaining a spliced feature vector, randomly generating a feature fusion network by adopting a preset neural network search model, accessing the feature fusion network into the pre-training model, inputting the spliced feature vector into the feature fusion network to recognize the implication relation of the corresponding training text pair, and training the feature fusion network by adopting the neural network search model according to the implication relation to obtain a text implication relation recognition model to recognize the implication relation of the text pair to be recognized. The method and the device improve the accuracy of identification of the implication relation of the text pair consisting of proper nouns and expand the identification range.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For convenience of understanding, a specific flow of the embodiment of the present invention is described below, and referring to fig. 1, a first embodiment of a text implication relationship identification method in the embodiment of the present invention includes:

101. acquiring a training text pair containing implication relations, and extracting text feature vectors of the training text pair;

it is to be understood that the execution subject of the present invention may be a text implication relationship recognition apparatus, and may also be a terminal or a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject. It should be emphasized that, in order to further ensure the privacy and security of the training text pair and the text pair to be recognized, the training text pair and the text pair to be recognized may also be stored in a node of a blockchain.

In this embodiment, each obtained training sample pair includes a precondition sentence and an assumption sentence, and by labeling the training sample pair, the inclusion relationship of the training sample can be labeled through manual labeling or automatic labeling tool labeling, so that the labeled real inclusion relationship of the assumption sentence can be inferred from the precondition sentence, the label of the assumption sentence can be inferred from the precondition sentence as a contradiction relationship, and the label of the assumption sentence, which is irrelevant to the assumption sentence, is a neutral relationship.

And the text feature vector extracted by the training sample pair comprises part-of-speech features, named entity feature vectors, and matching feature vectors of the precondition sentences and the hypothesis sentences. Corresponding part-of-speech feature vectors can be obtained according to the parts-of-speech of the training sample pairs, wherein the part-of-speech feature vectors comprise two types of twelve parts-of-speech, one type is a real word: nouns, verbs, adjectives, numerologies, quantifiers, and pronouns, one type being a null word: adverbs, prepositions, conjunctions, co-words, vocabularies, and exclamations; the named entities of the training sample pairs can obtain corresponding named entity feature vectors, including numbers, time, names of people, names of places and the like; corresponding matching feature vectors can be obtained through the same words appearing in the precondition sentence and the hypothesis sentence, and can be represented through binary coding, wherein 1 represents the same word, and 0 represents different words.

102. Extracting the inclusion characteristic vector of the training text pair by adopting a preset pre-training model, and splicing the inclusion characteristic vector and the text characteristic vector to obtain a corresponding spliced characteristic vector;

in this embodiment, the pre-training model may be a BERT model, a Roberta model, an Albert model, or the like, and the inclusion feature vector obtained preliminarily by the training of the training sample pair itself is extracted through the model training. And text feature vectors such as a part-of-speech feature vector, a named entity feature vector, a matching feature vector and the like are spliced into the implied feature vector to obtain a corresponding spliced feature vector, and the characteristic of the implied feature vector is essentially enhanced, so that the subsequently identified implied relation about the training text pair has part-of-speech sensitivity, named entity sensitivity and matching sensitivity.

103. Randomly generating a feature fusion network by adopting a preset neural network search model, and accessing the feature fusion network into the pre-training model;

in this embodiment, the preset neural network search model includes a plurality of basic networks, including highway (high-speed neural network), multi-head authentication, LSTM (Long Short Term Memory); the feature fusion network is accessed into the pre-training model, one of which is that useful vector information can be concerned about by an extended model, for example, a highway and multi-head attribution network, the LSTM network makes up the defect that the pre-training models such as BERT, roberta and Albert are not sensitive to the text sequence, and more dimensional feature attributes of the training text pairs can be concerned about; and secondly, text feature vectors such as part-of-speech feature vectors, named entity feature vectors, matching feature vectors and the like which are spliced later are fused with the inclusion feature vectors without influencing the inclusion feature vectors of the pre-training model.

The preset Neural Network search model can be an RNN (Recurrent Neural Network) model, a feature fusion Network is randomly generated through the RNN model, and then the RNN model is accessed to a pre-training model to perform fusion of splicing feature vectors. The RNN model outputs the network type and the input network node list of each basic network in the feature fusion network, and the corresponding feature fusion network can be obtained through the network type and the input network node list of the current network node.

104. Inputting the spliced feature vectors into the feature fusion network to identify implication relations of corresponding training text pairs, and training the feature fusion network by adopting the neural network search model according to the implication relations to obtain a text implication relation identification model;

in this embodiment, the spliced feature vector is represented by a matrix, each basic network in the feature fusion network converts the matrix by convolution kernel, when the spliced feature passes through the highway and multi-head authentication networks, attention of the text feature vector in the spliced feature vector can be increased, when the spliced feature vector passes through the LSTM network, the sequential sensitivity of the spliced feature vector can be increased, and finally, the corresponding fusion feature vector is output. And inputting the fusion feature vector into a classifier to identify the training text pair as a real implication relationship, a contradiction relationship or a neutral relationship as a predicted implication relationship, wherein the verification of the generated fusion feature network is carried out to judge whether the current fusion feature network meets the training requirement.

Calculating the loss value of the feature fusion network according to the predicted inclusion relation and the marked inclusion relation, if the loss value is larger than a preset threshold value, adjusting the current feature fusion network by adopting a neural network search model so as to update the current feature fusion network, and outputting the network type and the input network node list of each basic network in the new feature fusion network. And then, fusing the spliced feature vectors through a new feature fusion network to obtain a new fusion feature vector, identifying the training text pair to obtain a new inclusion relation, and repeating the process until the loss value is smaller than a preset threshold value, and stopping updating to obtain a corresponding text inclusion relation identification model. In addition, after the feature fusion network is updated for a preset number of times, the updating can be stopped, and a corresponding text inclusion relation recognition model can be obtained. In addition, the characteristic fusion network can be updated through a strategy gradient updating method, and the network training efficiency is improved.

105. And acquiring a text pair to be recognized, and recognizing the text pair to be recognized input by the text inclusion relation recognition model to obtain the inclusion relation of the text pair to be recognized.

In the embodiment, the text to be recognized is recognized through the trained text inclusion relationship recognition model, the part of speech of the words in the text to be recognized, the matching relationship between the named entity and the precondition sentence and the hypothesis sentence is concerned, and the recognition accuracy of the inclusion relationship obtained through recognition is higher for the text to be recognized with the special words such as numbers, time, names of people, names of places and the like.

In the embodiment of the invention, text feature vectors are extracted from training text pairs and spliced to implication feature vectors output by the training text pairs after passing through a pre-training model to obtain corresponding spliced feature vectors, and the part-of-speech features, named entity features and matching features of the training text pairs are expanded; and then generating a feature fusion network through a neural network generation model, splicing the feature fusion network into a pre-training model, expanding the pre-training model, and identifying the implication relation of a training text pair through fusion and splicing of feature vectors. The method improves the accuracy of relation recognition of the model for the text pair consisting of proper nouns, and expands the model recognition range.

Referring to fig. 2, a second embodiment of the method for recognizing a text implication relationship according to the embodiment of the present invention includes:

201. acquiring a training text pair containing implication relations;

202. decomposing the training text pair to obtain a plurality of subwords, and determining the text attribute of each subword;

in this embodiment, a precondition sentence and an assumption sentence in a training text pair are decomposed into a plurality of subwords, and then a part of speech and a named entity of each subword are determined, and a matching relationship between the precondition sentence and the assumption sentence is used as a text attribute of each subword. Text segmentation is a common technical means in the art, and therefore, is not described herein again.

203. Generating a text feature vector of the training text pair according to the text attribute, wherein the text feature vector comprises a part-of-speech feature vector, a named entity feature vector and a matching feature vector;

in this embodiment, the text feature vector of the training text pair generated by the text attribute includes a part-of-speech feature vector, a named entity feature vector, and a matching feature vector.

For the part-of-speech feature vector, 1 can be used to represent that a sub-word belongs to a real word category, 0 is used to represent that the sub-word belongs to a virtual word category, and then the specific part-of-speech is correspondingly marked;

for the named entity feature vector, a sub-word can be represented by 1 as an entity sub-word, namely, a proper noun such as a number, time, a name of a person, a name of a place and the like, a sub-word can be represented by 0 as a non-entity sub-word, namely, a common vocabulary, and then the category of the special word is correspondingly marked; for example, the original sentences and corresponding named entity feature vectors for the following training sample pairs are:

[ CLS ] a small bright favorite place A [ SEP ];

0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0

wherein, the named entity corresponding to 1 is marked with a "LOCATION" label, a precondition statement is between [ CLS ] and [ SEP ], an assumed statement is between [ SEP ] and [ SEP ], and [ CLS ] and [ SEP ] are represented by 0.

For the matching feature vector, the specific generation process is as follows:

For example, the original sentences and the corresponding matching feature vectors of the following training sample pairs are respectively:

[ CLS ] a small Ming love site A [ SEP ];

0 1 1 0 0 1 1 1 0 1 1 0 1 1 1 0

204. extracting the inclusion characteristic vector of the training text pair by adopting a preset pre-training model, and splicing the inclusion characteristic vector and the text characteristic vector to obtain a corresponding spliced characteristic vector;

205. randomly generating a feature fusion network by adopting a preset neural network search model, and accessing the feature fusion network into the pre-training model;

206. inputting the spliced feature vectors into the feature fusion network to identify implication relations of corresponding training text pairs, and training the feature fusion network by adopting the neural network search model according to the implication relations to obtain a text implication relation identification model;

207. and acquiring a text pair to be recognized, and recognizing the text pair to be recognized input by the text inclusion relation recognition model to obtain the inclusion relation of the text pair to be recognized.

The embodiment of the invention introduces the extraction process of the text feature vector in detail, the feature attribute is expanded on the basis of the inclusion feature vector, and the inclusion relation of the training text pair obtained by subsequently identifying the spliced feature vector has larger concerned content and higher accuracy of the predicted inclusion relation.

Referring to fig. 3, a third embodiment of the method for recognizing a text implication relationship according to the embodiment of the present invention includes:

301. acquiring a training text pair containing implication relations, and extracting text feature vectors of the training text pair;

302. extracting the implication characteristic vectors of the training text pairs by adopting a preset pre-training model, and splicing the implication characteristic vectors and the text characteristic vectors to obtain corresponding spliced characteristic vectors;

303. randomly generating a feature fusion network by adopting a preset neural network search model, and accessing the feature fusion network into the pre-training model;

304. inputting the spliced feature vector into the feature fusion network;

305. determining a network connection list of each basic network according to the feature fusion network;

in this embodiment, the neural network search model forms a corresponding network connection list for presentation by using the network type and the input network node list of each basic network in the feature fusion network, so that the network connection list can be obtained by using the feature fusion network. It should be noted that the network connection list, i.e. the number of each basic network in the feature fusion network, is one or more.

For example, the feature fusion network generated by RNN includes four basic networks N1, N2, N3, and N4, where the input network node list of N1 is empty, the input network node list of N2 is [ N1], the input network node list of N3 is [ N1, N2], and the input network node list of N4 is [ N1, N2, and N3], that is, a feature fusion network formed by the four basic networks may be defined, and through the feature fusion network, a corresponding network connection list, that is, a type of each basic network and a corresponding input network node list may be obtained.

306. Determining each basic network fusion path corresponding to the splicing feature vector according to the network connection list;

in this example, each basic network convergence path through which the spliced feature vector passes after being input into the corresponding special convergence network can be known through the network connection list. For example, in the feature fusion network including four basic networks N1, N2, N3, and N4 generated by the RNN in the above step, the feature fusion network obtained through the corresponding network connection list includes: and three basic network fusion paths of N1-N4, N1-N2-N4 and N1-N2-N3-N4.

307. According to different basic network fusion paths, respectively adopting corresponding basic networks to sequentially perform feature fusion on the spliced feature vectors to obtain corresponding single-chain fusion feature vectors;

in this embodiment, each basic network fusion path includes one or more basic networks, the spliced feature vectors are input into the feature fusion network, and the spliced feature vectors are fused according to the basic network in each basic feature fusion path, respectively for different fusion degrees or different feature attributes, to obtain a single-chain fusion feature vector that is lost.

For example, in the feature fusion network including four basic networks of N1, N2, N3, and N4 in the above step, the N1 basic network focuses on the word order of the training text pair, the N2 basic network focuses on the part of speech of the training text pair, the N3 basic network focuses on the named entity of the training text pair, and the N4 basic network focuses on the matching relationship between the precondition sentence and the hypothesis sentence in the training text; the word order and the part of speech of the training text pair are concerned by the single-chain fusion feature vector obtained after the spliced feature vector passes through the basic network feature fusion path N1-N4, the word order, the part of speech and the named entity of the training text pair are concerned by the single-chain fusion feature vector obtained after the spliced feature vector passes through the basic network feature fusion path N1-N2-N4, and the word order, the part of speech, the named entity and the matching relationship of the training text pair are concerned by the single-chain fusion feature vector obtained after the spliced feature vector passes through the basic network feature fusion path N1-N2-N3-N4.

308. Splicing the single-chain fusion feature vectors to obtain corresponding fusion feature vectors;

in this embodiment, the feature vectors are spliced through different basic network fusion paths in the feature fusion network, the obtained single-chain fusion feature vectors concern different fusion degrees and different feature attributes, then the multiple single-chain fusion feature vectors are spliced into corresponding fusion feature vectors, on one hand, the obtained fusion feature vectors are guaranteed to comprehensively concern the feature attributes of the training text pairs, on the other hand, the fusion feature vectors are analyzed from multiple dimensions, and the inclusion relationship of the training text pairs is recognized.

In this embodiment, each single-chain fused feature vector is represented by a matrix, and the fused feature vector obtained here is represented by matrix splicing of the number of single-chain fused feature vectors.

309. Identifying an implication relation of the corresponding training text pair according to the fusion feature vector;

in this embodiment, after the fusion feature vectors of the training text pairs are obtained, the fusion feature vectors are input into a trained classifier, and the inclusion relationship corresponding to the training text pairs can be obtained, where the true inclusion relationship, the contradictory relationship, and the probability corresponding to the neutral relationship are predicted first, and then the inclusion relationship with the highest probability is selected as the inclusion relationship of the training text pairs.

310. Training the feature fusion network by adopting the neural network search model according to the implication relation to obtain a text implication relation recognition model;

311. and acquiring a text pair to be recognized, and recognizing the text pair to be recognized input by the text inclusion relation recognition model to obtain the inclusion relation of the text pair to be recognized.

The embodiment of the invention introduces the process of generating the feature fusion network by the neural network search model in detail, and the feature fusion network is spliced into the pre-training model to expand the pre-training model, so that the generated text inclusion relation recognition model focuses on the sequence of the sub-words of the text pair, the focus on the feature vector of the text is improved, and the accuracy of recognizing proper nouns in the text pair by the text inclusion relation recognition model is favorably improved.

Referring to fig. 4, a fourth embodiment of the method for recognizing a text implication relationship according to the embodiment of the present invention includes:

401. acquiring a training text pair containing implication relations, and extracting text feature vectors of the training text pair;

402. extracting the inclusion characteristic vector of the training text pair by adopting a preset pre-training model, and splicing the inclusion characteristic vector and the text characteristic vector to obtain a corresponding spliced characteristic vector;

403. randomly generating a feature fusion network by adopting a preset neural network search model, and accessing the feature fusion network into the pre-training model;

404. inputting the spliced feature vectors into the feature fusion network to identify implication relations of corresponding training text pairs;

405. judging whether the feature fusion network is converged or not according to the implication relation;

in this embodiment, after the implication relationship between the precondition sentence and the hypothesis sentence in the training text pair is obtained as the true implication relationship, the contradiction relationship, or the neutral relationship, the corresponding implication relationship of the pre-labeled training text pair is compared, and the loss value of the feature fusion network is calculated by the difference between the predicted implication relationship and the labeled implication relationship of the training text pair, and if the loss value is greater than the preset loss threshold, convergence is performed, and if the loss value is smaller than the preset loss threshold, non-convergence is performed. Where the loss function applied here is a cross-entropy loss function, as follows:

where N is the number of training sample pairs, y ⁽ⁱ⁾ Represents the prediction result of the ith training sample pair (i.e. correct prediction or incorrect prediction, correct prediction is 1, incorrect prediction is 0),

and representing the probability corresponding to the predicted implication relation of the ith training sample pair, wherein Los represents a loss value.

406. If so, taking the pre-training model accessed to the current feature fusion network as a text inclusion relation recognition model;

407. if not, updating the feature fusion network by adopting the neural network search model according to the implication relationship until the feature fusion network is converged, and taking the pre-training model accessed to the current feature fusion network as a text implication relationship recognition model;

in this embodiment, if the corresponding loss obtained through the implication relationship calculation is smaller than a preset loss threshold, it is determined that the feature fusion network is converged, and a pre-training model of the current feature fusion network spliced value is used as a text implication relationship recognition model; otherwise, the basic network in the feature fusion network needs to be recombined through the neural network search model, including addition, deletion, sequencing adjustment and the like of the basic network. And then, splicing feature vectors are fused again through a new feature fusion network, then the fusion relation of the training text pair is predicted through a classifier until the loss value corresponding to the new implication relation is smaller than a preset loss threshold value, and the current latest feature fusion network is used as a text implication relation recognition model.

In this embodiment, the feature fusion network is intensively learned by a policy gradient update method. The method comprises the following steps: firstly, training in a feature fusion network for each time by a fixed step number, and maintaining the learning attenuation rate of the last training; and secondly, recording the score of each training text pair on the model after the training is finished, sequencing the training text pairs according to the scores, and training the training text pairs in the front of the sequence for more times.

408. And acquiring a text pair to be recognized, and recognizing the text pair to be recognized input by the text inclusion relation recognition model to obtain the inclusion relation of the text pair to be recognized.

In the embodiment of the invention, the quality degree of the current feature fusion network is calculated by predicting the implication relation of the pre-training model accessed into the feature fusion network to the training text pair, the quality degree required by the feature fusion network is determined by setting a loss threshold value according to actual conditions, and then the feature fusion network is updated by a strategy gradient updating method, so that the model training efficiency is improved under the condition of ensuring the quality of the model.

In the above description of the method for recognizing a text implication relationship in the embodiment of the present invention, a text implication relationship recognition apparatus in the embodiment of the present invention is described below with reference to fig. 5, where an embodiment of the text implication relationship recognition apparatus in the embodiment of the present invention includes:

an extraction module 501, configured to obtain a training text pair including an implication relationship, and extract a text feature vector of the training text pair;

a splicing module 502, configured to extract an implied feature vector of the training text pair by using a preset pre-training model, and splice the implied feature vector and the text feature vector to obtain a corresponding spliced feature vector;

an access module 503, configured to randomly generate a feature fusion network by using a preset neural network search model, and access the feature fusion network to the pre-training model;

a training module 504, configured to input the concatenated feature vectors into the feature fusion network to identify an implication relationship of a corresponding training text pair, and train the feature fusion network by using the neural network search model according to the implication relationship to obtain a text implication relationship identification model;

and the identification module 505 is used for acquiring the text pair to be identified, identifying the text pair to be identified which is input into the text implication relation identification model, and obtaining the implication relation of the text pair to be identified.

In the embodiment of the invention, text feature vectors are extracted from training text pairs and spliced to implication feature vectors output by the training text pairs after passing through a pre-training model to obtain corresponding spliced feature vectors, and the part-of-speech features, named entity features and matching features of the training text pairs are expanded; and then generating a feature fusion network through a neural network generation model, splicing the feature fusion network into a pre-training model, expanding the pre-training model, and identifying the implication relationship of the training text pair through fusion splicing of feature vectors. The method improves the accuracy of the relation recognition of the model to the text pair formed by the proper nouns, and expands the model recognition range.

Referring to fig. 6, another embodiment of the device for recognizing a text implication relationship according to the embodiment of the present invention includes:

a training module 504, configured to input the spliced feature vector into the feature fusion network to identify an implication relationship of a corresponding training text pair, and train the feature fusion network by using the neural network search model according to the implication relationship to obtain a text implication relationship identification model;

Specifically, the extraction module 501 includes:

the text decomposition unit 5011 is configured to decompose the training text pair to obtain a plurality of subwords, and determine text attributes of the subwords;

the feature generating unit 5012 is configured to generate a text feature vector of the training text pair according to the text attribute, where the text feature vector includes a part-of-speech feature vector, a named entity feature vector, and a matching feature vector.

Specifically, the feature generation unit 5012 is further configured to:

Specifically, the training module 504 includes:

a feature input unit 5041, configured to input the stitched feature vector into the feature fusion network;

a path confirmation unit 5042, configured to determine a network connection list of each basic network according to the feature fusion network; determining each basic network fusion path corresponding to the splicing feature vector according to the network connection list;

a feature fusion unit 5043, configured to perform feature fusion on the spliced feature vectors sequentially by using corresponding basic networks according to the basic network fusion paths to obtain corresponding fusion feature vectors;

and the identifying unit 5044 is configured to identify an implication relationship of the corresponding training text pair according to the fusion feature vector.

Specifically, the feature fusion unit 5044 is further configured to:

Specifically, the training module 504 further includes:

a judging unit 5045, configured to judge whether the feature fusion network converges according to the implication relationship;

the model generation unit 5046 is configured to, if yes, use the pre-training model accessed to the current feature fusion network as a text implication relationship recognition model;

and the network updating unit 5047 is configured to, if not, update the feature fusion network by using the neural network search model according to the implication relationship until the feature fusion network converges, and use the pre-training model accessed to the current feature fusion network as a text implication relationship recognition model.

In the embodiment of the invention, text feature vectors are extracted from training text pairs and spliced to implication feature vectors output by the training text pairs after passing through a pre-training model to obtain corresponding spliced feature vectors, and the part-of-speech features, named entity features and matching features of the training text pairs are expanded; and then generating a feature fusion network through a neural network generation model, splicing the feature fusion network into a pre-training model, expanding the pre-training model, and identifying the implication relation of a training text pair through fusion and splicing of feature vectors. The method improves the accuracy of the relation recognition of the model to the text pair formed by the proper nouns, and expands the model recognition range. The extraction process of the text feature vector is also introduced in detail, feature attributes are expanded on the basis of the inclusion feature vector, and the inclusion relation of a training text pair obtained by subsequently identifying the spliced feature vector is higher in concerned content and higher in accuracy of the predicted inclusion relation; the process of generating the feature fusion network by the neural network search model is introduced in detail, the feature fusion network is spliced into the pre-training model, and the pre-training model is expanded, so that the generated text inclusion relation recognition model focuses on the sequence of sub-words of a text pair, the focus on a text feature vector is improved, and the accuracy of recognizing proper nouns in the text pair by the text inclusion relation recognition model is improved; and finally, calculating the quality degree of the current feature fusion network through predicting the inclusion relationship of the pre-training model accessed into the feature fusion network to the training text pair, setting a loss threshold value through the actual situation to determine the degree of goodness and weakness required to be achieved by the feature fusion network, and updating the feature fusion network by a strategy gradient updating method, thereby improving the model training efficiency under the condition of ensuring the model quality.

Fig. 5 and fig. 6 describe the text implication relationship recognition apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the text implication relationship recognition apparatus in the embodiment of the present invention is described in detail from the perspective of hardware processing.

Fig. 7 is a schematic structural diagram of a text implication relationship recognition device 700 according to an embodiment of the present invention, where the text implication relationship recognition device 700 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 710 (e.g., one or more processors) and a memory 720, and one or more storage media 730 (e.g., one or more mass storage devices) for storing applications 733 or data 732. Memory 720 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations on the text implication relationship recognition device 700. Still further, the processor 710 may be configured to communicate with the storage medium 730 to execute a series of instruction operations in the storage medium 730 on the text implication relationship recognition device 700.

Text implication relationship identification device 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input-output interfaces 760, and/or one or more operating systems 731, such as Windows Server, mac OS X, unix, linux, freeBSD, and the like. Those skilled in the art will appreciate that the configuration of the text implication relationship recognition device shown in fig. 7 does not constitute a limitation of the text implication relationship recognition device, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The present invention further provides a device for recognizing a text implication relationship, where the device includes a memory and a processor, where the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the processor is caused to execute the steps of the method for recognizing a text implication relationship in the foregoing embodiments.

The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to execute the steps of the text implication relationship identification method.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for recognizing a text implication relationship is characterized by comprising the following steps:

acquiring a training text pair containing implication relations, and extracting text feature vectors of the training text pair;

randomly screening a plurality of basic networks in each basic network included by the self-body by adopting a preset neural network search model to obtain the network types and the input network node lists of the screened basic networks, generating a characteristic fusion network by the network types and the input network node lists, and accessing the characteristic fusion network into the pre-training model;

inputting the spliced feature vectors into the feature fusion network to identify the implication relation of the corresponding training text pairs, and training the feature fusion network by adopting the neural network search model according to the implication relation to obtain a text implication relation identification model, wherein each basic network in the feature fusion network pays attention to the feature attribute of the corresponding network type of the training text pairs to identify the implication relation of the corresponding training text pairs;

2. The method for recognizing text implication relationship according to claim 1, wherein the extracting the text feature vector of the training text pair includes:

3. The method for recognizing text implication relationship according to claim 2, wherein the training text pair includes a first training text and a second training text, and the generating the text feature vector of the training text pair according to the text attribute includes:

4. The method for recognizing text implication relationship according to claim 1, wherein the inputting the spliced feature vectors into the feature fusion network to recognize implication relationship of corresponding training text pairs comprises:

inputting the spliced feature vector into the feature fusion network;

according to the fusion paths of the basic networks, sequentially adopting the corresponding basic networks to perform feature fusion on the spliced feature vectors to obtain corresponding fusion feature vectors;

5. The method for recognizing the text implication relationship according to claim 4, wherein the sequentially adopting the corresponding basic networks to perform feature fusion on the spliced feature vectors according to the fusion paths of the basic networks to obtain the corresponding fused feature vectors comprises:

6. The method for recognizing the text implication relationship according to any one of claims 1-5, wherein the training of the feature fusion network by using the neural network search model according to the implication relationship to obtain the text implication relationship recognition model comprises:

7. A text inclusion relation recognition apparatus, characterized in that the text inclusion relation recognition apparatus comprises:

the extraction module is used for acquiring a training text pair containing the implication relation and extracting a text feature vector of the training text pair;

the splicing module is used for extracting the implication characteristic vectors of the training text pairs by adopting a preset pre-training model and splicing the implication characteristic vectors and the text characteristic vectors to obtain corresponding splicing characteristic vectors;

the access module is used for randomly screening a plurality of basic networks in each basic network included by the access module by adopting a preset neural network search model to obtain the network types and the input network node lists of the screened basic networks, generating a feature fusion network by the network types and the input network node lists and accessing the feature fusion network into the pre-training model;

the training module is used for inputting the spliced feature vectors into the feature fusion network so as to identify the implication relation of the corresponding training text pair, and training the feature fusion network by adopting the neural network search model according to the implication relation to obtain a text implication relation identification model, wherein each basic network in the feature fusion network pays attention to the feature attribute of the corresponding training text pair in the corresponding network type so as to identify the implication relation of the corresponding training text pair;

8. The apparatus according to claim 7, wherein the training module includes:

9. A text inclusion relation recognition apparatus characterized by comprising: a memory and at least one processor, the memory having instructions stored therein;

the at least one processor invokes the instructions in the memory to cause the text implication relationship recognition device to perform the text implication relationship recognition method of any of claims 1-6.

10. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement the text implication relationship recognition method according to any of claims 1-6.