CN111914067B - Chinese text matching method and system - Google Patents
Chinese text matching method and system Download PDFInfo
- Publication number
- CN111914067B CN111914067B CN202010837271.8A CN202010837271A CN111914067B CN 111914067 B CN111914067 B CN 111914067B CN 202010837271 A CN202010837271 A CN 202010837271A CN 111914067 B CN111914067 B CN 111914067B
- Authority
- CN
- China
- Prior art keywords
- semantic
- word vector
- word
- representation
- chinese
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 239000013598 vector Substances 0.000 claims abstract description 198
- 230000011218 segmentation Effects 0.000 claims abstract description 33
- 230000002452 interceptive effect Effects 0.000 claims description 19
- 238000013528 artificial neural network Methods 0.000 claims description 13
- 230000009466 transformation Effects 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 10
- 239000000427 antigen Substances 0.000 claims description 5
- 102000036639 antigens Human genes 0.000 claims description 5
- 108091007433 antigens Proteins 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 13
- 238000012549 training Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000002474 experimental method Methods 0.000 description 9
- 230000007246 mechanism Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 230000002457 bidirectional effect Effects 0.000 description 7
- 230000015654 memory Effects 0.000 description 7
- 230000003993 interaction Effects 0.000 description 6
- 238000003058 natural language processing Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 101100006960 Caenorhabditis elegans let-2 gene Proteins 0.000 description 5
- 230000000873 masking effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000010295 mobile communication Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000001953 sensory effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a Chinese text matching method. The method comprises the following steps: carrying out character-level coding on the Chinese sentence pairs by using a plurality of word segmentation tools to obtain initial character vectors of the Chinese sentence pairs; inputting an initial word vector of a Chinese sentence pair into an input layer, and determining semantic representation of the word vector based on a knowledge network external knowledge base; respectively carrying out iterative updating on word lattices of the semantic representation and the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with the semantic representation; inputting the semantic word vector into a sentence matching layer, and determining a final feature representation semantic word vector of the Chinese sentence pair; a match probability is determined for the feature representations of the chinese sentence pair based on the final feature representation semantic word vector of the chinese sentence pair and the plurality of word segmentation tools. The embodiment of the invention also provides a Chinese text matching system. According to the embodiment of the invention, the semantic information in the HowNet external knowledge base is integrated into the model, so that the semantic information in sentences is better utilized, and the matching effect is obviously improved.
Description
Technical Field
The invention relates to the field of text matching, in particular to a Chinese text matching method and a Chinese text matching system.
Background
Text matching is an important basic problem in Natural Language Processing, and can be applied to a large number of NLP (Natural Language Processing) tasks, such as information retrieval, question-answering system, dialogue system, machine translation, and the like, which can be abstracted to a great extent as text matching problems. Word lattice convolutional neural network, two-way multi-angle matching in natural language sentences is commonly used for text matching.
Word lattice convolutional neural network: the word lattice is used as input, multiple CNN (Convolutional Neural Networks) convolution kernels are used on different n-gram texts to extract features, and the features are fused through a pooling mechanism to be used for text matching.
Bidirectional multi-angle matching in natural language sentences. The method uses words as input, each sentence is coded by a BilSTM (Bi-directional Long Short-Term Memory, bidirectional Long-Short Term Memory network), the characteristics of the two sentences are interacted by a plurality of methods, and a plurality of kinds of interactive information are combined for classification.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the related art:
when the word lattice convolutional neural network is used, the features are derived from local information, the global information cannot be fused, and the model may lose remote information when the features of a certain position in a sentence are extracted. In addition, this technique uses only representations of words, and does not utilize semantic information.
When bidirectional multi-angle matching in natural language sentences is used, although interactive information between sentences can be obtained, because input is simple word segmentation, influence caused by inaccurate word segmentation can be introduced. In addition, this technique also does not utilize semantic information of the terms.
Disclosure of Invention
In order to solve at least the problem that in the prior art, only local information can be obtained by convolution of n-gram texts of a word lattice convolution neural network, word vector representation is used, and explicit semantic information is not included. When the two-way multi-angle is matched, the input unit uses a word segmentation tool to segment words, and the word segmentation tool cannot guarantee that the words are completely accurate.
In a first aspect, an embodiment of the present invention provides a method for matching a chinese text, including:
using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
determining a match probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
In a second aspect, an embodiment of the present invention provides a chinese text matching system, including:
the coding program module is used for coding Chinese sentence pairs at a character level by using a plurality of word segmentation tools to obtain initial character vectors of the Chinese sentence pairs;
a semantic representation determining program module, configured to input an initial word vector of the chinese sentence pair to an input layer, determine a word vector of the chinese sentence pair, obtain an semantic corresponding to the word vector based on a knowledge network external knowledge base, and determine a semantic representation of the word vector;
an updating iterative program module, which is used for inputting the word vectors and semantic representations of the Chinese sentence pairs to a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
a matching program module for inputting the semantic word vector to a sentence matching layer, connecting the semantic word vector of the Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
a probability determination program module for determining a matching probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the chinese text matching method of any of the embodiments of the present invention.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the chinese text matching method according to any embodiment of the present invention.
The embodiment of the invention has the beneficial effects that: combining the word segmentation results of various word segmentation tools to construct a word lattice diagram as model input. Semantic information in an external knowledge base of HowNet is merged into the model, so that the model can better utilize the semantic information in sentences. The graph transformation model with enhanced semantic knowledge is used, and experiments prove that compared with a word lattice convolutional neural network, a sentence bidirectional multi-angle matching base line and the like, the model has obvious performance improvement.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for matching Chinese text according to an embodiment of the present invention;
FIG. 2 is an overall structure diagram of a product of a method for matching Chinese texts according to an embodiment of the present invention;
FIG. 3 is a diagram of a semantic information structure of a Chinese text matching method according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating updated semantic representations and word vectors of a Chinese text matching method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of word segmentation and potential word ambiguity of a Chinese text matching method according to an embodiment of the present invention;
FIG. 6 is a graph of performance data of different models of a Chinese text matching method on LCQMC and BQ test data sets according to an embodiment of the present invention;
FIG. 7 is a graph of performance data using different segments on the LCQMC test data set for a Chinese text matching method according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a chinese text matching system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a method for matching a chinese text according to an embodiment of the present invention, which includes the following steps:
s11: using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
s12: inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
s13: inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
s14: inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
s15: determining a match probability based on the final feature representation semantic word vector and feature representations of the Chinese sentence pair.
In the embodiment, the overall model for Chinese text matching comprises an input layer and a graph transformation layer of a graph transformation network, so that the model can notice the graph information formed by the whole sentence and not only the local information; meanwhile, an external knowledge base of HowNet (HowNet) is introduced, so that semantic information can be merged into the model, and the matching performance of the model is further improved as shown in FIG. 2.
For step S11, a plurality of word segmentation tools, such as jieba, pkuseeg, and thulac, are prepared in advance. Since text matching is involved, typically one sentence of the sentence pair is a question entered by the user and the other sentence is a question in the question text library. Thereby determining whether the two sentences match. And performing multiple word segmentation on the sentence pairs by using a plurality of prepared word segmentation tools to obtain all word segmentation results. When encoding, a pre-training model BERT (Bidirectional Encoder Representation from converters) can be used to encode two sentences at a character level, so as to obtain an initial word vector Representation.
For step S12, it is then necessary to obtain word vectors, each word containing several words, and the weights of each word are obtained through a feed-forward neural network and weighted to obtain the word vectors.
Semantic information in the HowNet external knowledge base is then introduced. As shown in FIG. 3, each word may have multiple meanings, each meaning being modified by some sememe. HowNet takes the sememe as the smallest unit of the expression, and has a total of 1985 sememes in the knowledge base. The vector representation of each meaning is obtained by pooling weighting also from the vector representation of the sememe, which is written as semantic representation below. There may be multiple semantic representations for each word because some chinese words may be ambiguous words.
For step S13, in the graph transformation layer module capable of sensing semantics, the semantic representation and the word vector representation are updated iteratively in sequence. The update uses MD-GAT (multi-dimensional graph attention transform network). Among these, the multidimensional graph attention transforms the network, which is a network used many times in the method, and is applied as a function below. Both word lattice and sequence of sememes are considered as a graph.
To obtain the representation after the first update, the node x is usedjConnected node representation and xjUpdating the representation of the previous round by itself. The update is the weighting of the connected nodes and their representations.
As an embodiment, the method further comprises: and iteratively updating the semantic representation through an reachable node of a word node corresponding to the semantic representation, iteratively updating a word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
In the present embodiment, as shown in fig. 4, when updating the semantic representation, reachable node representation of the word node corresponding to the semantic representation is used as information, and a Gate loop Unit (GRU) is used as a control Unit for history information and new information. When the word vector is updated, the semantic node representation corresponding to the word node is used as information, and the GRU is also used as a control unit for history information and updated new information. The GRU may thus retain a portion of the useful history information. The module can be iterated for a plurality of times in the model, and experiments prove that the effect of twice iteration is the best.
For step S14, inputting the semantic word vector determined in step S13 to the sentence matching layer includes:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feed-forward neural network to generate a final feature expression semantic word vector.
In the present embodiment, a word-level vector representation is first obtained by pooling weighting the updated word vectors, and the pooled input is all the words including the word. And then, normalizing the word vector and the initial word vector representation obtained by BERT coding to obtain a new word vector. In the module, not only the operation of multidimensional graph attention transformation is carried out on sentences, but also interaction is carried out among the sentences, and the interaction also uses the multidimensional graph attention transformation. The sentence information and the interactive information are connected, and a final word vector representation is obtained through a two-layer feedforward neural network. And then obtaining sentence vectors through pooling weighting.
For step S15, the vectors of the two sentences and their dot products and absolute values are connected, and then the BERT-coded feature representation is connected, and a two-layer feedforward neural network and activation function are used together to obtain the classification probability.
p=FFN([cCLS,ra,rb,ra⊙rb,|ra-rb|])
Wherein r isaAnd rbAre vectors of two sentences, respectively, cCLSIs a characteristic representation after BERT coding.
It can be seen from this embodiment that the segmentation results of multiple segmentation tools are combined to construct a word lattice diagram as the model input. Semantic information in an external knowledge base of HowNet is fused into the model, so that the model can better utilize the semantic information in sentences. By using a graph transformation model with enhanced semantic knowledge, experiments prove that the model has remarkable performance improvement compared with baselines such as a word lattice convolution neural network, sentence bidirectional multi-angle matching and the like.
The method is described in detail, wherein a pre-trained language model, such as BERT, has shown powerful performance on various natural language processing tasks, including text matching. For Chinese text matching, BERT takes as input a pair of sentences, each Chinese character being a separate input token. It ignores the word information. To address this problem, some chinese variants of the original BERT have been proposed, such as BERT-wwm and ERNIE (two prior art). However, the pre-training process of BERT considering words requires a lot of time and resources. Therefore, the model of the method adopts a pre-trained language model as initialization, and fine-tunes the model by using word information.
The Hownet is an external knowledge base that manually labels each Chinese word sense with one or more related senses. The Hownet treats an sememe as an atomic semantic unit. Unlike WordNet, it emphasizes that various parts and attributes of the concept can be well represented in the sense. Hownet finds wide application in many natural language processing tasks, such as word similarity calculation, emotion analysis, word representation learning, and language modeling. However, people have less research on their effectiveness in short text matching tasks, especially in combination with pre-trained language models.
GAT (graph attention networks) is a special type of network that handles graph structure data through an attention mechanism. Given a diagramWhere V and ε are each a node xiAnd a set of edges. N is a radical of(xi)Is a collection comprising nodes xi themselves and nodes directly connected by xi.
Each node xi in the graph has an initial feature vectorWhere d is the feature dimension. The representation of each node is iteratively updated by a graph attention operation. In the l-th step, each node xi aggregates context information by aggregating its neighbors and itself. Updated representationIs calculated from the weighted average of the connected nodes,
whereinIs a learnable parameter, and σ (·) is a nonlinear activation function, such as ReLU. Attention coefficientIs two nodes x in a unified spaceiAnd xjEmbedded normalized similarity between them.
Note that, in the above-described formula,is a scalar, which means that all unitsAre treated equally. This may limit the ability to model complex dependencies.Instead of general attention, multidimensional attention has proven useful in dealing with context changes and ambiguous problems in many NLP tasks. For each embeddedIt does not compute a single scalar scoreInstead, a feature score vector is first computed, then normalized with a feature-sensitive multidimensional softmax,
whereinIs a function of the similarity in the equationA scalar of the calculation.Is a vector. The addition in the above equation means that a scalar is added to each element of the vector.For modeling the pairwise dependencies of two nodes, andfor estimatingThe contribution of each of the feature dimensions of (a),
wherein,andare learnable parameters. According to the score vectorThe corresponding modifications are as follows:
wherein |, indicates the element-by-element product of two vectors. For simplicity, the update process using the multidimensional attention mechanism is represented using MD-GAT (·), as follows,
after L steps of updating, each node finally has a context-aware representationTo obtain a stable training process, a residual join is also used, followed by layer normalization between the two graph attention layers.
In the specific implementation, a Chinese short text matching task is defined, and two Chinese sentences are given.Andtext matching modelIs to predict CaAnd CbWhether the semantics ofAnd the like. In this case, the number of the first and second,andrespectively representing the tth and tth' Chinese characters in two sentences, and TaAnd TbRepresenting the number of characters in the sentence.
In the method, a matching model with enhanced language knowledge is provided. Rather than dividing each sentence into a sequence of words, we have all possible paths of segmentation to form a word lattice diagramV is the set of nodes and ε is the set of edges. Each nodeCorresponding to the word wiWord wiIs from the t-th in the sentence1Character to tth2A sub-sequence of characters starting with a character. As shown in FIG. 5, w can be obtained by retrieving HowNetiAll the meanings of (a).
For two nodesAndif xiWith x in the original sentencejAdjacent, there is an edge between them.Is composed of xiItself and the set of all reachable nodes in its forward direction, andis composed of xiItself and all reachable nodes backwards from it.
Using two graphsAndfor two sentences CaAnd CbThe graph matching model predicts the similarity between the two to judge the original sentence CaAnd CbWhether or not to have the same meaning. As shown in fig. 2, the model consists of four parts: an input module, a semantic perception graph converter (SaGT), a sentence matching layer and a relation classifier. The input module outputs each word wiAnd an initial semantic representation of each meaning. The semantic perception map converter iteratively updates the word representations and the semantic representations and fuses the useful information of each other. The sentence matching layer firstly blends the word expression into the character and then matches the two character sequences by adopting a bilateral multi-view matching mechanism. The relationship classifier takes a sentence vector as input and predicts the relationship between two sentences.
Context embedding in input Module for each node x in the graphiWord wiIs a centralized pool of context character representations. Specifically, the original character-level sentences are first concatenated to form a new sequenceThey are then provided to the BERT model to obtain a contextual representation of each characterHypothesis word wiConsisting of successive character marksUsing for each role ck(t1≤k≤t2) A feed-forward network (FFN) with two layers computes feature-oriented fractional vectors, i.e., then uses feature-oriented multidimensional softmax (MD-so)ftmax) is used for the normalization,
uk=MD-softmaxk(FFN(ck))
corresponding character embedding ckBy normalized fraction ukThe weighting results in the context word embedding,
for simplicity, the above formula abbreviation is rewritten using Att-Pooling (. cndot.), i.e.:
vi=Att-Pooling({|ck|t1≤k≤t2})
and embedding sememes, wherein the method uses the Hownet as an external knowledge base to express semantic information of words. In view of ambiguity, HowNet distinguishes between different meanings for each word in an ambiguate. An example is given in fig. 3. The word "Apple (Apple)" has two meanings, including "Brand (Apple Brand)" and "fruit (Apple)". The term "brand (apple brand)" has five phonetic names including "computer", "pattern value", "energy", "carrying (Bring)" and "specific brand".
For each word wiBy usingsi,kIs wiIn the k-th sense, we represent its corresponding sememes as a setTo obtain each semantic si,kIs embedded ini,kWe first get each of the sememes using a multidimensional attention function The expression of (a):
whereinIs sense generated by SAT modelThe embedded vector of (2). For each semantic si,kWhich is embedded in si,kObtained by an attentive aggregation of all phoneme representations,
semantically aware graph converter, for each node x in the graphiInsert word viContaining only context information, without explicit linguistic knowledge, but embedding si,kOnly containing linguistic knowledge and no contextual information. In order to obtain useful information therefrom, a semantic perception map converter (SaGT) is proposed. The algorithm first finds vi and si,kRespectively as word wiInitial word representation ofAnd initial sense representation of word senseThe word representation and the meaning representation are then iteratively updated in two sub-steps.
Updating sensory representation in the first iteration, the first substep is to remove the sensory representation fromIs updated toFor words with multiple meanings, it is usually determined which meaning should be used depending on the context in the sentence. Thus, when updating the representation, each sensation will first be from xiTo aggregate useful information in the words in the front-rear direction of the word,
where the two multidimensional attention functions MD-GAT (-) have different parameters. Based onEach sensory unit updates its representation with a Gate Round Unit (GRU),
the detailed update function of the GRU is as follows:
wherein Wz,Wr,Wg,bz,brAnd bgAre learnable parameters.
It is worth noting that we are not using it directlyNew representation form as semantic informationMiddle Si,kThe reason is thatOnly the context information is contained, and a gate such as a GRU is needed to control the fusion of the context information and the semantic information.
Updating the word representation the second sub-step is based on the updated meaning representationRepresenting words fromIs updated towiThis word first obtains semantic information from its semantic representation,
its representation is then updated with GRU:
After multiple iterations, the final word representationNot only contains context word information, but also contains semantic knowledge. For each sentence, we useAndto represent the final representation.
Sentence matching layer, in obtaining semantic knowledge enhanced word representation of each sentenceAndwe then incorporate this word information into the character. Without loss of generality, we will use sentence CaThe character in (a) to describe the process. For each characterBy including charactersAll the representations of the words are combined, and useful word information can be obtainedThus, semantic knowledge enhanced character representation ytCan be obtained by:
wherein LayerNorm (·) indicates layer normalization, andis a contextual character representation obtained using BERT described in Sec.
For each characterIt uses multidimensional attention to get from C respectivelyaAnd CbThe information is aggregated in the statement,
the above multidimensional attention function MD-GAT (-) shares the same parameters. Through this sharing mechanism, the model has very good properties, when two sentences are completely matched, it has
Where k ∈ {1, 2, …, P } (P is the number of views).Is a parameter vector that assigns different weights to different angles of the message. Using P distances d1,d2,…,dPWe can get the final character representation,
Similarly, we can obtain sentence CbEach character inFinal character representation ofNote that the final character representation contains three types of information: context information, knowledge representation of words and word senses, and character-level similarity. For each sentence CaOr CbObtaining a sentence representation vector r using attention weighting of all final character representations of the sentenceaOr rb。
A relation classifier using two sentence vectors ra,rbAnd vector c obtained with BERTCLSThe model will predict the similarity of the two sentences,
where FFN (-) is a feed-forward network with two hidden layers and a Sigmoid activation function is used after the output layer. With N training samplesThe training goal is to minimize the loss of binary mutual entropy,
wherein y isi∈[0,1]Is the label of the ith training sample, pi∈[0,1]Is that our model is paired in sentencesAs input prediction.
The above method was tested, and the data set we performed semantic text similarity experiments on two Chinese data sets: LCQMC and BQ.
LCQMC is a problem matching corpus with large-scale open domains. It consists of 260068 chinese sentence pairs, including 238766 training samples, 8802 verification samples, and 12500 test samples. Each pair is associated with a binary label indicating that the two sentences have the same meaning or the same intent. The positive samples were 30% more than the negative samples.
BQ is a domain-specific large-scale corpus used for bank problem matching. It is composed of 12 ten thousand Chinese sentence pairs, including 10 ten thousand training samples, 1 ten thousand verification samples and 1 ten thousand test samples. Each pair is also associated with a binary label indicating whether or not two sentences have the same meaning. The number of positive and negative samples is the same.
The evaluation index, Accuracy (ACC), and F1 score of each data set were used as evaluation indexes. Accuracy is the percentage of examples correctly classified. The F1 score of the match is the harmonic mean of accuracy and recall.
The hyper-parametric input word graph is composed of three word segmentation tools (jieba, pkuseeg and thulac). We use 200-dimensional pre-training antigen embedding provided by open HowNet. The number of graph update steps/layers L is 2 on both data sets. The dimensions of the word and word sense representation are 128. The discard rate for all hidden layers is 0.2. Each model was trained by RMSProp with an initial learning rate of 0.0005, the learning rate of the BERT layer multiplied by an additional factor of 0.1. As for the batch size, LCQMC used 32 and BQ used 64.
The model of the method was compared to three types of baselines: representation-based models, interaction-based models, and BERT-based models. The results are summarized in fig. 6. In order to ensure the reliability of the experimental results, the same experiment was performed five times, and an average score was given. Experiments were performed on their own using the parameters.
The representation-based model includes text CNN, BilSTM, Lattice-CNN, and model LET-1. The text CNN is a connected structure whose Convolutional Neural Networks (CNNs) are used to encode each sentence. BilSTM is another Siamese architecture with a bidirectional long short term memory (Bi-LSTM) for encoding each sentence. Lattice-CNN has also been proposed to deal with the underlying Chinese word segmentation problem. The algorithm takes a word lattice as input, and combines feature vectors generated by a plurality of CNN cores into different n-element contexts of each node in a lattice graph by using a pool mechanism. LET-1 is a variant of the model we propose. Here, the BERT is replaced with a conventional character-level converter encoder and the interaction between sentences is eliminated at the sentence matching layer.
From FIG. 6, we can find that our LET-1 outperforms all baseline models on both data sets. More specifically, LET-1 has better performance than Lattice-CNN. Although they all utilize word lattices, Lattice-CNN focuses only on local information, while our model can utilize global information. In addition, the semantic messages are added between sentences in the model, so that the performance of the model is greatly improved. Furthermore, both Lattice-based models (Lattice-CNN and LET-1) performed better than the other two baselines (Text-CNN and BilSTM), indicating that word lattices must be used on this task.
The interaction-based model includes two benchmarks BiMPM, ESIM and our model LET-2. BiMPM is a bilateral multi-view matching model. It encodes each sentence using BiLSTM and matches two sentences from multiple angles. BiMPM performs well on certain natural language reasoning (NLI) tasks. There are two BilSTMs in the ESIM. The first is to encode a sentence, and the other is to fuse word alignment information between two sentences. The ESIM can achieve up-to-date results on various matching tasks. LET-2 is also a variant of LET. BERT is replaced by a conventional character-level converter encoder, similar to LET-1, but here we introduce an interaction mechanism.
The results of the above three models are shown in the second part of fig. 6. Our LET-2 performs better than the other models. In particular, LET-2 performs better than BiMPM, although they both use a multi-view matching mechanism. This suggests that our neural networks are powerful with word lattices. Furthermore, LET-2 performs better than LET-1, indicating the usefulness of character level comparisons in two sentences.
The BERT based model includes four baselines: BERT, BERT wwm ext and ERNIE. We compared them with our proposed model LET-3. BERT is the chinese official BERT model issued by Google. BERT-wwm is a Chinese BERT that uses a full word masking mechanism during the pre-training process. BERT wwm ext is a variation of BERT wwm, using more training data and training steps. ERNIE is designed to learn linguistic representations enhanced by knowledge masking strategies, including entity-level masking and phrase-level masking. LET-3 is the LET model we propose with BERT as a character-level encoder.
The results are shown in the third part of fig. 6. We can find that all three variants of BERT (BERTwwm, BERT-wwn-ext, ERNIE) exceed the original BERT, indicating that the use of word-level information during pre-training is very important for the Chinese matching task. The performance of our LET-3 model is superior to all of these BERT-based models. The result shows that the semantic information is an effective method for improving the Chinese semantic matching of BERT when LET is used for fine tuning the phrase.
To verify the validity of expressing lexical semantic information in conjunction with the Hownet, we performed experiments on the LCQMC test set using LET-3. In a comparison model without the knowledge of the Hownet, a word sense updating module in the SaGT is removed, and word representation is updated through multi-dimensional self-attention. Fig. 7 lists the results of the experiments in the two models for the three word segmentation methods and their combinations (lattices). For various types of segmentation, the performance of integrating semantic information is superior to that of simple word representation. The LET can acquire semantic information from the Hownet, and the performance of the model is improved. More specifically, when using semantic information, the accuracy and F1 score of the word lattice based model improved on average by 0.71% and 0.43%, respectively. Thus, the semantic information performs better on a lattice-based model. The possible reason is that the lattice-based model contains more possible words so that the model can perceive more meaning.
Furthermore, we designed an experiment to explore the impact of using different segment inputs. Using a graph of performance data of different segments on the LCQMC test data set as shown in FIG. 7, a significant improvement can be seen between the lattice-based model (lattice) and the word-based model, pkuseg and thulac. We believe this is because the lattice-based model can reduce the impact of word segmentation errors, making the prediction more accurate.
We also investigated the role of GRU in the SaGT. The accuracy of GRU removal in the model averages 87.82%, indicating that the GRU can control and integrate historical messages with current information. Through experimentation, we found that the model with two layers of SaGT achieved the best results. This indicates that multiple information fusion will optimize the message and make the model more robust.
The method provides a new graph converter with enhanced language knowledge, which is used for matching Chinese short texts. The model takes two word format graphs as input and integrates the semantic information of the Hownet to solve the word ambiguity problem to a certain extent. The method was evaluated on two reference datasets to achieve the best performance. Ablation analysis also shows that both semantic information and multi-granular information are important for text matching modeling.
Fig. 8 is a schematic structural diagram of a chinese text matching system according to an embodiment of the present invention, which can execute the chinese text matching method according to any of the embodiments and is configured in a terminal.
The system for matching Chinese texts provided by the embodiment comprises: an encoding program module 11, a semantic representation determining program module 12, an update iterator module 13, a matching program module 14 and a probability determining program module 15.
The encoding program module 11 is configured to perform word-level encoding on a chinese sentence pair using a plurality of word segmentation tools to obtain an initial word vector of the chinese sentence pair; the semantic representation determining program module 12 is configured to input an initial word vector of the chinese sentence pair to an input layer, determine a word vector of the chinese sentence pair, obtain an semantic corresponding to the word vector based on a knowledge network external knowledge base, and determine a semantic representation of the word vector; the updating iterative program module 13 is configured to input the word vectors and semantic representations of the chinese sentence pairs to a graph transformation layer capable of perceiving semantics, perform iterative updating on the word lattices of the semantic representations and the word vectors respectively through a multidimensional graph attention network, and output semantic word vectors with semantic representations; the matching program module 14 is configured to input the semantic word vector to a sentence matching layer, connect the semantic word vector of the obtained chinese sentence pair with an interactive semantic word vector, and determine a final feature representation semantic word vector of the chinese sentence pair; the probability determination program module 15 is configured to determine matching probabilities based on the final feature representation semantic word vectors of the chinese sentence pairs and the feature representations of the chinese sentence pairs by the plurality of word segmentation tools.
Further, the matching program module is configured to:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feed-forward neural network to generate a final feature expression semantic word vector.
Further, the update iterator module is configured to:
and iteratively updating the semantic representation through an reachable node of a word node corresponding to the semantic representation, iteratively updating a word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
Further, the semantic representation determiner module is configured to:
determining the weight of each word in the initial word vector through a feedforward neural network, and weighting the initial word vector based on the weight to obtain the word vector.
Further, the system is also configured to: and performing iterative updating on the semantic representation and the word vector through a gating circulation unit.
The embodiment of the invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions which can execute the Chinese text matching method in any method embodiment;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
determining a match probability based on the final feature representation semantic word vector and feature representations of the Chinese sentence pair.
As a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in embodiments of the present invention. One or more program instructions are stored in a non-transitory computer readable storage medium that, when executed by a processor, perform the chinese text matching method in any of the method embodiments described above.
The non-volatile computer readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
An embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the chinese text matching method of any of the embodiments of the present invention.
The client of the embodiment of the present application exists in various forms, including but not limited to:
(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones, multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as tablet computers.
(3) Portable entertainment devices such devices may display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) Other electronic devices with data processing capabilities.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A Chinese text matching method comprises the following steps:
using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
determining a match probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
2. The method of claim 1, wherein the inputting the semantic word vector to a sentence matching layer, concatenating a semantic word vector and an interactive semantic word vector resulting in the chinese sentence pair comprises:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feed-forward neural network to generate a final feature expression semantic word vector.
3. The method of claim 1, wherein the iteratively updating the semantic representation, the word lattice of the word vector, respectively, through a multidimensional graph attention network comprises:
iteratively updating the semantic representation through an accessible node of a word node corresponding to the semantic representation, iteratively updating the word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
4. The method of claim 1, wherein the inputting an initial word vector of the pair of chinese sentences to an input layer, determining a word vector of the pair of chinese sentences comprises:
determining the weight of each word in the initial word vector through a feedforward neural network, and weighting the initial word vector based on the weight to obtain the word vector.
5. The method of claim 1, wherein the method comprises: and performing iterative updating on the semantic representation and the word vector through a gating circulation unit.
6. A chinese text matching system, comprising:
the encoding program module is used for performing character-level encoding on the Chinese sentence pairs by using a plurality of word segmentation tools to obtain initial character vectors of the Chinese sentence pairs;
a semantic representation determining program module, configured to input an initial word vector of the chinese sentence pair to an input layer, determine a word vector of the chinese sentence pair, obtain an semantic corresponding to the word vector based on a knowledge network external knowledge base, and determine a semantic representation of the word vector;
an updating iterative program module used for inputting the word vector and semantic representation of the Chinese sentence pair to a graph conversion layer capable of sensing semantics, respectively performing iterative updating on the semantic representation and the word lattice of the word vector through a multidimensional graph attention network, and outputting a semantic word vector with semantic representation;
a matching program module for inputting the semantic word vector to a sentence matching layer, connecting the semantic word vector of the Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
a probability determination program module for determining a matching probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
7. The system of claim 6, wherein the matching program module is to:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feedforward neural network to generate a final feature representation semantic word vector.
8. The system of claim 6, wherein the update iterator module is to:
iteratively updating the semantic representation through an accessible node of a word node corresponding to the semantic representation, iteratively updating the word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
9. The system of claim 6, wherein the semantic representation determiner module is to:
and determining the weight of each word in the initial word vector through a feedforward neural network, and weighting the initial word vector based on the weight to obtain the word vector.
10. The system of claim 6, wherein the system is further configured to: and performing iterative updating on the semantic representation and the word vector through a gating circulation unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010837271.8A CN111914067B (en) | 2020-08-19 | 2020-08-19 | Chinese text matching method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010837271.8A CN111914067B (en) | 2020-08-19 | 2020-08-19 | Chinese text matching method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111914067A CN111914067A (en) | 2020-11-10 |
CN111914067B true CN111914067B (en) | 2022-07-08 |
Family
ID=73279383
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010837271.8A Active CN111914067B (en) | 2020-08-19 | 2020-08-19 | Chinese text matching method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111914067B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112329478A (en) * | 2020-11-30 | 2021-02-05 | 北京明略昭辉科技有限公司 | Method, device and equipment for constructing causal relationship determination model |
CN112528654A (en) * | 2020-12-15 | 2021-03-19 | 作业帮教育科技(北京)有限公司 | Natural language processing method and device and electronic equipment |
CN112926322A (en) * | 2021-04-28 | 2021-06-08 | 河南大学 | Text classification method and system combining self-attention mechanism and deep learning |
CN113094473A (en) * | 2021-04-30 | 2021-07-09 | 平安国际智慧城市科技股份有限公司 | Keyword weight calculation method and device, computer equipment and storage medium |
CN113486659B (en) * | 2021-05-25 | 2024-03-15 | 平安科技(深圳)有限公司 | Text matching method, device, computer equipment and storage medium |
CN113468884B (en) * | 2021-06-10 | 2023-06-16 | 北京信息科技大学 | Chinese event trigger word extraction method and device |
CN114048286B (en) * | 2021-10-29 | 2024-06-07 | 南开大学 | Automatic fact verification method integrating graph converter and common attention network |
CN114238563A (en) * | 2021-12-08 | 2022-03-25 | 齐鲁工业大学 | Multi-angle interaction-based intelligent matching method and device for Chinese sentences to semantic meanings |
CN114429129A (en) * | 2021-12-22 | 2022-05-03 | 南京信息工程大学 | Literature mining and material property prediction method |
CN114881040B (en) * | 2022-05-12 | 2022-12-06 | 桂林电子科技大学 | Method and device for processing semantic information of paragraphs and storage medium |
CN115238708B (en) * | 2022-08-17 | 2024-02-27 | 腾讯科技(深圳)有限公司 | Text semantic recognition method, device, equipment, storage medium and program product |
CN115422362B (en) * | 2022-10-09 | 2023-10-31 | 郑州数智技术研究院有限公司 | Text matching method based on artificial intelligence |
CN116796197A (en) * | 2022-12-22 | 2023-09-22 | 华信咨询设计研究院有限公司 | Medical short text similarity matching method |
CN116226357B (en) * | 2023-05-09 | 2023-07-14 | 武汉纺织大学 | Document retrieval method under input containing error information |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108509411B (en) * | 2017-10-10 | 2021-05-11 | 腾讯科技(深圳)有限公司 | Semantic analysis method and device |
CN111046671A (en) * | 2019-12-12 | 2020-04-21 | 中国科学院自动化研究所 | Chinese named entity recognition method based on graph network and merged into dictionary |
-
2020
- 2020-08-19 CN CN202010837271.8A patent/CN111914067B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111914067A (en) | 2020-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111914067B (en) | Chinese text matching method and system | |
CN109840287B (en) | Cross-modal information retrieval method and device based on neural network | |
CN112528672B (en) | Aspect-level emotion analysis method and device based on graph convolution neural network | |
CN109033068B (en) | Method and device for reading and understanding based on attention mechanism and electronic equipment | |
US9830315B1 (en) | Sequence-based structured prediction for semantic parsing | |
Yao et al. | Bi-directional LSTM recurrent neural network for Chinese word segmentation | |
CN111259127B (en) | Long text answer selection method based on transfer learning sentence vector | |
CN108846077B (en) | Semantic matching method, device, medium and electronic equipment for question and answer text | |
CN106202010B (en) | Method and apparatus based on deep neural network building Law Text syntax tree | |
CN104834747B (en) | Short text classification method based on convolutional neural networks | |
CN106502985B (en) | neural network modeling method and device for generating titles | |
Sutskever et al. | Generating text with recurrent neural networks | |
CN111159416A (en) | Language task model training method and device, electronic equipment and storage medium | |
CN109376222B (en) | Question-answer matching degree calculation method, question-answer automatic matching method and device | |
CN110083705A (en) | A kind of multi-hop attention depth model, method, storage medium and terminal for target emotional semantic classification | |
CN110083693B (en) | Robot dialogue reply method and device | |
CN110008327B (en) | Legal answer generation method and device | |
CN109214006B (en) | Natural language reasoning method for image enhanced hierarchical semantic representation | |
WO2021051518A1 (en) | Text data classification method and apparatus based on neural network model, and storage medium | |
CN111191002A (en) | Neural code searching method and device based on hierarchical embedding | |
CN114818717B (en) | Chinese named entity recognition method and system integrating vocabulary and syntax information | |
CN113204611A (en) | Method for establishing reading understanding model, reading understanding method and corresponding device | |
CN111241828A (en) | Intelligent emotion recognition method and device and computer readable storage medium | |
CN115329766B (en) | Named entity identification method based on dynamic word information fusion | |
CN118093834A (en) | AIGC large model-based language processing question-answering system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant after: Sipic Technology Co.,Ltd. Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant before: AI SPEECH Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |