CN111914067B - Chinese text matching method and system - Google Patents

Chinese text matching method and system Download PDF

Info

Publication number
CN111914067B
CN111914067B CN202010837271.8A CN202010837271A CN111914067B CN 111914067 B CN111914067 B CN 111914067B CN 202010837271 A CN202010837271 A CN 202010837271A CN 111914067 B CN111914067 B CN 111914067B
Authority
CN
China
Prior art keywords
semantic
word vector
word
representation
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010837271.8A
Other languages
Chinese (zh)
Other versions
CN111914067A (en
Inventor
俞凯
吕波尔
陈露
朱苏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Sipic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sipic Technology Co Ltd filed Critical Sipic Technology Co Ltd
Priority to CN202010837271.8A priority Critical patent/CN111914067B/en
Publication of CN111914067A publication Critical patent/CN111914067A/en
Application granted granted Critical
Publication of CN111914067B publication Critical patent/CN111914067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a Chinese text matching method. The method comprises the following steps: carrying out character-level coding on the Chinese sentence pairs by using a plurality of word segmentation tools to obtain initial character vectors of the Chinese sentence pairs; inputting an initial word vector of a Chinese sentence pair into an input layer, and determining semantic representation of the word vector based on a knowledge network external knowledge base; respectively carrying out iterative updating on word lattices of the semantic representation and the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with the semantic representation; inputting the semantic word vector into a sentence matching layer, and determining a final feature representation semantic word vector of the Chinese sentence pair; a match probability is determined for the feature representations of the chinese sentence pair based on the final feature representation semantic word vector of the chinese sentence pair and the plurality of word segmentation tools. The embodiment of the invention also provides a Chinese text matching system. According to the embodiment of the invention, the semantic information in the HowNet external knowledge base is integrated into the model, so that the semantic information in sentences is better utilized, and the matching effect is obviously improved.

Description

Chinese text matching method and system
Technical Field
The invention relates to the field of text matching, in particular to a Chinese text matching method and a Chinese text matching system.
Background
Text matching is an important basic problem in Natural Language Processing, and can be applied to a large number of NLP (Natural Language Processing) tasks, such as information retrieval, question-answering system, dialogue system, machine translation, and the like, which can be abstracted to a great extent as text matching problems. Word lattice convolutional neural network, two-way multi-angle matching in natural language sentences is commonly used for text matching.
Word lattice convolutional neural network: the word lattice is used as input, multiple CNN (Convolutional Neural Networks) convolution kernels are used on different n-gram texts to extract features, and the features are fused through a pooling mechanism to be used for text matching.
Bidirectional multi-angle matching in natural language sentences. The method uses words as input, each sentence is coded by a BilSTM (Bi-directional Long Short-Term Memory, bidirectional Long-Short Term Memory network), the characteristics of the two sentences are interacted by a plurality of methods, and a plurality of kinds of interactive information are combined for classification.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the related art:
when the word lattice convolutional neural network is used, the features are derived from local information, the global information cannot be fused, and the model may lose remote information when the features of a certain position in a sentence are extracted. In addition, this technique uses only representations of words, and does not utilize semantic information.
When bidirectional multi-angle matching in natural language sentences is used, although interactive information between sentences can be obtained, because input is simple word segmentation, influence caused by inaccurate word segmentation can be introduced. In addition, this technique also does not utilize semantic information of the terms.
Disclosure of Invention
In order to solve at least the problem that in the prior art, only local information can be obtained by convolution of n-gram texts of a word lattice convolution neural network, word vector representation is used, and explicit semantic information is not included. When the two-way multi-angle is matched, the input unit uses a word segmentation tool to segment words, and the word segmentation tool cannot guarantee that the words are completely accurate.
In a first aspect, an embodiment of the present invention provides a method for matching a chinese text, including:
using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
determining a match probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
In a second aspect, an embodiment of the present invention provides a chinese text matching system, including:
the coding program module is used for coding Chinese sentence pairs at a character level by using a plurality of word segmentation tools to obtain initial character vectors of the Chinese sentence pairs;
a semantic representation determining program module, configured to input an initial word vector of the chinese sentence pair to an input layer, determine a word vector of the chinese sentence pair, obtain an semantic corresponding to the word vector based on a knowledge network external knowledge base, and determine a semantic representation of the word vector;
an updating iterative program module, which is used for inputting the word vectors and semantic representations of the Chinese sentence pairs to a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
a matching program module for inputting the semantic word vector to a sentence matching layer, connecting the semantic word vector of the Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
a probability determination program module for determining a matching probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the chinese text matching method of any of the embodiments of the present invention.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the chinese text matching method according to any embodiment of the present invention.
The embodiment of the invention has the beneficial effects that: combining the word segmentation results of various word segmentation tools to construct a word lattice diagram as model input. Semantic information in an external knowledge base of HowNet is merged into the model, so that the model can better utilize the semantic information in sentences. The graph transformation model with enhanced semantic knowledge is used, and experiments prove that compared with a word lattice convolutional neural network, a sentence bidirectional multi-angle matching base line and the like, the model has obvious performance improvement.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for matching Chinese text according to an embodiment of the present invention;
FIG. 2 is an overall structure diagram of a product of a method for matching Chinese texts according to an embodiment of the present invention;
FIG. 3 is a diagram of a semantic information structure of a Chinese text matching method according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating updated semantic representations and word vectors of a Chinese text matching method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of word segmentation and potential word ambiguity of a Chinese text matching method according to an embodiment of the present invention;
FIG. 6 is a graph of performance data of different models of a Chinese text matching method on LCQMC and BQ test data sets according to an embodiment of the present invention;
FIG. 7 is a graph of performance data using different segments on the LCQMC test data set for a Chinese text matching method according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a chinese text matching system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a method for matching a chinese text according to an embodiment of the present invention, which includes the following steps:
s11: using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
s12: inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
s13: inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
s14: inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
s15: determining a match probability based on the final feature representation semantic word vector and feature representations of the Chinese sentence pair.
In the embodiment, the overall model for Chinese text matching comprises an input layer and a graph transformation layer of a graph transformation network, so that the model can notice the graph information formed by the whole sentence and not only the local information; meanwhile, an external knowledge base of HowNet (HowNet) is introduced, so that semantic information can be merged into the model, and the matching performance of the model is further improved as shown in FIG. 2.
For step S11, a plurality of word segmentation tools, such as jieba, pkuseeg, and thulac, are prepared in advance. Since text matching is involved, typically one sentence of the sentence pair is a question entered by the user and the other sentence is a question in the question text library. Thereby determining whether the two sentences match. And performing multiple word segmentation on the sentence pairs by using a plurality of prepared word segmentation tools to obtain all word segmentation results. When encoding, a pre-training model BERT (Bidirectional Encoder Representation from converters) can be used to encode two sentences at a character level, so as to obtain an initial word vector Representation.
For step S12, it is then necessary to obtain word vectors, each word containing several words, and the weights of each word are obtained through a feed-forward neural network and weighted to obtain the word vectors.
Semantic information in the HowNet external knowledge base is then introduced. As shown in FIG. 3, each word may have multiple meanings, each meaning being modified by some sememe. HowNet takes the sememe as the smallest unit of the expression, and has a total of 1985 sememes in the knowledge base. The vector representation of each meaning is obtained by pooling weighting also from the vector representation of the sememe, which is written as semantic representation below. There may be multiple semantic representations for each word because some chinese words may be ambiguous words.
For step S13, in the graph transformation layer module capable of sensing semantics, the semantic representation and the word vector representation are updated iteratively in sequence. The update uses MD-GAT (multi-dimensional graph attention transform network). Among these, the multidimensional graph attention transforms the network, which is a network used many times in the method, and is applied as a function below. Both word lattice and sequence of sememes are considered as a graph.
Figure RE-GDA0002698513950000051
To obtain the representation after the first update, the node x is usedjConnected node representation and xjUpdating the representation of the previous round by itself. The update is the weighting of the connected nodes and their representations.
As an embodiment, the method further comprises: and iteratively updating the semantic representation through an reachable node of a word node corresponding to the semantic representation, iteratively updating a word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
In the present embodiment, as shown in fig. 4, when updating the semantic representation, reachable node representation of the word node corresponding to the semantic representation is used as information, and a Gate loop Unit (GRU) is used as a control Unit for history information and new information. When the word vector is updated, the semantic node representation corresponding to the word node is used as information, and the GRU is also used as a control unit for history information and updated new information. The GRU may thus retain a portion of the useful history information. The module can be iterated for a plurality of times in the model, and experiments prove that the effect of twice iteration is the best.
For step S14, inputting the semantic word vector determined in step S13 to the sentence matching layer includes:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feed-forward neural network to generate a final feature expression semantic word vector.
In the present embodiment, a word-level vector representation is first obtained by pooling weighting the updated word vectors, and the pooled input is all the words including the word. And then, normalizing the word vector and the initial word vector representation obtained by BERT coding to obtain a new word vector. In the module, not only the operation of multidimensional graph attention transformation is carried out on sentences, but also interaction is carried out among the sentences, and the interaction also uses the multidimensional graph attention transformation. The sentence information and the interactive information are connected, and a final word vector representation is obtained through a two-layer feedforward neural network. And then obtaining sentence vectors through pooling weighting.
For step S15, the vectors of the two sentences and their dot products and absolute values are connected, and then the BERT-coded feature representation is connected, and a two-layer feedforward neural network and activation function are used together to obtain the classification probability.
p=FFN([cCLS,ra,rb,ra⊙rb,|ra-rb|])
Wherein r isaAnd rbAre vectors of two sentences, respectively, cCLSIs a characteristic representation after BERT coding.
It can be seen from this embodiment that the segmentation results of multiple segmentation tools are combined to construct a word lattice diagram as the model input. Semantic information in an external knowledge base of HowNet is fused into the model, so that the model can better utilize the semantic information in sentences. By using a graph transformation model with enhanced semantic knowledge, experiments prove that the model has remarkable performance improvement compared with baselines such as a word lattice convolution neural network, sentence bidirectional multi-angle matching and the like.
The method is described in detail, wherein a pre-trained language model, such as BERT, has shown powerful performance on various natural language processing tasks, including text matching. For Chinese text matching, BERT takes as input a pair of sentences, each Chinese character being a separate input token. It ignores the word information. To address this problem, some chinese variants of the original BERT have been proposed, such as BERT-wwm and ERNIE (two prior art). However, the pre-training process of BERT considering words requires a lot of time and resources. Therefore, the model of the method adopts a pre-trained language model as initialization, and fine-tunes the model by using word information.
The Hownet is an external knowledge base that manually labels each Chinese word sense with one or more related senses. The Hownet treats an sememe as an atomic semantic unit. Unlike WordNet, it emphasizes that various parts and attributes of the concept can be well represented in the sense. Hownet finds wide application in many natural language processing tasks, such as word similarity calculation, emotion analysis, word representation learning, and language modeling. However, people have less research on their effectiveness in short text matching tasks, especially in combination with pre-trained language models.
GAT (graph attention networks) is a special type of network that handles graph structure data through an attention mechanism. Given a diagram
Figure RE-GDA0002698513950000071
Where V and ε are each a node xiAnd a set of edges. N is a radical of(xi)Is a collection comprising nodes xi themselves and nodes directly connected by xi.
Each node xi in the graph has an initial feature vector
Figure RE-GDA0002698513950000072
Where d is the feature dimension. The representation of each node is iteratively updated by a graph attention operation. In the l-th step, each node xi aggregates context information by aggregating its neighbors and itself. Updated representation
Figure RE-GDA0002698513950000073
Is calculated from the weighted average of the connected nodes,
Figure RE-GDA0002698513950000074
wherein
Figure RE-GDA0002698513950000075
Is a learnable parameter, and σ (·) is a nonlinear activation function, such as ReLU. Attention coefficient
Figure RE-GDA0002698513950000076
Is two nodes x in a unified spaceiAnd xjEmbedded normalized similarity between them.
Figure RE-GDA0002698513950000077
Wherein,
Figure RE-GDA0002698513950000078
and
Figure RE-GDA0002698513950000079
are the projected learnable parameters.
Note that, in the above-described formula,
Figure RE-GDA00026985139500000710
is a scalar, which means that all units
Figure RE-GDA00026985139500000711
Are treated equally. This may limit the ability to model complex dependencies.Instead of general attention, multidimensional attention has proven useful in dealing with context changes and ambiguous problems in many NLP tasks. For each embedded
Figure RE-GDA00026985139500000712
It does not compute a single scalar score
Figure RE-GDA00026985139500000713
Instead, a feature score vector is first computed, then normalized with a feature-sensitive multidimensional softmax,
Figure RE-GDA0002698513950000081
wherein
Figure RE-GDA0002698513950000082
Is a function of the similarity in the equation
Figure RE-GDA0002698513950000083
A scalar of the calculation.
Figure RE-GDA0002698513950000084
Is a vector. The addition in the above equation means that a scalar is added to each element of the vector.
Figure RE-GDA0002698513950000085
For modeling the pairwise dependencies of two nodes, and
Figure RE-GDA0002698513950000086
for estimating
Figure RE-GDA0002698513950000087
The contribution of each of the feature dimensions of (a),
Figure RE-GDA0002698513950000088
wherein,
Figure RE-GDA0002698513950000089
and
Figure RE-GDA00026985139500000810
are learnable parameters. According to the score vector
Figure RE-GDA00026985139500000811
The corresponding modifications are as follows:
Figure RE-GDA00026985139500000812
wherein |, indicates the element-by-element product of two vectors. For simplicity, the update process using the multidimensional attention mechanism is represented using MD-GAT (·), as follows,
Figure RE-GDA00026985139500000813
after L steps of updating, each node finally has a context-aware representation
Figure RE-GDA00026985139500000814
To obtain a stable training process, a residual join is also used, followed by layer normalization between the two graph attention layers.
In the specific implementation, a Chinese short text matching task is defined, and two Chinese sentences are given.
Figure RE-GDA00026985139500000815
And
Figure RE-GDA00026985139500000816
text matching model
Figure RE-GDA00026985139500000817
Is to predict CaAnd CbWhether the semantics ofAnd the like. In this case, the number of the first and second,
Figure RE-GDA00026985139500000818
and
Figure RE-GDA00026985139500000819
respectively representing the tth and tth' Chinese characters in two sentences, and TaAnd TbRepresenting the number of characters in the sentence.
In the method, a matching model with enhanced language knowledge is provided. Rather than dividing each sentence into a sequence of words, we have all possible paths of segmentation to form a word lattice diagram
Figure RE-GDA00026985139500000820
V is the set of nodes and ε is the set of edges. Each node
Figure RE-GDA00026985139500000821
Corresponding to the word wiWord wiIs from the t-th in the sentence1Character to tth2A sub-sequence of characters starting with a character. As shown in FIG. 5, w can be obtained by retrieving HowNetiAll the meanings of (a).
For two nodes
Figure RE-GDA00026985139500000822
And
Figure RE-GDA00026985139500000823
if xiWith x in the original sentencejAdjacent, there is an edge between them.
Figure RE-GDA00026985139500000824
Is composed of xiItself and the set of all reachable nodes in its forward direction, and
Figure RE-GDA00026985139500000825
is composed of xiItself and all reachable nodes backwards from it.
Using two graphs
Figure RE-GDA00026985139500000826
And
Figure RE-GDA00026985139500000827
for two sentences CaAnd CbThe graph matching model predicts the similarity between the two to judge the original sentence CaAnd CbWhether or not to have the same meaning. As shown in fig. 2, the model consists of four parts: an input module, a semantic perception graph converter (SaGT), a sentence matching layer and a relation classifier. The input module outputs each word wiAnd an initial semantic representation of each meaning. The semantic perception map converter iteratively updates the word representations and the semantic representations and fuses the useful information of each other. The sentence matching layer firstly blends the word expression into the character and then matches the two character sequences by adopting a bilateral multi-view matching mechanism. The relationship classifier takes a sentence vector as input and predicts the relationship between two sentences.
Context embedding in input Module for each node x in the graphiWord wiIs a centralized pool of context character representations. Specifically, the original character-level sentences are first concatenated to form a new sequence
Figure RE-GDA0002698513950000091
They are then provided to the BERT model to obtain a contextual representation of each character
Figure RE-GDA0002698513950000092
Hypothesis word wiConsisting of successive character marks
Figure RE-GDA0002698513950000098
Using for each role ck(t1≤k≤t2) A feed-forward network (FFN) with two layers computes feature-oriented fractional vectors, i.e., then uses feature-oriented multidimensional softmax (MD-so)ftmax) is used for the normalization,
uk=MD-softmaxk(FFN(ck))
corresponding character embedding ckBy normalized fraction ukThe weighting results in the context word embedding,
Figure RE-GDA0002698513950000093
for simplicity, the above formula abbreviation is rewritten using Att-Pooling (. cndot.), i.e.:
vi=Att-Pooling({|ck|t1≤k≤t2})
and embedding sememes, wherein the method uses the Hownet as an external knowledge base to express semantic information of words. In view of ambiguity, HowNet distinguishes between different meanings for each word in an ambiguate. An example is given in fig. 3. The word "Apple (Apple)" has two meanings, including "Brand (Apple Brand)" and "fruit (Apple)". The term "brand (apple brand)" has five phonetic names including "computer", "pattern value", "energy", "carrying (Bring)" and "specific brand".
For each word wiBy using
Figure RE-GDA0002698513950000094
si,kIs wiIn the k-th sense, we represent its corresponding sememes as a set
Figure RE-GDA0002698513950000095
To obtain each semantic si,kIs embedded ini,kWe first get each of the sememes using a multidimensional attention function
Figure RE-GDA0002698513950000096
Figure RE-GDA0002698513950000097
The expression of (a):
Figure RE-GDA0002698513950000101
wherein
Figure RE-GDA0002698513950000102
Is sense generated by SAT model
Figure RE-GDA0002698513950000103
The embedded vector of (2). For each semantic si,kWhich is embedded in si,kObtained by an attentive aggregation of all phoneme representations,
Figure RE-GDA0002698513950000104
semantically aware graph converter, for each node x in the graphiInsert word viContaining only context information, without explicit linguistic knowledge, but embedding si,kOnly containing linguistic knowledge and no contextual information. In order to obtain useful information therefrom, a semantic perception map converter (SaGT) is proposed. The algorithm first finds vi and si,kRespectively as word wiInitial word representation of
Figure RE-GDA0002698513950000105
And initial sense representation of word sense
Figure RE-GDA0002698513950000106
The word representation and the meaning representation are then iteratively updated in two sub-steps.
Updating sensory representation in the first iteration, the first substep is to remove the sensory representation from
Figure RE-GDA0002698513950000107
Is updated to
Figure RE-GDA0002698513950000108
For words with multiple meanings, it is usually determined which meaning should be used depending on the context in the sentence. Thus, when updating the representation, each sensation will first be from xiTo aggregate useful information in the words in the front-rear direction of the word,
Figure RE-GDA0002698513950000109
Figure RE-GDA00026985139500001010
where the two multidimensional attention functions MD-GAT (-) have different parameters. Based on
Figure RE-GDA00026985139500001011
Each sensory unit updates its representation with a Gate Round Unit (GRU),
Figure RE-GDA00026985139500001012
the detailed update function of the GRU is as follows:
Figure RE-GDA00026985139500001013
Figure RE-GDA00026985139500001014
Figure RE-GDA00026985139500001015
Figure RE-GDA00026985139500001016
wherein Wz,Wr,Wg,bz,brAnd bgAre learnable parameters.
It is worth noting that we are not using it directly
Figure RE-GDA00026985139500001017
New representation form as semantic information
Figure RE-GDA00026985139500001018
Middle Si,kThe reason is that
Figure RE-GDA00026985139500001019
Only the context information is contained, and a gate such as a GRU is needed to control the fusion of the context information and the semantic information.
Updating the word representation the second sub-step is based on the updated meaning representation
Figure RE-GDA00026985139500001020
Representing words from
Figure RE-GDA00026985139500001021
Is updated to
Figure RE-GDA00026985139500001022
wiThis word first obtains semantic information from its semantic representation,
Figure RE-GDA0002698513950000111
its representation is then updated with GRU:
Figure RE-GDA0002698513950000112
said GRU function and said
Figure RE-GDA0002698513950000113
The GRU function of (a) has different parameters.
After multiple iterations, the final word representation
Figure RE-GDA0002698513950000114
Not only contains context word information, but also contains semantic knowledge. For each sentence, we use
Figure RE-GDA0002698513950000115
And
Figure RE-GDA0002698513950000116
to represent the final representation.
Sentence matching layer, in obtaining semantic knowledge enhanced word representation of each sentence
Figure RE-GDA0002698513950000117
And
Figure RE-GDA0002698513950000118
we then incorporate this word information into the character. Without loss of generality, we will use sentence CaThe character in (a) to describe the process. For each character
Figure RE-GDA0002698513950000119
By including characters
Figure RE-GDA00026985139500001110
All the representations of the words are combined, and useful word information can be obtained
Figure RE-GDA00026985139500001111
Thus, semantic knowledge enhanced character representation ytCan be obtained by:
Figure RE-GDA00026985139500001112
wherein LayerNorm (·) indicates layer normalization, and
Figure RE-GDA00026985139500001113
is a contextual character representation obtained using BERT described in Sec.
For each character
Figure RE-GDA00026985139500001114
It uses multidimensional attention to get from C respectivelyaAnd CbThe information is aggregated in the statement,
Figure RE-GDA00026985139500001115
Figure RE-GDA00026985139500001116
the above multidimensional attention function MD-GAT (-) shares the same parameters. Through this sharing mechanism, the model has very good properties, when two sentences are completely matched, it has
Figure RE-GDA00026985139500001117
Figure RE-GDA00026985139500001118
Where k ∈ {1, 2, …, P } (P is the number of views).
Figure RE-GDA00026985139500001119
Is a parameter vector that assigns different weights to different angles of the message. Using P distances d1,d2,…,dPWe can get the final character representation,
Figure RE-GDA00026985139500001120
wherein
Figure RE-GDA00026985139500001121
And FFN (-) is a feed-forward network with two layers.
Similarly, we can obtain sentence CbEach character in
Figure RE-GDA00026985139500001122
Final character representation of
Figure RE-GDA00026985139500001123
Note that the final character representation contains three types of information: context information, knowledge representation of words and word senses, and character-level similarity. For each sentence CaOr CbObtaining a sentence representation vector r using attention weighting of all final character representations of the sentenceaOr rb
A relation classifier using two sentence vectors ra,rbAnd vector c obtained with BERTCLSThe model will predict the similarity of the two sentences,
Figure RE-GDA0002698513950000121
where FFN (-) is a feed-forward network with two hidden layers and a Sigmoid activation function is used after the output layer. With N training samples
Figure RE-GDA0002698513950000122
The training goal is to minimize the loss of binary mutual entropy,
Figure RE-GDA0002698513950000123
wherein y isi∈[0,1]Is the label of the ith training sample, pi∈[0,1]Is that our model is paired in sentences
Figure RE-GDA0002698513950000124
As input prediction.
The above method was tested, and the data set we performed semantic text similarity experiments on two Chinese data sets: LCQMC and BQ.
LCQMC is a problem matching corpus with large-scale open domains. It consists of 260068 chinese sentence pairs, including 238766 training samples, 8802 verification samples, and 12500 test samples. Each pair is associated with a binary label indicating that the two sentences have the same meaning or the same intent. The positive samples were 30% more than the negative samples.
BQ is a domain-specific large-scale corpus used for bank problem matching. It is composed of 12 ten thousand Chinese sentence pairs, including 10 ten thousand training samples, 1 ten thousand verification samples and 1 ten thousand test samples. Each pair is also associated with a binary label indicating whether or not two sentences have the same meaning. The number of positive and negative samples is the same.
The evaluation index, Accuracy (ACC), and F1 score of each data set were used as evaluation indexes. Accuracy is the percentage of examples correctly classified. The F1 score of the match is the harmonic mean of accuracy and recall.
The hyper-parametric input word graph is composed of three word segmentation tools (jieba, pkuseeg and thulac). We use 200-dimensional pre-training antigen embedding provided by open HowNet. The number of graph update steps/layers L is 2 on both data sets. The dimensions of the word and word sense representation are 128. The discard rate for all hidden layers is 0.2. Each model was trained by RMSProp with an initial learning rate of 0.0005, the learning rate of the BERT layer multiplied by an additional factor of 0.1. As for the batch size, LCQMC used 32 and BQ used 64.
The model of the method was compared to three types of baselines: representation-based models, interaction-based models, and BERT-based models. The results are summarized in fig. 6. In order to ensure the reliability of the experimental results, the same experiment was performed five times, and an average score was given. Experiments were performed on their own using the parameters.
The representation-based model includes text CNN, BilSTM, Lattice-CNN, and model LET-1. The text CNN is a connected structure whose Convolutional Neural Networks (CNNs) are used to encode each sentence. BilSTM is another Siamese architecture with a bidirectional long short term memory (Bi-LSTM) for encoding each sentence. Lattice-CNN has also been proposed to deal with the underlying Chinese word segmentation problem. The algorithm takes a word lattice as input, and combines feature vectors generated by a plurality of CNN cores into different n-element contexts of each node in a lattice graph by using a pool mechanism. LET-1 is a variant of the model we propose. Here, the BERT is replaced with a conventional character-level converter encoder and the interaction between sentences is eliminated at the sentence matching layer.
From FIG. 6, we can find that our LET-1 outperforms all baseline models on both data sets. More specifically, LET-1 has better performance than Lattice-CNN. Although they all utilize word lattices, Lattice-CNN focuses only on local information, while our model can utilize global information. In addition, the semantic messages are added between sentences in the model, so that the performance of the model is greatly improved. Furthermore, both Lattice-based models (Lattice-CNN and LET-1) performed better than the other two baselines (Text-CNN and BilSTM), indicating that word lattices must be used on this task.
The interaction-based model includes two benchmarks BiMPM, ESIM and our model LET-2. BiMPM is a bilateral multi-view matching model. It encodes each sentence using BiLSTM and matches two sentences from multiple angles. BiMPM performs well on certain natural language reasoning (NLI) tasks. There are two BilSTMs in the ESIM. The first is to encode a sentence, and the other is to fuse word alignment information between two sentences. The ESIM can achieve up-to-date results on various matching tasks. LET-2 is also a variant of LET. BERT is replaced by a conventional character-level converter encoder, similar to LET-1, but here we introduce an interaction mechanism.
The results of the above three models are shown in the second part of fig. 6. Our LET-2 performs better than the other models. In particular, LET-2 performs better than BiMPM, although they both use a multi-view matching mechanism. This suggests that our neural networks are powerful with word lattices. Furthermore, LET-2 performs better than LET-1, indicating the usefulness of character level comparisons in two sentences.
The BERT based model includes four baselines: BERT, BERT wwm ext and ERNIE. We compared them with our proposed model LET-3. BERT is the chinese official BERT model issued by Google. BERT-wwm is a Chinese BERT that uses a full word masking mechanism during the pre-training process. BERT wwm ext is a variation of BERT wwm, using more training data and training steps. ERNIE is designed to learn linguistic representations enhanced by knowledge masking strategies, including entity-level masking and phrase-level masking. LET-3 is the LET model we propose with BERT as a character-level encoder.
The results are shown in the third part of fig. 6. We can find that all three variants of BERT (BERTwwm, BERT-wwn-ext, ERNIE) exceed the original BERT, indicating that the use of word-level information during pre-training is very important for the Chinese matching task. The performance of our LET-3 model is superior to all of these BERT-based models. The result shows that the semantic information is an effective method for improving the Chinese semantic matching of BERT when LET is used for fine tuning the phrase.
To verify the validity of expressing lexical semantic information in conjunction with the Hownet, we performed experiments on the LCQMC test set using LET-3. In a comparison model without the knowledge of the Hownet, a word sense updating module in the SaGT is removed, and word representation is updated through multi-dimensional self-attention. Fig. 7 lists the results of the experiments in the two models for the three word segmentation methods and their combinations (lattices). For various types of segmentation, the performance of integrating semantic information is superior to that of simple word representation. The LET can acquire semantic information from the Hownet, and the performance of the model is improved. More specifically, when using semantic information, the accuracy and F1 score of the word lattice based model improved on average by 0.71% and 0.43%, respectively. Thus, the semantic information performs better on a lattice-based model. The possible reason is that the lattice-based model contains more possible words so that the model can perceive more meaning.
Furthermore, we designed an experiment to explore the impact of using different segment inputs. Using a graph of performance data of different segments on the LCQMC test data set as shown in FIG. 7, a significant improvement can be seen between the lattice-based model (lattice) and the word-based model, pkuseg and thulac. We believe this is because the lattice-based model can reduce the impact of word segmentation errors, making the prediction more accurate.
We also investigated the role of GRU in the SaGT. The accuracy of GRU removal in the model averages 87.82%, indicating that the GRU can control and integrate historical messages with current information. Through experimentation, we found that the model with two layers of SaGT achieved the best results. This indicates that multiple information fusion will optimize the message and make the model more robust.
The method provides a new graph converter with enhanced language knowledge, which is used for matching Chinese short texts. The model takes two word format graphs as input and integrates the semantic information of the Hownet to solve the word ambiguity problem to a certain extent. The method was evaluated on two reference datasets to achieve the best performance. Ablation analysis also shows that both semantic information and multi-granular information are important for text matching modeling.
Fig. 8 is a schematic structural diagram of a chinese text matching system according to an embodiment of the present invention, which can execute the chinese text matching method according to any of the embodiments and is configured in a terminal.
The system for matching Chinese texts provided by the embodiment comprises: an encoding program module 11, a semantic representation determining program module 12, an update iterator module 13, a matching program module 14 and a probability determining program module 15.
The encoding program module 11 is configured to perform word-level encoding on a chinese sentence pair using a plurality of word segmentation tools to obtain an initial word vector of the chinese sentence pair; the semantic representation determining program module 12 is configured to input an initial word vector of the chinese sentence pair to an input layer, determine a word vector of the chinese sentence pair, obtain an semantic corresponding to the word vector based on a knowledge network external knowledge base, and determine a semantic representation of the word vector; the updating iterative program module 13 is configured to input the word vectors and semantic representations of the chinese sentence pairs to a graph transformation layer capable of perceiving semantics, perform iterative updating on the word lattices of the semantic representations and the word vectors respectively through a multidimensional graph attention network, and output semantic word vectors with semantic representations; the matching program module 14 is configured to input the semantic word vector to a sentence matching layer, connect the semantic word vector of the obtained chinese sentence pair with an interactive semantic word vector, and determine a final feature representation semantic word vector of the chinese sentence pair; the probability determination program module 15 is configured to determine matching probabilities based on the final feature representation semantic word vectors of the chinese sentence pairs and the feature representations of the chinese sentence pairs by the plurality of word segmentation tools.
Further, the matching program module is configured to:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feed-forward neural network to generate a final feature expression semantic word vector.
Further, the update iterator module is configured to:
and iteratively updating the semantic representation through an reachable node of a word node corresponding to the semantic representation, iteratively updating a word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
Further, the semantic representation determiner module is configured to:
determining the weight of each word in the initial word vector through a feedforward neural network, and weighting the initial word vector based on the weight to obtain the word vector.
Further, the system is also configured to: and performing iterative updating on the semantic representation and the word vector through a gating circulation unit.
The embodiment of the invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions which can execute the Chinese text matching method in any method embodiment;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
determining a match probability based on the final feature representation semantic word vector and feature representations of the Chinese sentence pair.
As a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in embodiments of the present invention. One or more program instructions are stored in a non-transitory computer readable storage medium that, when executed by a processor, perform the chinese text matching method in any of the method embodiments described above.
The non-volatile computer readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
An embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the chinese text matching method of any of the embodiments of the present invention.
The client of the embodiment of the present application exists in various forms, including but not limited to:
(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones, multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as tablet computers.
(3) Portable entertainment devices such devices may display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) Other electronic devices with data processing capabilities.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A Chinese text matching method comprises the following steps:
using a plurality of word segmentation tools to encode Chinese sentence pairs at a character level to obtain initial character vectors of the Chinese sentence pairs;
inputting the initial word vector of the Chinese sentence pair to an input layer, determining a word vector of the Chinese sentence pair, obtaining an antigen corresponding to the word vector based on a knowledge network external knowledge base, and determining semantic representation of the word vector;
inputting the word vectors and semantic representations of the Chinese sentence pairs into a graph transformation layer capable of sensing semantics, respectively carrying out iterative updating on the semantic representations and word lattices of the word vectors through a multidimensional graph attention network, and outputting semantic word vectors with semantic representations;
inputting the semantic word vector into a sentence matching layer, connecting the semantic word vector of the obtained Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
determining a match probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
2. The method of claim 1, wherein the inputting the semantic word vector to a sentence matching layer, concatenating a semantic word vector and an interactive semantic word vector resulting in the chinese sentence pair comprises:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feed-forward neural network to generate a final feature expression semantic word vector.
3. The method of claim 1, wherein the iteratively updating the semantic representation, the word lattice of the word vector, respectively, through a multidimensional graph attention network comprises:
iteratively updating the semantic representation through an accessible node of a word node corresponding to the semantic representation, iteratively updating the word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
4. The method of claim 1, wherein the inputting an initial word vector of the pair of chinese sentences to an input layer, determining a word vector of the pair of chinese sentences comprises:
determining the weight of each word in the initial word vector through a feedforward neural network, and weighting the initial word vector based on the weight to obtain the word vector.
5. The method of claim 1, wherein the method comprises: and performing iterative updating on the semantic representation and the word vector through a gating circulation unit.
6. A chinese text matching system, comprising:
the encoding program module is used for performing character-level encoding on the Chinese sentence pairs by using a plurality of word segmentation tools to obtain initial character vectors of the Chinese sentence pairs;
a semantic representation determining program module, configured to input an initial word vector of the chinese sentence pair to an input layer, determine a word vector of the chinese sentence pair, obtain an semantic corresponding to the word vector based on a knowledge network external knowledge base, and determine a semantic representation of the word vector;
an updating iterative program module used for inputting the word vector and semantic representation of the Chinese sentence pair to a graph conversion layer capable of sensing semantics, respectively performing iterative updating on the semantic representation and the word lattice of the word vector through a multidimensional graph attention network, and outputting a semantic word vector with semantic representation;
a matching program module for inputting the semantic word vector to a sentence matching layer, connecting the semantic word vector of the Chinese sentence pair with an interactive semantic word vector, and determining a final feature representation semantic word vector of the Chinese sentence pair;
a probability determination program module for determining a matching probability based on the final feature representation semantic word vector of the Chinese sentence pair and the feature representations of the Chinese sentence pair by the plurality of word segmentation tools.
7. The system of claim 6, wherein the matching program module is to:
performing pooling weighting on the semantic word vector to obtain a weighted semantic word vector;
normalizing the weighted semantic word vector and the initial word vector to determine a semantic word vector;
interacting the semantic word vectors of the Chinese sentence pairs through a multidimensional graph attention network to obtain interactive semantic word vectors;
and inputting the semantic word vector and the interactive semantic word vector of the Chinese sentence pair into a feedforward neural network to generate a final feature representation semantic word vector.
8. The system of claim 6, wherein the update iterator module is to:
iteratively updating the semantic representation through an accessible node of a word node corresponding to the semantic representation, iteratively updating the word lattice of the word vector through a semantic node corresponding to the word node, and outputting the semantic word vector with the semantic representation.
9. The system of claim 6, wherein the semantic representation determiner module is to:
and determining the weight of each word in the initial word vector through a feedforward neural network, and weighting the initial word vector based on the weight to obtain the word vector.
10. The system of claim 6, wherein the system is further configured to: and performing iterative updating on the semantic representation and the word vector through a gating circulation unit.
CN202010837271.8A 2020-08-19 2020-08-19 Chinese text matching method and system Active CN111914067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010837271.8A CN111914067B (en) 2020-08-19 2020-08-19 Chinese text matching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010837271.8A CN111914067B (en) 2020-08-19 2020-08-19 Chinese text matching method and system

Publications (2)

Publication Number Publication Date
CN111914067A CN111914067A (en) 2020-11-10
CN111914067B true CN111914067B (en) 2022-07-08

Family

ID=73279383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010837271.8A Active CN111914067B (en) 2020-08-19 2020-08-19 Chinese text matching method and system

Country Status (1)

Country Link
CN (1) CN111914067B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329478A (en) * 2020-11-30 2021-02-05 北京明略昭辉科技有限公司 Method, device and equipment for constructing causal relationship determination model
CN112528654A (en) * 2020-12-15 2021-03-19 作业帮教育科技(北京)有限公司 Natural language processing method and device and electronic equipment
CN112926322A (en) * 2021-04-28 2021-06-08 河南大学 Text classification method and system combining self-attention mechanism and deep learning
CN113094473A (en) * 2021-04-30 2021-07-09 平安国际智慧城市科技股份有限公司 Keyword weight calculation method and device, computer equipment and storage medium
CN113486659B (en) * 2021-05-25 2024-03-15 平安科技(深圳)有限公司 Text matching method, device, computer equipment and storage medium
CN113468884B (en) * 2021-06-10 2023-06-16 北京信息科技大学 Chinese event trigger word extraction method and device
CN114048286B (en) * 2021-10-29 2024-06-07 南开大学 Automatic fact verification method integrating graph converter and common attention network
CN114238563A (en) * 2021-12-08 2022-03-25 齐鲁工业大学 Multi-angle interaction-based intelligent matching method and device for Chinese sentences to semantic meanings
CN114429129A (en) * 2021-12-22 2022-05-03 南京信息工程大学 Literature mining and material property prediction method
CN114881040B (en) * 2022-05-12 2022-12-06 桂林电子科技大学 Method and device for processing semantic information of paragraphs and storage medium
CN115238708B (en) * 2022-08-17 2024-02-27 腾讯科技(深圳)有限公司 Text semantic recognition method, device, equipment, storage medium and program product
CN115422362B (en) * 2022-10-09 2023-10-31 郑州数智技术研究院有限公司 Text matching method based on artificial intelligence
CN116796197A (en) * 2022-12-22 2023-09-22 华信咨询设计研究院有限公司 Medical short text similarity matching method
CN116226357B (en) * 2023-05-09 2023-07-14 武汉纺织大学 Document retrieval method under input containing error information

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509411B (en) * 2017-10-10 2021-05-11 腾讯科技(深圳)有限公司 Semantic analysis method and device
CN111046671A (en) * 2019-12-12 2020-04-21 中国科学院自动化研究所 Chinese named entity recognition method based on graph network and merged into dictionary

Also Published As

Publication number Publication date
CN111914067A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN111914067B (en) Chinese text matching method and system
CN109840287B (en) Cross-modal information retrieval method and device based on neural network
CN112528672B (en) Aspect-level emotion analysis method and device based on graph convolution neural network
CN109033068B (en) Method and device for reading and understanding based on attention mechanism and electronic equipment
US9830315B1 (en) Sequence-based structured prediction for semantic parsing
Yao et al. Bi-directional LSTM recurrent neural network for Chinese word segmentation
CN111259127B (en) Long text answer selection method based on transfer learning sentence vector
CN108846077B (en) Semantic matching method, device, medium and electronic equipment for question and answer text
CN106202010B (en) Method and apparatus based on deep neural network building Law Text syntax tree
CN104834747B (en) Short text classification method based on convolutional neural networks
CN106502985B (en) neural network modeling method and device for generating titles
Sutskever et al. Generating text with recurrent neural networks
CN111159416A (en) Language task model training method and device, electronic equipment and storage medium
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110083705A (en) A kind of multi-hop attention depth model, method, storage medium and terminal for target emotional semantic classification
CN110083693B (en) Robot dialogue reply method and device
CN110008327B (en) Legal answer generation method and device
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
WO2021051518A1 (en) Text data classification method and apparatus based on neural network model, and storage medium
CN111191002A (en) Neural code searching method and device based on hierarchical embedding
CN114818717B (en) Chinese named entity recognition method and system integrating vocabulary and syntax information
CN113204611A (en) Method for establishing reading understanding model, reading understanding method and corresponding device
CN111241828A (en) Intelligent emotion recognition method and device and computer readable storage medium
CN115329766B (en) Named entity identification method based on dynamic word information fusion
CN118093834A (en) AIGC large model-based language processing question-answering system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Co.,Ltd.

GR01 Patent grant
GR01 Patent grant