CN112966073B - Short text matching method based on semantics and shallow features - Google Patents

Short text matching method based on semantics and shallow features Download PDF

Info

Publication number
CN112966073B
CN112966073B CN202110373418.7A CN202110373418A CN112966073B CN 112966073 B CN112966073 B CN 112966073B CN 202110373418 A CN202110373418 A CN 202110373418A CN 112966073 B CN112966073 B CN 112966073B
Authority
CN
China
Prior art keywords
vector
text
feature
representing
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110373418.7A
Other languages
Chinese (zh)
Other versions
CN112966073A (en
Inventor
杨洁
余卫宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110373418.7A priority Critical patent/CN112966073B/en
Publication of CN112966073A publication Critical patent/CN112966073A/en
Application granted granted Critical
Publication of CN112966073B publication Critical patent/CN112966073B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a short text matching method based on semantics and shallow features, and relates to the technical field of text matching. The invention comprises the following steps: reading and preprocessing the first text and the second text to obtain word information; mapping the word information into a word feature vector by using a word2vec model; extracting the characteristics of sentence codes, and carrying out normalization processing on the characteristics to obtain statistical characteristic vectors; inputting the character feature vector and the statistical feature vector into an interactive feature learning device and a statistical feature learning device respectively to obtain a decoding vector u s And r s (ii) a And splicing the output of the interactive feature learner and the output of the statistical feature learner, inputting a splicing result into an MLP layer for prediction, and if the output result is 1, successfully matching the first text with the second text. The invention further refines the vector information by using the multilayer perceptron, and can obtain excellent text matching performance.

Description

Short text matching method based on semantics and shallow features
Technical Field
The invention relates to the technical field of text matching, in particular to a short text matching method based on semantics and shallow features.
Background
For the retrieval task, it is important to retrieve the content with high semantic relevance. The short text matching method realizes similarity judgment by matching the short text contents, and has important application value in each retrieval task. Matching in short text aims at matching two short texts. The traditional short text matching model has sparse short text semantics, less characteristic information and less training corpus, so that the industrial application of the traditional short text matching method is limited. Meanwhile, the two short texts have the conditions that the length difference is large, and synonyms, aliases and the like cannot be aligned, so that the matching accuracy of the short texts is further limited. The method has the advantages that richer semantic feature representation is obtained, negative influence of texts with large length difference on matching is reduced, and alignment problems of synonyms, aliases, short texts and the like are solved, so that the method is an important technical point.
Disclosure of Invention
In view of this, the present invention designs a feature extractor, an interactive feature learner, and a statistical feature learner, wherein modules of the feature extractor, the interactive feature learner, and the statistical feature learner respectively perform depth coding on short texts and statistical features, learn feature representations generated based on the depth coding, obtain corresponding short text depth representation vectors, further splice the corresponding representation vectors, and finally further refine representation vector information using a multi-layer sensor, so as to obtain excellent performance. The invention provides a short text matching method based on semantics and shallow features.
In order to achieve the purpose, the invention adopts the following technical scheme:
a short text matching method based on semantics and shallow features comprises the following steps:
reading and preprocessing the first text and the second text to obtain word information;
mapping the word information into a word feature vector by using a word2vec model;
extracting the characteristics of sentence codes, and carrying out normalization processing on the characteristics to obtain statistical characteristic vectors;
obtaining a decoding vector u corresponding to the character feature vector by using BilSTM and attention s (ii) a Updating the statistical feature vector by the multi-head attention mechanism structure to obtain a decoding vector r s
Decoding the vector u s And said decoded vector r s And splicing, predicting a splicing result, and if the output result is 1, successfully matching the first text with the second text.
Preferably, the word information includes a word sequence and a word sequence.
Preferably, the sentence-coded features include a distance feature, a text feature and a co-occurrence feature.
Preferably, the decoding vector u is characterized in that s The specific process of acquisition is as follows:
inputting the character feature vectors into a BilSTM layer, carrying out independent encoder coding, adding a special vector behind each vector, and automatically setting the special vectors according to actual conditions to obtain the following results:
Figure BDA0003010233180000021
Figure BDA0003010233180000022
wherein the content of the first and second substances,
Figure BDA0003010233180000023
inputting the character feature vector of a first text into a BilSTM layer, and carrying out independent encoder coding to obtain the character feature vector;
Figure BDA0003010233180000024
inputting the character feature vector of the second text into a BilSTM layer, and carrying out independent encoder coding to obtain the character feature vector;
Figure BDA0003010233180000025
representing a special vector corresponding to the first text;
Figure BDA0003010233180000026
representing a special vector corresponding to the second text;
Figure BDA0003010233180000027
the word feature vector representing the first text,
Figure BDA0003010233180000028
the word feature vector representing a second text; will be provided with
Figure BDA0003010233180000029
Inputting the vector into a nonlinear activation network to obtain a hidden vector matrix h b Will be
Figure BDA00030102331800000210
Inputting the vector into a nonlinear activation network to obtain a hidden vector matrix h d
Figure BDA0003010233180000031
Figure BDA0003010233180000032
Computing a hidden vector matrix h d And a hidden vector matrix h b The correlation matrix s is as follows:
s=(h b ) T h d ∈R (b+1)*(d+1)
calculating a mutual attention score A according to a softmax function by using the correlation matrix s b And A d (ii) a Computing a feature vector c b And feature vector c d The following were used:
c b =h d A b ∈R l*(n+1)
c d =[h b ;c b ]A d ∈R 2l*(m+1)
hiding vector matrix h b And a feature vector c b Splicing is carried out, and a hidden vector matrix h is obtained d And eigenvector matrix c d After splicing, splicing the two obtained vectors to obtain w t And u is a radical of t-1 And u t+1 Accessing a BilSTM layer to obtain the current final hidden vector u t Wherein u is t-1 And u t+1 The last hidden vector u at the previous moment and the next moment respectively; inputting the hidden vector u into a BilSTM layer to obtain a corresponding decoding vector u s
Preferably, the decoding vector r s The acquisition process is specifically as follows:
a multi-head attention mechanism is adopted, wherein the calculation formula of the attention head is as follows:
Figure BDA0003010233180000033
e b ,e d =line(f b ,f d );
Figure BDA0003010233180000034
wherein, f b A representation of a statistical feature vector representing a first text, f d A representation of a statistical feature vector representing a second text, e b ,e d Representing the vector by f b ,f d Projection vectors obtained by projecting onto different planes;
Figure BDA0003010233180000041
representing a value obtained after point multiplication of the projection vector;
Figure BDA0003010233180000042
representing the weight scores corresponding to the first text and the second text which are obtained by calculation;
updating the corresponding statistical characteristic expression vector by using the calculated characteristic weight, wherein the calculation formula is as follows:
Figure BDA0003010233180000043
Figure BDA0003010233180000044
wherein the content of the first and second substances,
Figure BDA0003010233180000045
representing projection vectors obtained by projection onto different planes; r is i Representing a new statistical feature representation vector obtained after the feature i is updated;
Figure BDA0003010233180000046
representing a join operation; reLU denotes the nonlinear activation function;
finally, calculating the difference and the product of the updated statistical characteristic representation vectors, and splicing; the calculation process is as follows:
r - =|r i -r j
Figure BDA0003010233180000047
Figure BDA0003010233180000048
preferably, the output vectors of the two modules are spliced, and then the MLP layer is used for prediction, and the calculation process is as follows:
o=softmax(MLP([u s ,r s ])。
compared with the prior art, the invention discloses a short text matching method based on semantics and shallow features, and has the following beneficial effects:
1. the scheme characteristic extractor part: and constructing statistical feature vector representation, respectively extracting three types of features including sentence coding distance feature, text feature and co-occurrence feature from the text, and performing normalization processing to enrich the feature of short text matching.
2. The characteristic extractor part of the scheme: character sequence features are extracted based on the first text and the second text, and complete digit and English word features are extracted at the same time, so that semantic information loss caused by splitting digits and English words is avoided, and short text matching prediction is more accurate.
3. The scheme characteristic learning part: by utilizing a co-attention model of the text feature learner, interactive features among short texts can be better learned, a better hidden state expression vector can be obtained, and the features of the short texts, which are more abstract and more robust, can be learned;
4. the characteristic learning part of the scheme: by utilizing a multi-head attention mechanism of the statistical feature learning device part, shallow rich semantic information can be obtained, so that the learned features have stronger robustness;
5. the characteristic learning part of the scheme: the difference and the product of the statistical characteristic learning device are used for calculation, and the difference and the product are calculated respectively, so that the difference and the commonality between texts can be effectively learned, and more accurate prediction of the relationship between the short texts is realized;
6. the prediction part of the scheme splices the characteristics learned by the text characteristic learner and the statistical characteristic learner, predicts the spliced characteristics by utilizing a multilayer perceptron and an activation function, and outputs different labels, thereby improving the identification accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a short text matching method based on semantics and shallow features, which comprises the following steps as shown in figure 1:
reading and preprocessing the first text and the second text to obtain word information;
mapping word information into a word feature vector by using a word2vec model;
extracting the characteristics of sentence codes, and carrying out normalization processing on the characteristics to obtain statistical characteristic vectors;
obtaining a decoding vector u corresponding to the character feature vector by using BilSTM and attention s (ii) a The statistical feature vector is updated by a multi-head attention mechanism structure to obtain a decoding vector r s
Decoding vector u s And decoding vector r s And splicing, predicting a splicing result, and if the output result is 1, successfully matching the first text with the second text.
The present embodiment includes the following steps:
feature extractor
Step 1: and inputting the short text 1 to be matched and the corresponding short text 2 to be matched to perform text preprocessing, wherein the text preprocessing comprises special symbol processing, capital and small English conversion and unified simplified and unsimplified characters. And is divided into a word sequence and a word sequence, and word information is mapped into a corresponding word vector (300 dimensions) by using a word2vec model. Three types of characteristics, namely sentence coding distance characteristics, text characteristics and co-occurrence characteristics, are respectively extracted, wherein the sentence coding distance characteristics comprise cosine similarity, euclidean similarity, hamming distance, TF/IDF, word2vec, editing distance and the like; the text characteristics comprise the number of words of the text, the number of words after word segmentation of the text, the number of words after word de-tagging stop words of the text, the number of words after word segmentation and word de-stop words, the number of non-repeated words, the ratio of non-repeated words, the number of tagging points, the ratio of tagging points, the number of parts of speech of POS and the ratio of parts of speech of POS; the co-occurrence characteristics include 1-gram, 2-gram, 3-gram, remove stop words-1-gram, remove stop words-2-gram, remove stop words-3-gram. Meanwhile, the extracted features are normalized, and corresponding 300-dimensional word-oriented feature vector representation and 100-dimensional statistical feature vector representation are obtained.
Step 2: according to the first step, a word feature vector (denoted as w1, w2, w3.. Wn) and a statistical feature vector (denoted as s1, s2, s3.. Sn) of a text 1 to be matched, a word feature vector (denoted as q1, q2, q3.. Qm) and a statistical feature vector (denoted as x1, x2, x3.. Xn) of a text 2 to be matched are obtained, a feature vector matrix of an input text 1 to be matched of an interactive feature learner and a feature vector matrix of an input text 2 to be matched of an interactive feature learner, with the dimensions of n × 300 and m × 300 are obtained respectively, and a feature vector of an input text 1 to be matched of a statistical feature matcher and a feature vector of an input text 2 to be matched with the dimension of 100 are obtained.
Interactive feature learner and statistical feature learner
Step 1: for the interactive feature learner, we take the attention and BilSTM as the basis and make improvements on this basis. Firstly, inputting a text feature vector matrix corresponding to a text 1 to be matched and a text 2 to be matched into a BilSTM layer through the BilSTM layer, and carrying out independent encoder coding, wherein the calculation method comprises the following steps:
h b =LSTM(f b );
h d =LSTM(f d );
and adding a special vector as an identification symbol after each vector, wherein the special vector is set according to the actual situation, and the following is obtained:
Figure BDA0003010233180000071
Figure BDA0003010233180000072
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003010233180000073
inputting a character feature vector of a first text into a BilSTM layer, and performing independent encoder coding to obtain the character feature vector;
Figure BDA0003010233180000074
inputting a character feature vector of a second text into a BilSTM layer, and carrying out independent encoder coding to obtain the character feature vector;
Figure BDA0003010233180000075
representing a special vector corresponding to the first text;
Figure BDA0003010233180000076
representing a special vector corresponding to the second text;
Figure BDA0003010233180000077
a word feature vector representing the first text,
Figure BDA0003010233180000078
a word feature vector representing the second text; will be provided with
Figure BDA0003010233180000079
Input to non-linear activationNetwork to obtain a hidden vector matrix h b Will be
Figure BDA0003010233180000081
Inputting the vector into a nonlinear activation network to obtain a hidden vector matrix h d
Figure BDA0003010233180000082
Figure BDA0003010233180000083
Computing a hidden vector matrix h d And a hidden vector matrix h b The correlation matrix s is as follows:
s=(h b ) T h d ∈R (b+1)*(d+1)
calculating a mutual attention score A according to the softmax function by using the correlation matrix s b And A d (ii) a The calculation process is as follows:
A b =softmax(s);
A d =softmax(s T );
computing a feature vector c b And feature vector c d The following:
c b =h d A b ∈R l*(n+1)
c d =[h b ;c b ]A d ∈R 2l*(m+1)
hiding vector matrix h b And a feature vector c b Splicing is carried out, and a hidden vector matrix h is obtained d And eigenvector matrix c d After splicing, splicing the two obtained vectors to obtain w t And u is a radical of t-1 And u t+1 Accessing a BilSTM layer to obtain the current final hidden vector u t Wherein u is t-1 And u t+1 The last hidden vector u at the previous and next time, respectively; inputting the hidden vector u into a BilSTM layer to obtain a corresponding decoding vector u s . The specific formula is as follows:
Figure BDA0003010233180000084
u t =BiLSTM(u t-1 ,w t ,u t+1 );
u=[u 1 ,u 2 ,u 3 ,.....,u n ]∈R 2l*n
the second step: for the statistical feature learner, a structure of a multi-head attention mechanism is adopted, and improvement is made on the basis of the multi-head attention mechanism. The calculation formula of our attention head is as follows:
Figure BDA0003010233180000091
e b ,e d =line(f b ,f d );
Figure BDA0003010233180000092
wherein f is b Statistical feature vector representation, f, representing the short text 1 to be matched d Statistical feature vector representation representing the short text 2 to be matched, e b ,e d Representing the vector by f b ,f d The resulting projection vectors projected onto different planes.
Figure BDA0003010233180000093
The value obtained after dot-multiplying the projection vector is represented.
Figure BDA0003010233180000094
And representing the weight scores corresponding to the short text 1 to be matched and the short text 2 to be matched which are obtained through calculation.
Next, we update the corresponding statistical feature representation vector by using the calculated feature weight, and the calculation formula is as follows:
Figure BDA0003010233180000095
Figure BDA0003010233180000096
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003010233180000097
representing projection vectors obtained by projection onto different planes; r is i Representing a new statistical feature representation vector obtained after the feature i is updated;
Figure BDA0003010233180000098
representing a connection operation; reLU represents a nonlinear activation function.
And finally, calculating the difference and the product of the updated statistical characteristic representation vectors, and splicing. The calculation process is as follows:
r - =|r i -r j
Figure BDA0003010233180000101
Figure BDA0003010233180000102
and (3) splicing prediction:
splicing the output vectors of the two modules, and then predicting by using an MLP layer, wherein the calculation process is as follows:
o=softmax(MLP([u s ,r s ])。
the embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (5)

1. A short text matching method based on semantics and shallow features is characterized by comprising the following steps:
reading and preprocessing the first text and the second text to obtain word information;
mapping the word information into a word feature vector by using a word2vec model;
extracting the characteristics of sentence codes, and carrying out normalization processing on the characteristics to obtain statistical characteristic vectors;
obtaining a decoding vector u corresponding to the character feature vector by using BilSTM and attention s (ii) a Updating the statistical feature vector by the multi-head attention mechanism structure to obtain a decoding vector r s
Decoding the vector u s And the decoding vector r s Splicing, predicting a splicing result, and if the output result is 1, successfully matching the first text with the second text;
the decoding vector u s The specific process of acquisition is as follows:
inputting the character feature vectors into a BilSTM layer, carrying out independent encoder coding, and adding a special vector behind each vector to obtain the following result:
Figure FDA0003805502050000011
Figure FDA0003805502050000012
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003805502050000013
inputting the character feature vector of the first text into a BilSTM layer, and carrying out independent encoder coding to obtain the character feature vector;
Figure FDA0003805502050000014
inputting the character feature vector of the second text into a BilSTM layer, and carrying out independent encoder coding to obtain the character feature vector;
Figure FDA0003805502050000015
representing a special vector corresponding to the first text;
Figure FDA0003805502050000016
representing a special vector corresponding to the second text;
Figure FDA0003805502050000017
the word feature vector representing the first text,
Figure FDA0003805502050000018
the word feature vector representing a second text; will be provided with
Figure FDA0003805502050000019
Inputting the vector into a nonlinear activation network to obtain a hidden vector matrix h b Will be
Figure FDA00038055020500000110
Inputting the data into a nonlinear activation network to obtain a hidden vector matrix h d
Figure FDA00038055020500000111
Figure FDA00038055020500000112
Computing a hidden vector matrix h d And a hidden vector matrix h b The correlation matrix s of (a) is as follows:
s=(h b ) T h d ∈R (b+1)*(d+1)
calculating a mutual attention score A according to a softmax function by using the correlation matrix s b And A d (ii) a Computing a feature vector c b And a feature vector c d The following:
c b =h d A b ∈R l*(n+1)
c d =[h b ;c b ]A d ∈R 2l*(m+1)
hiding vector matrix h b And a feature vector c b Splicing is carried out, and a hidden vector matrix h is obtained d And eigenvector matrix c d After splicing, splicing the two obtained vectors to obtain w t And u is t-1 And u t+1 Accessing a BilSTM layer to obtain the current final hidden vector u t Wherein u is t-1 And u t+1 The last hidden vector u at the previous and next time, respectively; inputting the hidden vector u into a BilSTM layer to obtain a corresponding decoding vector u s
2. The method of claim 1, wherein the word information comprises word sequence and word sequence.
3. The method of claim 1, wherein the sentence-coding features comprise distance features, text features, and co-occurrence features.
4. The method of claim 1, wherein the decoding vector r is a short text matching vector based on semantic and shallow feature s The acquisition process is specifically as follows:
a multi-head attention mechanism is adopted, wherein the calculation formula of the attention head is as follows:
Figure FDA0003805502050000021
e b ,e d =line(f b ,f d );
Figure FDA0003805502050000022
wherein f is b A representation of a statistical feature vector representing a first text, f d A representation of a statistical feature vector representing a second text, e b ,e d Representing the vector by f b ,f d Projection vectors obtained by projecting onto different planes;
Figure FDA0003805502050000031
representing a value obtained after point multiplication of the projection vector;
Figure FDA0003805502050000032
representing the weight scores corresponding to the first text and the second text which are obtained by calculation;
and updating the corresponding statistical characteristic expression vector by using the calculated characteristic weight, wherein the calculation formula is as follows:
Figure FDA0003805502050000033
Figure FDA0003805502050000034
wherein r is i h Representing projection vectors obtained by projection onto different planes; r is i Representing a new statistical feature representation vector obtained after the feature i is updated;
Figure FDA0003805502050000035
representing a join operation; reLU represents a nonlinear activation function;
calculating the difference and the product of the updated statistical feature representation vectors, and then splicing; the calculation process is as follows:
r - =|r i -r j
Figure FDA0003805502050000036
Figure FDA0003805502050000037
5. the short text matching method based on semantic and shallow feature as claimed in claim 1, wherein the computation process of the concatenation prediction is as follows:
o=softmax(MLP([u s ,r s ])。
CN202110373418.7A 2021-04-07 2021-04-07 Short text matching method based on semantics and shallow features Active CN112966073B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110373418.7A CN112966073B (en) 2021-04-07 2021-04-07 Short text matching method based on semantics and shallow features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110373418.7A CN112966073B (en) 2021-04-07 2021-04-07 Short text matching method based on semantics and shallow features

Publications (2)

Publication Number Publication Date
CN112966073A CN112966073A (en) 2021-06-15
CN112966073B true CN112966073B (en) 2023-01-06

Family

ID=76281435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110373418.7A Active CN112966073B (en) 2021-04-07 2021-04-07 Short text matching method based on semantics and shallow features

Country Status (1)

Country Link
CN (1) CN112966073B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656547B (en) * 2021-08-17 2023-06-30 平安科技(深圳)有限公司 Text matching method, device, equipment and storage medium
CN114282646B (en) * 2021-11-29 2023-08-25 淮阴工学院 Optical power prediction method and system based on two-stage feature extraction and BiLSTM improvement
CN115600580B (en) * 2022-11-29 2023-04-07 深圳智能思创科技有限公司 Text matching method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019220142A (en) * 2018-06-18 2019-12-26 日本電信電話株式会社 Answer learning device, answer learning method, answer generating device, answer generating method, and program
WO2021021330A1 (en) * 2019-07-30 2021-02-04 Intuit Inc. Neural network system for text classification

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7275661B2 (en) * 2019-03-01 2023-05-18 日本電信電話株式会社 Sentence generation device, sentence generation method, sentence generation learning device, sentence generation learning method and program
CN110348016B (en) * 2019-07-15 2022-06-14 昆明理工大学 Text abstract generation method based on sentence correlation attention mechanism
CN110781680B (en) * 2019-10-17 2023-04-18 江南大学 Semantic similarity matching method based on twin network and multi-head attention mechanism
CN112084336A (en) * 2020-09-09 2020-12-15 浙江综合交通大数据中心有限公司 Entity extraction and event classification method and device for expressway emergency

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019220142A (en) * 2018-06-18 2019-12-26 日本電信電話株式会社 Answer learning device, answer learning method, answer generating device, answer generating method, and program
WO2021021330A1 (en) * 2019-07-30 2021-02-04 Intuit Inc. Neural network system for text classification

Also Published As

Publication number Publication date
CN112966073A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
CN112966073B (en) Short text matching method based on semantics and shallow features
Zhang et al. Deep Neural Networks in Machine Translation: An Overview.
CN112733541A (en) Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism
CN109408812A (en) A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN111611810B (en) Multi-tone word pronunciation disambiguation device and method
CN110990555B (en) End-to-end retrieval type dialogue method and system and computer equipment
CN114943230A (en) Chinese specific field entity linking method fusing common knowledge
CN113987169A (en) Text abstract generation method, device and equipment based on semantic block and storage medium
CN111767718A (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN111274804A (en) Case information extraction method based on named entity recognition
CN115759119B (en) Financial text emotion analysis method, system, medium and equipment
CN112183083A (en) Abstract automatic generation method and device, electronic equipment and storage medium
CN112905736A (en) Unsupervised text emotion analysis method based on quantum theory
CN113051922A (en) Triple extraction method and system based on deep learning
CN113326702A (en) Semantic recognition method and device, electronic equipment and storage medium
CN114443813A (en) Intelligent online teaching resource knowledge point concept entity linking method
CN110619119B (en) Intelligent text editing method and device and computer readable storage medium
CN111444720A (en) Named entity recognition method for English text
CN111428518B (en) Low-frequency word translation method and device
CN113326367A (en) Task type dialogue method and system based on end-to-end text generation
CN116484852A (en) Chinese patent entity relationship joint extraction method based on relationship diagram attention network
CN116069924A (en) Text abstract generation method and system integrating global and local semantic features
CN115759102A (en) Chinese poetry wine culture named entity recognition method
CN113488196B (en) Drug specification text named entity recognition modeling method
CN113204679B (en) Code query model generation method and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant