CN111241254A - Statement similarity calculation method - Google Patents

Statement similarity calculation method Download PDF

Info

Publication number
CN111241254A
CN111241254A CN201911357252.9A CN201911357252A CN111241254A CN 111241254 A CN111241254 A CN 111241254A CN 201911357252 A CN201911357252 A CN 201911357252A CN 111241254 A CN111241254 A CN 111241254A
Authority
CN
China
Prior art keywords
word
sentence
calculating
statement
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911357252.9A
Other languages
Chinese (zh)
Inventor
陈旋
王冲
崇传兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Aijia Household Products Co Ltd
Original Assignee
Jiangsu Aijia Household Products Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Aijia Household Products Co Ltd filed Critical Jiangsu Aijia Household Products Co Ltd
Priority to CN201911357252.9A priority Critical patent/CN111241254A/en
Publication of CN111241254A publication Critical patent/CN111241254A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for calculating sentence similarity, which relates to the technical field of sentence analysis and specifically comprises the following steps; preparing a data set, and preprocessing each statement of the data set; training the preprocessed sentence set to obtain word vectors; calculating the similarity of the word vectors of the two sentences; using sentence vectors of sentences to perform hierarchical clustering to obtain a knowledge entry tree; and carrying out knowledge recommendation on the target sentence. The invention avoids linear similarity comparison through clustering, reduces the comparison range and improves the performance; the semantic features of the sentences are kept while the performance is improved, and simple character retrieval is avoided.

Description

Statement similarity calculation method
Technical Field
The invention relates to the technical field of statement analysis, in particular to a statement similarity calculation method.
Background
The Question Answering System (QA) is a high-level form of information retrieval System that can answer questions posed by users in natural language with accurate and concise natural language. The question-answering system is realized through a knowledge base in a simple implementation mode, and answers can be recommended only by matching the most similar questions. How to find the most similar problem within the massive knowledge? The traditional mode is either one-by-one linear comparison, and as the knowledge items are increased, the performance is slower and slower; or a part of data is matched by a search engine firstly, and then linear comparison is carried out, so that the search engine can only match sentences with the same words, and different descriptions of the sentences with the same semantics are often ignored.
Disclosure of Invention
The invention provides a method for calculating sentence similarity, which reduces the range of knowledge items of similarity calculation in a hierarchical clustering mode, and meanwhile, clustering is carried out through the characteristics of sentences, and sentences with the same semantics and different descriptions are not ignored.
The invention adopts the following technical scheme for solving the technical problems:
a method for calculating sentence similarity specifically comprises the following steps;
step1, prepare data set Q, Q ═ Q1,Q2,Q3,...,Qi}; and QiIs a sentence;
step2, preprocessing each statement of the data set;
step3, training the preprocessed sentence set to obtain word vectors;
step4, calculating the similarity of the word vectors of the two sentences;
step5, using statement QiSentence vector HiCarrying out hierarchical clustering to obtain a knowledge item tree;
and 6, recommending knowledge to the target sentence.
As a further preferable scheme of the method for calculating sentence similarity, the step2 of preprocessing each sentence in the data set specifically comprises the following steps;
step 2.1, word segmentation: each statement QiPerforming word segmentation;
step 2.2, stop words: removing words with little meaning including, but not limited to;
step 2.3, removing numbers, Chinese and English punctuation marks and other non-Chinese meaningless marks, including%, ¥ and!;
step 2.4, word filtering: will get each statement QiPerforming word filtering on the word segmentation result to obtain a result set:
S={S1,S2,S3,...,Siin which S isiFor each word;
according to a word in all sentences QiDefining the minimum occurrence frequency min and the maximum occurrence frequency max, and eliminating words with the occurrence frequency less than min and greater than max, wherein the words with the occurrence frequency greater than max are very common and not representative; words with the occurrence number in (min, max) are reserved;
step 2.5, a plurality of repeated words exist in the filtering result, and Q is obtained after the repeated words are removediThe de-emphasis word result set T specifically comprises the following steps:
T={T1,T2,T3,...,Ti}
as a further preferable scheme of the method for calculating sentence similarity, step3 is specifically as follows:
for all words TiComputing word vectors
Step 3.1, Single statement QiMiddle word TiThe basic formula of the word frequency value TF is as follows:
Figure BDA0002336269390000021
wherein, count TiIs a sentence QiWord T in word segmentation resultiNumber of occurrences, count SiIs a sentence QiCalculating the number of all words in the word segmentation result, and calculating the statement QiEach word T in the word segmentation resultiThe word frequency of;
step 3.2, Single statement QiMiddle word TiThe basic formula of (a) is as follows:
Figure BDA0002336269390000022
wherein N represents all statements QiNumber of (2), and N (T)i) Representing the containing word TiStatement Q ofiTo calculate the statement QiEach word T in the word segmentation resultiThe IDF value of (1);
step 3.3, calculating the TF-IDF value of a certain word, wherein the formula is as follows:
TF-IDF(Ti)=TF(Ti)*IDF(Ti)
step 3.4, each statement QiEach word T in the word segmentation result setiCalculating TF-IDF values respectively, and forming the TF-IDF values into a one-dimensional vector, namely the sentence QiSentence vector Hi
As a further preferable scheme of the method for calculating the sentence similarity, in step4, the cosine similarity formula is used to calculate the similarity of the word vectors of the two sentences, specifically as follows:
Figure BDA0002336269390000023
the larger the cosine similarity value, the more similar the two statements are.
As a further preferable scheme of the method for calculating sentence similarity, the step5 is specifically as follows:
step 5.1, put each statement QiSentence vector HiAs a class;
step 5.2, two classes with the largest similarity among the classes are searched, the two classes are classified into one class, and the total number of the classes is one less;
if only one knowledge exists in a single class, directly calculating cosine similarity, and calculating an average vector of the two knowledge as a sentence vector of a class node;
if there are subclasses in the single class, respectively calculating the average vector of the sentence vectors from the minimum class upwards in sequence until the class is represented by one average sentence vector, and then calculating the cosine similarity;
and 5.3, repeating the step 5.2 until only one class is left, and obtaining the hierarchical clustering tree.
As a further preferable scheme of the method for calculating sentence similarity, step 6 is specifically as follows:
step 6.1, after the target sentence is processed in the steps 1 to 3, word vectors are assembled together to form a sentence vector D
Step 6.2, respectively calculating cosine similarity of D and each child node with the depth of 1 in the tree, and searching a subtree with the maximum similarity;
step 6.3, in the subtrees, calculating D and the sub-nodes with the depth of 2 respectively to calculate cosine similarity, and searching the subtree with the maximum similarity;
step 6.4, until the most similar sentence is found.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
1. the invention avoids linear similarity comparison through clustering, reduces the comparison range and improves the performance;
2. the invention improves the performance and simultaneously reserves the semantic features of the sentences, thereby avoiding simple character retrieval;
3. the invention greatly reduces the calculation amount, saves the server resource and reduces the cost;
4. the sentence vector hierarchical clustering tree is constructed in advance, time-consuming tasks are arranged in advance, and the performance of real-time retrieval is improved.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of the hierarchical clustering operation of the present invention;
FIG. 3 is a schematic diagram of a hierarchical clustering tree in accordance with the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention
As shown in fig. 1, a method for calculating sentence similarity specifically includes the following steps;
step1, a data set Q is prepared, wherein Q ═ Q1,Q2,Q3,...,Qi};
Step2, preprocessing each statement of the data set;
step3, training the preprocessed sentence set to obtain word vectors;
step4, calculating the similarity of the word vectors of the two sentences;
step5, using statement QiSentence vector HiCarrying out hierarchical clustering to obtain a knowledge item tree;
and 6, recommending knowledge to the target sentence.
As a further preferable scheme of the method for calculating sentence similarity, the step2 of preprocessing each sentence in the data set specifically comprises the following steps;
step 2.1, word segmentation: each statement QiPerforming word segmentation;
step 2.2, stop words: removing words with little meaning including, but not limited to;
step 2.3, removing numbers, Chinese and English punctuation marks and other non-Chinese meaningless marks, including%, ¥ and!;
step 2.4, word filtering: will get each statement QiPerforming word filtering on the word segmentation result to obtain a result set:
S={S1,S2,S3,...,Siin which S isiFor each word;
according to a word in all sentences QiThe minimum occurrence number min and the maximum occurrence number max are defined to have the occurrence number smaller than min and largeEliminating words with max, and if the words with the size larger than max are eliminated, the description words are very common and not representative; the words less than min are few in appearance and strong in characteristics, which affect decision making, so that only the words with the appearance frequency of (min, max) are reserved;
step 2.5, a plurality of repeated words exist in the filtering result, and Q is obtained after the repeated words are removediThe de-emphasis word result set T specifically comprises the following steps:
T={T1,T2,T3,...,Ti}
as a further preferable scheme of the method for calculating sentence similarity, step3 is specifically as follows:
for all words TiComputing word vectors
Step 3.1, Single statement QiMiddle word TiThe basic formula of the word frequency value TF is as follows:
Figure BDA0002336269390000041
wherein, count TiIs a sentence QiWord T in word segmentation resultiNumber of occurrences, count SiIs a sentence QiCalculating the number of all words in the word segmentation result, and calculating the statement QiEach word T in the word segmentation resultiThe word frequency of;
step 3.2, Single statement QiMiddle word TiThe basic formula of (a) is as follows:
Figure BDA0002336269390000051
wherein N represents all statements QiNumber of (2), and N (T)i) Representing the containing word TiStatement Q ofiTo calculate the statement QiEach word T in the word segmentation resultiThe IDF value of (1);
step 3.3, calculating the TF-IDF value of a certain word, wherein the formula is as follows:
TF-IDF(Ti)=TF(Ti)*IDF(Ti)
step 3.4, each statement QiEach word T in the word segmentation result setiCalculating TF-IDF values respectively, and forming the TF-IDF values into a one-dimensional vector, namely the sentence QiSentence vector Hi
As a further preferable scheme of the method for calculating the sentence similarity, in step4, the cosine similarity formula is used to calculate the similarity of the word vectors of the two sentences, specifically as follows:
Figure BDA0002336269390000052
the larger the cosine similarity value, the more similar the two statements are.
As a further preferable scheme of the method for calculating sentence similarity, the step5 is specifically as follows:
step 5.1, put each statement QiSentence vector HiAs a class;
step 5.2, two classes with the maximum similarity among the classes are searched, the two classes are classified into one class, and the total number of the classes is reduced by one;
if only one knowledge exists in a single class, directly calculating cosine similarity, and calculating an average vector of the two knowledge as a sentence vector of a class node;
if there are subclasses in the single class, respectively calculating the average vector of the sentence vectors from the minimum class upwards in sequence until the class is represented by one average sentence vector, and then calculating the cosine similarity;
and 5.3, repeating the step 5.2 until only one class is left, and obtaining the hierarchical clustering tree.
As a further preferable scheme of the method for calculating sentence similarity, step 6 is specifically as follows:
step 6.1, processing the target sentence in the steps 1 to 3 to obtain a sentence vector D formed by assembling word vectors
Step 6.2, respectively calculating cosine similarity of D and each child node with the depth of 1 in the tree, and searching a subtree with the maximum similarity;
step 6.3, in the subtrees, calculating D and the sub-nodes with the depth of 2 respectively to calculate cosine similarity, and searching the subtree with the maximum similarity;
6.4, until finding the most similar statement;
as shown in fig. 2, a simple hierarchical clustering operation assumes A, B, C, D, E are 5 knowledge entries, corresponding to 5 sentence vectors.
step1, each knowledge item as an independent class, for a total of 5 classes;
step2, finding two types A and B with the maximum similarity, and then totally 4 types, namely { (A, B), C, D, E }; step3, finding the two types D and E with the largest similarity, and then totally 3 types, namely { (A, B), C, (D, E) };
step4, calculating an average sentence vector of (A, B), calculating an average sentence vector of (D, E), then calculating pairwise similarity with C respectively, finding that the similarity of C and (A, B) is greater than the similarity of C and (D, E) is greater than the similarity of (A, B) and (D, E), so that (A, B) and C are classified into a new class, and then 2 classes are obtained in total, namely { ((A, B), C), (D, E) };
step5, leaving only two classes, so merge directly to get the final class { (((A, B), C), (D, E)) }.
The final hierarchical clustering tree is shown in FIG. 3:
in addition to the root node, the sentence vector of each cluster node (i.e., black nodes C1-C4 in the figure) represents the average of the sentence vectors of A and B with method C1; c2 is the average of sentence vectors of C1 and C, and so on.
1) Carrying out knowledge recommendation on the target sentence;
a) processing the target sentence in the steps 1-3 to obtain a sentence vector D;
b) in the tree, respectively calculating cosine similarity of D and each child node with the depth of 1, and searching a subtree with the maximum similarity;
c) in the subtrees, calculating D and the child nodes with the depth of 2 respectively to calculate cosine similarity, and searching the subtree with the maximum similarity;
d) and so on until the most similar sentence is found.
The detailed implementation content of the invention comprises the following steps:
1. preparing a data set:
A. where you are;
B. asking what question is;
C. beijing is very beautiful;
D. this leather boot has a larger number. That number is appropriate;
E. the signal of your side is not heard well and is not clear;
2. preprocessing a data set:
a) word segmentation;
a. [ where you are ]
B. [ asking questions, what, question, etc. ]
C. [ Beijing, very beautiful ]
[ this, leather boot, number, size, and get,. That, only, number, as appropriate ]
E. [ you, that side, signal, not too much, good, listen, not too much, clear ]
b) Stop words:
a. [ where you are ]
B. [ asking questions, what, question, etc. ]
C. [ Beijing, beauty ]
D. [ leather boot, number, large, number, proper ]
E. [ signal, less, good, listen, less, clear ]
c) And (3) word filtering:
a. [ where you are ]
B. [ asking questions, what, question, etc. ]
C. [ Beijing, beauty ]
D. [ leather boot, number, large, number, proper ]
E. [ signal, less, good, listen, less, clear ]
3. And (3) word vector training:
respectively training the data set in the step2 into sentence vectors according to the TF-IDF word vector training method, and constructing a hierarchical clustering tree
4. The target sentence looks for similar sentences:
for example, we need to find a similar sentence of the sentence "the boot number is not small, and the boot number is more suitable";
word segmentation result [ this, only, leather boot, number, not small, that, only, more appropriate ];
stop words [ leather boot, number, small, proper ];
calculating a sentence vector as QV;
when the search is carried out in the hierarchical clustering tree, the most similar sentence can be found to be that the leather boot number is large. That number is appropriate "
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the protection scope of the present invention. While the embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (6)

1. A method of calculating sentence similarity, characterized by: the method specifically comprises the following steps;
step1, prepare data set Q, Q ═ Q1,Q2,Q3,...,Qi}; and QiIs a sentence;
step2, preprocessing each statement of the data set;
step3, training the preprocessed sentence set to obtain word vectors;
step4, calculating the similarity of the word vectors of the two sentences;
step5, using statement QiSentence vector HiCarrying out hierarchical clustering to obtain a knowledge item tree;
and 6, recommending knowledge to the target sentence.
2. The method of claim 1, wherein the method comprises: in one embodiment, the step2 of preprocessing each statement of the data set specifically includes the following steps;
step 2.1, word segmentation: each statement QiPerforming word segmentation;
step 2.2, stop words: removing words with little meaning including, but not limited to;
step 2.3, removing numbers, Chinese and English punctuation marks and other non-Chinese meaningless marks, including%, ¥ and!;
step 2.4, word filtering: will get each statement QiPerforming word filtering on the word segmentation result to obtain a result set:
S={S1,S2,S3,...,Siin which S isiFor each word;
according to a word in all sentences QiDefining the minimum occurrence frequency min and the maximum occurrence frequency max, and eliminating words with the occurrence frequency less than min and greater than max, wherein the words with the occurrence frequency greater than max are very common and not representative; words with the occurrence number in (min, max) are reserved;
step 2.5, a plurality of repeated words exist in the filtering result, and Q is obtained after the repeated words are removediThe de-emphasis word result set T specifically comprises the following steps:
T={T1,T2,T3,...,Ti}
3. the method of claim 1, wherein the method comprises: in one embodiment, the step3 is specifically as follows:
for all words TiComputing word vectors
Step 3.1, Single statement QiMiddle word TiThe basic formula of the word frequency value TF is as follows:
Figure FDA0002336269380000021
wherein, count TiIs a sentence QiWord T in word segmentation resultiNumber of occurrences, count SiIs a sentence QiCalculating the number of all words in the word segmentation result, and calculating the statement QiEach word T in the word segmentation resultiThe word frequency of;
step 3.2, Single statement QiMiddle word TiThe basic formula of (a) is as follows:
Figure FDA0002336269380000022
wherein N represents all statements QiNumber of (2), and N (T)i) Representing the containing word TiStatement Q ofiTo calculate the statement QiEach word T in the word segmentation resultiThe IDF value of (1);
step 3.3, calculating the TF-IDF value of a certain word, wherein the formula is as follows:
TF-IDF(Ti)=TF(Ti)*IDF(Ti)
step 3.4, each statement QiEach word T in the word segmentation result setiCalculating TF-IDF values respectively, and forming the TF-IDF values into a one-dimensional vector, namely the sentence QiSentence vector Hi
4. The method of claim 1, wherein the method comprises: in one embodiment, in step4, the similarity between the word vectors of the two sentences is calculated by using a cosine similarity formula, which is specifically calculated as follows:
Figure FDA0002336269380000023
the larger the cosine similarity value, the more similar the two statements are.
5. The method of claim 1, wherein the method comprises: in one embodiment, the step5 is specifically as follows:
step 5.1, put each statement QiSentence vector HiAs a class;
step 5.2, two classes with the largest similarity among the classes are searched, the two classes are classified into one class, and the total number of the classes is one less;
if only one knowledge exists in a single class, directly calculating cosine similarity, and calculating an average vector of the two knowledge as a sentence vector of a class node;
if there are subclasses in the single class, respectively calculating the average vector of the sentence vectors from the minimum class upwards in sequence until the class is represented by one average sentence vector, and then calculating the cosine similarity;
and 5.3, repeating the step 5.2 until only one class is left, and obtaining the hierarchical clustering tree.
6. The method of claim 1, wherein the method comprises: in one embodiment, the step 6 is specifically as follows:
step 6.1, after the target sentence is processed in the steps 1 to 3, word vectors are assembled together to form a sentence vector D
Step 6.2, respectively calculating cosine similarity of D and each child node with the depth of 1 in the tree, and searching a subtree with the maximum similarity;
step 6.3, in the subtrees, calculating D and the sub-nodes with the depth of 2 respectively to calculate cosine similarity, and searching the subtree with the maximum similarity;
step 6.4, until the most similar sentence is found.
CN201911357252.9A 2019-12-25 2019-12-25 Statement similarity calculation method Withdrawn CN111241254A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911357252.9A CN111241254A (en) 2019-12-25 2019-12-25 Statement similarity calculation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911357252.9A CN111241254A (en) 2019-12-25 2019-12-25 Statement similarity calculation method

Publications (1)

Publication Number Publication Date
CN111241254A true CN111241254A (en) 2020-06-05

Family

ID=70877560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911357252.9A Withdrawn CN111241254A (en) 2019-12-25 2019-12-25 Statement similarity calculation method

Country Status (1)

Country Link
CN (1) CN111241254A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673252A (en) * 2021-08-12 2021-11-19 之江实验室 Automatic join recommendation method for data table based on field semantics

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673252A (en) * 2021-08-12 2021-11-19 之江实验室 Automatic join recommendation method for data table based on field semantics

Similar Documents

Publication Publication Date Title
CN110162593B (en) Search result processing and similarity model training method and device
CN107436864B (en) Chinese question-answer semantic similarity calculation method based on Word2Vec
CN111259653B (en) Knowledge graph question-answering method, system and terminal based on entity relationship disambiguation
CN109101479B (en) Clustering method and device for Chinese sentences
CN109960763B (en) Photography community personalized friend recommendation method based on user fine-grained photography preference
CN105183833B (en) Microblog text recommendation method and device based on user model
WO2017107566A1 (en) Retrieval method and system based on word vector similarity
CN106446148A (en) Cluster-based text duplicate checking method
CN110674252A (en) High-precision semantic search system for judicial domain
CN105528437A (en) Question-answering system construction method based on structured text knowledge extraction
CN104408115B (en) The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform
CN111680488A (en) Cross-language entity alignment method based on knowledge graph multi-view information
CN110347796A (en) Short text similarity calculating method under vector semantic tensor space
CN112632250A (en) Question and answer method and system under multi-document scene
CN115470344A (en) Video barrage and comment theme fusion method based on text clustering
CN105956158A (en) Automatic extraction method of network neologism on the basis of mass microblog texts and use information
Mason et al. Domain-specific image captioning
CN115858750A (en) Power grid technical standard intelligent question-answering method and system based on natural language processing
CN111506726A (en) Short text clustering method and device based on part-of-speech coding and computer equipment
CN111241254A (en) Statement similarity calculation method
CN117743526A (en) Table question-answering method based on large language model and natural language processing
CN117112727A (en) Large language model fine tuning instruction set construction method suitable for cloud computing service
CN110019714A (en) More intent query method, apparatus, equipment and storage medium based on historical results
Shuai et al. Question answering system based on knowledge graph of film culture
KR20200047272A (en) Indexing system and method using variational recurrent autoencoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200605

WW01 Invention patent application withdrawn after publication