WO2021093871A1 - Procédé d'interrogation de texte, dispositif d'interrogation de texte, et support de stockage informatique - Google Patents
Procédé d'interrogation de texte, dispositif d'interrogation de texte, et support de stockage informatique Download PDFInfo
- Publication number
- WO2021093871A1 WO2021093871A1 PCT/CN2020/128801 CN2020128801W WO2021093871A1 WO 2021093871 A1 WO2021093871 A1 WO 2021093871A1 CN 2020128801 W CN2020128801 W CN 2020128801W WO 2021093871 A1 WO2021093871 A1 WO 2021093871A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sentence
- query
- document
- word
- vector
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3349—Reuse of stored results of previous queries
Definitions
- This application relates to the technical field of text query, in particular to a text query method, text query device and computer storage medium.
- the user In the literature search, the user is given a problem related to a professional field, and the retrieval system needs to find the most relevant documents from the database and return it to the user. The user can quickly obtain the required related literature data, which can save plenty of time.
- this application provides a text query method, text query device, and computer storage medium, which can improve the accuracy and efficiency of text query.
- a technical solution adopted in this application is to provide a text query method.
- the method includes: based on the first word-level relevance of the query sentence and the document sentence, an attention mechanism is introduced to the query sentence and the document sentence, and according to the introduction
- the relevance of the query sentence and the document sentence after the attention mechanism is used to obtain the first query result; according to the relevance of the first word level, the phrase-level relevance of the query sentence and the document sentence is obtained, and the relevance of the phrase level is obtained
- the second query result based on the second word level relevance of the professional domain vocabulary in the query sentence and the professional domain vocabulary in the document sentence, the attention mechanism is introduced to the query sentence and the document sentence, and according to the query after the attention mechanism is introduced
- the relevance of the statement and the document statement obtains the third query result; according to the first query result, the second query result, and the third query result, the final query result based on the query statement is determined.
- an attention mechanism is introduced to the query statement and the document statement, and the first query result is obtained according to the relevance of the query statement and the document statement after the attention mechanism is introduced.
- the first query result is obtained according to the correlation between the query statement and the document statement after the attention mechanism is introduced.
- determining the vector expression of the query sentence and the document sentence includes: performing word segmentation and word embedding processing on the query sentence and the document sentence to obtain the vector expression Q n*k of the query sentence and the vector expression D m*k of the document sentence, where , Among them, k represents the dimension of the word embedding vector, n represents the number of segmented words in the query sentence sequence, and m represents the number of segmented words in the document sentence. Represents the vector expression of the i-th word in the query sequence, Represents the vector expression of the i-th word in the document.
- calculating the word-level correlation matrix of the query sentence and the document sentence includes: calculating the word-level correlation matrix M n*m of the query sentence and the document sentence, where the i-th row in the matrix M n*m is the jth row
- the element M ij of the column is calculated using the following formula: among them, Represents the vector corresponding to the i-th word in the query sequence, Represents the vector corresponding to the jth word in the document sentence.
- the attention mechanism is introduced to the vector expression of the query sentence and the document sentence, including: using the following formula to calculate the vector expression of the query sentence and the document sentence after the attention mechanism is introduced : among them, Represents the vector after the i-th word in the query sequence is introduced into the attention mechanism, Represents the vector after the jth word in the document is introduced into the attention mechanism.
- the first query result is obtained, including: calculating the Hadamard product of the two vectors before and after each word in the query sentence and the document sentence is introduced into the attention mechanism; The two vectors and Hadamard products before and after the attention mechanism is introduced for each word in the query sentence and the document sentence are spliced to form a splicing vector; the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence is calculated; the query sentence Pooling is performed on the correlation matrix of the stitching vector of and the stitching vector of the document sentence to obtain the first query result.
- the pooling operation is performed on the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence to obtain the first query result, including: pooling the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence Operation to get the first intermediate vector
- idf i is the inverse text frequency index value of the i-th word in the query sentence
- represents the total number of documents in the corpus
- df i represents the number of documents containing the i-th word in the corpus.
- the phrase-level relevance of the query sentence and the document sentence is obtained, and the second query result is obtained according to the relevance of the phrase level, including: performing activities on the relevance matrix of the first word level The average pooling operation with a window size of 2*2 to obtain the first matrix; the maximum pooling operation in the row direction is performed on the first matrix to obtain the second intermediate vector
- idf i is the inverse text frequency index value of the i-th word in the query sentence
- represents the total number of documents in the corpus
- df i represents the number of documents containing the i-th word in the corpus.
- an attention mechanism is introduced to the query sentence and the document sentence, and according to the query sentence and the document after the introduction of the attention mechanism
- the relevance of the sentence, the third query result is obtained, including: determining the vector expression of the vocabulary of the professional field; extracting the vocabulary of the professional field in the query sentence and document sentence to form a new vector expression; calculating the word level of the query sentence and the vocabulary of the professional field
- the attention mechanism is introduced to the vector expression of the query sentence and the document sentence; according to the relevance of the query sentence and the document sentence after the attention mechanism is introduced, it is obtained The first query result.
- a technical solution adopted in the present application is to provide a text query device, which includes a processor and a memory, the memory stores program data, and the processor is used to execute the program data to implement the above-mentioned method.
- a technical solution adopted in this application is to provide a computer storage medium in which program data is stored, and the program data is used to implement the above-mentioned method when the program data is executed by a processor.
- the text query method provided in this application includes: based on the relevance of the query statement and the first word level of the document statement, the attention mechanism is introduced to the query statement and the document statement, and the query statement and the document statement are introduced based on the attention mechanism. Relevance, get the first query result; get the phrase-level relevance of the query sentence and the document sentence according to the relevance of the first word level, and get the second query result according to the phrase-level relevance; based on the specialty in the query sentence
- the second word level correlation of the domain vocabulary and the professional domain vocabulary in the document sentence, the attention mechanism is introduced to the query sentence and the document sentence, and the third is obtained according to the correlation between the query sentence and the document sentence after the attention mechanism is introduced.
- the first aspect is to compare words and phrases at the two levels, which can have a better recognition ability for the documents in the professional field.
- the second aspect is to add professional vocabulary to the recognition, which effectively solves the existing problems.
- the search network has a lack of professional knowledge background.
- FIG. 1 is a schematic flowchart of an embodiment of a text query method provided by this application
- FIG. 2 is a schematic diagram of the flow of step 11 in FIG. 1;
- FIG. 3 is a schematic flowchart of step 114 in FIG. 2;
- FIG. 4 is a schematic diagram of the flow of step 12 in FIG. 1;
- FIG. 5 is a schematic diagram of the flow of step 13 in FIG. 1;
- FIG. 6 is a schematic structural diagram of an embodiment of a text query device provided by the present application.
- FIG. 7 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.
- Fig. 1 is a schematic flowchart of an embodiment of a text query method provided by the present application, and the method includes:
- Step 11 Based on the relevance of the first word level of the query statement and the document statement, an attention mechanism is introduced to the query statement and the document statement, and the first query is obtained according to the relevance of the query statement and the document statement after the attention mechanism is introduced result.
- the word-level correlation matrix is first obtained through the vector inner product, and the attention mechanism is used to obtain the vector expression of each word on the basis of the correlation matrix. Then, the vector expression of each word in the query sentence is obtained through the maximum pooling operation. Finally, the inverse text frequency index is used to perform a weighted sum to obtain the final score.
- the use of the attention mechanism can make words more sensitive to related words, which is conducive to improving the results of document retrieval.
- step 11 may specifically include the following steps:
- Step 111 Determine the vector expression of the query sentence and the document sentence.
- k represents the dimension of the word embedding vector
- n represents the number of segmented words in the query sentence sequence
- m represents the number of segmented words in the document sentence.
- Step 112 Calculate the word-level correlation matrix of the query sentence and the document sentence.
- Step 113 Based on the word-level correlation matrix of the query sentence and the document sentence, an attention mechanism is introduced to the vector expression of the query sentence and the document sentence.
- Step 114 Obtain the first query result according to the correlation between the query sentence and the document sentence after the attention mechanism is introduced.
- step 114 may specifically include the following steps:
- Step 1141 Calculate the Hadamard product of the two vectors before and after each word in the query sentence and the document sentence is introduced into the attention mechanism.
- ⁇ means multiplying two values.
- Step 1142 Splicing the two vectors and Hadamard products before and after the attention mechanism is introduced for each word in the query sentence and the document sentence to form a splicing vector.
- Step 1143 Calculate the correlation matrix between the stitching vector of the query sentence and the stitching vector of the document sentence.
- Step 1144 Perform a pooling operation on the correlation matrix of the stitching vector of the query sentence and the stitching vector of the document sentence to obtain the first query result.
- a pooling operation is performed on the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence to obtain the first intermediate vector
- Matrix The maximum value of the i-th row in.
- idf i is the inverse text frequency index value of the i-th word in the query sentence
- represents the total number of documents in the corpus
- df i represents the number of documents containing the i-th word in the corpus.
- Step 12 According to the relevance of the first word level, obtain the phrase level relevance of the query sentence and the document sentence, and obtain the second query result according to the phrase level relevance.
- the word-level correlation matrix obtained by the vector inner product is subjected to an average pooling operation with a sliding window of 2*2, and then the maximum pooling operation is performed to obtain a phrase-level vector expression, and finally the inverse text frequency index is also used Perform a weighted sum to get the final score at the phrase level.
- step 12 may specifically include:
- Step 121 Perform an average pooling operation with an active window size of 2*2 on the correlation matrix of the first word level to obtain the first matrix.
- the relevance matrix of the first word level calculated before is recorded as The calculation formula of the first matrix is as follows:
- Matrix The size of the value in row wi and column wj can be seen from the size of the matrix.
- the range of values for wi and wj is:
- Step 122 Perform a maximum pooling operation in the row direction on the first matrix to obtain a second intermediate vector
- Matrix The maximum value of the i-th row in.
- Step 123 Calculate the second score using the following formula:
- idf i is the inverse text frequency index value of the i-th word in the query sentence
- represents the total number of documents in the corpus
- df i represents the number of documents containing the i-th word in the corpus.
- Step 13 Based on the second word level correlation of the professional domain vocabulary in the query sentence and the professional domain vocabulary in the document sentence, an attention mechanism is introduced to the query sentence and the document sentence, and based on the query sentence and the query sentence after introducing the attention mechanism. The relevance of the document sentence, the third query result is obtained.
- the words in the dictionary are converted into vector representations using the TransE algorithm. Find out the words contained in the knowledge dictionary in the query sentence and the document to be retrieved to form a vector expression, and then also obtain the correlation matrix through the vector inner product, and use the attention mechanism to obtain the corresponding vector expression based on the correlation matrix. Finally, the final score is obtained through average pooling and maximum pooling.
- step 13 may specifically include:
- Step 131 Determine the vector expression of the vocabulary in the professional field.
- the legal professional vocabulary is taken as an example.
- Step 132 Extract the professional domain vocabulary from the query sentence and the document sentence to form a new vector expression.
- k represents the dimension of the vector after transE embedding the elements in the professional vocabulary
- n represents the number of the segmented words in the query sentence sequence in the professional field vocabulary
- m represents the segmented words in the document sentence in the professional field
- the number of words in the vocabulary Represents the vector expression of the i-th term of the professional vocabulary in the query sequence
- Step 133 Calculate the word-level correlation matrix of the query sentence and the vocabulary of the professional field.
- Step 134 Based on the word-level correlation matrix of the query sentence and the document sentence, an attention mechanism is introduced to the vector expression of the query sentence and the document sentence.
- Step 135 Obtain the first query result according to the correlation between the query sentence and the document sentence after the attention mechanism is introduced.
- the subsequent steps 133 to 135 can adopt a similar way as in the above step 11: the matrix with Introduce the attention mechanism to get the vector with Calculate the correlation again to get
- Step 14 Determine the final query result based on the query statement according to the first query result, the second query result, and the third query result.
- the first score, the second score, and the third score can be averaged to obtain the final score to determine whether the query sentence is related to the document sentence, or the first score, the second score, and the third score can also be calculated.
- the score is summed according to a certain weight to get the final score, and there is no restriction here.
- the text query method provided in this embodiment includes: based on the relevance of the query sentence and the first word level of the document sentence, an attention mechanism is introduced to the query sentence and the document sentence, and according to the introduction of the attention mechanism The relevance of the query sentence and the document sentence to obtain the first query result; according to the relevance of the first word level, the phrase-level relevance of the query sentence and the document sentence is obtained, and the second query result is obtained according to the relevance of the phrase level ; Based on the second word level relevance of the professional domain vocabulary in the query sentence and the professional domain vocabulary in the document sentence, the attention mechanism is introduced to the query sentence and the document sentence, and according to the query sentence and the document sentence after the introduction of the attention mechanism The third query result is obtained; according to the first query result, the second query result, and the third query result, the final query result based on the query statement is determined.
- the first aspect is to compare words and phrases at the two levels, which can have a better recognition ability for the documents in the professional field.
- the second aspect is to add professional vocabulary to the recognition, which effectively solves the existing problems.
- the search network has a lack of professional knowledge background.
- the text query device 60 includes a processor 61 and a memory 62, wherein the memory 62 stores program data, and the processor 61 is used for The program data is executed to achieve the following method steps:
- an attention mechanism is introduced to the query statement and the document statement, and the first query result is obtained according to the relevance of the query statement and the document statement after the attention mechanism is introduced;
- Relevance of the first word level obtains the phrase-level relevance of the query sentence and the document sentence, and obtains the second query result according to the phrase-level relevance;
- the second word-level relevance of, the attention mechanism is introduced to the query sentence and the document sentence, and the third query result is obtained according to the relevance of the query sentence and the document sentence after the attention mechanism is introduced; according to the first query result, the first query The second query result and the third query result determine the final query result based on the query statement.
- FIG. 7 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.
- the computer storage medium 70 stores program data 71, and the program data 71 is used to implement the following method when executed by a processor step:
- an attention mechanism is introduced to the query statement and the document statement, and the first query result is obtained according to the relevance of the query statement and the document statement after the attention mechanism is introduced;
- Relevance of the first word level obtains the phrase-level relevance of the query sentence and the document sentence, and obtains the second query result according to the phrase-level relevance;
- the second word-level relevance of, the attention mechanism is introduced to the query sentence and the document sentence, and the third query result is obtained according to the relevance of the query sentence and the document sentence after the attention mechanism is introduced; according to the first query result, the first query The second query result and the third query result determine the final query result based on the query statement.
- the program data when executed, it is also used to realize: determine the vector expression of the query sentence and the document sentence; calculate the words of the query sentence and the document sentence Level correlation matrix; based on the word-level correlation matrix of query sentences and document sentences, an attention mechanism is introduced to the vector expression of query sentences and document sentences; according to the relevance of query sentences and document sentences after introducing the attention mechanism, Get the first query result.
- determining the vector expression of the query sentence and the document sentence includes: performing word segmentation and word embedding processing on the query sentence and the document sentence to obtain the vector expression Q n*k of the query sentence and the vector expression D m*k of the document sentence, where , Among them, k represents the dimension of the word embedding vector, n represents the number of segmented words in the query sentence sequence, and m represents the number of segmented words in the document sentence. Represents the vector expression of the i-th word in the query sequence, Represents the vector expression of the i-th word in the document.
- calculating the word-level correlation matrix of the query sentence and the document sentence includes: calculating the word-level correlation matrix M n*m of the query sentence and the document sentence, where the i-th row in the matrix M n*m is the jth row
- the element M ij of the column is calculated using the following formula: among them, Represents the vector corresponding to the i-th word in the query sequence, Represents the vector corresponding to the jth word in the document sentence.
- the attention mechanism is introduced to the vector expression of the query sentence and the document sentence, including: using the following formula to calculate the vector expression of the query sentence and the document sentence after the attention mechanism is introduced : among them, Represents the vector after the i-th word in the query sequence is introduced into the attention mechanism, Represents the vector of the j-th word in the document after the attention mechanism is introduced.
- the first query result is obtained, including: calculating the Hadamard product of the two vectors before and after each word in the query sentence and the document sentence is introduced into the attention mechanism; The two vectors and Hadamard products before and after the attention mechanism is introduced for each word in the query sentence and the document sentence are spliced to form a splicing vector; the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence is calculated; the query sentence Pooling is performed on the correlation matrix of the stitching vector of and the stitching vector of the document sentence to obtain the first query result.
- the pooling operation is performed on the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence to obtain the first query result, including: pooling the correlation matrix of the splicing vector of the query sentence and the splicing vector of the document sentence Operation to get the first intermediate vector
- idf i is the inverse text frequency index value of the i-th word in the query sentence
- represents the total number of documents in the corpus
- df i represents the number of documents containing the i-th word in the corpus.
- the program data when executed, it is also used to implement: performing an active window size of 2*2 on the correlation matrix of the first word level Average the pooling operation to obtain the first matrix; perform the maximum pooling operation in the row direction on the first matrix to obtain the second intermediate vector
- idf i is the inverse text frequency index value of the i-th word in the query sentence
- represents the total number of documents in the corpus
- df i represents the number of documents containing the i-th word in the corpus.
- the program data when executed, it is also used to realize: determine the vector expression of the vocabulary of the professional field; combine the professional field in the query sentence and the document sentence The vocabulary is extracted to form a new vector expression; the word-level correlation matrix of the query sentence and the vocabulary of the professional field is calculated; based on the word-level correlation matrix of the query sentence and the document sentence, attention is drawn to the vector expression of the query sentence and the document sentence Mechanism: According to the relevance of the query statement and the document statement after the attention mechanism is introduced, the first query result is obtained.
- the disclosed method and device may be implemented in other ways.
- the device implementation described above is merely illustrative.
- the division of the modules or units is only a logical function division.
- there may be other divisions for example, multiple units or components may be Combined or can be integrated into another system, or some features can be ignored or not implemented.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the objectives of the solutions of this embodiment.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
L'invention concerne un procédé d'interrogation de texte, un dispositif d'interrogation de texte, et un support de stockage informatique ; ledit procédé d'interrogation de texte comprenant : sur la base de la pertinence d'un premier niveau de mot d'une instruction d'interrogation et d'une instruction de document, l'introduction d'un mécanisme d'attention dans l'instruction d'interrogation et l'instruction de document, et, en fonction de la pertinence de l'instruction d'interrogation et de l'instruction de document, l'obtention d'un premier résultat d'interrogation (11) ; en fonction de la pertinence du premier niveau de mot, l'obtention de la pertinence du niveau d'expression de l'instruction d'interrogation et de l'instruction de document, et, en fonction de la pertinence du niveau d'expression, l'obtention d'un deuxième résultat d'interrogation (12) ; sur la base de la pertinence du second niveau de mot d'un terme de champ professionnel dans l'expression d'interrogation et d'un terme de champ professionnel dans l'instruction de document, l'introduction d'un mécanisme d'attention dans l'instruction d'interrogation et l'instruction de document, et, en fonction de la pertinence de l'instruction d'interrogation et de l'instruction de document, l'obtention d'un troisième résultat d'interrogation (13) ; en fonction du premier résultat d'interrogation, du deuxième résultat d'interrogation et du troisième résultat d'interrogation, la détermination d'un résultat d'interrogation final sur la base de l'instruction d'interrogation (14). Au moyen du procédé décrit, il est possible d'améliorer la précision et l'efficacité d'interrogation de texte.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911114274.2 | 2019-11-14 | ||
CN201911114274.2A CN111159331B (zh) | 2019-11-14 | 2019-11-14 | 文本的查询方法、文本查询装置以及计算机存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021093871A1 true WO2021093871A1 (fr) | 2021-05-20 |
Family
ID=70555994
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/128801 WO2021093871A1 (fr) | 2019-11-14 | 2020-11-13 | Procédé d'interrogation de texte, dispositif d'interrogation de texte, et support de stockage informatique |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111159331B (fr) |
WO (1) | WO2021093871A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159331B (zh) * | 2019-11-14 | 2021-11-23 | 中国科学院深圳先进技术研究院 | 文本的查询方法、文本查询装置以及计算机存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160004784A1 (en) * | 2014-07-04 | 2016-01-07 | Samsung Electronics Co., Ltd. | Method of providing relevant information and electronic device adapted to the same |
CN107844469A (zh) * | 2017-10-26 | 2018-03-27 | 北京大学 | 基于词向量查询模型的文本简化方法 |
CN108491433A (zh) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | 聊天应答方法、电子装置及存储介质 |
CN109063174A (zh) * | 2018-08-21 | 2018-12-21 | 腾讯科技(深圳)有限公司 | 查询答案的生成方法及装置、计算机存储介质、电子设备 |
CN111159331A (zh) * | 2019-11-14 | 2020-05-15 | 中国科学院深圳先进技术研究院 | 文本的查询方法、文本查询装置以及计算机存储介质 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026388A (en) * | 1995-08-16 | 2000-02-15 | Textwise, Llc | User interface and other enhancements for natural language information retrieval system and method |
US10031913B2 (en) * | 2014-03-29 | 2018-07-24 | Camelot Uk Bidco Limited | Method, system and software for searching, identifying, retrieving and presenting electronic documents |
CN109472024B (zh) * | 2018-10-25 | 2022-10-11 | 安徽工业大学 | 一种基于双向循环注意力神经网络的文本分类方法 |
CN110347790B (zh) * | 2019-06-18 | 2021-08-10 | 广州杰赛科技股份有限公司 | 基于注意力机制的文本查重方法、装置、设备及存储介质 |
-
2019
- 2019-11-14 CN CN201911114274.2A patent/CN111159331B/zh active Active
-
2020
- 2020-11-13 WO PCT/CN2020/128801 patent/WO2021093871A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160004784A1 (en) * | 2014-07-04 | 2016-01-07 | Samsung Electronics Co., Ltd. | Method of providing relevant information and electronic device adapted to the same |
CN107844469A (zh) * | 2017-10-26 | 2018-03-27 | 北京大学 | 基于词向量查询模型的文本简化方法 |
CN108491433A (zh) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | 聊天应答方法、电子装置及存储介质 |
CN109063174A (zh) * | 2018-08-21 | 2018-12-21 | 腾讯科技(深圳)有限公司 | 查询答案的生成方法及装置、计算机存储介质、电子设备 |
CN111159331A (zh) * | 2019-11-14 | 2020-05-15 | 中国科学院深圳先进技术研究院 | 文本的查询方法、文本查询装置以及计算机存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN111159331B (zh) | 2021-11-23 |
CN111159331A (zh) | 2020-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020062770A1 (fr) | Procédé et appareil de construction de dictionnaire de domaine et dispositif et support d'enregistrement | |
Zheng et al. | Learning to reweight terms with distributed representations | |
Thakkar et al. | Graph-based algorithms for text summarization | |
Wang et al. | Using word embeddings to enhance keyword identification for scientific publications | |
KR101923650B1 (ko) | 문장 임베딩 및 유사 질문 검색을 위한 장치 및 방법 | |
JP5216063B2 (ja) | 未登録語のカテゴリを決定する方法と装置 | |
Jabbar et al. | Empirical evaluation and study of text stemming algorithms | |
US20180260381A1 (en) | Prepositional phrase attachment over word embedding products | |
Anupriya et al. | LDA based topic modeling of journal abstracts | |
JP2002510076A (ja) | 言語モデルに基づく情報検索および音声認識 | |
CN109783806B (zh) | 一种利用语义解析结构的文本匹配方法 | |
CN107992477A (zh) | 文本主题确定方法、装置及电子设备 | |
El Mahdaouy et al. | Word-embedding-based pseudo-relevance feedback for Arabic information retrieval | |
CN103646112A (zh) | 利用了网络搜索的依存句法的领域自适应方法 | |
KR102059743B1 (ko) | 딥러닝 기반의 지식 구조 생성 방법을 활용한 의료 문헌 구절 검색 방법 및 시스템 | |
CN111737997A (zh) | 一种文本相似度确定方法、设备及储存介质 | |
CN110929498A (zh) | 一种短文本相似度的计算方法及装置、可读存储介质 | |
CN110019474B (zh) | 异构数据库中的同义数据自动关联方法、装置及电子设备 | |
Sharma et al. | BioAMA: towards an end to end biomedical question answering system | |
JP5427694B2 (ja) | 関連コンテンツ提示装置及びプログラム | |
WO2021093871A1 (fr) | Procédé d'interrogation de texte, dispositif d'interrogation de texte, et support de stockage informatique | |
Wang et al. | A joint chinese named entity recognition and disambiguation system | |
CN110442674B (zh) | 标签传播的聚类方法、终端设备、存储介质及装置 | |
TWI636370B (zh) | Establishing chart indexing method and computer program product by text information | |
CN113868387A (zh) | 一种基于改进tf-idf加权的word2vec医疗相似问题检索方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20887870 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20887870 Country of ref document: EP Kind code of ref document: A1 |