WO2019091026A1 - Procédé de recherche rapide de document dans une base de connaissances, serveur d'application, et support d'informations lisible par ordinateur - Google Patents

Procédé de recherche rapide de document dans une base de connaissances, serveur d'application, et support d'informations lisible par ordinateur Download PDF

Info

Publication number
WO2019091026A1
WO2019091026A1 PCT/CN2018/077675 CN2018077675W WO2019091026A1 WO 2019091026 A1 WO2019091026 A1 WO 2019091026A1 CN 2018077675 W CN2018077675 W CN 2018077675W WO 2019091026 A1 WO2019091026 A1 WO 2019091026A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
search
query
words
keyword
Prior art date
Application number
PCT/CN2018/077675
Other languages
English (en)
Chinese (zh)
Inventor
张师琲
侯丽
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019091026A1 publication Critical patent/WO2019091026A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present application relates to the field of data analysis technologies, and in particular, to a quick retrieval method of a knowledge base document, and an application server computer readable storage medium.
  • the present application proposes a quick retrieval method and an application server for a knowledge base document to solve the problem of how to quickly and accurately retrieve the files in the knowledge base and how to quickly understand the main contents of the retrieved files.
  • the present application provides a method for quickly searching a knowledge base document, the method comprising the steps of:
  • the search information is analyzed and processed to obtain a query word
  • the sorted search results are output and correspond to the summary and keywords of the output target document.
  • the present application further provides an application server, including a memory, a processor, and a knowledge base document fast retrieval system stored on the memory and operable on the processor, the knowledge base document The step of implementing the knowledge base document quick retrieval method as described above when the fast retrieval system is executed by the processor.
  • the present application further provides a computer readable storage medium storing a knowledge base document fast retrieval system, the knowledge base document fast retrieval system being configurable by at least one processor Executing to cause the at least one processor to perform the steps of the knowledge base document fast retrieval method as described above.
  • the knowledge database document fast retrieval method, the application server, and the computer readable storage medium proposed by the present application first receive retrieval information input by a user; secondly, analyze and process the retrieval information to obtain a query word. Searching the documents in the knowledge base again according to the query words, and sorting the search results according to the search matching degree; then obtaining the abstracts and keywords of each document through the summary generation model and the keyword generation model; Search results and corresponding to the summary and keywords of the output target document.
  • the knowledge database document fast retrieval method, the application server and the computer readable storage medium proposed in the present application the files in the knowledge base can be quickly and accurately retrieved, and the main contents of the retrieved files can be quickly understood.
  • 1 is a schematic diagram of an optional hardware architecture of an application server of the present application
  • FIG. 2 is a schematic diagram of a program module of an implementation manner of a quick retrieval system of the knowledge base document of the present application
  • FIG. 3 is a schematic flowchart of a first embodiment of a method for quickly searching a knowledge base of the present application
  • FIG. 4 is a schematic flowchart of a second embodiment of a method for quickly searching a knowledge base of the present application
  • FIG. 5 is a schematic flowchart of a third embodiment of a method for quickly searching a knowledge base of the present application
  • FIG. 6 is a schematic flowchart of a fourth embodiment of a method for quickly searching a knowledge base of the present application
  • FIG. 7 is a schematic flowchart of a fifth embodiment of a method for quickly searching a knowledge base of the present application.
  • FIG. 8 is a schematic flowchart of a sixth embodiment of a method for quickly searching a knowledge base of the present application
  • FIG. 9 is a schematic flowchart diagram of a seventh embodiment of a quick retrieval method of the knowledge base document of the present application.
  • FIG. 1 it is a schematic diagram of an optional hardware architecture of the application server 1 of the present application.
  • the application server 1 may include, but is not limited to, the memory 11, the processor 12, and the network interface 13 being communicably connected to each other through a system bus. It is pointed out that Figure 1 only shows the application server 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the application server 1 may be a computing device such as a rack server, a blade server, a tower server, or a rack server.
  • the application server 1 may be an independent server or a server cluster composed of multiple servers. .
  • the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), a random access memory (RAM), a static Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like.
  • the memory 11 may be an internal storage unit of the application server 1, such as a hard disk or memory of the application server 1.
  • the memory 11 may also be an external storage device of the application server 1, such as a plug-in hard disk equipped on the application server 1, a smart memory card (SMC), and a secure digital number. (Secure Digital, SD) card, flash card, etc.
  • SMC smart memory card
  • SD Secure Digital
  • the memory 11 can also include both the internal storage unit of the application server 1 and its external storage device.
  • the memory 11 is generally used to store an operating system installed in the application server 1 and various types of application software, such as program code of the knowledge base document quick retrieval system 200. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.
  • the processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments.
  • the processor 12 is typically used to control the overall operation of the application server 1.
  • the processor 12 is configured to run program code or process data stored in the memory 11, such as running the knowledge base document fast retrieval system 200 and the like.
  • the network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the application server 1 and other electronic devices.
  • the present application proposes a knowledge base document rapid retrieval system 200.
  • FIG. 2 there is shown a block diagram of the first embodiment of the knowledge base document quick retrieval system 200 of the present application.
  • the knowledge base document rapid retrieval system 200 includes a series of computer program instructions stored on the memory 11, and when the computer program instructions are executed by the processor 12, the knowledge base document quick retrieval operation of the embodiments of the present application can be implemented.
  • the knowledge base document quick retrieval system 200 can be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 2, the knowledge base document quick retrieval system 200 can be divided into an acquisition module 21, an analysis processing module 22, a retrieval module 23, a sorting module 24, an establishing module 25, a calling module 26, and an output module 27. among them:
  • the obtaining module 21 is configured to receive retrieval information input by a user.
  • the search information may be different according to different situations.
  • the search information may include the following three types: in the first case, the search information is a sentence; in the second case, the search information is a word.
  • the retrieval information includes the case of a sentence and a word.
  • the analysis processing module 22 is configured to analyze and process the search information to obtain a query word.
  • the first way for the case where the search information is a sentence, the input sentence is processed by word segmentation through the combination of grammar analysis and semantic analysis, the meaningless word symbols are eliminated, and several query words are extracted and transmitted to the retrieval module. search for. For example, if the user enters “How is the economic form of China this year?”, the key query words of “China” and “Economy” can be obtained through analysis, and unimportant word symbols such as auxiliary words, interrogative words and symbols are excluded. ;
  • the query word is conceptually extended into corresponding synonyms, synonyms and upper and lower words according to a preset rule, and part of the extended words or receiving users are extracted according to the synonym similarity similarity algorithm.
  • the selected extended word is used as the query word, and the choice of the extended word as the query word can be based on the priority level of each word. For example, the user enters "college students", and the subsequent "college students” can be extended to "undergraduate students", "graduate students”, “specialist students”, “college students”, “secondary students”, etc.:
  • the third way is to combine the two functions.
  • the specific combination process is as follows: firstly, the semantic analysis and the grammatical analysis are combined to process the word segmentation, and then the segmented query words are conceptually extended into corresponding synonyms and synonyms. Or the upper and lower words, according to the similarity priority algorithm, extract part of the extended words or receive the extended words selected by the user, and finally pass the query words together with the defined extended words as the query conditions to the retrieval module.
  • the searching module 23 is configured to search a document in the knowledge base according to the query word.
  • the documents in the knowledge base include various types, for example, text files including pdf, doc, docx, ppt, excel, txt, html, xml, zip, tar, and the like.
  • a full-text search operation can be performed, and a database is used as a source to build an index library, and the search matching degree is obtained by using TF-IDF to calculate the weight, and the search result is intelligently sorted according to the searched matching degree, and the search word is highlighted.
  • Search methods include cross-language information retrieval, spell check, regular search (for professionals), real-time search results, and entry records, etc., to achieve the optimal operation of assisted retrieval. Search results can be automatically completed based on historical records and hot searches during the search process.
  • the sorting module 24 is configured to sort the search results according to the search matching degree.
  • the establishing module 25 is configured to establish a summary generation model and a keyword generation model.
  • the calling module 26 is configured to invoke the summary generation model and the keyword generation model to obtain a summary and keywords of each document.
  • obtaining a summary and keywords of each document includes the following steps:
  • the target document is sentenced and segmented, and the content of the target document is split into sentences and words.
  • the digest generating model obtains a sentence generating digest whose weight value is greater than a preset value, and generates a keyword by using a keyword generating model to select a word with a word frequency greater than a preset value.
  • the output module 27 is configured to output the sorted search result, and corresponding to the summary and keywords of the output target document.
  • the user usually clicks on the top-ranked document for viewing.
  • the display module displays the content/summary/keyword of the document.
  • the present application also proposes a quick retrieval method of the knowledge base document.
  • FIG. 3 it is a schematic flowchart of the first embodiment of the quick retrieval method of the knowledge base document of the present application.
  • the order of execution of the steps in the flowchart shown in FIG. 3 may be changed according to different requirements, and some steps may be omitted.
  • Step S110 receiving retrieval information input by the user.
  • the search information may be different according to different situations.
  • the search information may include the following three types: in the first case, the search information is a sentence; in the second case, the search information is a word.
  • the retrieval information includes the case of a sentence and a word.
  • Step S120 analyzing and processing the search information to obtain a query word.
  • Step S130 Searching for documents in the knowledge base according to the query words, and sorting the search results according to the search matching degree.
  • the documents in the knowledge base include various types, for example, text files including pdf, doc, docx, ppt, excel, txt, html, xml, zip, tar, and the like.
  • a full-text search operation can be performed, and a database is used as a source to build an index library, and the search matching degree is obtained by using TF-IDF to calculate the weight, and the search result is intelligently sorted according to the searched matching degree, and the search word is highlighted.
  • Search methods include cross-language information retrieval, spell check, regular search (for professionals), real-time search results, and entry records, etc., to achieve the optimal operation of assisted retrieval. Search results can be automatically completed based on historical records and hot searches during the search process.
  • Step S140 obtaining a summary and a keyword of each document by using a digest generation model and a keyword generation model.
  • Step S150 outputting the sorted search result, and corresponding to the summary and keywords of the output target document.
  • step S120 in the first embodiment the method of analyzing and processing the search information to obtain a query word specifically includes the following steps:
  • search information when the search information is a sentence, the input sentence is subjected to word segmentation by a combination of syntax analysis and semantic analysis, the meaningless word symbol is removed, and a plurality of the query words are extracted.
  • the search information is a word
  • the word is conceptually expanded into corresponding synonyms, synonyms, and upper and lower words according to a preset rule, and part of the extended words or receiving user selection is extracted according to the synonym similarity similarity algorithm.
  • the extension word is used as the query term.
  • step S120 in the first embodiment “analysing and processing the search information to obtain a query word” specifically includes the following steps:
  • the system first divides the paragraph and the sentence into words, and after analysis, obtains more important words, and expands the important words into meanings, and the extended words include the upper words and the lower positions. Words, synonyms, synonyms, and so on. For example, if the user enters "How is the economic form of China this year?" The system obtains two query words “China” and “Economy”, then the system can obtain the extension words of "China”, such as “venue”, “venue” “Domestic”, etc.; according to the “economic”, the extension words “GDP”, "trade”, “commercial”, “financial”, “financial”, etc. are available.
  • step S130 in the first embodiment “searching for documents in the knowledge base according to the query words, and sorting the search results according to the search matching degree” specifically includes:
  • the retrieval methods include cross-language information retrieval, spell checking, regular retrieval (for professionals), real-time retrieval results, and entry records, etc., and the optimal operation of the auxiliary retrieval is realized.
  • TF-IDF is a statistical method used to assess the importance of a word for a file set or one of the files in a corpus. The importance of a word increases proportionally with the number of times it appears in the file, but it also decreases inversely with the frequency it appears in the corpus.
  • the main idea of TF-IDF is: If a word or phrase appears in an article with a high frequency TF and rarely appears in other articles, then the word or phrase is considered to have good class distinguishing ability and is suitable for use. To classify. TFIDF is actually: TF*IDF, TF word frequency (Term Frequency), IDF inverse file frequency (Inverse Document Frequency). TF indicates how often the term appears in the document.
  • a matching degree threshold may be set to display a document larger than the matching degree threshold.
  • the user can also display the number of documents on one interface as needed, for example, 20, 30, 50, and the like.
  • FIG. 7 is a schematic flowchart diagram of a fifth embodiment of the method for quickly searching for a knowledge base document of the present application.
  • "searching for documents in the knowledge base according to the query words, and sorting the search results according to the search matching degree" further includes the steps of:
  • the search result is automatically completed according to the historical record and the hot search.
  • the search results and the hot search can be used to supplement and optimize the search results, so that the search results are more complete and accurate.
  • the historical search record is stored in a database or a server, and the hot search result can also be obtained from the database or the retrieved record statistics of the server.
  • FIG. 8 is a schematic flowchart diagram of a sixth embodiment of a method for quickly searching for a knowledge base document of the present application.
  • step S140 in the first embodiment "the summary and keywords of each document are obtained by the digest generation model and the keyword generation model" specifically include:
  • S610 performing segmentation and word segmentation on the target document, and splitting the content of the target document into sentences and words;
  • S620 Generate, by using the digest generation model, a sentence whose weight value is greater than a preset value, generate the digest, and generate, by using the keyword generation model, a word whose word frequency is greater than a preset value to generate the keyword.
  • step S140 in the first embodiment "the summary and keywords of each document are obtained by the digest generation model and the keyword generation model" further includes:
  • Wi a*WPi+b*WSi
  • the keyword generation model is established based on word frequency statistics.
  • the weight value of each sentence of Wi is the weight of each sentence and each keyword
  • WPi is the position weight value
  • WSi is the semantic weight value
  • a and b are the weight coefficients
  • wp(ij) is the jth keyword
  • sp(j) is the number of sentences in each sentence containing the j-th keyword
  • m is the total number of sentences
  • n is the total number of keywords.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • a storage medium such as ROM/RAM, disk
  • the optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in the various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé de recherche rapide de document dans une base de connaissances, un serveur d'application et un support d'informations lisible par ordinateur, le procédé consistant : à recevoir des informations de recherche entrées par un utilisateur (S10); à analyser et à traiter les informations de recherche afin d'acquérir un mot d'interrogation (S120); sur la base du mot d'interrogation, à rechercher des documents dans une base de connaissances, et à classer les résultats de recherche sur la base d'un degré de correspondance de recherche (S130); au moyen d'un modèle de génération de résumés et d'un modèle de génération de mots clés, à acquérir un résumé et des mots clés de chaque document (S140); et à délivrer les résultats de recherche classés et à délivrer de manière correspondante le résumé et les mots-clés d'un document cible (S150). Le présent procédé de recherche rapide de document dans une base de connaissances, le serveur d'application et le support d'informations lisible par ordinateur permettent de rechercher rapidement et avec précision des documents dans une base de connaissances, et de comprendre rapidement le contenu principal des documents récupérés.
PCT/CN2018/077675 2017-11-10 2018-02-28 Procédé de recherche rapide de document dans une base de connaissances, serveur d'application, et support d'informations lisible par ordinateur WO2019091026A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711106767.2 2017-11-10
CN201711106767.2A CN108038096A (zh) 2017-11-10 2017-11-10 知识库文档快速检索方法、应用服务器计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2019091026A1 true WO2019091026A1 (fr) 2019-05-16

Family

ID=62092842

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077675 WO2019091026A1 (fr) 2017-11-10 2018-02-28 Procédé de recherche rapide de document dans une base de connaissances, serveur d'application, et support d'informations lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN108038096A (fr)
WO (1) WO2019091026A1 (fr)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674306B (zh) * 2018-06-15 2023-06-20 株式会社日立制作所 知识图谱的构建方法、装置、电子设备
CN109189916B (zh) * 2018-08-17 2022-04-22 杜林蔚 英文摘要关键信息提取方法、装置及电子设备
CN109101495A (zh) * 2018-08-27 2018-12-28 上海宝尊电子商务有限公司 一种基于图像识别和知识图谱的时尚领域文本生成方法
CN109359178A (zh) * 2018-09-14 2019-02-19 华南师范大学 一种检索方法、装置、存储介质及设备
CN109299235B (zh) * 2018-09-19 2023-04-25 平安科技(深圳)有限公司 知识库搜索方法、装置及计算机可读存储介质
CN109408690B (zh) * 2018-09-19 2021-10-26 合肥泓泉档案信息科技有限公司 一种具有地域分析功能的档案信息智能化调控方法
CN109522389B (zh) * 2018-11-07 2020-09-01 中国联合网络通信集团有限公司 文档推送方法、装置和存储介质
CN109918661B (zh) * 2019-03-04 2023-05-30 腾讯科技(深圳)有限公司 同义词获取方法及装置
CN109933724B (zh) * 2019-03-07 2022-01-14 上海智臻智能网络科技股份有限公司 知识搜索方法、系统、问答装置、电子设备及存储介质
CN109933702B (zh) * 2019-03-11 2022-12-16 智慧芽信息科技(苏州)有限公司 一种检索展示方法、装置、设备及存储介质
CN111767365A (zh) * 2019-03-12 2020-10-13 株式会社理光 文档检索设备及方法
CN110069610B (zh) * 2019-03-16 2024-03-19 平安科技(深圳)有限公司 基于Solr的检索方法、装置、设备和存储介质
CN110727786A (zh) * 2019-09-12 2020-01-24 武汉儒松科技有限公司 自学习的知识库管理方法、装置、终端设备及存储介质
CN111008265B (zh) * 2019-12-03 2023-03-28 腾讯云计算(北京)有限责任公司 企业信息搜索方法及装置
CN111241247A (zh) * 2020-01-19 2020-06-05 国网湖南省电力有限公司 一种电力系统异常状态历史记录搜索方法、系统及介质
CN111930880A (zh) * 2020-08-14 2020-11-13 易联众信息技术股份有限公司 一种文本编码检索的方法、装置及介质
CN112035512B (zh) * 2020-09-02 2023-08-18 中国银行股份有限公司 知识库的检索方法、装置、电子设备及计算机存储介质
CN113761142A (zh) * 2020-09-25 2021-12-07 北京沃东天骏信息技术有限公司 一种生成答案摘要的方法和装置
CN112925900B (zh) * 2021-02-26 2023-10-03 北京百度网讯科技有限公司 搜索信息处理方法、装置、设备及存储介质
CN113204621B (zh) * 2021-05-12 2024-05-07 北京百度网讯科技有限公司 文档入库、文档检索方法,装置,设备以及存储介质
CN113254623B (zh) * 2021-06-23 2024-02-20 中国银行股份有限公司 数据处理方法、装置、服务器、介质及产品
CN115687580B (zh) * 2022-09-22 2023-08-01 广州视嵘信息技术有限公司 搜索提醒补全的生成和重排序方法、装置、设备及介质
CN115905489B (zh) * 2022-11-21 2023-11-17 广西建设职业技术学院 一种提供招投标信息搜索服务的方法
CN116010560B (zh) * 2023-03-28 2023-06-09 青岛阿斯顿工程技术转移有限公司 一种国际技术转移数据服务系统
CN116450769A (zh) * 2023-06-09 2023-07-18 北京量子伟业信息技术股份有限公司 智慧档案的管理方法、装置、设备及介质
CN118094019A (zh) * 2024-04-29 2024-05-28 中国铁道科学研究院集团有限公司电子计算技术研究所 一种文本关联内容推荐方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408876A (zh) * 2007-10-09 2009-04-15 中兴通讯股份有限公司 一种电子文档全文检索的方法及系统
CN102163229A (zh) * 2011-04-13 2011-08-24 北京百度网讯科技有限公司 一种用于生成搜索结果的摘要的方法与设备
CN103150388A (zh) * 2013-03-21 2013-06-12 天脉聚源(北京)传媒科技有限公司 一种提取关键词的方法及装置
CN103530344A (zh) * 2013-10-09 2014-01-22 上海大学 一种基于改进的tf-idf方法的检索词实时修正方法
CN104035955A (zh) * 2014-03-18 2014-09-10 北京百度网讯科技有限公司 搜索方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023989B (zh) * 2009-09-23 2012-10-10 阿里巴巴集团控股有限公司 一种信息检索方法及其系统
CN103678576B (zh) * 2013-12-11 2016-08-17 华中师范大学 基于动态语义分析的全文检索系统
CN103699525B (zh) * 2014-01-03 2016-08-31 江苏金智教育信息股份有限公司 一种基于文本多维度特征自动生成摘要的方法和装置
CN103838833B (zh) * 2014-02-24 2017-03-15 华中师范大学 基于相关词语语义分析的全文检索系统
KR101656245B1 (ko) * 2015-09-09 2016-09-09 주식회사 위버플 문장 추출 방법 및 시스템

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408876A (zh) * 2007-10-09 2009-04-15 中兴通讯股份有限公司 一种电子文档全文检索的方法及系统
CN102163229A (zh) * 2011-04-13 2011-08-24 北京百度网讯科技有限公司 一种用于生成搜索结果的摘要的方法与设备
CN103150388A (zh) * 2013-03-21 2013-06-12 天脉聚源(北京)传媒科技有限公司 一种提取关键词的方法及装置
CN103530344A (zh) * 2013-10-09 2014-01-22 上海大学 一种基于改进的tf-idf方法的检索词实时修正方法
CN104035955A (zh) * 2014-03-18 2014-09-10 北京百度网讯科技有限公司 搜索方法和装置

Also Published As

Publication number Publication date
CN108038096A (zh) 2018-05-15

Similar Documents

Publication Publication Date Title
WO2019091026A1 (fr) Procédé de recherche rapide de document dans une base de connaissances, serveur d'application, et support d'informations lisible par ordinateur
US9558264B2 (en) Identifying and displaying relationships between candidate answers
WO2019174132A1 (fr) Procédé de traitement de données, serveur et support de stockage informatique
US9318027B2 (en) Caching natural language questions and results in a question and answer system
US8819047B2 (en) Fact verification engine
US9507867B2 (en) Discovery engine
CN111797214A (zh) 基于faq数据库的问题筛选方法、装置、计算机设备及介质
US10360229B2 (en) Systems and methods for enterprise data search and analysis
US11321336B2 (en) Systems and methods for enterprise data search and analysis
CN112231494B (zh) 信息抽取方法、装置、电子设备及存储介质
CN111160007B (zh) 基于bert语言模型的搜索方法、装置、计算机设备及存储介质
WO2015084757A1 (fr) Systèmes et procédés de traitement de données stockées dans une base de données
US20190205470A1 (en) Hypotheses generation using searchable unstructured data corpus
Sukumar et al. Semantic based sentence ordering approach for multi-document summarization
US9183600B2 (en) Technology prediction
Juan An effective similarity measurement for FAQ question answering system
CN112926297B (zh) 处理信息的方法、装置、设备和存储介质
US10380195B1 (en) Grouping documents by content similarity
Orăsan Comparative evaluation of term-weighting methods for automatic summarization
Gayen et al. Automatic identification of Bengali noun-noun compounds using random forest
Chen et al. Chinese named entity abbreviation generation using first-order logic
WO2013150633A1 (fr) Système et procédé de traitement de documents
Gondaliya et al. Journey of Information Retrieval to Information Retrieval Tools-IR&IRT A Review
Shannaq Adapt clustering methods for arabic documents
Sheth et al. IMPACT SCORE ESTIMATION WITH PRIVACY PRESERVATION IN INFORMATION RETRIEVAL.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18875887

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24/09/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18875887

Country of ref document: EP

Kind code of ref document: A1