CN109948044A - 基于向量最近邻搜索的文档查询 - Google Patents

基于向量最近邻搜索的文档查询 Download PDF

Info

Publication number
CN109948044A
CN109948044A CN201711343103.8A CN201711343103A CN109948044A CN 109948044 A CN109948044 A CN 109948044A CN 201711343103 A CN201711343103 A CN 201711343103A CN 109948044 A CN109948044 A CN 109948044A
Authority
CN
China
Prior art keywords
vector
document
query
retrieval
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711343103.8A
Other languages
English (en)
Chinese (zh)
Inventor
李明琴
陈琪
任刚
王井东
韩殿飞
华杰锋
张东擎
罗威
李增中
谭锋
张十
朱素艳
沈徽
张霖涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Priority to CN201711343103.8A priority Critical patent/CN109948044A/zh
Priority to PCT/US2018/064146 priority patent/WO2019118253A1/fr
Publication of CN109948044A publication Critical patent/CN109948044A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)
CN201711343103.8A 2017-12-14 2017-12-14 基于向量最近邻搜索的文档查询 Pending CN109948044A (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201711343103.8A CN109948044A (zh) 2017-12-14 2017-12-14 基于向量最近邻搜索的文档查询
PCT/US2018/064146 WO2019118253A1 (fr) 2017-12-14 2018-12-06 Rappel de document sur la base d'une recherche du plus proche voisin d'un vecteur

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711343103.8A CN109948044A (zh) 2017-12-14 2017-12-14 基于向量最近邻搜索的文档查询

Publications (1)

Publication Number Publication Date
CN109948044A true CN109948044A (zh) 2019-06-28

Family

ID=65199569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711343103.8A Pending CN109948044A (zh) 2017-12-14 2017-12-14 基于向量最近邻搜索的文档查询

Country Status (2)

Country Link
CN (1) CN109948044A (fr)
WO (1) WO2019118253A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339241A (zh) * 2020-02-18 2020-06-26 北京百度网讯科技有限公司 一种问题查重方法、装置以及电子设备
CN111339261A (zh) * 2020-03-17 2020-06-26 北京香侬慧语科技有限责任公司 一种基于预训练模型的文档抽取方法及系统
CN111930880A (zh) * 2020-08-14 2020-11-13 易联众信息技术股份有限公司 一种文本编码检索的方法、装置及介质
US11354293B2 (en) 2020-01-28 2022-06-07 Here Global B.V. Method and apparatus for indexing multi-dimensional records based upon similarity of the records
CN115545853A (zh) * 2022-12-02 2022-12-30 云筑信息科技(成都)有限公司 一种寻找供应商的搜索方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11893030B2 (en) * 2020-09-29 2024-02-06 Cerner Innovation, Inc. System and method for improved state identification and prediction in computerized queries

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7475071B1 (en) * 2005-11-12 2009-01-06 Google Inc. Performing a parallel nearest-neighbor matching operation using a parallel hybrid spill tree
CN101639831A (zh) * 2008-07-29 2010-02-03 华为技术有限公司 一种搜索方法、装置及系统
CN103136352A (zh) * 2013-02-27 2013-06-05 华中师范大学 基于双层语义分析的全文检索系统
CN103838833A (zh) * 2014-02-24 2014-06-04 华中师范大学 基于相关词语语义分析的全文检索系统
CN103838735A (zh) * 2012-11-21 2014-06-04 大连灵动科技发展有限公司 一种提高检索效率和质量的数据检索方法
CN106909628A (zh) * 2017-01-24 2017-06-30 南京大学 一种基于区间的文本相似搜索方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7475071B1 (en) * 2005-11-12 2009-01-06 Google Inc. Performing a parallel nearest-neighbor matching operation using a parallel hybrid spill tree
CN101639831A (zh) * 2008-07-29 2010-02-03 华为技术有限公司 一种搜索方法、装置及系统
CN103838735A (zh) * 2012-11-21 2014-06-04 大连灵动科技发展有限公司 一种提高检索效率和质量的数据检索方法
CN103136352A (zh) * 2013-02-27 2013-06-05 华中师范大学 基于双层语义分析的全文检索系统
CN103838833A (zh) * 2014-02-24 2014-06-04 华中师范大学 基于相关词语语义分析的全文检索系统
CN106909628A (zh) * 2017-01-24 2017-06-30 南京大学 一种基于区间的文本相似搜索方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MUJA MARIUS ET AL: "Scalable Nearest Neighbor Algorithms for High Dimensional Data", 《IEEE COMPUTER SOCIETY》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11354293B2 (en) 2020-01-28 2022-06-07 Here Global B.V. Method and apparatus for indexing multi-dimensional records based upon similarity of the records
CN111339241A (zh) * 2020-02-18 2020-06-26 北京百度网讯科技有限公司 一种问题查重方法、装置以及电子设备
CN111339241B (zh) * 2020-02-18 2024-02-13 北京百度网讯科技有限公司 一种问题查重方法、装置以及电子设备
CN111339261A (zh) * 2020-03-17 2020-06-26 北京香侬慧语科技有限责任公司 一种基于预训练模型的文档抽取方法及系统
CN111930880A (zh) * 2020-08-14 2020-11-13 易联众信息技术股份有限公司 一种文本编码检索的方法、装置及介质
CN115545853A (zh) * 2022-12-02 2022-12-30 云筑信息科技(成都)有限公司 一种寻找供应商的搜索方法

Also Published As

Publication number Publication date
WO2019118253A1 (fr) 2019-06-20

Similar Documents

Publication Publication Date Title
CN109948044A (zh) 基于向量最近邻搜索的文档查询
US11030445B2 (en) Sorting and displaying digital notes on a digital whiteboard
US9495345B2 (en) Methods and systems for modeling complex taxonomies with natural language understanding
CN103339623B (zh) 涉及因特网搜索的方法和设备
CN102436513B (zh) 分布式检索方法和系统
CN112131295B (zh) 基于Elasticsearch的数据处理方法及设备
CN107145496A (zh) 基于关键词将图像与内容项目匹配的方法
JP6346218B2 (ja) オンライン取引プラットフォームのための検索方法、装置およびサーバ
CN107103016A (zh) 基于关键词表示使图像与内容匹配的方法
US20010044800A1 (en) Internet organizer
US20110282861A1 (en) Extracting higher-order knowledge from structured data
CN107885873A (zh) 用于输出信息的方法和装置
CN103412903B (zh) 基于兴趣对象预测的物联网实时搜索方法及系统
US20190347068A1 (en) Personal history recall
CN107145497A (zh) 基于图像和内容的元数据选择与内容匹配的图像的方法
KR102682244B1 (ko) Esg 보조 툴을 이용하여 정형화된 esg 데이터로 기계학습 모델을 학습하는 방법 및 기계학습 모델로 자동완성된 esg 문서를 생성하는 서비스 서버
CN109918594A (zh) 一种信息显示方法及装置
CN107491465A (zh) 用于搜索内容的方法和装置以及数据处理系统
KR101446154B1 (ko) 사용자 질의 확장 기법을 이용한 시맨틱 콘텐츠 검색 시스템 및 방법
US20180293299A1 (en) Query processing
Antunes et al. Context storage for m2m scenarios
US20220027419A1 (en) Smart search and recommendation method for content, storage medium, and terminal
KR101592670B1 (ko) 인덱스를 이용하는 데이터 검색 장치 및 이를 이용하는 방법
US9195940B2 (en) Jabba-type override for correcting or improving output of a model
CN110110199B (zh) 信息输出方法和装置

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination