CN118103830A - 生成不同文档模式之间的相似性分数 - Google Patents

生成不同文档模式之间的相似性分数 Download PDF

Info

Publication number
CN118103830A
CN118103830A CN202280068598.0A CN202280068598A CN118103830A CN 118103830 A CN118103830 A CN 118103830A CN 202280068598 A CN202280068598 A CN 202280068598A CN 118103830 A CN118103830 A CN 118103830A
Authority
CN
China
Prior art keywords
document
queries
documents
configuration
readable medium
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280068598.0A
Other languages
English (en)
Chinese (zh)
Inventor
L·S·马泰
F·特洛简
M·M·布罗恩
A·K·海德
周英曌
M-M·彼得里卡
R·A·沙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Publication of CN118103830A publication Critical patent/CN118103830A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/256Integrating or interfacing systems involving database management systems in federated or virtual databases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN202280068598.0A 2021-09-01 2022-08-31 生成不同文档模式之间的相似性分数 Pending CN118103830A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/464,534 2021-09-01
US17/464,534 US20230066143A1 (en) 2021-09-01 2021-09-01 Generating similarity scores between different document schemas
PCT/US2022/042177 WO2023034397A1 (en) 2021-09-01 2022-08-31 Generating similarity scores between different document schemas

Publications (1)

Publication Number Publication Date
CN118103830A true CN118103830A (zh) 2024-05-28

Family

ID=83508834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280068598.0A Pending CN118103830A (zh) 2021-09-01 2022-08-31 生成不同文档模式之间的相似性分数

Country Status (5)

Country Link
US (1) US20230066143A1 (https=)
EP (1) EP4396694A1 (https=)
JP (1) JP2024535733A (https=)
CN (1) CN118103830A (https=)
WO (1) WO2023034397A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120994760A (zh) * 2025-10-16 2025-11-21 深圳市蓝凌软件股份有限公司 基于多字段信息与离群值检测的文档检索方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12248504B2 (en) 2023-05-31 2025-03-11 Docusign, Inc. Document container with candidate documents

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3842577B2 (ja) * 2001-03-30 2006-11-08 株式会社東芝 構造化文書検索方法および構造化文書検索装置およびプログラム
US7882122B2 (en) * 2005-03-18 2011-02-01 Capital Source Far East Limited Remote access of heterogeneous data
US20060218158A1 (en) * 2005-03-23 2006-09-28 Gunther Stuhec Translation of information between schemas
US20080114740A1 (en) * 2006-11-14 2008-05-15 Xcential Group Llc System and method for maintaining conformance of electronic document structure with multiple, variant document structure models
WO2008083504A1 (en) * 2007-01-10 2008-07-17 Nick Koudas Method and system for information discovery and text analysis
US8954469B2 (en) * 2007-03-14 2015-02-10 Vcvciii Llc Query templates and labeled search tip system, methods, and techniques
JP2009223781A (ja) * 2008-03-18 2009-10-01 Nec Corp 情報推薦装置、情報推薦システム、情報推薦方法、プログラム及び記録媒体
US11068657B2 (en) * 2010-06-28 2021-07-20 Skyscanner Limited Natural language question answering system and method based on deep semantics
US8346792B1 (en) * 2010-11-09 2013-01-01 Google Inc. Query generation using structural similarity between documents
US20140200879A1 (en) * 2013-01-11 2014-07-17 Brian Sakhai Method and System for Rating Food Items
US20140208779A1 (en) * 2013-01-30 2014-07-31 Fresh Food Solutions Llc Systems and methods for extending the fresh life of perishables in the retail and vending setting
US10956415B2 (en) * 2016-09-26 2021-03-23 Splunk Inc. Generating a subquery for an external data system using a configuration file
US10489466B1 (en) * 2016-09-29 2019-11-26 EMC IP Holding Company LLC Method and system for document similarity analysis based on weak transitive relation of similarity
US11182437B2 (en) * 2017-10-26 2021-11-23 International Business Machines Corporation Hybrid processing of disjunctive and conjunctive conditions of a search query for a similarity search
US11416448B1 (en) * 2019-08-14 2022-08-16 Amazon Technologies, Inc. Asynchronous searching of protected areas of a provider network
US11651156B2 (en) * 2020-05-07 2023-05-16 Optum Technology, Inc. Contextual document summarization with semantic intelligence
US20220245155A1 (en) * 2021-02-04 2022-08-04 Yext, Inc. Distributed multi-source data processing and publishing platform
US11620319B2 (en) * 2021-05-13 2023-04-04 Capital One Services, Llc Search platform for unstructured interaction summaries

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120994760A (zh) * 2025-10-16 2025-11-21 深圳市蓝凌软件股份有限公司 基于多字段信息与离群值检测的文档检索方法

Also Published As

Publication number Publication date
JP2024535733A (ja) 2024-10-02
US20230066143A1 (en) 2023-03-02
WO2023034397A1 (en) 2023-03-09
EP4396694A1 (en) 2024-07-10

Similar Documents

Publication Publication Date Title
JP6439043B2 (ja) 文脈検索文字列同義語の自動生成
US10614048B2 (en) Techniques for correlating data in a repository system
US20160019281A1 (en) Interfacing with a Relational Database for Multi-Dimensional Analysis via a Spreadsheet Application
US9665560B2 (en) Information retrieval system based on a unified language model
US10380124B2 (en) Searching data sets
US20170124181A1 (en) Automatic fuzzy matching of entities in context
US20250285137A1 (en) Tracking performance of recommended content across multiple content outlets
US20260012512A1 (en) ENHANCED PROCESSING OF USER PROFILES USING DATA STRUCTURES SPECIALIZED FOR GRAPHICAL PROCESSING UNITS (GPUs)
US20250284715A1 (en) Machine learning for similarity scores between different document schemas
US11449773B2 (en) Enhanced similarity detection between data sets with unknown prior features using machine-learning
CN118103830A (zh) 生成不同文档模式之间的相似性分数
US11366796B2 (en) Systems and methods for compressing keys in hierarchical data structures
US20240061829A1 (en) System and methods for enhancing data from disjunctive sources
US10372488B2 (en) Parallel processing using memory mapping
US12189706B2 (en) Hybrid approach for generating recommendations
US20240378489A1 (en) Enhancing nearest neighbor algorithm using a set of parallel models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination