CN108496190B - 用于从电子数据结构中提取属性的注释系统 - Google Patents

用于从电子数据结构中提取属性的注释系统 Download PDF

Info

Publication number
CN108496190B
CN108496190B CN201780005536.4A CN201780005536A CN108496190B CN 108496190 B CN108496190 B CN 108496190B CN 201780005536 A CN201780005536 A CN 201780005536A CN 108496190 B CN108496190 B CN 108496190B
Authority
CN
China
Prior art keywords
string
annotation
description
strings
attributes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780005536.4A
Other languages
English (en)
Chinese (zh)
Other versions
CN108496190A (zh
Inventor
吴思明
S·伯尔简·布罗简尼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Publication of CN108496190A publication Critical patent/CN108496190A/zh
Application granted granted Critical
Publication of CN108496190B publication Critical patent/CN108496190B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Strategic Management (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Security & Cryptography (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN201780005536.4A 2016-01-27 2017-01-26 用于从电子数据结构中提取属性的注释系统 Active CN108496190B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US15/007,381 2016-01-27
US15/007,381 US10628403B2 (en) 2016-01-27 2016-01-27 Annotation system for extracting attributes from electronic data structures
PCT/US2017/015002 WO2017132296A1 (en) 2016-01-27 2017-01-26 Annotation system for extracting attributes from electronic data structures

Publications (2)

Publication Number Publication Date
CN108496190A CN108496190A (zh) 2018-09-04
CN108496190B true CN108496190B (zh) 2022-06-24

Family

ID=57963504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780005536.4A Active CN108496190B (zh) 2016-01-27 2017-01-26 用于从电子数据结构中提取属性的注释系统

Country Status (5)

Country Link
US (1) US10628403B2 (https=)
EP (1) EP3408802A1 (https=)
JP (1) JP6850806B2 (https=)
CN (1) CN108496190B (https=)
WO (1) WO2017132296A1 (https=)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11010768B2 (en) * 2015-04-30 2021-05-18 Oracle International Corporation Character-based attribute value extraction system
US10997507B2 (en) * 2017-06-01 2021-05-04 Accenture Global Solutions Limited Data reconciliation
US12182307B1 (en) * 2017-09-13 2024-12-31 Privacy Analytics Inc. System and method for active learning to detect personally identifying information
US11509540B2 (en) * 2017-12-14 2022-11-22 Extreme Networks, Inc. Systems and methods for zero-footprint large-scale user-entity behavior modeling
US10642869B2 (en) * 2018-05-29 2020-05-05 Accenture Global Solutions Limited Centralized data reconciliation using artificial intelligence mechanisms
KR102129843B1 (ko) * 2018-12-17 2020-07-03 주식회사 크라우드웍스 검증용 주석 처리 작업을 이용한 실전용 주석 처리 작업의 검증 방법 및 장치
WO2021223873A1 (en) * 2020-05-08 2021-11-11 Ecole Polytechnique Federale De Lausanne (Epfl) System and method for privacy-preserving distributed training of machine learning models on distributed datasets
US12028455B2 (en) 2020-07-14 2024-07-02 Visa International Service Association Privacy-preserving identity attribute verification using policy tokens
US11645318B2 (en) * 2020-08-20 2023-05-09 Walmart Apollo, Llc Systems and methods for unified extraction of attributes
US11016980B1 (en) 2020-11-20 2021-05-25 Coupang Corp. Systems and method for generating search terms
EP4060556B1 (en) * 2021-03-19 2025-09-24 Aptiv Technologies AG Method and device for validating annotations of objects
CN113377775B (zh) * 2021-06-21 2024-02-02 特赞(上海)信息科技有限公司 信息处理方法及装置
US20250138814A1 (en) * 2023-10-26 2025-05-01 Cisco Technology, Inc. Selective deployment of software based on versioning schema

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7082426B2 (en) * 1993-06-18 2006-07-25 Cnet Networks, Inc. Content aggregation method and apparatus for an on-line product catalog
EP1425646A4 (en) * 2001-08-16 2006-02-01 Trans World New York Llc USER-SPECIFIED MEDIA SAMPLING, RECOMMENDATION AND PURCHASE SYSTEM WITH REAL-TIME INVENTORY DATABASE
US7139752B2 (en) * 2003-05-30 2006-11-21 International Business Machines Corporation System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations
US8977953B1 (en) * 2006-01-27 2015-03-10 Linguastat, Inc. Customizing information by combining pair of annotations from at least two different documents
JP2009026195A (ja) * 2007-07-23 2009-02-05 Yokohama National Univ 商品分類装置、商品分類方法及びプログラム
JP2010134709A (ja) * 2008-12-04 2010-06-17 Toshiba Corp 語彙誤り検出装置及び語彙誤り検出方法
US8352473B2 (en) 2010-04-21 2013-01-08 Microsoft Corporation Product synthesis from multiple sources
EP2469421A1 (en) * 2010-12-23 2012-06-27 British Telecommunications Public Limited Company Method and apparatus for processing electronic data
US20120330971A1 (en) * 2011-06-26 2012-12-27 Itemize Llc Itemized receipt extraction using machine learning
US8706758B2 (en) 2011-10-04 2014-04-22 Galisteo Consulting Group, Inc. Flexible account reconciliation
CN103309961B (zh) * 2013-05-30 2015-07-15 北京智海创讯信息技术有限公司 基于马尔可夫随机场的网页正文提取方法
US9348815B1 (en) * 2013-06-28 2016-05-24 Digital Reasoning Systems, Inc. Systems and methods for construction, maintenance, and improvement of knowledge representations
CN103678665B (zh) * 2013-12-24 2016-09-07 焦点科技股份有限公司 一种基于数据仓库的异构大数据整合方法和系统
US20150331936A1 (en) * 2014-05-14 2015-11-19 Faris ALQADAH Method and system for extracting a product and classifying text-based electronic documents
CN104008186B (zh) * 2014-06-11 2018-10-16 北京京东尚科信息技术有限公司 从目标文本中确定关键词的方法和装置
CN105243162B (zh) * 2015-10-30 2018-10-30 方正国际软件有限公司 基于关系型数据库存储的对象化数据模型查询方法及装置

Also Published As

Publication number Publication date
US20170212921A1 (en) 2017-07-27
US10628403B2 (en) 2020-04-21
WO2017132296A1 (en) 2017-08-03
EP3408802A1 (en) 2018-12-05
CN108496190A (zh) 2018-09-04
JP6850806B2 (ja) 2021-03-31
JP2019503541A (ja) 2019-02-07

Similar Documents

Publication Publication Date Title
CN108496190B (zh) 用于从电子数据结构中提取属性的注释系统
JP6629678B2 (ja) 機械学習装置
US10565498B1 (en) Deep neural network-based relationship analysis with multi-feature token model
CN112650923A (zh) 新闻事件的舆情处理方法及装置、存储介质、计算机设备
JP2019503541A5 (https=)
US12045209B2 (en) Method and apparatus for smart and extensible schema matching framework
WO2022105115A1 (zh) 问答对匹配方法、装置、电子设备及存储介质
CN110096434A (zh) 一种接口测试方法及装置
TWI682287B (zh) 知識圖譜產生裝置、方法及其電腦程式產品
EP3349131A1 (en) Method and system for extracting user-specific content
CN107004141A (zh) 对大样本组的高效标注
CN115063784B (zh) 票据图像的信息提取方法和装置、存储介质及电子设备
TW202329015A (zh) 於電子商務平台用於執行產品匹配之方法及系統
CN113591881A (zh) 基于模型融合的意图识别方法、装置、电子设备及介质
CN110738050A (zh) 基于分词和命名实体识别的文本重组方法及装置、介质
CN113837836A (zh) 模型推荐方法、装置、设备及存储介质
CN117093556A (zh) 日志分类方法、装置、计算机设备及计算机可读存储介质
TW202139054A (zh) 表單數據檢測方法、電腦裝置及儲存介質
KR20210023453A (ko) 리뷰 광고 매칭 장치 및 방법
CN108921213B (zh) 一种实体分类模型训练方法及装置
CN115952309B (zh) 面向多个多媒体检索任务的结构化多模态检索方法及系统
CN111309851A (zh) 一种实体词存储方法、装置及电子设备
CN118606438A (zh) 数据分析方法、装置、计算机设备、可读存储介质和程序产品
CN117648401A (zh) 知识库构建方法和知识检索方法及相关装置、设备
CN104462360B (zh) 一种为文本集合生成语义标识的方法和装置

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant