CN102792298B - 使用特征化匹配的规则来匹配元数据源 - Google Patents

使用特征化匹配的规则来匹配元数据源 Download PDF

Info

Publication number
CN102792298B
CN102792298B CN201180013068.8A CN201180013068A CN102792298B CN 102792298 B CN102792298 B CN 102792298B CN 201180013068 A CN201180013068 A CN 201180013068A CN 102792298 B CN102792298 B CN 102792298B
Authority
CN
China
Prior art keywords
source
data element
data
description
lexical item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201180013068.8A
Other languages
English (en)
Chinese (zh)
Other versions
CN102792298A (zh
Inventor
A.肖恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ab Initio Technology LLC
Original Assignee
Ab Initio Technology LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ab Initio Technology LLC filed Critical Ab Initio Technology LLC
Publication of CN102792298A publication Critical patent/CN102792298A/zh
Application granted granted Critical
Publication of CN102792298B publication Critical patent/CN102792298B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
CN201180013068.8A 2010-01-13 2011-01-13 使用特征化匹配的规则来匹配元数据源 Active CN102792298B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US29466310P 2010-01-13 2010-01-13
US61/294,663 2010-01-13
PCT/US2011/021108 WO2011088195A1 (en) 2010-01-13 2011-01-13 Matching metadata sources using rules for characterizing matches

Publications (2)

Publication Number Publication Date
CN102792298A CN102792298A (zh) 2012-11-21
CN102792298B true CN102792298B (zh) 2017-03-29

Family

ID=43755121

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201180013068.8A Active CN102792298B (zh) 2010-01-13 2011-01-13 使用特征化匹配的规则来匹配元数据源

Country Status (8)

Country Link
US (1) US9031895B2 (https=)
EP (1) EP2524327B1 (https=)
JP (1) JP5768063B2 (https=)
KR (1) KR101758669B1 (https=)
CN (1) CN102792298B (https=)
AU (1) AU2011205296B2 (https=)
CA (1) CA2786445C (https=)
WO (1) WO2011088195A1 (https=)

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101639292B1 (ko) 2008-12-02 2016-07-13 아브 이니티오 테크놀로지 엘엘시 데이터 요소 사이의 관계를 시각화하는 방법
US8370407B1 (en) * 2011-06-28 2013-02-05 Go Daddy Operating Company, LLC Systems providing a network resource address reputation service
US9065860B2 (en) 2011-08-02 2015-06-23 Cavium, Inc. Method and apparatus for multiple access of plural memory banks
US9081801B2 (en) * 2012-07-25 2015-07-14 Hewlett-Packard Development Company, L.P. Metadata supersets for matching images
US9892026B2 (en) * 2013-02-01 2018-02-13 Ab Initio Technology Llc Data records selection
US9872634B2 (en) * 2013-02-08 2018-01-23 Vital Connect, Inc. Respiratory rate measurement using a combination of respiration signals
US9864755B2 (en) 2013-03-08 2018-01-09 Go Daddy Operating Company, LLC Systems for associating an online file folder with a uniform resource locator
US9521138B2 (en) 2013-06-14 2016-12-13 Go Daddy Operating Company, LLC System for domain control validation
US9178888B2 (en) 2013-06-14 2015-11-03 Go Daddy Operating Company, LLC Method for domain control validation
HK1224398A1 (zh) * 2013-12-18 2017-08-18 Ab Initio Technology Llc 数据生成
US9275336B2 (en) 2013-12-31 2016-03-01 Cavium, Inc. Method and system for skipping over group(s) of rules based on skip group rule
US9544402B2 (en) * 2013-12-31 2017-01-10 Cavium, Inc. Multi-rule approach to encoding a group of rules
US9667446B2 (en) 2014-01-08 2017-05-30 Cavium, Inc. Condition code approach for comparing rule and packet data that are provided in portions
US10891272B2 (en) 2014-09-26 2021-01-12 Oracle International Corporation Declarative language and visualization system for recommended data transformations and repairs
US10210246B2 (en) * 2014-09-26 2019-02-19 Oracle International Corporation Techniques for similarity analysis and data enrichment using knowledge sources
US10915233B2 (en) 2014-09-26 2021-02-09 Oracle International Corporation Automated entity correlation and classification across heterogeneous datasets
US10684998B2 (en) 2014-11-21 2020-06-16 Microsoft Technology Licensing, Llc Automatic schema mismatch detection
CN104504021A (zh) * 2014-12-11 2015-04-08 北京国双科技有限公司 数据匹配方法及装置
US10891258B2 (en) * 2016-03-22 2021-01-12 Tata Consultancy Services Limited Systems and methods for de-normalized data structure files based generation of intelligence reports
JP6665678B2 (ja) * 2016-05-17 2020-03-13 富士通株式会社 メタデータ登録方法、メタデータ登録プログラムおよびメタデータ登録装置
US11106643B1 (en) * 2017-08-02 2021-08-31 Synchrony Bank System and method for integrating systems to implement data quality processing
US11016936B1 (en) * 2017-09-05 2021-05-25 Palantir Technologies Inc. Validating data for integration
US10885056B2 (en) * 2017-09-29 2021-01-05 Oracle International Corporation Data standardization techniques
US10936599B2 (en) 2017-09-29 2021-03-02 Oracle International Corporation Adaptive recommendations
US11093639B2 (en) * 2018-02-23 2021-08-17 International Business Machines Corporation Coordinated de-identification of a dataset across a network
GB2574905A (en) * 2018-06-18 2019-12-25 Arm Ip Ltd Pipeline template configuration in a data processing system
US11113324B2 (en) * 2018-07-26 2021-09-07 JANZZ Ltd Classifier system and method
US11074230B2 (en) 2018-09-04 2021-07-27 International Business Machines Corporation Data matching accuracy based on context features
US11163750B2 (en) 2018-09-27 2021-11-02 International Business Machines Corporation Dynamic, transparent manipulation of content and/or namespaces within data storage systems
US11755754B2 (en) * 2018-10-19 2023-09-12 Oracle International Corporation Systems and methods for securing data based on discovered relationships
CN110210222B (zh) * 2018-10-24 2023-01-31 腾讯科技(深圳)有限公司 数据处理方法、数据处理装置和计算机可读存储介质
KR102774097B1 (ko) * 2019-03-22 2025-03-04 삼성전자주식회사 전자 장치 및 그 제어 방법
US11269905B2 (en) 2019-06-20 2022-03-08 International Business Machines Corporation Interaction between visualizations and other data controls in an information system by matching attributes in different datasets
CN110414579A (zh) * 2019-07-18 2019-11-05 北京信远通科技有限公司 元数据模型合标性检查方法及装置、存储介质
CN111639077B (zh) * 2020-05-15 2024-03-22 杭州数梦工场科技有限公司 数据治理方法、装置、电子设备、存储介质
US11734511B1 (en) * 2020-07-08 2023-08-22 Mineral Earth Sciences Llc Mapping data set(s) to canonical phrases using natural language processing model(s)
CN112181949A (zh) * 2020-10-10 2021-01-05 浪潮云信息技术股份公司 一种在线数据建模的方法及装置
CN112199433A (zh) * 2020-10-28 2021-01-08 云赛智联股份有限公司 一种用于城市级数据中台的数据治理系统
CN112751938B (zh) * 2020-12-30 2023-04-07 上海赋算通云计算科技有限公司 一种基于多集群作业的实时数据同步系统,实现方法以及存储介质
CN113362174B (zh) * 2021-06-17 2023-01-24 富途网络科技(深圳)有限公司 数据对比方法、装置、设备以及存储介质
US12050575B2 (en) 2021-07-26 2024-07-30 International Business Machines Corporation Mapping of heterogeneous data as matching fields
CN113792057A (zh) * 2021-08-02 2021-12-14 浪潮软件股份有限公司 一种业务数据标准字典匹配方法
CN117332284B (zh) * 2023-12-01 2024-02-09 湖南空间折叠互联网科技有限公司 线下医疗数据匹配算法及系统
CN119202755B (zh) * 2024-11-27 2025-03-14 深圳市安仕新能源科技股份有限公司 一种基于mes的规格范围自动匹配方法、系统和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1744080A (zh) * 2005-09-27 2006-03-08 南方医科大学 一种与特定功能相关的基因信息检索系统及用于该系统的检索词数据库的构建方法
WO2006116286A2 (en) * 2005-04-25 2006-11-02 Leon Falic Internet-based duty-free goods electronic commerce system and method
CN101650746A (zh) * 2009-09-27 2010-02-17 中国电信股份有限公司 一种对排序结果进行验证的方法和系统

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000040085A (ja) * 1998-07-22 2000-02-08 Hitachi Ltd 日本語形態素解析処理の後処理方法および装置
US6826568B2 (en) * 2001-12-20 2004-11-30 Microsoft Corporation Methods and system for model matching
US7730063B2 (en) * 2002-12-10 2010-06-01 Asset Trust, Inc. Personalized medicine service
JP2003271656A (ja) * 2002-03-19 2003-09-26 Fujitsu Ltd 関係付候補生成装置,関係付候補生成方法,関係付システム,関係付候補生成プログラムおよび同プログラムを記録したコンピュータ読取可能な記録媒体
US7542958B1 (en) * 2002-09-13 2009-06-02 Xsb, Inc. Methods for determining the similarity of content and structuring unstructured content from heterogeneous sources
US20040158567A1 (en) * 2003-02-12 2004-08-12 International Business Machines Corporation Constraint driven schema association
US20040249682A1 (en) * 2003-06-06 2004-12-09 Demarcken Carl G. Filling a query cache for travel planning
US7552110B2 (en) * 2003-09-22 2009-06-23 International Business Machines Corporation Method for performing a query in a computer system to retrieve data from a database
JP4511892B2 (ja) * 2004-07-26 2010-07-28 ヤフー株式会社 類義語検索装置、その方法、そのプログラム、および、情報検索装置
US20060075013A1 (en) * 2004-09-03 2006-04-06 Hite Thomas D System and method for relating computing systems
JP4687089B2 (ja) * 2004-12-08 2011-05-25 日本電気株式会社 重複レコード検出システム、および重複レコード検出プログラム
US20070005621A1 (en) * 2005-06-01 2007-01-04 Lesh Kathryn A Information system using healthcare ontology
US7716630B2 (en) 2005-06-27 2010-05-11 Ab Initio Technology Llc Managing parameters for graph-based computations
US20080021912A1 (en) * 2006-07-24 2008-01-24 The Mitre Corporation Tools and methods for semi-automatic schema matching
US8027948B2 (en) * 2008-01-31 2011-09-27 International Business Machines Corporation Method and system for generating an ontology
JP5187308B2 (ja) * 2007-08-01 2013-04-24 日本電気株式会社 変換プログラム探索システムおよび変換プログラム探索方法
US8775441B2 (en) 2008-01-16 2014-07-08 Ab Initio Technology Llc Managing an archive for approximate string matching

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006116286A2 (en) * 2005-04-25 2006-11-02 Leon Falic Internet-based duty-free goods electronic commerce system and method
CN1744080A (zh) * 2005-09-27 2006-03-08 南方医科大学 一种与特定功能相关的基因信息检索系统及用于该系统的检索词数据库的构建方法
CN101650746A (zh) * 2009-09-27 2010-02-17 中国电信股份有限公司 一种对排序结果进行验证的方法和系统

Also Published As

Publication number Publication date
US9031895B2 (en) 2015-05-12
AU2011205296B2 (en) 2016-07-28
AU2011205296A1 (en) 2012-07-12
US20110173149A1 (en) 2011-07-14
KR101758669B1 (ko) 2017-07-18
EP2524327B1 (en) 2017-11-29
CA2786445A1 (en) 2011-07-21
WO2011088195A1 (en) 2011-07-21
JP5768063B2 (ja) 2015-08-26
JP2013517569A (ja) 2013-05-16
CN102792298A (zh) 2012-11-21
CA2786445C (en) 2018-02-13
KR20120135218A (ko) 2012-12-12
EP2524327A1 (en) 2012-11-21

Similar Documents

Publication Publication Date Title
CN102792298B (zh) 使用特征化匹配的规则来匹配元数据源
Torvik et al. Author name disambiguation in MEDLINE
JP5817531B2 (ja) 文書クラスタリングシステム、文書クラスタリング方法およびプログラム
CN103348598B (zh) 生成数据模式信息
CN114253939B (zh) 一种数据模型的构建方法、装置、电子设备及存储介质
US20150006528A1 (en) Hierarchical data structure of documents
CN114880483A (zh) 一种元数据知识图谱构建方法、存储介质及系统
US20130332454A1 (en) Dictionary entry name generator
CN114741276A (zh) 国产操作系统测试用例的复用方法和装置
CN111831684A (zh) 数据的查询方法、装置和计算机可读存储介质
Ciszak Application of clustering and association methods in data cleaning
Pamungkas et al. B-BabelNet: business-specific lexical database for improving semantic analysis of business process models
Meusel et al. Towards more accurate statistical profiling of deployed schema. org microdata
CN116340617A (zh) 一种搜索推荐方法和装置
Fize et al. Matching heterogeneous textual data using spatial features
Li et al. Context-based entity description rule for entity resolution
CN106294517A (zh) 信息处理装置及方法
Tkeshelashvili et al. Spreadsheet data extraction using semantic network
El Abassi et al. Deduplication Over Big Data Integration
HK1173248B (en) Matching metadata sources using rules for characterizing matches
HK1173248A (en) Matching metadata sources using rules for characterizing matches
Jumde et al. Supporting uncertain predicates in DBMS using approximate string matching and probabilistic databases
Wang et al. DRAV: Detection and repair of data availability violations in Internet of Things
Willför Design and implementation of a compatibility algorithm for a computer configurator
Lawrence et al. Integrating data sources using a standardized global dictionary

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant