CN111712809B - 通过示例来学习etl规则 - Google Patents

通过示例来学习etl规则 Download PDF

Info

Publication number
CN111712809B
CN111712809B CN201980013060.8A CN201980013060A CN111712809B CN 111712809 B CN111712809 B CN 111712809B CN 201980013060 A CN201980013060 A CN 201980013060A CN 111712809 B CN111712809 B CN 111712809B
Authority
CN
China
Prior art keywords
etl
source
architecture
target
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201980013060.8A
Other languages
English (en)
Chinese (zh)
Other versions
CN111712809A (zh
Inventor
M·莎森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Publication of CN111712809A publication Critical patent/CN111712809A/zh
Application granted granted Critical
Publication of CN111712809B publication Critical patent/CN111712809B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)
CN201980013060.8A 2018-04-16 2019-04-11 通过示例来学习etl规则 Active CN111712809B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US15/953,873 US11494688B2 (en) 2018-04-16 2018-04-16 Learning ETL rules by example
US15/953,873 2018-04-16
PCT/US2019/026891 WO2019204106A1 (en) 2018-04-16 2019-04-11 Learning etl rules by example

Publications (2)

Publication Number Publication Date
CN111712809A CN111712809A (zh) 2020-09-25
CN111712809B true CN111712809B (zh) 2024-12-24

Family

ID=66248868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980013060.8A Active CN111712809B (zh) 2018-04-16 2019-04-11 通过示例来学习etl规则

Country Status (5)

Country Link
US (1) US11494688B2 (https=)
EP (1) EP3782044A1 (https=)
JP (1) JP7419244B2 (https=)
CN (1) CN111712809B (https=)
WO (1) WO2019204106A1 (https=)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11704370B2 (en) 2018-04-20 2023-07-18 Microsoft Technology Licensing, Llc Framework for managing features across environments
WO2020142524A1 (en) * 2018-12-31 2020-07-09 Kobai, Inc. Decision intelligence system and method
US11487721B2 (en) 2019-04-30 2022-11-01 Sap Se Matching metastructure for data modeling
US20210012219A1 (en) * 2019-07-10 2021-01-14 Sap Se Dynamic generation of rule and logic statements
CN111338966B (zh) * 2020-03-05 2023-09-19 中国银行股份有限公司 数据源表的大数据加工检测方法及装置
US11379478B2 (en) * 2020-04-02 2022-07-05 International Business Machines Corporation Optimizing a join operation
CN112035468B (zh) * 2020-08-24 2024-06-14 杭州览众数据科技有限公司 基于内存计算、web可视化配置的多数据源ETL工具
CN111966394B (zh) * 2020-08-28 2024-05-31 珠海格力电器股份有限公司 基于etl的数据分析方法、装置、设备和存储介质
US11372826B2 (en) 2020-10-19 2022-06-28 Oracle International Corporation Dynamic inclusion of custom columns into a logical model
US11630677B2 (en) * 2020-11-30 2023-04-18 Whp Workflow Solutions, Inc. Data aggregation with self-configuring drivers
US12411814B2 (en) * 2020-12-03 2025-09-09 International Business Machines Corporation Metadata based mapping assist
GB2602479A (en) * 2020-12-31 2022-07-06 Smart Photonics Holding B V Waveguide structure and method of manufacture
US20240265029A1 (en) * 2021-06-07 2024-08-08 Nec Corporation Information processing apparatus, information processing method, and storage medium
CN117897710A (zh) 2021-07-12 2024-04-16 施耐德电子系统美国股份有限公司 解决工业数据转换问题的人工智能方法
US11836120B2 (en) * 2021-07-23 2023-12-05 Oracle International Corporation Machine learning techniques for schema mapping
US11886396B2 (en) * 2021-11-13 2024-01-30 Tata Consultancy Services Limited System and method for learning-based synthesis of data transformation rules
WO2023096870A1 (en) 2021-11-23 2023-06-01 Innovaccer Inc. Method and system for unifying de-identified data from multiple sources
EP4432117A4 (en) * 2021-11-30 2025-09-03 Siemens Ag METHOD AND APPARATUS FOR GENERATING DATA MODEL
US20230205746A1 (en) * 2021-12-23 2023-06-29 Microsoft Technology Licensing, Llc Determination of recommended column types for columns in tabular data
US12386794B2 (en) 2022-01-18 2025-08-12 Optum, Inc. Predictive recommendations for schema mapping
WO2023215484A1 (en) 2022-05-06 2023-11-09 Innovaccer Inc. Method and system for providing faas based feature library using dag
WO2023250038A1 (en) 2022-06-21 2023-12-28 Innovaccer Inc. System and method for automatic display of contextually related data on multiple devices
US12061600B2 (en) * 2022-07-14 2024-08-13 International Business Machines Corporation API management for batch processing
US12298946B2 (en) * 2022-10-14 2025-05-13 Oracle International Corporation Natively supporting JSON duality view in a database management system
US12287777B2 (en) 2022-10-14 2025-04-29 Oracle International Corporation Natively supporting JSON duality view in a database management system
EP4357929A1 (en) * 2022-10-21 2024-04-24 Atos France Data quality assurance for heterogenous data migration in clouds
US20240386027A1 (en) * 2023-05-19 2024-11-21 Thermo Electron North America LLC Flexible extract, transform, and load (etl) process
US12072904B1 (en) * 2023-05-30 2024-08-27 Microsoft Technology Licensing, Llc Data transformation toolkit
US20250077480A1 (en) * 2023-08-31 2025-03-06 Honeywell International Inc. Systems, apparatuses, methods, and computer program products for robust data integration
US12248493B1 (en) * 2023-12-18 2025-03-11 Sap Se Automatically evolving tenant model schema in data lake in response to source system changes
US12298995B1 (en) * 2024-02-21 2025-05-13 Nom Nom Ai Inc. Systems, methods, and computer-readable media for managing an extract, transform, and load process

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3648051B2 (ja) * 1998-02-02 2005-05-18 富士通株式会社 関連情報検索装置及びプログラム記録媒体
JP2004086782A (ja) * 2002-08-29 2004-03-18 Hitachi Ltd 異種データベース統合支援装置
JP4855080B2 (ja) * 2006-01-13 2012-01-18 三菱電機株式会社 スキーマ統合支援装置、スキーマ統合支援装置のスキーマ統合支援方法およびスキーマ統合支援プログラム
CN101777073A (zh) * 2010-02-01 2010-07-14 浪潮集团山东通用软件有限公司 一种基于xml格式的数据转换方法
US9298787B2 (en) * 2011-11-09 2016-03-29 International Business Machines Corporation Star and snowflake schemas in extract, transform, load processes
US9542412B2 (en) 2014-03-28 2017-01-10 Tamr, Inc. Method and system for large scale data curation
US10169378B2 (en) 2014-09-11 2019-01-01 Oracle International Corporation Automatic generation of logical database schemas from physical database tables and metadata
US10374905B2 (en) * 2015-06-05 2019-08-06 Oracle International Corporation System and method for intelligently mapping a source element to a target element in an integration cloud service design time
US20170061500A1 (en) * 2015-09-02 2017-03-02 Borodin Research Inc. Systems and methods for data service platform
US10095766B2 (en) * 2015-10-23 2018-10-09 Numerify, Inc. Automated refinement and validation of data warehouse star schemas
US20170220654A1 (en) * 2016-02-03 2017-08-03 Wipro Limited Method for automatically generating extract transform load (etl) codes using a code generation device
JP6723893B2 (ja) * 2016-10-07 2020-07-15 株式会社日立製作所 データ統合装置およびデータ統合方法
CN106682235A (zh) * 2017-01-18 2017-05-17 济南浪潮高新科技投资发展有限公司 一种异构数据映射系统及方法
CN107798069A (zh) * 2017-09-26 2018-03-13 恒生电子股份有限公司 用于数据加载的方法、装置及计算机可读介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
AnHai Doan等.Reconciling Schemas of Disparate Data Sources:A Machine-Learning Approach.SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data.2001,第509–520页. *

Also Published As

Publication number Publication date
JP7419244B2 (ja) 2024-01-22
WO2019204106A1 (en) 2019-10-24
US20190318272A1 (en) 2019-10-17
EP3782044A1 (en) 2021-02-24
CN111712809A (zh) 2020-09-25
US11494688B2 (en) 2022-11-08
JP2021519964A (ja) 2021-08-12

Similar Documents

Publication Publication Date Title
CN111712809B (zh) 通过示例来学习etl规则
US12248862B1 (en) System for deep learning using knowledge graphs
Karnitis et al. Migration of relational database to document-oriented database: structure denormalization and data transformation
US11455306B2 (en) Query classification and processing using neural network based machine learning
US9535902B1 (en) Systems and methods for entity resolution using attributes from structured and unstructured data
US8108367B2 (en) Constraints with hidden rows in a database
JP5833406B2 (ja) 参照を使用してジェネリック・データ・アイテムに関連するデータ管理アーキテクチャ
US12380151B1 (en) Semantic translation of data sets
CN112396108A (zh) 业务数据评估方法、装置、设备及计算机可读存储介质
US10268645B2 (en) In-database provisioning of data
US11921750B2 (en) Database systems and applications for assigning records to chunks of a partition in a non-relational database system with auto-balancing
CN114218218A (zh) 基于数据仓库的数据处理方法、装置、设备及存储介质
CN115062023B (zh) 宽表优化方法、装置、电子设备及计算机可读存储介质
US20230075655A1 (en) Systems and methods for context-independent database search paths
US20230059184A1 (en) Selective database data rollback
CN107103448A (zh) 基于工作流的数据集成系统
CN107870949A (zh) 数据分析作业依赖关系生成方法和系统
US20240370442A1 (en) Visualization Data Reuse In A Data Analysis System
US20240193176A1 (en) Cleaning and organizing schemaless semi-structured data for extract, transform, and load processing
US10489419B1 (en) Data modeling translation system
CN118302753A (zh) 自主电子表格创建
US20240220876A1 (en) Artificial intelligence (ai) based data product provisioning
CN108228762B (zh) 用于配置主数据库通用模板的方法和系统
CN109408704B (zh) 基金数据关联方法、系统、计算机设备和存储介质
CN114428777A (zh) 数据库扩展方法和系统

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment
TG01 Patent term adjustment