CN111712809B - 通过示例来学习etl规则 - Google Patents
通过示例来学习etl规则 Download PDFInfo
- Publication number
- CN111712809B CN111712809B CN201980013060.8A CN201980013060A CN111712809B CN 111712809 B CN111712809 B CN 111712809B CN 201980013060 A CN201980013060 A CN 201980013060A CN 111712809 B CN111712809 B CN 111712809B
- Authority
- CN
- China
- Prior art keywords
- etl
- source
- architecture
- target
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/211—Schema design and management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/953,873 US11494688B2 (en) | 2018-04-16 | 2018-04-16 | Learning ETL rules by example |
| US15/953,873 | 2018-04-16 | ||
| PCT/US2019/026891 WO2019204106A1 (en) | 2018-04-16 | 2019-04-11 | Learning etl rules by example |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111712809A CN111712809A (zh) | 2020-09-25 |
| CN111712809B true CN111712809B (zh) | 2024-12-24 |
Family
ID=66248868
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201980013060.8A Active CN111712809B (zh) | 2018-04-16 | 2019-04-11 | 通过示例来学习etl规则 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11494688B2 (https=) |
| EP (1) | EP3782044A1 (https=) |
| JP (1) | JP7419244B2 (https=) |
| CN (1) | CN111712809B (https=) |
| WO (1) | WO2019204106A1 (https=) |
Families Citing this family (31)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11704370B2 (en) | 2018-04-20 | 2023-07-18 | Microsoft Technology Licensing, Llc | Framework for managing features across environments |
| WO2020142524A1 (en) * | 2018-12-31 | 2020-07-09 | Kobai, Inc. | Decision intelligence system and method |
| US11487721B2 (en) | 2019-04-30 | 2022-11-01 | Sap Se | Matching metastructure for data modeling |
| US20210012219A1 (en) * | 2019-07-10 | 2021-01-14 | Sap Se | Dynamic generation of rule and logic statements |
| CN111338966B (zh) * | 2020-03-05 | 2023-09-19 | 中国银行股份有限公司 | 数据源表的大数据加工检测方法及装置 |
| US11379478B2 (en) * | 2020-04-02 | 2022-07-05 | International Business Machines Corporation | Optimizing a join operation |
| CN112035468B (zh) * | 2020-08-24 | 2024-06-14 | 杭州览众数据科技有限公司 | 基于内存计算、web可视化配置的多数据源ETL工具 |
| CN111966394B (zh) * | 2020-08-28 | 2024-05-31 | 珠海格力电器股份有限公司 | 基于etl的数据分析方法、装置、设备和存储介质 |
| US11372826B2 (en) | 2020-10-19 | 2022-06-28 | Oracle International Corporation | Dynamic inclusion of custom columns into a logical model |
| US11630677B2 (en) * | 2020-11-30 | 2023-04-18 | Whp Workflow Solutions, Inc. | Data aggregation with self-configuring drivers |
| US12411814B2 (en) * | 2020-12-03 | 2025-09-09 | International Business Machines Corporation | Metadata based mapping assist |
| GB2602479A (en) * | 2020-12-31 | 2022-07-06 | Smart Photonics Holding B V | Waveguide structure and method of manufacture |
| US20240265029A1 (en) * | 2021-06-07 | 2024-08-08 | Nec Corporation | Information processing apparatus, information processing method, and storage medium |
| CN117897710A (zh) | 2021-07-12 | 2024-04-16 | 施耐德电子系统美国股份有限公司 | 解决工业数据转换问题的人工智能方法 |
| US11836120B2 (en) * | 2021-07-23 | 2023-12-05 | Oracle International Corporation | Machine learning techniques for schema mapping |
| US11886396B2 (en) * | 2021-11-13 | 2024-01-30 | Tata Consultancy Services Limited | System and method for learning-based synthesis of data transformation rules |
| WO2023096870A1 (en) | 2021-11-23 | 2023-06-01 | Innovaccer Inc. | Method and system for unifying de-identified data from multiple sources |
| EP4432117A4 (en) * | 2021-11-30 | 2025-09-03 | Siemens Ag | METHOD AND APPARATUS FOR GENERATING DATA MODEL |
| US20230205746A1 (en) * | 2021-12-23 | 2023-06-29 | Microsoft Technology Licensing, Llc | Determination of recommended column types for columns in tabular data |
| US12386794B2 (en) | 2022-01-18 | 2025-08-12 | Optum, Inc. | Predictive recommendations for schema mapping |
| WO2023215484A1 (en) | 2022-05-06 | 2023-11-09 | Innovaccer Inc. | Method and system for providing faas based feature library using dag |
| WO2023250038A1 (en) | 2022-06-21 | 2023-12-28 | Innovaccer Inc. | System and method for automatic display of contextually related data on multiple devices |
| US12061600B2 (en) * | 2022-07-14 | 2024-08-13 | International Business Machines Corporation | API management for batch processing |
| US12298946B2 (en) * | 2022-10-14 | 2025-05-13 | Oracle International Corporation | Natively supporting JSON duality view in a database management system |
| US12287777B2 (en) | 2022-10-14 | 2025-04-29 | Oracle International Corporation | Natively supporting JSON duality view in a database management system |
| EP4357929A1 (en) * | 2022-10-21 | 2024-04-24 | Atos France | Data quality assurance for heterogenous data migration in clouds |
| US20240386027A1 (en) * | 2023-05-19 | 2024-11-21 | Thermo Electron North America LLC | Flexible extract, transform, and load (etl) process |
| US12072904B1 (en) * | 2023-05-30 | 2024-08-27 | Microsoft Technology Licensing, Llc | Data transformation toolkit |
| US20250077480A1 (en) * | 2023-08-31 | 2025-03-06 | Honeywell International Inc. | Systems, apparatuses, methods, and computer program products for robust data integration |
| US12248493B1 (en) * | 2023-12-18 | 2025-03-11 | Sap Se | Automatically evolving tenant model schema in data lake in response to source system changes |
| US12298995B1 (en) * | 2024-02-21 | 2025-05-13 | Nom Nom Ai Inc. | Systems, methods, and computer-readable media for managing an extract, transform, and load process |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3648051B2 (ja) * | 1998-02-02 | 2005-05-18 | 富士通株式会社 | 関連情報検索装置及びプログラム記録媒体 |
| JP2004086782A (ja) * | 2002-08-29 | 2004-03-18 | Hitachi Ltd | 異種データベース統合支援装置 |
| JP4855080B2 (ja) * | 2006-01-13 | 2012-01-18 | 三菱電機株式会社 | スキーマ統合支援装置、スキーマ統合支援装置のスキーマ統合支援方法およびスキーマ統合支援プログラム |
| CN101777073A (zh) * | 2010-02-01 | 2010-07-14 | 浪潮集团山东通用软件有限公司 | 一种基于xml格式的数据转换方法 |
| US9298787B2 (en) * | 2011-11-09 | 2016-03-29 | International Business Machines Corporation | Star and snowflake schemas in extract, transform, load processes |
| US9542412B2 (en) | 2014-03-28 | 2017-01-10 | Tamr, Inc. | Method and system for large scale data curation |
| US10169378B2 (en) | 2014-09-11 | 2019-01-01 | Oracle International Corporation | Automatic generation of logical database schemas from physical database tables and metadata |
| US10374905B2 (en) * | 2015-06-05 | 2019-08-06 | Oracle International Corporation | System and method for intelligently mapping a source element to a target element in an integration cloud service design time |
| US20170061500A1 (en) * | 2015-09-02 | 2017-03-02 | Borodin Research Inc. | Systems and methods for data service platform |
| US10095766B2 (en) * | 2015-10-23 | 2018-10-09 | Numerify, Inc. | Automated refinement and validation of data warehouse star schemas |
| US20170220654A1 (en) * | 2016-02-03 | 2017-08-03 | Wipro Limited | Method for automatically generating extract transform load (etl) codes using a code generation device |
| JP6723893B2 (ja) * | 2016-10-07 | 2020-07-15 | 株式会社日立製作所 | データ統合装置およびデータ統合方法 |
| CN106682235A (zh) * | 2017-01-18 | 2017-05-17 | 济南浪潮高新科技投资发展有限公司 | 一种异构数据映射系统及方法 |
| CN107798069A (zh) * | 2017-09-26 | 2018-03-13 | 恒生电子股份有限公司 | 用于数据加载的方法、装置及计算机可读介质 |
-
2018
- 2018-04-16 US US15/953,873 patent/US11494688B2/en active Active
-
2019
- 2019-04-11 EP EP19719137.2A patent/EP3782044A1/en not_active Withdrawn
- 2019-04-11 JP JP2020545789A patent/JP7419244B2/ja active Active
- 2019-04-11 WO PCT/US2019/026891 patent/WO2019204106A1/en not_active Ceased
- 2019-04-11 CN CN201980013060.8A patent/CN111712809B/zh active Active
Non-Patent Citations (1)
| Title |
|---|
| AnHai Doan等.Reconciling Schemas of Disparate Data Sources:A Machine-Learning Approach.SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data.2001,第509–520页. * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7419244B2 (ja) | 2024-01-22 |
| WO2019204106A1 (en) | 2019-10-24 |
| US20190318272A1 (en) | 2019-10-17 |
| EP3782044A1 (en) | 2021-02-24 |
| CN111712809A (zh) | 2020-09-25 |
| US11494688B2 (en) | 2022-11-08 |
| JP2021519964A (ja) | 2021-08-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111712809B (zh) | 通过示例来学习etl规则 | |
| US12248862B1 (en) | System for deep learning using knowledge graphs | |
| Karnitis et al. | Migration of relational database to document-oriented database: structure denormalization and data transformation | |
| US11455306B2 (en) | Query classification and processing using neural network based machine learning | |
| US9535902B1 (en) | Systems and methods for entity resolution using attributes from structured and unstructured data | |
| US8108367B2 (en) | Constraints with hidden rows in a database | |
| JP5833406B2 (ja) | 参照を使用してジェネリック・データ・アイテムに関連するデータ管理アーキテクチャ | |
| US12380151B1 (en) | Semantic translation of data sets | |
| CN112396108A (zh) | 业务数据评估方法、装置、设备及计算机可读存储介质 | |
| US10268645B2 (en) | In-database provisioning of data | |
| US11921750B2 (en) | Database systems and applications for assigning records to chunks of a partition in a non-relational database system with auto-balancing | |
| CN114218218A (zh) | 基于数据仓库的数据处理方法、装置、设备及存储介质 | |
| CN115062023B (zh) | 宽表优化方法、装置、电子设备及计算机可读存储介质 | |
| US20230075655A1 (en) | Systems and methods for context-independent database search paths | |
| US20230059184A1 (en) | Selective database data rollback | |
| CN107103448A (zh) | 基于工作流的数据集成系统 | |
| CN107870949A (zh) | 数据分析作业依赖关系生成方法和系统 | |
| US20240370442A1 (en) | Visualization Data Reuse In A Data Analysis System | |
| US20240193176A1 (en) | Cleaning and organizing schemaless semi-structured data for extract, transform, and load processing | |
| US10489419B1 (en) | Data modeling translation system | |
| CN118302753A (zh) | 自主电子表格创建 | |
| US20240220876A1 (en) | Artificial intelligence (ai) based data product provisioning | |
| CN108228762B (zh) | 用于配置主数据库通用模板的方法和系统 | |
| CN109408704B (zh) | 基金数据关联方法、系统、计算机设备和存储介质 | |
| CN114428777A (zh) | 数据库扩展方法和系统 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| TG01 | Patent term adjustment | ||
| TG01 | Patent term adjustment |