WO2023082681A1 - Procédé et appareil de traitement de données basés sur une intégration de flux par lots, dispositif informatique et support - Google Patents
Procédé et appareil de traitement de données basés sur une intégration de flux par lots, dispositif informatique et support Download PDFInfo
- Publication number
- WO2023082681A1 WO2023082681A1 PCT/CN2022/105078 CN2022105078W WO2023082681A1 WO 2023082681 A1 WO2023082681 A1 WO 2023082681A1 CN 2022105078 W CN2022105078 W CN 2022105078W WO 2023082681 A1 WO2023082681 A1 WO 2023082681A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- processing
- layer
- processed
- module
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Definitions
- the data to be processed is processed layer by layer to obtain the first data; wherein, in each of the processing layers, the data input to the processing layer is processed to obtain the processed data, and the processed data is The data is real-time data, based on the Flink flow, the processed real-time data is stored in the Hive module, and the processed data is input to the next processing layer; the first data is the last one in the processing chain The processed data obtained by the processing layer;
- the first processing module is configured to process the data to be processed layer by layer according to the processing link to obtain first data
- the offline data is obtained from the Hive module, and the offline data is used to correct the wrong data.
- Real-time data is corrected. Since the data is passed layer by layer and processed layer by layer, changes in the processing results of the previous processing layer will cause corresponding changes in the processing results of the subsequent processing layers, so the corrected data needs to be input to the next processing layer. layer, and the next processing layer re-processes the data.
- the data processing device can connect to visual display components (such as Tableau), and query the full amount of data in a custom way on the web client (Web), thereby supporting the visual display of front-end data.
- visual display components such as Tableau
- FIG. 3 is a schematic diagram of the first structure of the data processing device provided by the embodiment of the present disclosure.
- the data processing device includes an acquisition module 101, a first processing module 102 and The second processing module 103, the second processing module 103 forms a data application layer, the first processing module 102 includes a plurality of processing layers, each processing layer forms a processing link, and each processing layer includes a first processing unit 1021 and a second processing unit 1022 .
- the acquiring module 101 is configured to acquire data to be processed, and the data to be processed includes real-time data.
- the ODS layer, DWD layer and DWS layer are connected through the Kafka module to realize data exchange. Passed layer by layer.
- the second processing module 203 is located at the ADS layer and may be an OLAP module.
- the query module 204 is respectively connected to the Hive module of each processing layer and the OLAP module of the ADS layer, so as to realize cross-source query.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
La présente divulgation concerne un procédé de traitement de données basé sur une intégration de flux par lots, ledit procédé consistant à : obtenir des données à traiter ; selon une liaison de traitement, traiter, couche par couche, les données à traiter de façon à obtenir des premières données ; dans des couches de traitement, traiter des données entrées dans la couche de traitement actuelle pour obtenir des données en temps réel traitées, stocker les données traitées en temps réel dans un module Hive d'après un flux Flink, puis entrer les données traitées dans la couche de traitement suivante ; traiter les premières données dans une couche d'application de données pour obtenir des secondes données ; en réponse à la détection d'une erreur dans les secondes données, corriger, selon les données hors ligne de la couche de traitement actuelle, les données erronées dans la couche de traitement où une erreur de données s'est produite, puis entrer les données corrigées dans la couche de traitement suivante, de façon à ce que la couche de traitement suivante traite les données d'entrée. La présente divulgation concerne également un appareil de traitement de données, un dispositif informatique et un support.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111318823.5 | 2021-11-09 | ||
CN202111318823.5A CN113779094B (zh) | 2021-11-09 | 2021-11-09 | 基于批流一体的数据处理方法、装置、计算机设备和介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023082681A1 true WO2023082681A1 (fr) | 2023-05-19 |
Family
ID=78956925
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/105078 WO2023082681A1 (fr) | 2021-11-09 | 2022-07-12 | Procédé et appareil de traitement de données basés sur une intégration de flux par lots, dispositif informatique et support |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113779094B (fr) |
WO (1) | WO2023082681A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117724706A (zh) * | 2024-02-06 | 2024-03-19 | 湖南盛鼎科技发展有限责任公司 | 批流一体流程化实时处理异构平台海量数据的方法及系统 |
CN118051554A (zh) * | 2024-03-05 | 2024-05-17 | 合肥喆塔科技有限公司 | 基于FlinkSQL与Kudu构建实时数仓的方法、设备及介质 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113779094B (zh) * | 2021-11-09 | 2022-03-22 | 通号通信信息集团有限公司 | 基于批流一体的数据处理方法、装置、计算机设备和介质 |
CN114416845A (zh) * | 2022-01-19 | 2022-04-29 | 平安好医投资管理有限公司 | 大数据测试方法、装置、电子设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473480A (zh) * | 2013-10-08 | 2013-12-25 | 武汉大学 | 基于改进万有引力支持向量机的在线监测数据校正方法 |
US20150341231A1 (en) * | 2014-05-21 | 2015-11-26 | Asif Khan | Distributed system architecture using event stream processing |
CN112000636A (zh) * | 2020-08-31 | 2020-11-27 | 民生科技有限责任公司 | 基于Flink流式处理的用户行为统计分析方法 |
CN113515363A (zh) * | 2021-08-10 | 2021-10-19 | 中国人民解放军61646部队 | 面向异型任务高并发的多层次数据处理系统动态调度平台 |
CN113779094A (zh) * | 2021-11-09 | 2021-12-10 | 通号通信信息集团有限公司 | 基于批流一体的数据处理方法、装置、计算机设备和介质 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10936585B1 (en) * | 2018-10-31 | 2021-03-02 | Splunk Inc. | Unified data processing across streaming and indexed data sets |
US11526539B2 (en) * | 2019-01-31 | 2022-12-13 | Salesforce, Inc. | Temporary reservations in non-relational datastores |
CN112507029B (zh) * | 2020-12-18 | 2022-11-04 | 上海哔哩哔哩科技有限公司 | 数据处理系统及数据实时处理方法 |
CN113220521A (zh) * | 2021-02-04 | 2021-08-06 | 北京易车互联信息技术有限公司 | 实时监控系统 |
CN112905595A (zh) * | 2021-03-05 | 2021-06-04 | 腾讯科技(深圳)有限公司 | 一种数据查询方法、装置及计算机可读存储介质 |
-
2021
- 2021-11-09 CN CN202111318823.5A patent/CN113779094B/zh active Active
-
2022
- 2022-07-12 WO PCT/CN2022/105078 patent/WO2023082681A1/fr unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473480A (zh) * | 2013-10-08 | 2013-12-25 | 武汉大学 | 基于改进万有引力支持向量机的在线监测数据校正方法 |
US20150341231A1 (en) * | 2014-05-21 | 2015-11-26 | Asif Khan | Distributed system architecture using event stream processing |
CN112000636A (zh) * | 2020-08-31 | 2020-11-27 | 民生科技有限责任公司 | 基于Flink流式处理的用户行为统计分析方法 |
CN113515363A (zh) * | 2021-08-10 | 2021-10-19 | 中国人民解放军61646部队 | 面向异型任务高并发的多层次数据处理系统动态调度平台 |
CN113779094A (zh) * | 2021-11-09 | 2021-12-10 | 通号通信信息集团有限公司 | 基于批流一体的数据处理方法、装置、计算机设备和介质 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117724706A (zh) * | 2024-02-06 | 2024-03-19 | 湖南盛鼎科技发展有限责任公司 | 批流一体流程化实时处理异构平台海量数据的方法及系统 |
CN117724706B (zh) * | 2024-02-06 | 2024-05-03 | 湖南盛鼎科技发展有限责任公司 | 批流一体流程化实时处理异构平台海量数据的方法及系统 |
CN118051554A (zh) * | 2024-03-05 | 2024-05-17 | 合肥喆塔科技有限公司 | 基于FlinkSQL与Kudu构建实时数仓的方法、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113779094B (zh) | 2022-03-22 |
CN113779094A (zh) | 2021-12-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023082681A1 (fr) | Procédé et appareil de traitement de données basés sur une intégration de flux par lots, dispositif informatique et support | |
US11422982B2 (en) | Scaling stateful clusters while maintaining access | |
US11354314B2 (en) | Method for connecting a relational data store's meta data with hadoop | |
US11836533B2 (en) | Automated reconfiguration of real time data stream processing | |
US9418113B2 (en) | Value based windows on relations in continuous data streams | |
US8321450B2 (en) | Standardized database connectivity support for an event processing server in an embedded context | |
US8387076B2 (en) | Standardized database connectivity support for an event processing server | |
CN112507029B (zh) | 数据处理系统及数据实时处理方法 | |
CN109656963B (zh) | 元数据获取方法、装置、设备及计算机可读存储介质 | |
CN106649630A (zh) | 数据查询方法及装置 | |
US20230144100A1 (en) | Method and apparatus for managing and controlling resource, device and storage medium | |
CN106687955B (zh) | 简化将数据从数据源转移到数据目标的导入过程的调用 | |
EP2883172A1 (fr) | Système de base de données relationnelle en temps réel à haute performance et procédé pour l'utiliser | |
US10394805B2 (en) | Database management for mobile devices | |
CN110019267A (zh) | 一种元数据更新方法、装置、系统、电子设备及存储介质 | |
US11645179B2 (en) | Method and apparatus of monitoring interface performance of distributed application, device and storage medium | |
CN107346270B (zh) | 基于实时计算的基数估计的方法和系统 | |
CN108629016B (zh) | 支持实时流计算面向大数据数据库控制系统、计算机程序 | |
US10489179B1 (en) | Virtual machine instance data aggregation based on work definition metadata | |
WO2017157111A1 (fr) | Procédé, dispositif et système pour empêcher la perte de données de mémoire | |
US8510426B2 (en) | Communication and coordination between web services in a cloud-based computing environment | |
CN111125161A (zh) | 数据的实时处理方法、装置、设备及存储介质 | |
US20220277009A1 (en) | Processing database queries based on external tables | |
US11757959B2 (en) | Dynamic data stream processing for Apache Kafka using GraphQL | |
CN113612832A (zh) | 流式数据分发方法与系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22891490 Country of ref document: EP Kind code of ref document: A1 |