CN106202346B - A kind of data load cleaning engine, scheduling and storage system - Google Patents

A kind of data load cleaning engine, scheduling and storage system Download PDF

Info

Publication number
CN106202346B
CN106202346B CN201610524292.8A CN201610524292A CN106202346B CN 106202346 B CN106202346 B CN 106202346B CN 201610524292 A CN201610524292 A CN 201610524292A CN 106202346 B CN106202346 B CN 106202346B
Authority
CN
China
Prior art keywords
data
module
etl
scheduling
storage system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610524292.8A
Other languages
Chinese (zh)
Other versions
CN106202346A (en
Inventor
孙永剑
郑书礼
裘鑫芳
董磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Information Network Co., Ltd.
Original Assignee
Guangdong Information Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Information Network Co Ltd filed Critical Guangdong Information Network Co Ltd
Priority to CN201610524292.8A priority Critical patent/CN106202346B/en
Publication of CN106202346A publication Critical patent/CN106202346A/en
Application granted granted Critical
Publication of CN106202346B publication Critical patent/CN106202346B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Abstract

The invention discloses a kind of data to load cleaning engine, scheduling and storage system, including data source, data warehouse and user's display module, data warehouse is connected with ETL management module, ETL management module includes ETL scheduler module, ETL monitoring module, quality of data module and ETL task module, and data warehouse includes interface document area, detail data working area SSA, detail data SOR, Data Mart, Data Summary module, feedback module and metadata storage MDR.The present invention has the advantages of practical, data management is convenient, flexibility height, easy to spread, high-effect data processing, big handling capacity copes with the more data sources of addition, supports more analysis demands.

Description

A kind of data load cleaning engine, scheduling and storage system
Technical field
The invention belongs to field of computer technology more particularly to a kind of data load cleaning engines, scheduling and storage system.
Background technique
The fast development and information-based propulsion of big data technology, so that the data volume that human society is accumulated alreadys exceed Summation in 5000, the quantity of acquisition, storage, processing and the propagation of mass data are also growing day by day in the past.Enterprise realizes number According to sharing, more people can be made more fully to use data with existing resource, reduce the duplications of labour such as data collection, data acquisition With corresponding expense.But implement data sharing process in, due to different user provide data may be from it is different Approach, data content, data format and the quality of data are multifarious, and even can encounter data format sometimes cannot convert or count After format transformation, lose the thorny problems such as information, seriously hinder flowing of the data in each department and each software systems with It is shared.Therefore, how the inevitable choice that effective integrated management has become enhancing Commercial Banks ' Competitiveness is carried out to mass data.
In recent years, with the development of the big datas processing technique such as Hadoop, Spark, data have been attracted people's attention, As the strategic resource of equal importance with water, petroleum.Current mass data is mainly stored in traditional SQL database, and big The NoSQL database that data technique uses is very different, simultaneously because the diversity feature of data, uses big data platform Before handling data, need data to import the storage system of big data platform oneself, and generally require and carried out at ETL when importing Reason, completes the extraction of Various types of data, cleans, the processes such as loading.On the single machine that traditional ETL system is mainly run, it is also distributed formula ETL processing, but mainly towards multitask scene.These traditional ETL system functions developed it is more perfect, but When coping with the scene of big data quantity, it is difficult to meet process demand in processing speed, function causes to connecting there are many deviations Traditional ETL processing mode embarrassment heavy burden.
Summary of the invention
Above-mentioned technical problem present in present invention aims to solve the prior art provides a kind of load cleaning of data and draws It holds up, dispatch and storage system, have practical, data management is convenient, and flexibility is high, easy to spread, high-effect data processing, The advantages of big handling capacity copes with the more data sources of addition, supports more analysis demands.
In order to solve the above-mentioned technical problem, the present invention adopts the following technical scheme:
A kind of data load cleaning engine, scheduling and storage system, it is characterised in that: including data source, data warehouse and User's display module, data warehouse are connected with ETL management module, and ETL management module includes ETL scheduler module, ETL monitoring mould Block, quality of data module and ETL task module, ETL scheduler module are used to control the operation of all ETL tasks, ETL monitoring module For the operation of tracing and monitoring ETL task, quality of data module is used for the quality of data in tracking data warehouse, ETL task module For completing specific ETL process work;Data warehouse includes interface document area, detail data working area SSA, detail data SOR, Data Mart, Data Summary module, feedback module and metadata store MDR, detail data SOR connection Data Summary mould Block, Data Summary module connect feedback module, and for storing and processing interface document, file interface area is connected in file interface area Authority setting module, authority setting module are specific according to its to each catalogue for organizing according to specific bibliographic structure Purposes setting to the access authority of different user, ETL management module is interacted and is cooperated centered on metadata, from data Data are extracted in source, then carry out biography conversion, cleaning and load, according to the data warehouse model defined, are loaded data into In data warehouse, meet renewing for data integration well, realizes summarizing and distributing for the data between each business;
Detail data working area SSA is connected with authentication module, and authentication module is connected with searching module, and searching module connection is thin For joint number according to SOR, authentication module is connected with processing module, and processing module joint detail data SOR, detail data SOR are connected with friendship Division module is changed, metadata storage MDR is used to save the information about process and data in data warehouse, metadata storage MDR is connected with metadata management module;Data Mart is connected with multi-dimension data cube module, and multi-dimension data cube module is more for storing Dimension data, data warehouse and Data Mart are stored in a TDH data group, and each different data are pressed in TDH data group Different home zones is distinguished, and Data Mart is stored in 3D vision region, for analyzing multidimensional data, multi-dimension data cube Module is stored in integrated region;Exchange partition module uses " subregion is ignored " and " dividing and rule " two kinds of zoning schemes, can be with Reduce and import influence of the data manipulation to user's real time access data, operation mode just as using hot-swappable hard disk, It is easy to use, in performance, due to storing mass data in system, it can be effectively improved and be looked by " subregion is ignored " Ask performance, can be improved the manageability and availability of data, such as data are deleted, data backup, take " dividing and rule " into Row is more improved and is efficiently managed, and the failure that task generates can be confined in subregion, and can effectively shorten recovery Time;Since each tool and system can all generate the metadata of oneself, these metadata are use up using metadata management module The possible centrally stored metadata that arrives stores in MDR, and it is a shared metadata for user's central access that metadata, which stores MDR, Place, real metadata maintenance ground or generate these metadata system and tool in;The connection of user's display module There is enquiry module, enquiry module is used to show business tine according to user demand.The system have it is practical, data management is just Victory, flexibility is high, easy to spread, high-effect data processing, big handling capacity, copes with the more data sources of addition, supports more The advantages of more analysis demands.
Further, ETL scheduler module connects having time setting module, and each task can be set in when hold Row so that each task can at the time of specified automatic running, execution period of task with very big otherness, have Time interval is defined, some defines the determining time, a scheduling chained list is established by time setting module in systems, Each node in chained list contains " scheduling information of task " and " next time executes the moment ", and always according to " when next time executes Carve " it is ranked up from small to large, dispatching efficiency is improved, to cope with the task of big quantity.
Further, ETL monitoring module is connected with fault processing module, and fault processing module connects ETL scheduler module, when out When current task run-time error or failure, fault processing module can redistribute task, guarantee that system continues to run.
Further, ETL task module is connected with graphics module, and graphics module converts the operating condition of task It is intuitive clear for visual figure.
Further, the data processing tools in interface document area are mainly Kettle, and interface document area is pressed under Unix system It organizes according to specific bibliographic structure, each catalogue is set according to its specific purposes to difference by authority setting module The access authority of user, independently of each other, Clear partition.
Further, detail data SOR is a set of table structure for meeting 3NF normal form specification based on BDW exploitation, detail data SOR stores the data of most level of detail in data warehouse, is classified by exchange partition module according to different subject areas Tissue, detail data SOR is the core of entire data warehouse data model as enterprise data model, is had enough flexible Property, the more data sources of addition are coped with, support more analysis demands, expand the scope of application of system.
Further, detail data SOR is connected with BDW upgrading update module, and BDW can be supported by upgrading update module by BDW It is further upgrading and update.
Further, ETL management module uses the DTS component of Microsoft, defines ETL mistake by standard interface OLE DB or ODBC The data source of journey connects, and by the DTS decimation rule carried or using T-SQL scripting language data extraction definition, cleaning turn Change method, using Microsoft SQL Server DTS tool design and complete the ETL in all data warehouses operate.
Further, Data Mart is in star-like or snowflake type structure, and Data Mart is a subset of data warehouse, can be claimed Make in " small data warehouse ", the application of Data Mart is the supplement to data warehouse applications, and Data Mart is the multidimensional towards analysis Data store precalculated data for specific user, to meet user's special demand, have independence, access is fast Speed and conveniently, do not influenced by the ongoing update of system.
The present invention is by adopting the above-described technical solution, have the advantages that
The present invention rapidly realizes automatic, reliable data acquisition, transmission, conversion and load, and ETL processing speed is fast, The processing of big data quantity can be completed, so that ETL task execution gets up to be easier to realize, and multitask can be supported to hold Row, independently of each other, is independent of each other, and reduce the cost of ETL data processing, improves the performance of ETL data processing, improve The manageability and availability of data, detail data SOR are the core of entire data warehouse data model as enterprise data model The heart has enough flexibilities, copes with the more data sources of addition, supports more analysis demands, the scope of application of system It greatly enhances.The present invention has practical, and data management is convenient, and flexibility is high, easy to spread, high-effect data processing, greatly The advantages of handling capacity copes with the more data sources of addition, supports more analysis demands.
Detailed description of the invention
The present invention will be further explained below with reference to the attached drawings:
Fig. 1 is the flow diagram of a kind of data load cleaning engine of the present invention, scheduling and storage system;
Fig. 2 is the flow diagram of data warehouse in the present invention.
Specific embodiment
It as shown in Figure 1 to Figure 2, is a kind of data load cleaning engine of the present invention, scheduling and storage system, including data Source, data warehouse and user's display module, data warehouse are connected with ETL management module, and ETL management module includes ETL scheduling mould Block, ETL monitoring module, quality of data module and ETL task module, ETL scheduler module are used to control the fortune of all ETL tasks Row, ETL scheduler module connect having time setting module, and each task can be set in when execute, so that each appoint Business can at the time of specified automatic running, execution period of task defined between the time with very big otherness, some Every (such as executing every 3 minutes primary), some defines determining time (the Friday night 21:00 as weekly starts to execute), For determining the time, but can be divided into per year, the moon, week, many modes such as day, established in systems by time setting module One scheduling chained list, each node in chained list contains " scheduling information of task " and " next time executes the moment ", and presses always It is ranked up from small to large according to " next time executes the moment ", dispatching efficiency is improved, to cope with the task of big quantity.ETL monitors mould Block is used for the operation of tracing and monitoring ETL task, and ETL monitoring module is connected with fault processing module, and fault processing module connects ETL Scheduler module, when there is task run mistake or failure, fault processing module can redistribute task, guarantee that system continues Operation.Quality of data module is used for the quality of data in tracking data warehouse, and ETL task module is for completing specific ETL process Work, ETL task module are connected with graphics module, and graphics module converts the operating condition of task to visually Figure, it is intuitive clear.
ETL management module uses the DTS component of Microsoft, and the number of ETL process is defined by standard interface OLE DB or ODBC It is connected according to source, by the DTS decimation rule carried or uses T-SQL scripting language data extraction definition, cleaning and conversion method, Using Microsoft SQL Server DTS tool design and complete the ETL in all data warehouses operate, with DTS component design After complete DTS packet, packet can disposably be executed, packet can also be set as Automatic dispatching, be not necessarily to the implementation procedure of packet Manual intervention.It is convenient in order to be provided to system manager, the execution of the DTS packet on backstage and scheduling are embodied as by ASP technology B/S multi-modal user interface, such system manager need not be managed and safeguard to the ETL of data warehouse on the server, manage Reason person can complete to manage and maintain operation in other any one places, convenient for management, improve working efficiency.ETL manages mould Block is interacted and is cooperated centered on metadata, and data are extracted from data source, then carries out biography conversion, cleaning and load, It according to the data warehouse model defined, loads data into data warehouse, meets renewing for data integration well, realize Data between each business summarize and distribute.
Data warehouse includes that interface document area, detail data working area SSA, detail data SOR, Data Mart, data are total It ties module, feedback module and metadata and stores MDR, detail data SOR connection Data Summary module, the connection of Data Summary module is instead Module is presented, for storing and processing interface document, file interface area is connected with authority setting module, file interface in file interface area Area organizes under Unix system according to specific bibliographic structure, specific according to its to each catalogue by authority setting module Purposes setting to the access authority of different user, the data processing tools in interface document area are mainly Kettle, independently of each other, It is independent of each other, Clear partition guarantees the validity of access.Detail data working area SSA is connected with authentication module, and authentication module connects It is connected to searching module, searching module joint detail data SOR, authentication module is connected with processing module, processing module joint detail The interface document of support is loaded into database, authentication module for the temporary of data by data SOR, detail data working area SSA According to searching module to detail data SOR in existing data be compared with the data newly loaded, by verifying then by going out Processing module will be in the Data Integration of these new loads to detail data SOR.
Detail data SOR is a set of table structure for meeting 3NF normal form specification based on BDW exploitation, detail data SOR storage The data of most level of detail, detail data SOR are connected with exchange partition module, are pressed by exchange partition module in data warehouse Taxonomic organization is carried out according to different subject areas, exchange partition module uses " subregion is ignored " and " dividing and rule " two kinds of subregion machines System, it is possible to reduce import influence of the data manipulation to user's real time access data, operation mode is hot-swappable hard just as using Disk is the same, easy to use,, can be effective by " subregion is ignored " due to storing mass data in system in performance Ground improves query performance, can be improved the manageability and availability of data, such as data are deleted, data backup, take " point and Control it " more improve and efficiently manages, the failure that task generates can be confined in subregion, and can effectively be contracted Short recovery time, detail data SOR are the cores of entire data warehouse data model as enterprise data model, are had enough Flexibility, cope with the more data sources of addition, support more analysis demands, expand the scope of application of system.Details Data SOR is connected with BDW upgrading update module, and the further upgrading and update of BDW can be supported by upgrading update module by BDW.
Metadata storage MDR is used to save the information about process and data in data warehouse, and the information of data includes Log, data dictionary and configuration information etc., metadata storage MDR are connected with metadata management module, due to each tool and are System can all generate the metadata of oneself, using metadata management module that these metadata are centrally stored as far as possible to metadata It stores in MDR, it is place of the shared metadata for user's central access, the dimension of real metadata that metadata, which stores MDR, Shield ground is still in the system and tool for generating these metadata.Data Mart is connected with multi-dimension data cube module, data warehouse It is stored in a TDH data group with Data Mart, each different data are come in TDH data group by different home zones It distinguishes, Data Mart is stored in 3D vision region, and for analyzing multidimensional data, multi-dimension data cube module is stored in integrated area In domain, for storing multidimensional data.Data Mart is in star-like or snowflake type structure, and Data Mart is a son of data warehouse Collection can be referred to as " small data warehouse ", and the application of Data Mart is supplement to data warehouse applications, Data Mart be towards point The multidimensional data of analysis stores precalculated data for specific user, to meet user's special demand, has independent Property, access quickly and conveniently, is not influenced by the ongoing update of system.Data Summary module design is denormalization, is used to Multidimensional data is updated, feedback module is based on data mining results.User's display module is connected with enquiry module, and enquiry module is used In showing corresponding business tine according to demand set by user, handling the time including business, the deadline of business, business Detailed content parameter etc..Specific user can quick search to oneself demand business detailed content.
The present invention rapidly realizes automatic, reliable data acquisition, transmission, conversion and load, and ETL processing speed is fast, The processing of big data quantity can be completed, so that ETL task execution gets up to be easier to realize, and multitask can be supported to hold Row, independently of each other, is independent of each other, and reduce the cost of ETL data processing, improves the performance of ETL data processing, improve The manageability and availability of data, detail data SOR are the core of entire data warehouse data model as enterprise data model The heart has enough flexibilities, copes with the more data sources of addition, supports more analysis demands, the scope of application of system It greatly enhances.The present invention has practical, and data management is convenient, and flexibility is high, easy to spread, high-effect data processing, greatly The advantages of handling capacity copes with the more data sources of addition, supports more analysis demands.
The above is only specific embodiments of the present invention, but technical characteristic of the invention is not limited thereto.It is any with this hair Based on bright, to solve essentially identical technical problem, essentially identical technical effect is realized, made simple change, etc. With replacement or modification etc., all it is covered by among protection scope of the present invention.

Claims (8)

1. a kind of data load cleaning engine, scheduling and storage system, it is characterised in that: including data source, data warehouse and use Family display module, the data warehouse are connected with ETL management module, and the ETL management module includes ETL scheduler module, ETL prison Module, quality of data module and ETL task module are controlled, the ETL scheduler module is used to control the operation of all ETL tasks, institute Operation of the ETL monitoring module for tracing and monitoring ETL task is stated, the quality of data module is used for the data in tracking data warehouse Quality, the ETL task module are connected with troubleshooting mould for completing specific ETL process work, the ETL monitoring module Block, the fault processing module connect the ETL scheduler module;
The data warehouse includes that interface document area, detail data working area SSA, detail data SOR, Data Mart, data are total It is total to tie module, feedback module and metadata storage MDR, the detail data SOR connection Data Summary module, the data It ties module and connects the feedback module, the file interface area connects for storing and processing interface document, the file interface area It is connected to authority setting module, the authority setting module presses each catalogue for organizing according to specific bibliographic structure According to its specific purposes setting to the access authority of different user;
The detail data working area SSA is connected with authentication module, and the authentication module is connected with searching module, the lookup mould Block connects the detail data SOR, and the authentication module is connected with processing module, and the processing module connects the detail data SOR, the detail data SOR are connected with exchange partition module, and the metadata storage MDR is used to save about in data warehouse Process and data information, metadata storage MDR is connected with metadata management module;The Data Mart is connected with more Cube module is tieed up, the multi-dimension data cube module is for storing multidimensional data;
User's display module is connected with enquiry module, and the enquiry module is used to show business tine according to user demand.
2. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described ETL scheduler module connects having time setting module.
3. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described ETL task module is connected with graphics module.
4. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described The data processing tools in interface document area are mainly Kettle.
5. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described Detail data SOR is a set of table structure for meeting 3NF normal form specification based on BDW exploitation.
6. a kind of data load cleaning engine, scheduling and storage system according to claim 5, it is characterised in that: described Detail data SOR is connected with BDW upgrading update module.
7. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described ETL management module uses the DTS component of Microsoft.
8. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described Data Mart is in star-like or snowflake type structure.
CN201610524292.8A 2016-06-29 2016-06-29 A kind of data load cleaning engine, scheduling and storage system Active CN106202346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610524292.8A CN106202346B (en) 2016-06-29 2016-06-29 A kind of data load cleaning engine, scheduling and storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610524292.8A CN106202346B (en) 2016-06-29 2016-06-29 A kind of data load cleaning engine, scheduling and storage system

Publications (2)

Publication Number Publication Date
CN106202346A CN106202346A (en) 2016-12-07
CN106202346B true CN106202346B (en) 2019-11-01

Family

ID=57465396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610524292.8A Active CN106202346B (en) 2016-06-29 2016-06-29 A kind of data load cleaning engine, scheduling and storage system

Country Status (1)

Country Link
CN (1) CN106202346B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280084A (en) * 2017-01-06 2018-07-13 上海前隆信息科技有限公司 A kind of construction method of data warehouse, system and server
CN107688592B (en) * 2017-04-06 2020-03-17 平安科技(深圳)有限公司 Data cleaning method and terminal
CN107679160A (en) * 2017-09-28 2018-02-09 深圳市华傲数据技术有限公司 Data processing method and device based on chart database
CN107832451A (en) * 2017-11-23 2018-03-23 安徽科创智慧知识产权服务有限公司 A kind of big data cleaning way of simplification
CN107895032A (en) * 2017-11-23 2018-04-10 安徽科创智慧知识产权服务有限公司 Carry out the network data acquisition method that data are tentatively cleaned
CN107992552A (en) * 2017-11-28 2018-05-04 南京莱斯信息技术股份有限公司 A kind of data interchange platform and method for interchanging data
CN108196912B (en) * 2018-01-03 2021-04-23 新疆熙菱信息技术股份有限公司 Data integration method based on hot plug assembly
CN109033291A (en) * 2018-07-13 2018-12-18 深圳市小牛在线互联网信息咨询有限公司 A kind of job scheduling method, device, computer equipment and storage medium
CN109269557A (en) * 2018-09-19 2019-01-25 中国南方电网有限责任公司超高压输电公司广州局 A kind of change of current station equipment operating parameter and running environment intelligent monitor system and method
CN109669975B (en) * 2018-11-09 2020-12-18 成都数之联科技有限公司 Industrial big data processing system and method
CN109918437A (en) * 2019-03-08 2019-06-21 北京中油瑞飞信息技术有限责任公司 Distributed data processing method, apparatus and data assets management system
CN112667615B (en) * 2020-12-25 2022-02-15 广东电网有限责任公司电力科学研究院 Data cleaning system and method
CN112667472B (en) * 2020-12-28 2022-04-08 武汉达梦数据库股份有限公司 Data source connection state monitoring device and method
CN113177039B (en) * 2021-04-27 2024-02-27 中通服咨询设计研究院有限公司 Data center data cleaning system based on data fusion
CN114817393B (en) * 2022-06-24 2022-09-16 深圳市信联征信有限公司 Data extraction and cleaning method and device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452485A (en) * 2008-12-31 2009-06-10 中国建设银行股份有限公司 Method and device for generating multidimensional cubic based on relational database
CN201600693U (en) * 2009-11-26 2010-10-06 中国移动通信集团河北有限公司 Data warehouse system
CN103577605A (en) * 2013-11-20 2014-02-12 贵州电网公司电力调度控制中心 Data warehouse based on data fusion and data mining and application method of data warehouse
CN104933160A (en) * 2015-06-26 2015-09-23 河海大学 ETL (Extract Transform and Load) framework design method for safety monitoring business analysis
CN105095327A (en) * 2014-05-23 2015-11-25 深圳市珍爱网信息技术有限公司 Distributed ELT system and scheduling method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452485A (en) * 2008-12-31 2009-06-10 中国建设银行股份有限公司 Method and device for generating multidimensional cubic based on relational database
CN201600693U (en) * 2009-11-26 2010-10-06 中国移动通信集团河北有限公司 Data warehouse system
CN103577605A (en) * 2013-11-20 2014-02-12 贵州电网公司电力调度控制中心 Data warehouse based on data fusion and data mining and application method of data warehouse
CN105095327A (en) * 2014-05-23 2015-11-25 深圳市珍爱网信息技术有限公司 Distributed ELT system and scheduling method
CN104933160A (en) * 2015-06-26 2015-09-23 河海大学 ETL (Extract Transform and Load) framework design method for safety monitoring business analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《基于数据仓库的高校数据统计服务平台研究》;龙新征,等;《通信学报》;20130930;全文 *
IBM数据仓库解决方案简述;石油论文资料库;《豆丁》;20140413;第1-24页 *

Also Published As

Publication number Publication date
CN106202346A (en) 2016-12-07

Similar Documents

Publication Publication Date Title
CN106202346B (en) A kind of data load cleaning engine, scheduling and storage system
CN105005570B (en) Magnanimity intelligent power data digging method and device based on cloud computing
US6901405B1 (en) Method for persisting a schedule and database schema
CN104050042B (en) The resource allocation methods and device of ETL operations
CN109272155A (en) A kind of corporate behavior analysis system based on big data
CN108694195B (en) Management method and system of distributed data warehouse
CN106446153A (en) Distributed newSQL database system and method
CN104111996A (en) Health insurance outpatient clinic big data extraction system and method based on hadoop platform
CN107766402A (en) A kind of building dictionary cloud source of houses big data platform
CN102609446B (en) Distributed Bloom filter system and application method thereof
WO2016025924A1 (en) Systems and methods for auto-scaling a big data system
CN101566981A (en) Method for establishing dynamic virtual data base in analyzing and processing system
CN107463595A (en) A kind of data processing method and system based on Spark
CN106599197A (en) Data acquisition and exchange engine
CN106528341B (en) Automation disaster tolerance system based on Greenplum database
CN102917006B (en) A kind of unified control and management method and device realizing computational resource and object permission
CN102917025A (en) Method for business migration based on cloud computing platform
CN103246549B (en) A kind of method and system of data conversion storage
CN108009258A (en) It is a kind of can Configuration Online data collection and analysis platform
CN113721892A (en) Domain modeling method, domain modeling device, computer equipment and storage medium
CN102279891A (en) Retrieval method, device and system for concurrently searching information technology (IT) logs
US7020656B1 (en) Partition exchange loading technique for fast addition of data to a data warehousing system
CN108287889B (en) A kind of multi-source heterogeneous date storage method and system based on elastic table model
CN112287275A (en) City-class data middle platform
JP6262505B2 (en) Distributed data virtualization system, query processing method, and query processing program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20191008

Address after: 510030 floor 2 and 5, building 9, No. 305, Dongfeng Middle Road, Yuexiu District, Guangzhou City, Guangdong Province

Applicant after: Guangdong Information Network Co., Ltd.

Address before: 310018, No. 2, No. 928, Xiasha Higher Education Park, Hangzhou, Zhejiang, Jianggan District

Applicant before: Zhejiang University of Technology

GR01 Patent grant
GR01 Patent grant