CN106202346B - A kind of data load cleaning engine, scheduling and storage system - Google Patents
A kind of data load cleaning engine, scheduling and storage system Download PDFInfo
- Publication number
- CN106202346B CN106202346B CN201610524292.8A CN201610524292A CN106202346B CN 106202346 B CN106202346 B CN 106202346B CN 201610524292 A CN201610524292 A CN 201610524292A CN 106202346 B CN106202346 B CN 106202346B
- Authority
- CN
- China
- Prior art keywords
- data
- module
- etl
- scheduling
- storage system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
Abstract
The invention discloses a kind of data to load cleaning engine, scheduling and storage system, including data source, data warehouse and user's display module, data warehouse is connected with ETL management module, ETL management module includes ETL scheduler module, ETL monitoring module, quality of data module and ETL task module, and data warehouse includes interface document area, detail data working area SSA, detail data SOR, Data Mart, Data Summary module, feedback module and metadata storage MDR.The present invention has the advantages of practical, data management is convenient, flexibility height, easy to spread, high-effect data processing, big handling capacity copes with the more data sources of addition, supports more analysis demands.
Description
Technical field
The invention belongs to field of computer technology more particularly to a kind of data load cleaning engines, scheduling and storage system.
Background technique
The fast development and information-based propulsion of big data technology, so that the data volume that human society is accumulated alreadys exceed
Summation in 5000, the quantity of acquisition, storage, processing and the propagation of mass data are also growing day by day in the past.Enterprise realizes number
According to sharing, more people can be made more fully to use data with existing resource, reduce the duplications of labour such as data collection, data acquisition
With corresponding expense.But implement data sharing process in, due to different user provide data may be from it is different
Approach, data content, data format and the quality of data are multifarious, and even can encounter data format sometimes cannot convert or count
After format transformation, lose the thorny problems such as information, seriously hinder flowing of the data in each department and each software systems with
It is shared.Therefore, how the inevitable choice that effective integrated management has become enhancing Commercial Banks ' Competitiveness is carried out to mass data.
In recent years, with the development of the big datas processing technique such as Hadoop, Spark, data have been attracted people's attention,
As the strategic resource of equal importance with water, petroleum.Current mass data is mainly stored in traditional SQL database, and big
The NoSQL database that data technique uses is very different, simultaneously because the diversity feature of data, uses big data platform
Before handling data, need data to import the storage system of big data platform oneself, and generally require and carried out at ETL when importing
Reason, completes the extraction of Various types of data, cleans, the processes such as loading.On the single machine that traditional ETL system is mainly run, it is also distributed formula
ETL processing, but mainly towards multitask scene.These traditional ETL system functions developed it is more perfect, but
When coping with the scene of big data quantity, it is difficult to meet process demand in processing speed, function causes to connecting there are many deviations
Traditional ETL processing mode embarrassment heavy burden.
Summary of the invention
Above-mentioned technical problem present in present invention aims to solve the prior art provides a kind of load cleaning of data and draws
It holds up, dispatch and storage system, have practical, data management is convenient, and flexibility is high, easy to spread, high-effect data processing,
The advantages of big handling capacity copes with the more data sources of addition, supports more analysis demands.
In order to solve the above-mentioned technical problem, the present invention adopts the following technical scheme:
A kind of data load cleaning engine, scheduling and storage system, it is characterised in that: including data source, data warehouse and
User's display module, data warehouse are connected with ETL management module, and ETL management module includes ETL scheduler module, ETL monitoring mould
Block, quality of data module and ETL task module, ETL scheduler module are used to control the operation of all ETL tasks, ETL monitoring module
For the operation of tracing and monitoring ETL task, quality of data module is used for the quality of data in tracking data warehouse, ETL task module
For completing specific ETL process work;Data warehouse includes interface document area, detail data working area SSA, detail data
SOR, Data Mart, Data Summary module, feedback module and metadata store MDR, detail data SOR connection Data Summary mould
Block, Data Summary module connect feedback module, and for storing and processing interface document, file interface area is connected in file interface area
Authority setting module, authority setting module are specific according to its to each catalogue for organizing according to specific bibliographic structure
Purposes setting to the access authority of different user, ETL management module is interacted and is cooperated centered on metadata, from data
Data are extracted in source, then carry out biography conversion, cleaning and load, according to the data warehouse model defined, are loaded data into
In data warehouse, meet renewing for data integration well, realizes summarizing and distributing for the data between each business;
Detail data working area SSA is connected with authentication module, and authentication module is connected with searching module, and searching module connection is thin
For joint number according to SOR, authentication module is connected with processing module, and processing module joint detail data SOR, detail data SOR are connected with friendship
Division module is changed, metadata storage MDR is used to save the information about process and data in data warehouse, metadata storage
MDR is connected with metadata management module;Data Mart is connected with multi-dimension data cube module, and multi-dimension data cube module is more for storing
Dimension data, data warehouse and Data Mart are stored in a TDH data group, and each different data are pressed in TDH data group
Different home zones is distinguished, and Data Mart is stored in 3D vision region, for analyzing multidimensional data, multi-dimension data cube
Module is stored in integrated region;Exchange partition module uses " subregion is ignored " and " dividing and rule " two kinds of zoning schemes, can be with
Reduce and import influence of the data manipulation to user's real time access data, operation mode just as using hot-swappable hard disk,
It is easy to use, in performance, due to storing mass data in system, it can be effectively improved and be looked by " subregion is ignored "
Ask performance, can be improved the manageability and availability of data, such as data are deleted, data backup, take " dividing and rule " into
Row is more improved and is efficiently managed, and the failure that task generates can be confined in subregion, and can effectively shorten recovery
Time;Since each tool and system can all generate the metadata of oneself, these metadata are use up using metadata management module
The possible centrally stored metadata that arrives stores in MDR, and it is a shared metadata for user's central access that metadata, which stores MDR,
Place, real metadata maintenance ground or generate these metadata system and tool in;The connection of user's display module
There is enquiry module, enquiry module is used to show business tine according to user demand.The system have it is practical, data management is just
Victory, flexibility is high, easy to spread, high-effect data processing, big handling capacity, copes with the more data sources of addition, supports more
The advantages of more analysis demands.
Further, ETL scheduler module connects having time setting module, and each task can be set in when hold
Row so that each task can at the time of specified automatic running, execution period of task with very big otherness, have
Time interval is defined, some defines the determining time, a scheduling chained list is established by time setting module in systems,
Each node in chained list contains " scheduling information of task " and " next time executes the moment ", and always according to " when next time executes
Carve " it is ranked up from small to large, dispatching efficiency is improved, to cope with the task of big quantity.
Further, ETL monitoring module is connected with fault processing module, and fault processing module connects ETL scheduler module, when out
When current task run-time error or failure, fault processing module can redistribute task, guarantee that system continues to run.
Further, ETL task module is connected with graphics module, and graphics module converts the operating condition of task
It is intuitive clear for visual figure.
Further, the data processing tools in interface document area are mainly Kettle, and interface document area is pressed under Unix system
It organizes according to specific bibliographic structure, each catalogue is set according to its specific purposes to difference by authority setting module
The access authority of user, independently of each other, Clear partition.
Further, detail data SOR is a set of table structure for meeting 3NF normal form specification based on BDW exploitation, detail data
SOR stores the data of most level of detail in data warehouse, is classified by exchange partition module according to different subject areas
Tissue, detail data SOR is the core of entire data warehouse data model as enterprise data model, is had enough flexible
Property, the more data sources of addition are coped with, support more analysis demands, expand the scope of application of system.
Further, detail data SOR is connected with BDW upgrading update module, and BDW can be supported by upgrading update module by BDW
It is further upgrading and update.
Further, ETL management module uses the DTS component of Microsoft, defines ETL mistake by standard interface OLE DB or ODBC
The data source of journey connects, and by the DTS decimation rule carried or using T-SQL scripting language data extraction definition, cleaning turn
Change method, using Microsoft SQL Server DTS tool design and complete the ETL in all data warehouses operate.
Further, Data Mart is in star-like or snowflake type structure, and Data Mart is a subset of data warehouse, can be claimed
Make in " small data warehouse ", the application of Data Mart is the supplement to data warehouse applications, and Data Mart is the multidimensional towards analysis
Data store precalculated data for specific user, to meet user's special demand, have independence, access is fast
Speed and conveniently, do not influenced by the ongoing update of system.
The present invention is by adopting the above-described technical solution, have the advantages that
The present invention rapidly realizes automatic, reliable data acquisition, transmission, conversion and load, and ETL processing speed is fast,
The processing of big data quantity can be completed, so that ETL task execution gets up to be easier to realize, and multitask can be supported to hold
Row, independently of each other, is independent of each other, and reduce the cost of ETL data processing, improves the performance of ETL data processing, improve
The manageability and availability of data, detail data SOR are the core of entire data warehouse data model as enterprise data model
The heart has enough flexibilities, copes with the more data sources of addition, supports more analysis demands, the scope of application of system
It greatly enhances.The present invention has practical, and data management is convenient, and flexibility is high, easy to spread, high-effect data processing, greatly
The advantages of handling capacity copes with the more data sources of addition, supports more analysis demands.
Detailed description of the invention
The present invention will be further explained below with reference to the attached drawings:
Fig. 1 is the flow diagram of a kind of data load cleaning engine of the present invention, scheduling and storage system;
Fig. 2 is the flow diagram of data warehouse in the present invention.
Specific embodiment
It as shown in Figure 1 to Figure 2, is a kind of data load cleaning engine of the present invention, scheduling and storage system, including data
Source, data warehouse and user's display module, data warehouse are connected with ETL management module, and ETL management module includes ETL scheduling mould
Block, ETL monitoring module, quality of data module and ETL task module, ETL scheduler module are used to control the fortune of all ETL tasks
Row, ETL scheduler module connect having time setting module, and each task can be set in when execute, so that each appoint
Business can at the time of specified automatic running, execution period of task defined between the time with very big otherness, some
Every (such as executing every 3 minutes primary), some defines determining time (the Friday night 21:00 as weekly starts to execute),
For determining the time, but can be divided into per year, the moon, week, many modes such as day, established in systems by time setting module
One scheduling chained list, each node in chained list contains " scheduling information of task " and " next time executes the moment ", and presses always
It is ranked up from small to large according to " next time executes the moment ", dispatching efficiency is improved, to cope with the task of big quantity.ETL monitors mould
Block is used for the operation of tracing and monitoring ETL task, and ETL monitoring module is connected with fault processing module, and fault processing module connects ETL
Scheduler module, when there is task run mistake or failure, fault processing module can redistribute task, guarantee that system continues
Operation.Quality of data module is used for the quality of data in tracking data warehouse, and ETL task module is for completing specific ETL process
Work, ETL task module are connected with graphics module, and graphics module converts the operating condition of task to visually
Figure, it is intuitive clear.
ETL management module uses the DTS component of Microsoft, and the number of ETL process is defined by standard interface OLE DB or ODBC
It is connected according to source, by the DTS decimation rule carried or uses T-SQL scripting language data extraction definition, cleaning and conversion method,
Using Microsoft SQL Server DTS tool design and complete the ETL in all data warehouses operate, with DTS component design
After complete DTS packet, packet can disposably be executed, packet can also be set as Automatic dispatching, be not necessarily to the implementation procedure of packet
Manual intervention.It is convenient in order to be provided to system manager, the execution of the DTS packet on backstage and scheduling are embodied as by ASP technology
B/S multi-modal user interface, such system manager need not be managed and safeguard to the ETL of data warehouse on the server, manage
Reason person can complete to manage and maintain operation in other any one places, convenient for management, improve working efficiency.ETL manages mould
Block is interacted and is cooperated centered on metadata, and data are extracted from data source, then carries out biography conversion, cleaning and load,
It according to the data warehouse model defined, loads data into data warehouse, meets renewing for data integration well, realize
Data between each business summarize and distribute.
Data warehouse includes that interface document area, detail data working area SSA, detail data SOR, Data Mart, data are total
It ties module, feedback module and metadata and stores MDR, detail data SOR connection Data Summary module, the connection of Data Summary module is instead
Module is presented, for storing and processing interface document, file interface area is connected with authority setting module, file interface in file interface area
Area organizes under Unix system according to specific bibliographic structure, specific according to its to each catalogue by authority setting module
Purposes setting to the access authority of different user, the data processing tools in interface document area are mainly Kettle, independently of each other,
It is independent of each other, Clear partition guarantees the validity of access.Detail data working area SSA is connected with authentication module, and authentication module connects
It is connected to searching module, searching module joint detail data SOR, authentication module is connected with processing module, processing module joint detail
The interface document of support is loaded into database, authentication module for the temporary of data by data SOR, detail data working area SSA
According to searching module to detail data SOR in existing data be compared with the data newly loaded, by verifying then by going out
Processing module will be in the Data Integration of these new loads to detail data SOR.
Detail data SOR is a set of table structure for meeting 3NF normal form specification based on BDW exploitation, detail data SOR storage
The data of most level of detail, detail data SOR are connected with exchange partition module, are pressed by exchange partition module in data warehouse
Taxonomic organization is carried out according to different subject areas, exchange partition module uses " subregion is ignored " and " dividing and rule " two kinds of subregion machines
System, it is possible to reduce import influence of the data manipulation to user's real time access data, operation mode is hot-swappable hard just as using
Disk is the same, easy to use,, can be effective by " subregion is ignored " due to storing mass data in system in performance
Ground improves query performance, can be improved the manageability and availability of data, such as data are deleted, data backup, take " point and
Control it " more improve and efficiently manages, the failure that task generates can be confined in subregion, and can effectively be contracted
Short recovery time, detail data SOR are the cores of entire data warehouse data model as enterprise data model, are had enough
Flexibility, cope with the more data sources of addition, support more analysis demands, expand the scope of application of system.Details
Data SOR is connected with BDW upgrading update module, and the further upgrading and update of BDW can be supported by upgrading update module by BDW.
Metadata storage MDR is used to save the information about process and data in data warehouse, and the information of data includes
Log, data dictionary and configuration information etc., metadata storage MDR are connected with metadata management module, due to each tool and are
System can all generate the metadata of oneself, using metadata management module that these metadata are centrally stored as far as possible to metadata
It stores in MDR, it is place of the shared metadata for user's central access, the dimension of real metadata that metadata, which stores MDR,
Shield ground is still in the system and tool for generating these metadata.Data Mart is connected with multi-dimension data cube module, data warehouse
It is stored in a TDH data group with Data Mart, each different data are come in TDH data group by different home zones
It distinguishes, Data Mart is stored in 3D vision region, and for analyzing multidimensional data, multi-dimension data cube module is stored in integrated area
In domain, for storing multidimensional data.Data Mart is in star-like or snowflake type structure, and Data Mart is a son of data warehouse
Collection can be referred to as " small data warehouse ", and the application of Data Mart is supplement to data warehouse applications, Data Mart be towards point
The multidimensional data of analysis stores precalculated data for specific user, to meet user's special demand, has independent
Property, access quickly and conveniently, is not influenced by the ongoing update of system.Data Summary module design is denormalization, is used to
Multidimensional data is updated, feedback module is based on data mining results.User's display module is connected with enquiry module, and enquiry module is used
In showing corresponding business tine according to demand set by user, handling the time including business, the deadline of business, business
Detailed content parameter etc..Specific user can quick search to oneself demand business detailed content.
The present invention rapidly realizes automatic, reliable data acquisition, transmission, conversion and load, and ETL processing speed is fast,
The processing of big data quantity can be completed, so that ETL task execution gets up to be easier to realize, and multitask can be supported to hold
Row, independently of each other, is independent of each other, and reduce the cost of ETL data processing, improves the performance of ETL data processing, improve
The manageability and availability of data, detail data SOR are the core of entire data warehouse data model as enterprise data model
The heart has enough flexibilities, copes with the more data sources of addition, supports more analysis demands, the scope of application of system
It greatly enhances.The present invention has practical, and data management is convenient, and flexibility is high, easy to spread, high-effect data processing, greatly
The advantages of handling capacity copes with the more data sources of addition, supports more analysis demands.
The above is only specific embodiments of the present invention, but technical characteristic of the invention is not limited thereto.It is any with this hair
Based on bright, to solve essentially identical technical problem, essentially identical technical effect is realized, made simple change, etc.
With replacement or modification etc., all it is covered by among protection scope of the present invention.
Claims (8)
1. a kind of data load cleaning engine, scheduling and storage system, it is characterised in that: including data source, data warehouse and use
Family display module, the data warehouse are connected with ETL management module, and the ETL management module includes ETL scheduler module, ETL prison
Module, quality of data module and ETL task module are controlled, the ETL scheduler module is used to control the operation of all ETL tasks, institute
Operation of the ETL monitoring module for tracing and monitoring ETL task is stated, the quality of data module is used for the data in tracking data warehouse
Quality, the ETL task module are connected with troubleshooting mould for completing specific ETL process work, the ETL monitoring module
Block, the fault processing module connect the ETL scheduler module;
The data warehouse includes that interface document area, detail data working area SSA, detail data SOR, Data Mart, data are total
It is total to tie module, feedback module and metadata storage MDR, the detail data SOR connection Data Summary module, the data
It ties module and connects the feedback module, the file interface area connects for storing and processing interface document, the file interface area
It is connected to authority setting module, the authority setting module presses each catalogue for organizing according to specific bibliographic structure
According to its specific purposes setting to the access authority of different user;
The detail data working area SSA is connected with authentication module, and the authentication module is connected with searching module, the lookup mould
Block connects the detail data SOR, and the authentication module is connected with processing module, and the processing module connects the detail data
SOR, the detail data SOR are connected with exchange partition module, and the metadata storage MDR is used to save about in data warehouse
Process and data information, metadata storage MDR is connected with metadata management module;The Data Mart is connected with more
Cube module is tieed up, the multi-dimension data cube module is for storing multidimensional data;
User's display module is connected with enquiry module, and the enquiry module is used to show business tine according to user demand.
2. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described
ETL scheduler module connects having time setting module.
3. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described
ETL task module is connected with graphics module.
4. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described
The data processing tools in interface document area are mainly Kettle.
5. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described
Detail data SOR is a set of table structure for meeting 3NF normal form specification based on BDW exploitation.
6. a kind of data load cleaning engine, scheduling and storage system according to claim 5, it is characterised in that: described
Detail data SOR is connected with BDW upgrading update module.
7. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described
ETL management module uses the DTS component of Microsoft.
8. a kind of data load cleaning engine, scheduling and storage system according to claim 1, it is characterised in that: described
Data Mart is in star-like or snowflake type structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610524292.8A CN106202346B (en) | 2016-06-29 | 2016-06-29 | A kind of data load cleaning engine, scheduling and storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610524292.8A CN106202346B (en) | 2016-06-29 | 2016-06-29 | A kind of data load cleaning engine, scheduling and storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106202346A CN106202346A (en) | 2016-12-07 |
CN106202346B true CN106202346B (en) | 2019-11-01 |
Family
ID=57465396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610524292.8A Active CN106202346B (en) | 2016-06-29 | 2016-06-29 | A kind of data load cleaning engine, scheduling and storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106202346B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108280084A (en) * | 2017-01-06 | 2018-07-13 | 上海前隆信息科技有限公司 | A kind of construction method of data warehouse, system and server |
CN107688592B (en) * | 2017-04-06 | 2020-03-17 | 平安科技(深圳)有限公司 | Data cleaning method and terminal |
CN107679160A (en) * | 2017-09-28 | 2018-02-09 | 深圳市华傲数据技术有限公司 | Data processing method and device based on chart database |
CN107832451A (en) * | 2017-11-23 | 2018-03-23 | 安徽科创智慧知识产权服务有限公司 | A kind of big data cleaning way of simplification |
CN107895032A (en) * | 2017-11-23 | 2018-04-10 | 安徽科创智慧知识产权服务有限公司 | Carry out the network data acquisition method that data are tentatively cleaned |
CN107992552A (en) * | 2017-11-28 | 2018-05-04 | 南京莱斯信息技术股份有限公司 | A kind of data interchange platform and method for interchanging data |
CN108196912B (en) * | 2018-01-03 | 2021-04-23 | 新疆熙菱信息技术股份有限公司 | Data integration method based on hot plug assembly |
CN109033291A (en) * | 2018-07-13 | 2018-12-18 | 深圳市小牛在线互联网信息咨询有限公司 | A kind of job scheduling method, device, computer equipment and storage medium |
CN109269557A (en) * | 2018-09-19 | 2019-01-25 | 中国南方电网有限责任公司超高压输电公司广州局 | A kind of change of current station equipment operating parameter and running environment intelligent monitor system and method |
CN109669975B (en) * | 2018-11-09 | 2020-12-18 | 成都数之联科技有限公司 | Industrial big data processing system and method |
CN109918437A (en) * | 2019-03-08 | 2019-06-21 | 北京中油瑞飞信息技术有限责任公司 | Distributed data processing method, apparatus and data assets management system |
CN112667615B (en) * | 2020-12-25 | 2022-02-15 | 广东电网有限责任公司电力科学研究院 | Data cleaning system and method |
CN112667472B (en) * | 2020-12-28 | 2022-04-08 | 武汉达梦数据库股份有限公司 | Data source connection state monitoring device and method |
CN113177039B (en) * | 2021-04-27 | 2024-02-27 | 中通服咨询设计研究院有限公司 | Data center data cleaning system based on data fusion |
CN114817393B (en) * | 2022-06-24 | 2022-09-16 | 深圳市信联征信有限公司 | Data extraction and cleaning method and device and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452485A (en) * | 2008-12-31 | 2009-06-10 | 中国建设银行股份有限公司 | Method and device for generating multidimensional cubic based on relational database |
CN201600693U (en) * | 2009-11-26 | 2010-10-06 | 中国移动通信集团河北有限公司 | Data warehouse system |
CN103577605A (en) * | 2013-11-20 | 2014-02-12 | 贵州电网公司电力调度控制中心 | Data warehouse based on data fusion and data mining and application method of data warehouse |
CN104933160A (en) * | 2015-06-26 | 2015-09-23 | 河海大学 | ETL (Extract Transform and Load) framework design method for safety monitoring business analysis |
CN105095327A (en) * | 2014-05-23 | 2015-11-25 | 深圳市珍爱网信息技术有限公司 | Distributed ELT system and scheduling method |
-
2016
- 2016-06-29 CN CN201610524292.8A patent/CN106202346B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452485A (en) * | 2008-12-31 | 2009-06-10 | 中国建设银行股份有限公司 | Method and device for generating multidimensional cubic based on relational database |
CN201600693U (en) * | 2009-11-26 | 2010-10-06 | 中国移动通信集团河北有限公司 | Data warehouse system |
CN103577605A (en) * | 2013-11-20 | 2014-02-12 | 贵州电网公司电力调度控制中心 | Data warehouse based on data fusion and data mining and application method of data warehouse |
CN105095327A (en) * | 2014-05-23 | 2015-11-25 | 深圳市珍爱网信息技术有限公司 | Distributed ELT system and scheduling method |
CN104933160A (en) * | 2015-06-26 | 2015-09-23 | 河海大学 | ETL (Extract Transform and Load) framework design method for safety monitoring business analysis |
Non-Patent Citations (2)
Title |
---|
《基于数据仓库的高校数据统计服务平台研究》;龙新征,等;《通信学报》;20130930;全文 * |
IBM数据仓库解决方案简述;石油论文资料库;《豆丁》;20140413;第1-24页 * |
Also Published As
Publication number | Publication date |
---|---|
CN106202346A (en) | 2016-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106202346B (en) | A kind of data load cleaning engine, scheduling and storage system | |
CN105005570B (en) | Magnanimity intelligent power data digging method and device based on cloud computing | |
US6901405B1 (en) | Method for persisting a schedule and database schema | |
CN104050042B (en) | The resource allocation methods and device of ETL operations | |
CN109272155A (en) | A kind of corporate behavior analysis system based on big data | |
CN108694195B (en) | Management method and system of distributed data warehouse | |
CN106446153A (en) | Distributed newSQL database system and method | |
CN104111996A (en) | Health insurance outpatient clinic big data extraction system and method based on hadoop platform | |
CN107766402A (en) | A kind of building dictionary cloud source of houses big data platform | |
CN102609446B (en) | Distributed Bloom filter system and application method thereof | |
WO2016025924A1 (en) | Systems and methods for auto-scaling a big data system | |
CN101566981A (en) | Method for establishing dynamic virtual data base in analyzing and processing system | |
CN107463595A (en) | A kind of data processing method and system based on Spark | |
CN106599197A (en) | Data acquisition and exchange engine | |
CN106528341B (en) | Automation disaster tolerance system based on Greenplum database | |
CN102917006B (en) | A kind of unified control and management method and device realizing computational resource and object permission | |
CN102917025A (en) | Method for business migration based on cloud computing platform | |
CN103246549B (en) | A kind of method and system of data conversion storage | |
CN108009258A (en) | It is a kind of can Configuration Online data collection and analysis platform | |
CN113721892A (en) | Domain modeling method, domain modeling device, computer equipment and storage medium | |
CN102279891A (en) | Retrieval method, device and system for concurrently searching information technology (IT) logs | |
US7020656B1 (en) | Partition exchange loading technique for fast addition of data to a data warehousing system | |
CN108287889B (en) | A kind of multi-source heterogeneous date storage method and system based on elastic table model | |
CN112287275A (en) | City-class data middle platform | |
JP6262505B2 (en) | Distributed data virtualization system, query processing method, and query processing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20191008 Address after: 510030 floor 2 and 5, building 9, No. 305, Dongfeng Middle Road, Yuexiu District, Guangzhou City, Guangdong Province Applicant after: Guangdong Information Network Co., Ltd. Address before: 310018, No. 2, No. 928, Xiasha Higher Education Park, Hangzhou, Zhejiang, Jianggan District Applicant before: Zhejiang University of Technology |
|
GR01 | Patent grant | ||
GR01 | Patent grant |