CN106202346A - A kind of data load and clean engine, dispatch and storage system - Google Patents
A kind of data load and clean engine, dispatch and storage system Download PDFInfo
- Publication number
- CN106202346A CN106202346A CN201610524292.8A CN201610524292A CN106202346A CN 106202346 A CN106202346 A CN 106202346A CN 201610524292 A CN201610524292 A CN 201610524292A CN 106202346 A CN106202346 A CN 106202346A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- etl
- connects
- dispatch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of data to load cleaning engine, dispatch and storage system, module is represented including data source, data warehouse and user, data warehouse connects has ETL to manage module, ETL management module includes ETL scheduler module, ETL monitoring module, quality of data module and ETL task module, and data warehouse includes interface document district, detail data working area SSA, detail data SOR, Data Mart, Data Summary module, feedback module and metadata storage MDR.The present invention has practical, and data management is convenient, and motility is high, it is easy to promote, and high-effect data process, big handling capacity, it is possible to more data source, the advantage supporting more analysis demand are added in reply.
Description
Technical field
The invention belongs to field of computer technology, particularly relate to a kind of data and load cleaning engine, dispatch and storage system.
Background technology
The fast development of big data technique and informationalized propelling so that the data volume that human society is accumulated alreadys more than
The summation of 5000 in the past, the collection of mass data, the quantity storing, process and propagating also grow with each passing day.Enterprise realizes number
According to sharing, more people can be made to use data with existing resource more fully, reduce the duplication of labour such as collection of data, data acquisition
With corresponding expense.But, in the middle of the process implementing data sharing, the data provided due to different user may be from different
Approach, its data content, data form and the quality of data vary, and the most even can run into data form and can not change or number
After format transformation, the thorny problem such as loss information, seriously hinder data flowing in all departments and each software system with
Share.Therefore, how mass data is carried out effective integrated management and become the inevitable choice strengthening Commercial Banks ' Competitiveness.
In recent years, along with the development of the big data processing technique such as Hadoop, Spark, data have attracted people's attention,
Become the strategic resource of equal importance with water, oil.Current mass data is mainly stored in traditional SQL database, and greatly
The NoSQL data base that data technique uses is very different, simultaneously because the multiformity feature of data, uses big data platform
Before processing data, need data to import the storage system of big data platform oneself, and when importing it is generally required to carry out at ETL
The processes such as reason, completes the extraction of Various types of data, cleans, loading.On the unit that traditional E TL system is mainly run, also have distributed
ETL process, but mainly towards multitask scene.It is the most perfect that these traditional ETL system functions have developed, but
When tackling the scene of big data quantity, being difficult to meet process demand in processing speed, there is a lot of deviation to connecting in function, causes
Traditional ETL processing mode embarrassment heavy burden.
Summary of the invention
Present invention aim at solving above-mentioned technical problem present in prior art, it is provided that a kind of data load cleaning and draw
Holding up, dispatch and storage system, have practical, data management is convenient, and motility is high, it is easy to promote, and high-effect data process,
Big handling capacity, it is possible to more data source, the advantage supporting more analysis demand are added in reply.
In order to solve above-mentioned technical problem, the present invention adopts the following technical scheme that
A kind of data load and clean engine, dispatch and storage system, it is characterised in that: include data source, data warehouse and
User represents module, and data warehouse connects has ETL to manage module, and ETL management module includes that ETL scheduler module, ETL monitor mould
Block, quality of data module and ETL task module, ETL scheduler module is for controlling the operation of all ETL tasks, ETL monitoring module
For the operation of tracing and monitoring ETL task, quality of data module is for following the tracks of the quality of data of data warehouse, ETL task module
For completing concrete ETL process work;Data warehouse includes interface document district, detail data working area SSA, detail data
SOR, Data Mart, Data Summary module, feedback module and metadata storage MDR, detail data SOR connects Data Summary mould
Block, Data Summary module connects feedback module, and file interface district is for storage and Processing Interface file, and file interface district connects to be had
Authority setting module, authority setting module is for organizing according to specific bibliographic structure, specific according to it to each catalogue
Purposes set access rights to different user, ETL management module interacts centered by metadata and cooperates, from data
Extracted data in source, then carries out passing conversion, cleaning and load, according to the data warehouse model defined, loads data into
In data warehouse, meet renewing of data integration well, it is achieved the collecting and distribute of the data between each business;
Detail data working area SSA connects authentication module, and authentication module connects lookup module, searches module and connects thin
Joint number is according to SOR, and authentication module connects processing module, processing module joint detail data SOR, and detail data SOR connects friendship
Changing division module, metadata storage MDR is used for preserving the information about the process in data warehouse and data, and metadata stores
MDR connects metadata management module;Data Mart connects has multi-dimension data cube module, multi-dimension data cube module to be used for storing many
Dimension data, data warehouse and Data Mart are stored in a TDH data group, and each different data are pressed in TDH data group
Different home zones is distinguished, and Data Mart is stored in 3D vision region, is used for analyzing multidimensional data, multi-dimension data cube
Module stores is in integrated region;Exchange partition module uses " subregion is ignored " and " dividing and rule " two kinds of zoning schemes, permissible
Reduce and import the data manipulation impact on user's real time access data, operator scheme just as using hot swappable hard disk,
Easy to use, in performance, owing to system storing mass data, can be effectively improved by " subregion is ignored " and look into
Ask performance, manageability and the availability of data can be improved, such as data deletion, data backup etc., take " dividing and rule " to enter
Row more improves and manages efficiently, the fault that task produces can be confined in subregion, and can effectively shorten recovery
Time;All can generate the metadata of oneself due to each instrument and system, utilize metadata management module these metadata to the greatest extent
Possible centralized stores stores in MDR to metadata, and metadata storage MDR is that a shared metadata is for user's central access
Place, real metadata safeguard ground or in generating the system of these metadata and instrument;User represents module and connects
Having enquiry module, enquiry module is for representing business tine according to user's request.This system has practical, and data management is just
Victory, motility is high, it is easy to promote, and high-effect data process, big handling capacity, it is possible to more data source is added in reply, supports more
The advantages analyzing demand more.
Further, ETL scheduler module connects has time setting module, each task to be set in when to hold
OK so that each task can be run automatically in the moment specified, the execution cycle of task has the biggest diversity, has
Defining time interval, have defines the time of determination, establishes a scheduling chained list in systems by time setting module,
Each node in chained list contains " schedule information of task " and " next time performs the moment ", and all the time according to " when next time performs
Carve " it is ranked up from small to large, improve dispatching efficiency, to tackle the task of big quantity.
Further, ETL monitoring module connects faulty processing module, and fault processing module connects ETL scheduler module, when going out
When current task run-time error or fault, fault processing module can redistribute task, it is ensured that system continues to run with.
Further, ETL task module connects graphics module, and the ruuning situation of task is converted by graphics module
For visual figure, the clearest.
Further, the data processing tools in interface document district is mainly Kettle, and interface document district presses under Unix system
Organize according to specific bibliographic structure, by authority setting module, each catalogue is set difference according to its specific purposes
The access rights of user, separate, Clear partition.
Further, detail data SOR is a set of list structure meeting 3NF normal form specification based on BDW exploitation, detail data
SOR stores the data of level of detail in data warehouse, is classified according to different subject areas by exchange partition module
Tissue, detail data SOR, as enterprise data model, is the core of whole data warehouse data model, has enough flexible
Property, it is possible to more data source is added in reply, supports more analysis demand, expands the scope of application of system.
Further, detail data SOR connects has BDW to upgrade more new module, can support BDW by BDW more new module of upgrading
Further upgrading and update.
Further, ETL management module uses the DTS assembly of Microsoft, defines ETL mistake by standard interface OLE DB or ODBC
The data source of journey connects, the decimation rule that carried by DTS or use T-SQL script data extraction definition, clean and turn
Change method, the DTS tool design using Microsoft SQL Server the ETL operation completing in all of data warehouse.
Further, Data Mart is star-like or snowflake type structure, and Data Mart is a subset of data warehouse, can claim
Making in " small data warehouse ", the application of Data Mart is to supplement data warehouse applications, and Data Mart is towards the multidimensional analyzed
Data, store precalculated data for specific user, thus meet user's special demand, have independence, access fast
Fast and convenient, do not affected by the ongoing renewal of system.
Due to the fact that and have employed technique scheme, have the advantages that
The present invention achieves data acquisition automatic, reliable rapidly, transmits, changes and load, and ETL processing speed is fast,
The processing of big data quantity can be completed so that ETL tasks carrying gets up to be more prone to realize, and can support that multitask is held
OK, separate, it is independent of each other, and reduces the cost that ETL data process, improve the performance that ETL data process, improve
The manageability of data and availability, detail data SOR, as enterprise data model, is the core of whole data warehouse data model
The heart, has enough motilities, it is possible to more data source is added in reply, supports more analysis demand, the scope of application of system
It is greatly enhanced.The present invention has practical, and data management is convenient, and motility is high, it is easy to promote, and high-effect data process, greatly
Handling capacity, it is possible to more data source, the advantage supporting more analysis demand are added in reply.
Accompanying drawing explanation
The invention will be further described below in conjunction with the accompanying drawings:
Fig. 1 is that a kind of data of the present invention load the schematic flow sheet cleaning engine, dispatching and store system;
Fig. 2 is the schematic flow sheet of data warehouse in the present invention.
Detailed description of the invention
As shown in Figure 1 to Figure 2, load cleaning engine for one data of the present invention, dispatch and storage system, including data
Source, data warehouse and user represent module, and data warehouse connects has ETL to manage module, and ETL management module includes that ETL dispatches mould
Block, ETL monitoring module, quality of data module and ETL task module, ETL scheduler module is for controlling the fortune of all ETL tasks
OK, ETL scheduler module connects has time setting module, each task to be set in when to perform so that Mei Geren
Business can run automatically in the moment specified, and the execution cycle of task has the biggest diversity, and have defined between the time
Every (as performed once every 3 minutes), have defines the time of determination (as Friday night 21:00 weekly starts to perform),
For determining the time, again can be divided into per year, the moon, week, a lot of modes such as day, established in systems by time setting module
One scheduling chained list, each node in chained list contains " schedule information of task " and " next time performs the moment ", and presses all the time
It is ranked up from small to large according to " next time performs the moment ", improves dispatching efficiency, to tackle the task of big quantity.ETL monitors mould
Block connects faulty processing module for the operation of tracing and monitoring ETL task, ETL monitoring module, and fault processing module connects ETL
Scheduler module, when there is task run mistake or fault, fault processing module can redistribute task, it is ensured that system continues
Run.Quality of data module is for following the tracks of the quality of data of data warehouse, and ETL task module has been used for concrete ETL process
Work, ETL task module connects has graphics module, graphics module to be converted into visual by the ruuning situation of task
Figure, the clearest.
ETL management module uses the DTS assembly of Microsoft, by the number of standard interface OLE DB or ODBC definition ETL process
Connect according to source, the decimation rule carried by DTS or use T-SQL script data extraction definition, cleaning and conversion method,
The DTS tool design using Microsoft SQL Server the ETL operation completing in all of data warehouse, design with DTS assembly
After complete DTS bag, bag disposably can be performed, it is also possible to bag is set to Automatic dispatching, make the execution process of bag without
Manual intervention.In order to provide convenient to system manager, execution and the scheduling of the DTS bag on backstage are embodied as by ASP technology
B/S multi-modal user interface, the ETL of data warehouse need not be managed and safeguard by such system manager on the server, pipe
Reason person can complete management and attended operation, convenient management in other any one places, improves work efficiency.ETL manages mould
Block interacts centered by metadata and cooperates, extracted data from data source, then carries out passing conversion, cleaning and load,
According to the data warehouse model defined, load data in data warehouse, meet renewing of data integration well, it is achieved
The collecting and distribute of data between each business.
Data warehouse includes that interface document district, detail data working area SSA, detail data SOR, Data Mart, data are total
Knot module, feedback module and metadata storage MDR, detail data SOR connects Data Summary module, and Data Summary module connects anti-
Feedback module, file interface district has permission setting module, file interface for storage and Processing Interface file, the connection of file interface district
District organizes according to specific bibliographic structure under Unix system, specific according to it to each catalogue by authority setting module
Purposes set access rights to different user, the data processing tools in interface document district is mainly Kettle, separate,
It is independent of each other, Clear partition, it is ensured that the effectiveness of access.Detail data working area SSA connects authentication module, and authentication module is even
Being connected to search module, search module joint detail data SOR, authentication module connects processing module, processing module joint detail
Data SOR, detail data working area SSA keeping in for data, the interface document supported is loaded into data base, authentication module
According to search module to detail data SOR in existing data compare with the new data loaded, by verifying then by going out
Processing module is by the Data Integration of these new loadings to detail data SOR.
Detail data SOR is a set of list structure meeting 3NF normal form specification based on BDW exploitation, and detail data SOR stores
The data of level of detail in data warehouse, detail data SOR connects exchange partition module, is pressed by exchange partition module
Carrying out taxonomic organization according to different subject areas, exchange partition module uses " subregion is ignored " and " dividing and rule " two kinds of subregion machines
System, it is possible to reduce import the data manipulation impact on user's real time access data, operator scheme is just as using hot swappable hard
Dish is the same, easy to use, in performance, owing to system storing mass data, and can be effective by " subregion is ignored "
Ground improve query performance, manageability and the availability of data can be improved, such as data deletion, data backup etc., take " point and
Control it " more improve and manage efficiently, the fault that task produces can be confined in subregion, and can effectively contract
Short recovery time, detail data SOR, as enterprise data model, is the core of whole data warehouse data model, has enough
Motility, it is possible to reply add more data source, support more analysis demand, expand the scope of application of system.Details
Data SOR connect has BDW to upgrade more new module, can be supported the upgrading further of BDW by BDW more new module of upgrading and be updated.
Metadata storage MDR is used for preserving the information about the process in data warehouse and data, and the information of data includes
Daily record, data dictionary and configuration information etc., metadata storage MDR connects has metadata management module, due to each instrument be
System all can generate the metadata of oneself, utilizes metadata management module these metadata centralized stores as far as possible to metadata
In storage MDR, metadata storage MDR is the shared metadata place for user's central access, the dimension of real metadata
Protect ground or in the system generating these metadata and instrument.Data Mart connects multi-dimension data cube module, data warehouse
Being stored in a TDH data group with Data Mart, each different data are come by different home zones in TDH data group
Distinguishing, Data Mart is stored in 3D vision region, is used for analyzing multidimensional data, and multi-dimension data cube module stores is in integrated district
In territory, it is used for storing multidimensional data.Data Mart is star-like or snowflake type structure, and Data Mart is a son of data warehouse
Collection, can be referred to as in " small data warehouse ", and the application of Data Mart is to supplement data warehouse applications, and Data Mart is towards dividing
The multidimensional data of analysis, stores precalculated data for specific user, thus meets user's special demand, have independence
Property, access quickly and convenient, do not affected by the ongoing renewal of system.Data Summary module is designed as denormalization, is used for
Updating multidimensional data, feedback module is based on data mining results.User represents module connection enquiry module, and enquiry module is used
Corresponding business tine, handling the time including business, the deadline of business, business is represented in the demand set according to user
Detailed content parameter etc..Specific user can quick search to the detailed content of the business of oneself demand.
The present invention achieves data acquisition automatic, reliable rapidly, transmits, changes and load, and ETL processing speed is fast,
The processing of big data quantity can be completed so that ETL tasks carrying gets up to be more prone to realize, and can support that multitask is held
OK, separate, it is independent of each other, and reduces the cost that ETL data process, improve the performance that ETL data process, improve
The manageability of data and availability, detail data SOR, as enterprise data model, is the core of whole data warehouse data model
The heart, has enough motilities, it is possible to more data source is added in reply, supports more analysis demand, the scope of application of system
It is greatly enhanced.The present invention has practical, and data management is convenient, and motility is high, it is easy to promote, and high-effect data process, greatly
Handling capacity, it is possible to more data source, the advantage supporting more analysis demand are added in reply.
These are only the specific embodiment of the present invention, but the technical characteristic of the present invention is not limited thereto.Any with this
Based on bright, for solving essentially identical technical problem, it is achieved essentially identical technique effect, done simple change, etc.
With replacement or modification etc., all it is covered by among protection scope of the present invention.
Claims (9)
1. data load cleaning engine, dispatch and storage system, it is characterised in that: include data source, data warehouse and use
Family represents module, and described data warehouse connects has ETL to manage module, and described ETL management module includes that ETL scheduler module, ETL supervise
Control module, quality of data module and ETL task module, described ETL scheduler module is for controlling the operation of all ETL tasks, institute
Stating the operation for tracing and monitoring ETL task of the ETL monitoring module, described quality of data module is for following the tracks of the data of data warehouse
Quality, described ETL task module has been used for concrete ETL process work;
Described data warehouse includes that interface document district, detail data working area SSA, detail data SOR, Data Mart, data are total
Knot module, feedback module and metadata storage MDR, described detail data SOR connects described Data Summary module, and described data are total
Knot module connects described feedback module, and described file interface district is for storage and Processing Interface file, and described file interface district is even
Being connected to authority setting module, each catalogue, for organizing according to specific bibliographic structure, is pressed by described authority setting module
The access rights to different user are set according to its specific purposes;
Described detail data working area SSA connects authentication module, and described authentication module connects lookup module, described lookup mould
Block connects described detail data SOR, and described authentication module connects processing module, and described processing module connects described detail data
SOR, described detail data SOR connects exchange partition module, and described metadata storage MDR is used for preserving about in data warehouse
Process and the information of data, described metadata storage MDR connect have metadata management module;The connection of described Data Mart has many
Dimension cube module, described multi-dimension data cube module is used for storing multidimensional data;
Described user represents module connection enquiry module, and described enquiry module is for representing business tine according to user's request.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
ETL scheduler module connects time setting module.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
ETL monitoring module connects faulty processing module, and described fault processing module connects described ETL scheduler module.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
ETL task module connects graphics module.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
The data processing tools in interface document district is mainly Kettle.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
Detail data SOR is a set of list structure meeting 3NF normal form specification based on BDW exploitation.
A kind of data the most according to claim 6 load and clean engine, dispatch and storage system, it is characterised in that: described
Detail data SOR connects has BDW to upgrade more new module.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
ETL management module uses the DTS assembly of Microsoft.
A kind of data the most according to claim 1 load and clean engine, dispatch and storage system, it is characterised in that: described
Data Mart is star-like or snowflake type structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610524292.8A CN106202346B (en) | 2016-06-29 | 2016-06-29 | A kind of data load cleaning engine, scheduling and storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610524292.8A CN106202346B (en) | 2016-06-29 | 2016-06-29 | A kind of data load cleaning engine, scheduling and storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106202346A true CN106202346A (en) | 2016-12-07 |
CN106202346B CN106202346B (en) | 2019-11-01 |
Family
ID=57465396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610524292.8A Active CN106202346B (en) | 2016-06-29 | 2016-06-29 | A kind of data load cleaning engine, scheduling and storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106202346B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107679160A (en) * | 2017-09-28 | 2018-02-09 | 深圳市华傲数据技术有限公司 | Data processing method and device based on chart database |
CN107688592A (en) * | 2017-04-06 | 2018-02-13 | 平安科技(深圳)有限公司 | The method and terminal of data cleansing |
CN107832451A (en) * | 2017-11-23 | 2018-03-23 | 安徽科创智慧知识产权服务有限公司 | A kind of big data cleaning way of simplification |
CN107895032A (en) * | 2017-11-23 | 2018-04-10 | 安徽科创智慧知识产权服务有限公司 | Carry out the network data acquisition method that data are tentatively cleaned |
CN107992552A (en) * | 2017-11-28 | 2018-05-04 | 南京莱斯信息技术股份有限公司 | A kind of data interchange platform and method for interchanging data |
CN108196912A (en) * | 2018-01-03 | 2018-06-22 | 新疆熙菱信息技术股份有限公司 | One kind is based on hot-plug component formula data integrating method |
CN108280084A (en) * | 2017-01-06 | 2018-07-13 | 上海前隆信息科技有限公司 | A kind of construction method of data warehouse, system and server |
CN109033291A (en) * | 2018-07-13 | 2018-12-18 | 深圳市小牛在线互联网信息咨询有限公司 | A kind of job scheduling method, device, computer equipment and storage medium |
CN109269557A (en) * | 2018-09-19 | 2019-01-25 | 中国南方电网有限责任公司超高压输电公司广州局 | A kind of change of current station equipment operating parameter and running environment intelligent monitor system and method |
CN109669975A (en) * | 2018-11-09 | 2019-04-23 | 成都数之联科技有限公司 | A kind of industry big data processing system and method |
CN109918437A (en) * | 2019-03-08 | 2019-06-21 | 北京中油瑞飞信息技术有限责任公司 | Distributed data processing method, apparatus and data assets management system |
CN112667615A (en) * | 2020-12-25 | 2021-04-16 | 广东电网有限责任公司电力科学研究院 | Data cleaning system and method |
CN112667472A (en) * | 2020-12-28 | 2021-04-16 | 武汉达梦数据库股份有限公司 | Data source connection state monitoring device and method |
CN113177039A (en) * | 2021-04-27 | 2021-07-27 | 中通服咨询设计研究院有限公司 | Data center data cleaning system based on data fusion |
CN114817393A (en) * | 2022-06-24 | 2022-07-29 | 深圳市信联征信有限公司 | Data extraction and cleaning method and device and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452485A (en) * | 2008-12-31 | 2009-06-10 | 中国建设银行股份有限公司 | Method and device for generating multidimensional cubic based on relational database |
CN201600693U (en) * | 2009-11-26 | 2010-10-06 | 中国移动通信集团河北有限公司 | Data warehouse system |
CN103577605A (en) * | 2013-11-20 | 2014-02-12 | 贵州电网公司电力调度控制中心 | Data warehouse based on data fusion and data mining and application method of data warehouse |
CN104933160A (en) * | 2015-06-26 | 2015-09-23 | 河海大学 | ETL (Extract Transform and Load) framework design method for safety monitoring business analysis |
CN105095327A (en) * | 2014-05-23 | 2015-11-25 | 深圳市珍爱网信息技术有限公司 | Distributed ELT system and scheduling method |
-
2016
- 2016-06-29 CN CN201610524292.8A patent/CN106202346B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452485A (en) * | 2008-12-31 | 2009-06-10 | 中国建设银行股份有限公司 | Method and device for generating multidimensional cubic based on relational database |
CN201600693U (en) * | 2009-11-26 | 2010-10-06 | 中国移动通信集团河北有限公司 | Data warehouse system |
CN103577605A (en) * | 2013-11-20 | 2014-02-12 | 贵州电网公司电力调度控制中心 | Data warehouse based on data fusion and data mining and application method of data warehouse |
CN105095327A (en) * | 2014-05-23 | 2015-11-25 | 深圳市珍爱网信息技术有限公司 | Distributed ELT system and scheduling method |
CN104933160A (en) * | 2015-06-26 | 2015-09-23 | 河海大学 | ETL (Extract Transform and Load) framework design method for safety monitoring business analysis |
Non-Patent Citations (2)
Title |
---|
石油论文资料库: "IBM数据仓库解决方案简述", 《豆丁》 * |
龙新征,等: "《基于数据仓库的高校数据统计服务平台研究》", 《通信学报》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108280084A (en) * | 2017-01-06 | 2018-07-13 | 上海前隆信息科技有限公司 | A kind of construction method of data warehouse, system and server |
CN107688592B (en) * | 2017-04-06 | 2020-03-17 | 平安科技(深圳)有限公司 | Data cleaning method and terminal |
CN107688592A (en) * | 2017-04-06 | 2018-02-13 | 平安科技(深圳)有限公司 | The method and terminal of data cleansing |
CN107679160A (en) * | 2017-09-28 | 2018-02-09 | 深圳市华傲数据技术有限公司 | Data processing method and device based on chart database |
CN107832451A (en) * | 2017-11-23 | 2018-03-23 | 安徽科创智慧知识产权服务有限公司 | A kind of big data cleaning way of simplification |
CN107895032A (en) * | 2017-11-23 | 2018-04-10 | 安徽科创智慧知识产权服务有限公司 | Carry out the network data acquisition method that data are tentatively cleaned |
CN107992552A (en) * | 2017-11-28 | 2018-05-04 | 南京莱斯信息技术股份有限公司 | A kind of data interchange platform and method for interchanging data |
CN108196912B (en) * | 2018-01-03 | 2021-04-23 | 新疆熙菱信息技术股份有限公司 | Data integration method based on hot plug assembly |
CN108196912A (en) * | 2018-01-03 | 2018-06-22 | 新疆熙菱信息技术股份有限公司 | One kind is based on hot-plug component formula data integrating method |
CN109033291A (en) * | 2018-07-13 | 2018-12-18 | 深圳市小牛在线互联网信息咨询有限公司 | A kind of job scheduling method, device, computer equipment and storage medium |
CN109269557A (en) * | 2018-09-19 | 2019-01-25 | 中国南方电网有限责任公司超高压输电公司广州局 | A kind of change of current station equipment operating parameter and running environment intelligent monitor system and method |
CN109669975A (en) * | 2018-11-09 | 2019-04-23 | 成都数之联科技有限公司 | A kind of industry big data processing system and method |
CN109669975B (en) * | 2018-11-09 | 2020-12-18 | 成都数之联科技有限公司 | Industrial big data processing system and method |
CN109918437A (en) * | 2019-03-08 | 2019-06-21 | 北京中油瑞飞信息技术有限责任公司 | Distributed data processing method, apparatus and data assets management system |
CN112667615B (en) * | 2020-12-25 | 2022-02-15 | 广东电网有限责任公司电力科学研究院 | Data cleaning system and method |
CN112667615A (en) * | 2020-12-25 | 2021-04-16 | 广东电网有限责任公司电力科学研究院 | Data cleaning system and method |
CN112667472A (en) * | 2020-12-28 | 2021-04-16 | 武汉达梦数据库股份有限公司 | Data source connection state monitoring device and method |
CN112667472B (en) * | 2020-12-28 | 2022-04-08 | 武汉达梦数据库股份有限公司 | Data source connection state monitoring device and method |
CN113177039A (en) * | 2021-04-27 | 2021-07-27 | 中通服咨询设计研究院有限公司 | Data center data cleaning system based on data fusion |
CN113177039B (en) * | 2021-04-27 | 2024-02-27 | 中通服咨询设计研究院有限公司 | Data center data cleaning system based on data fusion |
CN114817393A (en) * | 2022-06-24 | 2022-07-29 | 深圳市信联征信有限公司 | Data extraction and cleaning method and device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106202346B (en) | 2019-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106202346A (en) | A kind of data load and clean engine, dispatch and storage system | |
CN105005570B (en) | Magnanimity intelligent power data digging method and device based on cloud computing | |
CN107256443B (en) | Line loss real-time computing technique based on business and data integration | |
CN101208692B (en) | Automatically moving multidimensional data between live datacubes of enterprise software systems | |
CN104468778B (en) | A kind of cloud manufacturing execution system and its manufacture execution method based on cloud service | |
CN108520316A (en) | A kind of data-optimized processing method of overload alarm | |
US20140358977A1 (en) | Management of Intermediate Data Spills during the Shuffle Phase of a Map-Reduce Job | |
CN109271382A (en) | A kind of data lake system towards full data shape opening and shares | |
CN104111996A (en) | Health insurance outpatient clinic big data extraction system and method based on hadoop platform | |
CN101566981A (en) | Method for establishing dynamic virtual data base in analyzing and processing system | |
CN106599197A (en) | Data acquisition and exchange engine | |
CN108694195A (en) | A kind of management method and system of Distributed Data Warehouse | |
CN102722355A (en) | Workflow mechanism-based concurrent ETL (Extract, Transform and Load) conversion method | |
CN103902537A (en) | Multi-service log data storage processing and inquiring system and method thereof | |
CN112883001A (en) | Data processing method, device and medium based on marketing and distribution through data visualization platform | |
CN102763083A (en) | Computer system and upgrade method for same | |
CN109508957A (en) | A kind of enterprise management system | |
CN102495916A (en) | Multi-application-system panoramic modeling method based on object matching | |
CN106780157B (en) | Ceph-based power grid multi-temporal model storage and management system and method | |
CN109359107A (en) | Database method for cleaning, system, device and storage medium | |
CN112825169B (en) | Intelligent enterprise management system and management method | |
CN110007905A (en) | A kind of generation method and system of the software development scheme based on big data | |
CN104009906A (en) | Structuralized theme instant communication system and method | |
Lee et al. | A big data management system for energy consumption prediction models | |
CN103810258A (en) | Data aggregation scheduling method based on data warehouse |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20191008 Address after: 510030 floor 2 and 5, building 9, No. 305, Dongfeng Middle Road, Yuexiu District, Guangzhou City, Guangdong Province Applicant after: Guangdong Information Network Co., Ltd. Address before: 310018, No. 2, No. 928, Xiasha Higher Education Park, Hangzhou, Zhejiang, Jianggan District Applicant before: Zhejiang University of Technology |
|
GR01 | Patent grant | ||
GR01 | Patent grant |