CN109299180A - A kind of data warehouse ETL operating system - Google Patents

A kind of data warehouse ETL operating system Download PDF

Info

Publication number
CN109299180A
CN109299180A CN201811283414.4A CN201811283414A CN109299180A CN 109299180 A CN109299180 A CN 109299180A CN 201811283414 A CN201811283414 A CN 201811283414A CN 109299180 A CN109299180 A CN 109299180A
Authority
CN
China
Prior art keywords
data
component
initial
work flow
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811283414.4A
Other languages
Chinese (zh)
Other versions
CN109299180B (en
Inventor
鲁大军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Comb Big Data Technology Co.,Ltd.
Original Assignee
Wuhan Guanggu Lianzhongda Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Guanggu Lianzhongda Data Technology Co Ltd filed Critical Wuhan Guanggu Lianzhongda Data Technology Co Ltd
Priority to CN201811283414.4A priority Critical patent/CN109299180B/en
Publication of CN109299180A publication Critical patent/CN109299180A/en
Application granted granted Critical
Publication of CN109299180B publication Critical patent/CN109299180B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention belongs to data processing method or systems technology field more particularly to a kind of data warehouse ETL operating systems.The implementation for realizing data according to warehouse ETL operating system of the invention is extracted, it substantially reduces acquisition and handles the time of initial data, the distributed treatment to data conversion and load is realized by extraction-conversion-publication-acquisition, it changes traditional periodic job and executes step, the retardation rate for reducing data improves the real time reaction speed of data.

Description

A kind of data warehouse ETL operating system
Technical field
The invention belongs to data processing method or systems technology field more particularly to a kind of data warehouse ETL operating systems.
Background technique
With the rapid development of computer technology and information technology, electronic information data are more next in the daily operation of enterprise It is more important, enterprise there is an urgent need to it is efficient, accurate, analyze data in time.The data loading cycle in traditional data warehouse is longer, past It, cannot response data information change in real time toward analysis and inquiry to historical data can only be provided.Information is modern enterprise Valuable source is basis of the enterprise with scientific management, Analysis of Policy Making.Currently, most enterprises spend a large amount of fund and when Between construct the operation system and the office automation system of Transaction Processing OLTP, for recording the various correlations of issued transaction Data.According to statistics, data volume every 2~3 year will be doubled and redoubled, these data contain huge commercial value, and look forward to Industry is of interest usually only to be accounted for 2%~4% or so of total amount of data.Therefore, enterprise is not still maximumlly using having deposited Data resource also lose the best opportunity for formulating key business decision so that waste more times and fund.In It is how enterprise converts the data into information, knowledge by various technological means, at its core competitiveness of raising Main bottleneck.The enterprise of increase with to(for) the demand of real time data processing, realization real-time is higher, data acquisition is more accelerated Prompt convenient automatic processing technology and scheme are the important development directions of enterprise's Future Data processing.
Summary of the invention
The purpose of the invention is, provides a kind of data warehouse ETL operating system, when improving concurrent more work flows Data system handles the efficiency of task and the real-time of data acquisition, processing, while improving data between each system level Independence reduces the processing workload of associated data, guarantees the relatively independent of each work flow under multiple task situation.
To achieve the above object, the invention adopts the following technical scheme that.
A kind of data warehouse ETL operating system, contains
I. user interactive module, comprising:
Interactive maintenance component of the 1a. for the realization description data management in data integration and data flow;
The processing of transformational relation, data of the 1b. for defining the mapping relations between initial data and target data, data Process, the data package with the interactive process of ETL;
1c. is used to show mapping relations, the data conversion relationship, data processing stream between initial data and target data Journey, the visualization component with ETL interactive process;
Description data, which are included at least, describes data for describing the process of ETL specific operation process;Including process inner assembly The configuration relation that information describes between data, each component describes data;
Ii. job management module, comprising:
2a. be used to control and change the starting of work flow, homework type, work flow control mode and pushed To the control assembly of job execution component;
2b. constructs membership credentials between work flow and pushes to work for parsing the corresponding description data of each work flow The decomposition component of industry executive module;
2c. is used to complete between different work process and the executive module of switching, the starting of work flow control mode; Executive module directly or periodically or periodically executes the work flow by control assembly and/or decomposition component push;According to The real-time power of initial data required for different work process to it is multiple while with the period push work flow It is successively executed after being ranked up according to strong and weak sequence;
Pretreated transition components of the 2d. for initial data in the process that fulfils assignment;Transition components receive number to be processed According to rear, data object is turned into the accessible Format Object of transition components using data line objectification tool, data are examined Rope is matched with rule in transformation rule according to the current property value of data, is converted to the data for meeting transformation rule And forwarding;
2e. is used for the response assemblies as data server and the caching of initial data arranged in a distributed manner;Response Mode executes component in an asynchronous mode, passes through response assemblies by the initial data that initial data component and data extraction assembly obtain It is issued, each work flow completes data acquisition by response assemblies according to self-demand;
Iii. data package module, comprising:
3a. is used to store storage and correspond to from initial data the database component of equipment or platform, including original data set Number of packages is according to extraction assembly;Data pick-up component be and initial data relevant device or platform-independent extraction tool;Data pick-up Component is imported using direct drop loading method for real time job process and is stored in data cache region, for data warehouse And non real-time work flow imports and is stored in data warehouse in such a way that micro- batch loads;
3b. file operation component;
3c. record component;
For above-mentioned data warehouse ETL operating system, response assemblies are set as the corresponding convert task of each work flow It submits, wait, prepare, execute, stop five intermediate state;To each convert task setting number, period, state, preferential, sequence Number, record, wait, repeat eight mark classes;It wherein submits and refers to that conversion is also not ready to the starting time to be placed such as execution, waits Refer to that entering convert queue waits initial data, prepares to refer to that initial data is ready for being ready to carry out, executes and referring to and convert just It carrying out, stopping referring to that conversion has stopped;Number is to refer to that convert task executes in the unique encodings of each convert task, period To the time --- real-time task 0, state refer to convert task real-time status --- initial value 1, preferentially refer to according to conversion appoint Be engaged in real-time demand sequence priority --- the higher numerical value of priority is bigger, serial number refers to the corresponding initial data of convert task Record of the queue order in tables of data, record of hiding at the end of refer in the convert task successively --- initial value 0, waiting Referring to the duration that convert task waits in ready queue -- initial value 0 repeats to refer to whether convert task needs to repeat;
Response assemblies are held in conversion process, and the priority level of any one convert task is task flag and waiting conventional number The sum of value, task schedule of every execution, then to waiting mark plus a processing.
For above-mentioned data warehouse ETL operating system, response assemblies construct tables of data, data according to the classification of initial data Each initial data is sequentially placed into table by library component according to tables of data, and executive module is according to the corresponding initial data class of work flow Corresponding data in other Track Date Central Table, and initial data is handled according to work flow content;Response assemblies are in original number It according to accumulation or is more than specific quantity or time delete processing to be carried out to legacy data later.
For above-mentioned data warehouse ETL operating system, transformation rule is asked by definition tool according in different work process The detected rule and processing rule for inscribing data define and are stored in rule base, in the operation of each work flow, load corresponding Transformation rule simultaneously pre-processes corresponding initial data;Response assemblies are not influencing to be currently running using dynamically load mode Data prediction in the case where new rule loaded and be applied to data conversion;Response assemblies according to each rule pattern match Algorithm carries out detection judgement to the logical relation combination of data.
For above-mentioned data warehouse ETL operating system, visualization component is also used to, and output interactive maintenance component manipulates window Mouth, output operation monitoring data, output process log information.
For above-mentioned data warehouse ETL operating system, data pick-up component uses the real-time abstracting method based on trigger; Trigger is when initial data changes, and by the data-pushing after variation to response assemblies, response assemblies are notified all and are somebody's turn to do Initial data keeps the work flow of interaction.
For above-mentioned data warehouse ETL operating system, the interactive process with ETL includes operation ETL task, detection ETL work Industry, management record data;It further include description data repository, process describes data and saves in xml format in storage.
The beneficial effect of above-mentioned data warehouse ETL operating system is:
1, the implementation of the invention for realizing data according to warehouse ETL operating system is extracted, and substantially reduces acquisition and processing The time of initial data realizes the distributed treatment to data conversion and load by extraction-conversion-publication-acquisition, changes Traditional periodic job executes step, reduces the retardation rate of data, improves the real time reaction speed of data;
2, flow chart of data processing is optimized by data conversion and specific aim extraction, improves data centrality, reduced Journey time loss, while improving the accuracy of data;
3, branch's processing mode based on priority processing and associated data, is protected at a possibility that avoiding server idle machine Card has suitable work flow to continuously carry out always, while according to the requirement of real-time that ensure that work flow, reducing data and prolonging Late, the validity of system data processing is improved;
4, it is based on above system structure, between the corresponding service area of each work flow and the acquisition of initial data relatively solely It is vertical, will not influence each other, while guaranteeing that efficient stable is run always for the acquisition process of initial data, ensure that the stabilization of input with Accurately, the colleges and universities of position system data provide important guarantee;It can be effective under the scene of multiple task and high real-time requirements Guarantee the management and execution efficiency of job task;
5, it is based on priority to be arranged and manage structure, in implementation extraction and data set when improving system processes data Validity, avoid influence of the length of each work flow processing time to whole system stability, eliminate single task role not Sharp influence factor improves the stability of system entirety.
Detailed description of the invention
Fig. 1 is the principle schematic diagram of data warehouse ETL operating system;
Fig. 2 is the data transmitting schematic diagram of data warehouse ETL operating system
Fig. 3 is that the data of data warehouse ETL operating system update schematic diagram
Fig. 4 is the operation and data separating effect diagram of data warehouse ETL operating system.
Specific embodiment
It elaborates below in conjunction with specific embodiment to the invention.
It is as shown in Figure 1 a kind of principle schematic diagram of data warehouse ETL operating system of the invention, wherein containing i. User interactive module, comprising:
Interactive maintenance component of the 1a. for the realization description data management in data integration and data flow;
The processing of transformational relation, data of the 1b. for defining the mapping relations between initial data and target data, data Process, the data package with the interactive process of ETL;
1c. is used to show mapping relations, the data conversion relationship, data processing stream between initial data and target data Journey, the visualization component with ETL interactive process;Visualization component is also used to, and output interactive maintenance component manipulation window, output are made Industry monitoring data, output process log information, the interactive process with ETL include operation ETL task, detection ETL operation, manage and remember Record data;It further include description data repository, process describes data and saves in xml format in storage.
Description data, which are included at least, describes data for describing the process of ETL specific operation process;Including process inner assembly The configuration relation that information describes between data, each component describes data;
User interactive module for this system provide figure and text interaction approach, help user realize logarithm accordingly and Service control provides the approach of access control;Processing and the number of work flow are improved using the visualization processings technology such as interface UI According to identifiability, simplify use difficulty.The centralized control of description data can be realized to the more acurrate of overall work state Analysis more quickly and accurately obtains partial operation content and state, improves the validity and system response time of user management;
Ii. job management module, comprising:
2a. be used to control and change the starting of work flow, homework type, work flow control mode and pushed To the control assembly of job execution component;
2b. constructs membership credentials between work flow and pushes to work for parsing the corresponding description data of each work flow The decomposition component of industry executive module;
2c. is used to complete between different work process and the executive module of switching, the starting of work flow control mode; Executive module directly or periodically or periodically executes the work flow by control assembly and/or decomposition component push;According to The real-time power of initial data required for different work process to it is multiple while with the period push work flow It is successively executed after being ranked up according to strong and weak sequence;
Pretreated transition components of the 2d. for initial data in the process that fulfils assignment;Transition components receive number to be processed According to rear, data object is turned into the accessible Format Object of transition components using data line objectification tool, data are examined Rope is matched with rule in transformation rule according to the current property value of data, is converted to the data for meeting transformation rule And forwarding;Transformation rule is fixed according to the detected rule of problem data in different work process and processing rule by definition tool Justice is good and is stored in rule base, in the operation of each work flow, loads corresponding transformation rule and carries out to corresponding initial data Pretreatment;Response assemblies are added new rule in the case where not influencing the data prediction being currently running using dynamically load mode Carry and be applied to data conversion;Response assemblies examine the logical relation combination of data according to the pattern matching algorithm of each rule Survey judgement;Response assemblies are set as submitting, waiting, preparation, execute, in stopping five to the corresponding convert task of each work flow Between state;To each convert task setting number, the period, state, preferential, serial number, record, waiting, repeat eight mark classes;Wherein Submission refers to that conversion is also not ready to the starting time to be placed such as execution, waiting refers into convert queue waiting initial data, standard It is standby to refer to that initial data is ready for being ready to carry out, executes and refer to that conversion is carrying out, stopping referring to converting and stopped;Number It is to refer to that convert task executes the waiting time in the unique encodings of each convert task, period --- real-time task 0, state refer to turn It is preferential to change task real-time status --- initial value 1 preferentially refers to the priority to sort according to convert task real-time demand --- The higher numerical value of grade is bigger, serial number refers to that hide queue order in tables of data, record of the corresponding initial data of convert task refers to In the convert task successively at the end of record --- initial value 0, waiting refer to that convert task waits in ready queue when It is long -- initial value 0 repeats to refer to whether convert task needs to repeat;Response assemblies are held in conversion process, any one conversion The priority level of task be task flag and wait flag values and, task schedule of every executions, then to wait indicate add One processing.
As shown in Fig. 2, the optimal control for concurrent multitask is realized based on executive module and the transition components present invention, A possibility that avoiding the task of high real-time requirements cannot execute rapidly, more rationally effective distribution resource, Yuan's art data are logical Its data variation is crossed to trigger real time data renewal process, the resource that system entirety is reduced while improving real-time property disappears Consumption, while data server not will receive the influence of data variation, the extraction process of data and the acquisition process of data are mutually only It is vertical, it can respectively play the performance of its system and unaffected;The management and generation control data prediction function of rule, effectively Simplify resource consumption of data system during data conversion treatment, reduces data processing time, be further ensured that data Real-time improves operation response speed, and then guarantee the efficient fortune of system entirety to meet the requirement of each work flow data Row.It, can more rationally efficient scheduling job be suitable in conjunction with the switching of the task of executive module and decomposition component and managerial ability Sequence meets the real-time processing and priority adjustment capability of task, improves the flexibility of system.
2e. is used for the response assemblies as data server and the caching of initial data arranged in a distributed manner;Response Mode executes component in an asynchronous mode, passes through response assemblies by the initial data that initial data component and data extraction assembly obtain It is issued, each work flow completes data acquisition by response assemblies according to self-demand;Response assemblies are according to initial data Classification construct tables of data, each initial data is sequentially placed into table by database component according to tables of data, and executive module is according to work Corresponding data in the corresponding initial data classification Track Date Central Table of industry process, and initial data is carried out according to work flow content Processing;Response assemblies accumulate in initial data or carry out delete processing more than specific quantity or to legacy data after the time.
For the processing and acquisition of multi-data source kind different types of structure data in actual use, turn in the present invention It changes component and response assemblies constitutes intermediate system, make data active layer and application layer respectively as independent overall operation, utilize this In build the relationship between system coordination data source and data application, guarantee task independence while, improve data retrieval, The stability for the treatment of effeciency and whole system;
Iii. data package module, comprising:
3a. is used to store storage and correspond to from initial data the database component of equipment or platform, including original data set Number of packages is according to extraction assembly;Data pick-up component be and initial data relevant device or platform-independent extraction tool;Data pick-up Component is imported using direct drop loading method for real time job process and is stored in data cache region, for data warehouse And non real-time work flow imports and is stored in data warehouse in such a way that micro- batch loads;Data pick-up component uses base In the real-time abstracting method of trigger;Trigger is when initial data changes, by the data-pushing after variation to response group Part, response assemblies notify all work flows for keeping interacting with the initial data.
3b. file operation component;
3c. record component;
As shown in Figure 2, Figure 3, Figure 4, the extraction of data of the invention handles basic procedure, is based on above system structure, this Each Yuan's art data source in invention is changed data and is sent in its mutual corresponding message queue, and work flow is according to respectively It needs to extract data, forms multi-source ETL structure, so that the acquisition of data separates independence with conversion load, it is real-time to improve large capacity Treatment effeciency when Data Concurrent is based on asynchronous mechanism, more flexible effective data processing mode can be provided, in height Real time tasks can carry out when occurring into preferential, the work flow of low requirement of real-time can enter waiting processing state according to It needs to start, is based on above structure, each work flow is equally mutual with Raw Data System and persistence architecture system It is independent, therefore job task process is not by the great influence of data terminal working condition, it is only necessary to corresponding message team is learned in real time Whether column update, and greatly reduce the total amount of the control and data processing when work flow operation, reduce system entirety resource consumption Take, while mutually independent each system is conducive to the mutually isolated of error rate in each section, be unlikely to be superimposed or series connection is caused to be imitated It answers, error generation rate is effectively reduced, quick localization process after occurring convenient for mistake.The high stability of guarantee system.
Above embodiments are only to illustrate the technical solution of the invention rather than to the limits of the invention protection scope System, the technical solution of the invention is modified or replaced equivalently do not depart from the invention technical solution essence and Range.

Claims (7)

1. a kind of data warehouse ETL operating system, which is characterized in that contain
I. user interactive module, comprising:
1a. in data integration and data flow for realizing the interactive maintenance component of description data management;
1b. is for defining the transformational relation of the mapping relations between initial data and target data, data, the processing stream of data Journey, the data package with the interactive process of ETL;
1c. be used to show mapping relations between initial data and target data, data conversion relationship, flow chart of data processing, with The visualization component of ETL interactive process;
The description data, which are included at least, describes data for describing the process of ETL specific operation process;Including process inner assembly The configuration relation that information describes between data, each component describes data;
Ii. job management module, comprising:
2a. be used to control and change the starting of work flow, homework type, work flow control mode and pushed to work The control assembly of industry executive module;
2b. constructs between work flow membership credentials and pushes to operation and hold for parsing the corresponding description data of each work flow The decomposition component of row component;
2c. is used to complete between different work process and the executive module of switching, the starting of work flow control mode;It is described Executive module directly or periodically or periodically executes the work flow by control assembly and/or decomposition component push;According to The real-time power of initial data required for different work process to it is multiple while with the period push work flow It is successively executed after being ranked up according to strong and weak sequence;
Pretreated transition components of the 2d. for initial data in the process that fulfils assignment;The transition components receive number to be processed According to rear, data object is turned into the accessible Format Object of transition components using data line objectification tool, data are examined Rope is matched with rule in transformation rule according to the current property value of data, is converted to the data for meeting transformation rule And forwarding;
2e. is used for the response assemblies as data server and the caching of initial data arranged in a distributed manner;The response Mode executes component in an asynchronous mode, passes through response assemblies by the initial data that initial data component and data extraction assembly obtain It is issued, each work flow completes data acquisition by response assemblies according to self-demand;
Iii. data package module, comprising:
3a. is used to store storage and correspond to from initial data the database component of equipment or platform, including initial data package count According to extraction assembly;The data pick-up component be and initial data relevant device or platform-independent extraction tool;
Data pick-up component is imported and is stored in data cache region using direct drop loading method for real time job process, For data warehouse and non real-time work flow data warehouse is imported and is stored in such a way that micro- batch loads;
3b. file operation component;
3c. record component.
2. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the response assemblies are to each work The corresponding convert task of industry process is set as submitting, waits, prepares, executing, stopping five intermediate state;Each convert task is set It sets number, the period, state, preferential, serial number, record, waiting, repeat eight mark classes;It wherein submits and refers to that conversion is also not ready to The starting time to be placed such as execution, waiting, which refer into convert queue, to be waited initial data, prepares to refer to that initial data is ready for It is ready to carry out well, executes and refer to that conversion is carrying out, stopping referring to converting and stopped;Number is unique volume of each convert task It is initial that code, period refer to that convert task executes waiting time --- real-time task 0, state refer to convert task real-time status --- Value be 1, preferentially refer to according to convert task real-time demand sequence priority --- the higher numerical value of priority is bigger, serial number is Refer to that the corresponding initial data of convert task hides queue order in tables of data, record at the end of refer in the convert task successively Record --- initial value 0, waiting refer to the duration that convert task waits in ready queue -- initial value 0, repeatedly refer to Whether convert task, which needs, repeats;
Response assemblies are held in conversion process, and the priority level of any one convert task is task flag and waiting flag values With task schedule of every execution, then to waiting mark plus a processing.
3. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the response assemblies are according to original The classification of beginning data constructs tables of data, and each initial data is sequentially placed into table by database component according to tables of data, the execution Component is according to corresponding data in the corresponding initial data classification Track Date Central Table of work flow, and according to work flow content to original Beginning data are handled;The response assemblies initial data accumulate or be more than specific quantity or after the time to legacy data into Row delete processing.
4. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that it is fixed that the transformation rule passes through Volunteer's tool defines and is stored in rule base according to the detected rule of problem data in different work process and processing rule, each When work flow is run, loads corresponding transformation rule and corresponding initial data is pre-processed;Response assemblies are using dynamic New rule is loaded and is applied to data conversion in the case where not influencing the data prediction being currently running by state loading mode;It rings Component is answered to carry out detection judgement to the logical relation combination of data according to the pattern matching algorithm of each rule.
5. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the visualization component is also used In output interactive maintenance component manipulation window, output operation monitoring data, output process log information.
6. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the data pick-up component is adopted With the real-time abstracting method based on trigger;The trigger is when initial data changes, by the data-pushing after variation To response assemblies, the response assemblies notify all work flows for keeping interacting with the initial data.
7. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the interactive process with ETL Including operation ETL task, detection ETL operation, management record data;It further include description data repository, the process describes number XML format saves in storage accordingly.
CN201811283414.4A 2018-10-31 2018-10-31 ETL operating system of data warehouse Active CN109299180B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811283414.4A CN109299180B (en) 2018-10-31 2018-10-31 ETL operating system of data warehouse

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811283414.4A CN109299180B (en) 2018-10-31 2018-10-31 ETL operating system of data warehouse

Publications (2)

Publication Number Publication Date
CN109299180A true CN109299180A (en) 2019-02-01
CN109299180B CN109299180B (en) 2021-08-27

Family

ID=65145695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811283414.4A Active CN109299180B (en) 2018-10-31 2018-10-31 ETL operating system of data warehouse

Country Status (1)

Country Link
CN (1) CN109299180B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111190972A (en) * 2019-12-31 2020-05-22 武汉俊楚信息科技有限公司 Experiment data management system
CN111897827A (en) * 2020-07-06 2020-11-06 苏宁金融科技(南京)有限公司 Data updating method and system for data warehouse and electronic equipment
WO2021099903A1 (en) * 2019-11-18 2021-05-27 International Business Machines Corporation Multi-tenant extract transform load resource sharing
CN113434497A (en) * 2021-08-26 2021-09-24 中国电子信息产业集团有限公司 Data element vault composed of data warehouse and data element warehouse
CN114860349A (en) * 2022-07-06 2022-08-05 深圳华锐分布式技术股份有限公司 Data loading method, device, equipment and medium
US11841871B2 (en) 2021-06-29 2023-12-12 International Business Machines Corporation Managing extract, transform and load systems

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080306984A1 (en) * 2007-06-08 2008-12-11 Friedlander Robert R System and method for semantic normalization of source for metadata integration with etl processing layer of complex data across multiple data sources particularly for clinical research and applicable to other domains
CN102142039A (en) * 2004-12-17 2011-08-03 亚马逊科技公司 Apparatus and method for data warehousing
CN105359141A (en) * 2013-05-17 2016-02-24 甲骨文国际公司 Supporting combination of flow based ETL and entity relationship based ETL
US9922072B1 (en) * 2010-12-31 2018-03-20 United Services Automobile Association (Usaa) Extract, transform, and load application complexity management framework

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142039A (en) * 2004-12-17 2011-08-03 亚马逊科技公司 Apparatus and method for data warehousing
US20080306984A1 (en) * 2007-06-08 2008-12-11 Friedlander Robert R System and method for semantic normalization of source for metadata integration with etl processing layer of complex data across multiple data sources particularly for clinical research and applicable to other domains
US9922072B1 (en) * 2010-12-31 2018-03-20 United Services Automobile Association (Usaa) Extract, transform, and load application complexity management framework
CN105359141A (en) * 2013-05-17 2016-02-24 甲骨文国际公司 Supporting combination of flow based ETL and entity relationship based ETL

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨等: "大型商业银行基于Hadoop分布式数据仓库建设初探", 《计算机应用与软件》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021099903A1 (en) * 2019-11-18 2021-05-27 International Business Machines Corporation Multi-tenant extract transform load resource sharing
GB2603098A (en) * 2019-11-18 2022-07-27 Ibm Multi-tenant extract transform load resource sharing
GB2603098B (en) * 2019-11-18 2022-12-14 Ibm Multi-tenant extract transform load resource sharing
CN111190972A (en) * 2019-12-31 2020-05-22 武汉俊楚信息科技有限公司 Experiment data management system
CN111897827A (en) * 2020-07-06 2020-11-06 苏宁金融科技(南京)有限公司 Data updating method and system for data warehouse and electronic equipment
US11841871B2 (en) 2021-06-29 2023-12-12 International Business Machines Corporation Managing extract, transform and load systems
CN113434497A (en) * 2021-08-26 2021-09-24 中国电子信息产业集团有限公司 Data element vault composed of data warehouse and data element warehouse
CN114860349A (en) * 2022-07-06 2022-08-05 深圳华锐分布式技术股份有限公司 Data loading method, device, equipment and medium

Also Published As

Publication number Publication date
CN109299180B (en) 2021-08-27

Similar Documents

Publication Publication Date Title
CN109299180A (en) A kind of data warehouse ETL operating system
CN100444121C (en) Batch task scheduling engine and dispatching method
CN107864174B (en) Rule-based Internet of things equipment linkage method
CN109933306A (en) Mix Computational frame generation, data processing method, device and mixing Computational frame
CN108920261A (en) A kind of two-stage self-adapting dispatching method suitable for large-scale parallel data processing task
CN101017546A (en) Method and device for categorical data batch processing
CN106548324A (en) A kind of IT system O&M service management system
CN102467532A (en) Task processing method and task processing device
CN106126601A (en) A kind of social security distributed preprocess method of big data and system
CN108037919A (en) A kind of visualization big data workflow configuration method and system based on WEB
CN108664635B (en) Method, device, equipment and storage medium for acquiring database statistical information
CN109753596B (en) Information source management and configuration method and system for large-scale network data acquisition
CN112162980A (en) Data quality control method and system, storage medium and electronic equipment
US20200334314A1 (en) Emergency disposal support system
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
CN107544844A (en) A kind of method and device of lifting Spark Operating ettectiveness
CN115757603A (en) Visual data modeling system and method
CN114416849A (en) Data processing method and device, electronic equipment and storage medium
CN107451211B (en) A kind of download system based on RabbitMQ and MongoDB
CN116974994A (en) High-efficiency file collaboration system based on clusters
CN111046059B (en) Low-efficiency SQL statement analysis method and system based on distributed database cluster
CN112667873A (en) Crawler system and method suitable for general data acquisition of most websites
CN114756629B (en) Multi-source heterogeneous data interaction analysis engine and method based on SQL
CN105630997A (en) Data parallel processing method, device and equipment
CN115599524A (en) Data lake system based on cooperative scheduling processing of streaming data and batch data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240104

Address after: 430074 East Lake New Technology Development Zone, Wuhan City, Hubei Province, China (Free Trade Zone, Wuhan Area)

Patentee after: Wuhan Comb Big Data Technology Co.,Ltd.

Address before: 430000 No. 04, room 01, floor 1-2, zone 3, 3S geospatial information industry base, wudayuan Road, Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee before: WUHAN OPTICS VALLEY DATA TECHNOLOGIES Co.,Ltd.