CN109299180A - A kind of data warehouse ETL operating system - Google Patents
A kind of data warehouse ETL operating system Download PDFInfo
- Publication number
- CN109299180A CN109299180A CN201811283414.4A CN201811283414A CN109299180A CN 109299180 A CN109299180 A CN 109299180A CN 201811283414 A CN201811283414 A CN 201811283414A CN 109299180 A CN109299180 A CN 109299180A
- Authority
- CN
- China
- Prior art keywords
- data
- component
- initial
- work flow
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000006243 chemical reaction Methods 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims description 56
- 230000008569 process Effects 0.000 claims description 52
- 230000004044 response Effects 0.000 claims description 39
- 238000012545 processing Methods 0.000 claims description 35
- 230000000712 assembly Effects 0.000 claims description 32
- 238000000429 assembly Methods 0.000 claims description 32
- 230000002452 interceptive effect Effects 0.000 claims description 19
- 230000009466 transformation Effects 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 10
- 230000007704 transition Effects 0.000 claims description 10
- 238000007726 management method Methods 0.000 claims description 9
- 238000000354 decomposition reaction Methods 0.000 claims description 7
- 238000012800 visualization Methods 0.000 claims description 7
- 238000012423 maintenance Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000001514 detection method Methods 0.000 claims description 5
- 238000011068 loading method Methods 0.000 claims description 5
- 238000013075 data extraction Methods 0.000 claims description 3
- 238000013523 data management Methods 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000000737 periodic effect Effects 0.000 abstract description 2
- 238000003672 processing method Methods 0.000 abstract description 2
- 230000036632 reaction speed Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000007630 basic procedure Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005111 flow chemistry technique Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Abstract
The invention belongs to data processing method or systems technology field more particularly to a kind of data warehouse ETL operating systems.The implementation for realizing data according to warehouse ETL operating system of the invention is extracted, it substantially reduces acquisition and handles the time of initial data, the distributed treatment to data conversion and load is realized by extraction-conversion-publication-acquisition, it changes traditional periodic job and executes step, the retardation rate for reducing data improves the real time reaction speed of data.
Description
Technical field
The invention belongs to data processing method or systems technology field more particularly to a kind of data warehouse ETL operating systems.
Background technique
With the rapid development of computer technology and information technology, electronic information data are more next in the daily operation of enterprise
It is more important, enterprise there is an urgent need to it is efficient, accurate, analyze data in time.The data loading cycle in traditional data warehouse is longer, past
It, cannot response data information change in real time toward analysis and inquiry to historical data can only be provided.Information is modern enterprise
Valuable source is basis of the enterprise with scientific management, Analysis of Policy Making.Currently, most enterprises spend a large amount of fund and when
Between construct the operation system and the office automation system of Transaction Processing OLTP, for recording the various correlations of issued transaction
Data.According to statistics, data volume every 2~3 year will be doubled and redoubled, these data contain huge commercial value, and look forward to
Industry is of interest usually only to be accounted for 2%~4% or so of total amount of data.Therefore, enterprise is not still maximumlly using having deposited
Data resource also lose the best opportunity for formulating key business decision so that waste more times and fund.In
It is how enterprise converts the data into information, knowledge by various technological means, at its core competitiveness of raising
Main bottleneck.The enterprise of increase with to(for) the demand of real time data processing, realization real-time is higher, data acquisition is more accelerated
Prompt convenient automatic processing technology and scheme are the important development directions of enterprise's Future Data processing.
Summary of the invention
The purpose of the invention is, provides a kind of data warehouse ETL operating system, when improving concurrent more work flows
Data system handles the efficiency of task and the real-time of data acquisition, processing, while improving data between each system level
Independence reduces the processing workload of associated data, guarantees the relatively independent of each work flow under multiple task situation.
To achieve the above object, the invention adopts the following technical scheme that.
A kind of data warehouse ETL operating system, contains
I. user interactive module, comprising:
Interactive maintenance component of the 1a. for the realization description data management in data integration and data flow;
The processing of transformational relation, data of the 1b. for defining the mapping relations between initial data and target data, data
Process, the data package with the interactive process of ETL;
1c. is used to show mapping relations, the data conversion relationship, data processing stream between initial data and target data
Journey, the visualization component with ETL interactive process;
Description data, which are included at least, describes data for describing the process of ETL specific operation process;Including process inner assembly
The configuration relation that information describes between data, each component describes data;
Ii. job management module, comprising:
2a. be used to control and change the starting of work flow, homework type, work flow control mode and pushed
To the control assembly of job execution component;
2b. constructs membership credentials between work flow and pushes to work for parsing the corresponding description data of each work flow
The decomposition component of industry executive module;
2c. is used to complete between different work process and the executive module of switching, the starting of work flow control mode;
Executive module directly or periodically or periodically executes the work flow by control assembly and/or decomposition component push;According to
The real-time power of initial data required for different work process to it is multiple while with the period push work flow
It is successively executed after being ranked up according to strong and weak sequence;
Pretreated transition components of the 2d. for initial data in the process that fulfils assignment;Transition components receive number to be processed
According to rear, data object is turned into the accessible Format Object of transition components using data line objectification tool, data are examined
Rope is matched with rule in transformation rule according to the current property value of data, is converted to the data for meeting transformation rule
And forwarding;
2e. is used for the response assemblies as data server and the caching of initial data arranged in a distributed manner;Response
Mode executes component in an asynchronous mode, passes through response assemblies by the initial data that initial data component and data extraction assembly obtain
It is issued, each work flow completes data acquisition by response assemblies according to self-demand;
Iii. data package module, comprising:
3a. is used to store storage and correspond to from initial data the database component of equipment or platform, including original data set
Number of packages is according to extraction assembly;Data pick-up component be and initial data relevant device or platform-independent extraction tool;Data pick-up
Component is imported using direct drop loading method for real time job process and is stored in data cache region, for data warehouse
And non real-time work flow imports and is stored in data warehouse in such a way that micro- batch loads;
3b. file operation component;
3c. record component;
For above-mentioned data warehouse ETL operating system, response assemblies are set as the corresponding convert task of each work flow
It submits, wait, prepare, execute, stop five intermediate state;To each convert task setting number, period, state, preferential, sequence
Number, record, wait, repeat eight mark classes;It wherein submits and refers to that conversion is also not ready to the starting time to be placed such as execution, waits
Refer to that entering convert queue waits initial data, prepares to refer to that initial data is ready for being ready to carry out, executes and referring to and convert just
It carrying out, stopping referring to that conversion has stopped;Number is to refer to that convert task executes in the unique encodings of each convert task, period
To the time --- real-time task 0, state refer to convert task real-time status --- initial value 1, preferentially refer to according to conversion appoint
Be engaged in real-time demand sequence priority --- the higher numerical value of priority is bigger, serial number refers to the corresponding initial data of convert task
Record of the queue order in tables of data, record of hiding at the end of refer in the convert task successively --- initial value 0, waiting
Referring to the duration that convert task waits in ready queue -- initial value 0 repeats to refer to whether convert task needs to repeat;
Response assemblies are held in conversion process, and the priority level of any one convert task is task flag and waiting conventional number
The sum of value, task schedule of every execution, then to waiting mark plus a processing.
For above-mentioned data warehouse ETL operating system, response assemblies construct tables of data, data according to the classification of initial data
Each initial data is sequentially placed into table by library component according to tables of data, and executive module is according to the corresponding initial data class of work flow
Corresponding data in other Track Date Central Table, and initial data is handled according to work flow content;Response assemblies are in original number
It according to accumulation or is more than specific quantity or time delete processing to be carried out to legacy data later.
For above-mentioned data warehouse ETL operating system, transformation rule is asked by definition tool according in different work process
The detected rule and processing rule for inscribing data define and are stored in rule base, in the operation of each work flow, load corresponding
Transformation rule simultaneously pre-processes corresponding initial data;Response assemblies are not influencing to be currently running using dynamically load mode
Data prediction in the case where new rule loaded and be applied to data conversion;Response assemblies according to each rule pattern match
Algorithm carries out detection judgement to the logical relation combination of data.
For above-mentioned data warehouse ETL operating system, visualization component is also used to, and output interactive maintenance component manipulates window
Mouth, output operation monitoring data, output process log information.
For above-mentioned data warehouse ETL operating system, data pick-up component uses the real-time abstracting method based on trigger;
Trigger is when initial data changes, and by the data-pushing after variation to response assemblies, response assemblies are notified all and are somebody's turn to do
Initial data keeps the work flow of interaction.
For above-mentioned data warehouse ETL operating system, the interactive process with ETL includes operation ETL task, detection ETL work
Industry, management record data;It further include description data repository, process describes data and saves in xml format in storage.
The beneficial effect of above-mentioned data warehouse ETL operating system is:
1, the implementation of the invention for realizing data according to warehouse ETL operating system is extracted, and substantially reduces acquisition and processing
The time of initial data realizes the distributed treatment to data conversion and load by extraction-conversion-publication-acquisition, changes
Traditional periodic job executes step, reduces the retardation rate of data, improves the real time reaction speed of data;
2, flow chart of data processing is optimized by data conversion and specific aim extraction, improves data centrality, reduced
Journey time loss, while improving the accuracy of data;
3, branch's processing mode based on priority processing and associated data, is protected at a possibility that avoiding server idle machine
Card has suitable work flow to continuously carry out always, while according to the requirement of real-time that ensure that work flow, reducing data and prolonging
Late, the validity of system data processing is improved;
4, it is based on above system structure, between the corresponding service area of each work flow and the acquisition of initial data relatively solely
It is vertical, will not influence each other, while guaranteeing that efficient stable is run always for the acquisition process of initial data, ensure that the stabilization of input with
Accurately, the colleges and universities of position system data provide important guarantee;It can be effective under the scene of multiple task and high real-time requirements
Guarantee the management and execution efficiency of job task;
5, it is based on priority to be arranged and manage structure, in implementation extraction and data set when improving system processes data
Validity, avoid influence of the length of each work flow processing time to whole system stability, eliminate single task role not
Sharp influence factor improves the stability of system entirety.
Detailed description of the invention
Fig. 1 is the principle schematic diagram of data warehouse ETL operating system;
Fig. 2 is the data transmitting schematic diagram of data warehouse ETL operating system
Fig. 3 is that the data of data warehouse ETL operating system update schematic diagram
Fig. 4 is the operation and data separating effect diagram of data warehouse ETL operating system.
Specific embodiment
It elaborates below in conjunction with specific embodiment to the invention.
It is as shown in Figure 1 a kind of principle schematic diagram of data warehouse ETL operating system of the invention, wherein containing i.
User interactive module, comprising:
Interactive maintenance component of the 1a. for the realization description data management in data integration and data flow;
The processing of transformational relation, data of the 1b. for defining the mapping relations between initial data and target data, data
Process, the data package with the interactive process of ETL;
1c. is used to show mapping relations, the data conversion relationship, data processing stream between initial data and target data
Journey, the visualization component with ETL interactive process;Visualization component is also used to, and output interactive maintenance component manipulation window, output are made
Industry monitoring data, output process log information, the interactive process with ETL include operation ETL task, detection ETL operation, manage and remember
Record data;It further include description data repository, process describes data and saves in xml format in storage.
Description data, which are included at least, describes data for describing the process of ETL specific operation process;Including process inner assembly
The configuration relation that information describes between data, each component describes data;
User interactive module for this system provide figure and text interaction approach, help user realize logarithm accordingly and
Service control provides the approach of access control;Processing and the number of work flow are improved using the visualization processings technology such as interface UI
According to identifiability, simplify use difficulty.The centralized control of description data can be realized to the more acurrate of overall work state
Analysis more quickly and accurately obtains partial operation content and state, improves the validity and system response time of user management;
Ii. job management module, comprising:
2a. be used to control and change the starting of work flow, homework type, work flow control mode and pushed
To the control assembly of job execution component;
2b. constructs membership credentials between work flow and pushes to work for parsing the corresponding description data of each work flow
The decomposition component of industry executive module;
2c. is used to complete between different work process and the executive module of switching, the starting of work flow control mode;
Executive module directly or periodically or periodically executes the work flow by control assembly and/or decomposition component push;According to
The real-time power of initial data required for different work process to it is multiple while with the period push work flow
It is successively executed after being ranked up according to strong and weak sequence;
Pretreated transition components of the 2d. for initial data in the process that fulfils assignment;Transition components receive number to be processed
According to rear, data object is turned into the accessible Format Object of transition components using data line objectification tool, data are examined
Rope is matched with rule in transformation rule according to the current property value of data, is converted to the data for meeting transformation rule
And forwarding;Transformation rule is fixed according to the detected rule of problem data in different work process and processing rule by definition tool
Justice is good and is stored in rule base, in the operation of each work flow, loads corresponding transformation rule and carries out to corresponding initial data
Pretreatment;Response assemblies are added new rule in the case where not influencing the data prediction being currently running using dynamically load mode
Carry and be applied to data conversion;Response assemblies examine the logical relation combination of data according to the pattern matching algorithm of each rule
Survey judgement;Response assemblies are set as submitting, waiting, preparation, execute, in stopping five to the corresponding convert task of each work flow
Between state;To each convert task setting number, the period, state, preferential, serial number, record, waiting, repeat eight mark classes;Wherein
Submission refers to that conversion is also not ready to the starting time to be placed such as execution, waiting refers into convert queue waiting initial data, standard
It is standby to refer to that initial data is ready for being ready to carry out, executes and refer to that conversion is carrying out, stopping referring to converting and stopped;Number
It is to refer to that convert task executes the waiting time in the unique encodings of each convert task, period --- real-time task 0, state refer to turn
It is preferential to change task real-time status --- initial value 1 preferentially refers to the priority to sort according to convert task real-time demand ---
The higher numerical value of grade is bigger, serial number refers to that hide queue order in tables of data, record of the corresponding initial data of convert task refers to
In the convert task successively at the end of record --- initial value 0, waiting refer to that convert task waits in ready queue when
It is long -- initial value 0 repeats to refer to whether convert task needs to repeat;Response assemblies are held in conversion process, any one conversion
The priority level of task be task flag and wait flag values and, task schedule of every executions, then to wait indicate add
One processing.
As shown in Fig. 2, the optimal control for concurrent multitask is realized based on executive module and the transition components present invention,
A possibility that avoiding the task of high real-time requirements cannot execute rapidly, more rationally effective distribution resource, Yuan's art data are logical
Its data variation is crossed to trigger real time data renewal process, the resource that system entirety is reduced while improving real-time property disappears
Consumption, while data server not will receive the influence of data variation, the extraction process of data and the acquisition process of data are mutually only
It is vertical, it can respectively play the performance of its system and unaffected;The management and generation control data prediction function of rule, effectively
Simplify resource consumption of data system during data conversion treatment, reduces data processing time, be further ensured that data
Real-time improves operation response speed, and then guarantee the efficient fortune of system entirety to meet the requirement of each work flow data
Row.It, can more rationally efficient scheduling job be suitable in conjunction with the switching of the task of executive module and decomposition component and managerial ability
Sequence meets the real-time processing and priority adjustment capability of task, improves the flexibility of system.
2e. is used for the response assemblies as data server and the caching of initial data arranged in a distributed manner;Response
Mode executes component in an asynchronous mode, passes through response assemblies by the initial data that initial data component and data extraction assembly obtain
It is issued, each work flow completes data acquisition by response assemblies according to self-demand;Response assemblies are according to initial data
Classification construct tables of data, each initial data is sequentially placed into table by database component according to tables of data, and executive module is according to work
Corresponding data in the corresponding initial data classification Track Date Central Table of industry process, and initial data is carried out according to work flow content
Processing;Response assemblies accumulate in initial data or carry out delete processing more than specific quantity or to legacy data after the time.
For the processing and acquisition of multi-data source kind different types of structure data in actual use, turn in the present invention
It changes component and response assemblies constitutes intermediate system, make data active layer and application layer respectively as independent overall operation, utilize this
In build the relationship between system coordination data source and data application, guarantee task independence while, improve data retrieval,
The stability for the treatment of effeciency and whole system;
Iii. data package module, comprising:
3a. is used to store storage and correspond to from initial data the database component of equipment or platform, including original data set
Number of packages is according to extraction assembly;Data pick-up component be and initial data relevant device or platform-independent extraction tool;Data pick-up
Component is imported using direct drop loading method for real time job process and is stored in data cache region, for data warehouse
And non real-time work flow imports and is stored in data warehouse in such a way that micro- batch loads;Data pick-up component uses base
In the real-time abstracting method of trigger;Trigger is when initial data changes, by the data-pushing after variation to response group
Part, response assemblies notify all work flows for keeping interacting with the initial data.
3b. file operation component;
3c. record component;
As shown in Figure 2, Figure 3, Figure 4, the extraction of data of the invention handles basic procedure, is based on above system structure, this
Each Yuan's art data source in invention is changed data and is sent in its mutual corresponding message queue, and work flow is according to respectively
It needs to extract data, forms multi-source ETL structure, so that the acquisition of data separates independence with conversion load, it is real-time to improve large capacity
Treatment effeciency when Data Concurrent is based on asynchronous mechanism, more flexible effective data processing mode can be provided, in height
Real time tasks can carry out when occurring into preferential, the work flow of low requirement of real-time can enter waiting processing state according to
It needs to start, is based on above structure, each work flow is equally mutual with Raw Data System and persistence architecture system
It is independent, therefore job task process is not by the great influence of data terminal working condition, it is only necessary to corresponding message team is learned in real time
Whether column update, and greatly reduce the total amount of the control and data processing when work flow operation, reduce system entirety resource consumption
Take, while mutually independent each system is conducive to the mutually isolated of error rate in each section, be unlikely to be superimposed or series connection is caused to be imitated
It answers, error generation rate is effectively reduced, quick localization process after occurring convenient for mistake.The high stability of guarantee system.
Above embodiments are only to illustrate the technical solution of the invention rather than to the limits of the invention protection scope
System, the technical solution of the invention is modified or replaced equivalently do not depart from the invention technical solution essence and
Range.
Claims (7)
1. a kind of data warehouse ETL operating system, which is characterized in that contain
I. user interactive module, comprising:
1a. in data integration and data flow for realizing the interactive maintenance component of description data management;
1b. is for defining the transformational relation of the mapping relations between initial data and target data, data, the processing stream of data
Journey, the data package with the interactive process of ETL;
1c. be used to show mapping relations between initial data and target data, data conversion relationship, flow chart of data processing, with
The visualization component of ETL interactive process;
The description data, which are included at least, describes data for describing the process of ETL specific operation process;Including process inner assembly
The configuration relation that information describes between data, each component describes data;
Ii. job management module, comprising:
2a. be used to control and change the starting of work flow, homework type, work flow control mode and pushed to work
The control assembly of industry executive module;
2b. constructs between work flow membership credentials and pushes to operation and hold for parsing the corresponding description data of each work flow
The decomposition component of row component;
2c. is used to complete between different work process and the executive module of switching, the starting of work flow control mode;It is described
Executive module directly or periodically or periodically executes the work flow by control assembly and/or decomposition component push;According to
The real-time power of initial data required for different work process to it is multiple while with the period push work flow
It is successively executed after being ranked up according to strong and weak sequence;
Pretreated transition components of the 2d. for initial data in the process that fulfils assignment;The transition components receive number to be processed
According to rear, data object is turned into the accessible Format Object of transition components using data line objectification tool, data are examined
Rope is matched with rule in transformation rule according to the current property value of data, is converted to the data for meeting transformation rule
And forwarding;
2e. is used for the response assemblies as data server and the caching of initial data arranged in a distributed manner;The response
Mode executes component in an asynchronous mode, passes through response assemblies by the initial data that initial data component and data extraction assembly obtain
It is issued, each work flow completes data acquisition by response assemblies according to self-demand;
Iii. data package module, comprising:
3a. is used to store storage and correspond to from initial data the database component of equipment or platform, including initial data package count
According to extraction assembly;The data pick-up component be and initial data relevant device or platform-independent extraction tool;
Data pick-up component is imported and is stored in data cache region using direct drop loading method for real time job process,
For data warehouse and non real-time work flow data warehouse is imported and is stored in such a way that micro- batch loads;
3b. file operation component;
3c. record component.
2. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the response assemblies are to each work
The corresponding convert task of industry process is set as submitting, waits, prepares, executing, stopping five intermediate state;Each convert task is set
It sets number, the period, state, preferential, serial number, record, waiting, repeat eight mark classes;It wherein submits and refers to that conversion is also not ready to
The starting time to be placed such as execution, waiting, which refer into convert queue, to be waited initial data, prepares to refer to that initial data is ready for
It is ready to carry out well, executes and refer to that conversion is carrying out, stopping referring to converting and stopped;Number is unique volume of each convert task
It is initial that code, period refer to that convert task executes waiting time --- real-time task 0, state refer to convert task real-time status ---
Value be 1, preferentially refer to according to convert task real-time demand sequence priority --- the higher numerical value of priority is bigger, serial number is
Refer to that the corresponding initial data of convert task hides queue order in tables of data, record at the end of refer in the convert task successively
Record --- initial value 0, waiting refer to the duration that convert task waits in ready queue -- initial value 0, repeatedly refer to
Whether convert task, which needs, repeats;
Response assemblies are held in conversion process, and the priority level of any one convert task is task flag and waiting flag values
With task schedule of every execution, then to waiting mark plus a processing.
3. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the response assemblies are according to original
The classification of beginning data constructs tables of data, and each initial data is sequentially placed into table by database component according to tables of data, the execution
Component is according to corresponding data in the corresponding initial data classification Track Date Central Table of work flow, and according to work flow content to original
Beginning data are handled;The response assemblies initial data accumulate or be more than specific quantity or after the time to legacy data into
Row delete processing.
4. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that it is fixed that the transformation rule passes through
Volunteer's tool defines and is stored in rule base according to the detected rule of problem data in different work process and processing rule, each
When work flow is run, loads corresponding transformation rule and corresponding initial data is pre-processed;Response assemblies are using dynamic
New rule is loaded and is applied to data conversion in the case where not influencing the data prediction being currently running by state loading mode;It rings
Component is answered to carry out detection judgement to the logical relation combination of data according to the pattern matching algorithm of each rule.
5. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the visualization component is also used
In output interactive maintenance component manipulation window, output operation monitoring data, output process log information.
6. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the data pick-up component is adopted
With the real-time abstracting method based on trigger;The trigger is when initial data changes, by the data-pushing after variation
To response assemblies, the response assemblies notify all work flows for keeping interacting with the initial data.
7. a kind of data warehouse ETL operating system according to claim 1, which is characterized in that the interactive process with ETL
Including operation ETL task, detection ETL operation, management record data;It further include description data repository, the process describes number
XML format saves in storage accordingly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811283414.4A CN109299180B (en) | 2018-10-31 | 2018-10-31 | ETL operating system of data warehouse |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811283414.4A CN109299180B (en) | 2018-10-31 | 2018-10-31 | ETL operating system of data warehouse |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109299180A true CN109299180A (en) | 2019-02-01 |
CN109299180B CN109299180B (en) | 2021-08-27 |
Family
ID=65145695
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811283414.4A Active CN109299180B (en) | 2018-10-31 | 2018-10-31 | ETL operating system of data warehouse |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299180B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111190972A (en) * | 2019-12-31 | 2020-05-22 | 武汉俊楚信息科技有限公司 | Experiment data management system |
CN111897827A (en) * | 2020-07-06 | 2020-11-06 | 苏宁金融科技(南京)有限公司 | Data updating method and system for data warehouse and electronic equipment |
WO2021099903A1 (en) * | 2019-11-18 | 2021-05-27 | International Business Machines Corporation | Multi-tenant extract transform load resource sharing |
CN113434497A (en) * | 2021-08-26 | 2021-09-24 | 中国电子信息产业集团有限公司 | Data element vault composed of data warehouse and data element warehouse |
CN114860349A (en) * | 2022-07-06 | 2022-08-05 | 深圳华锐分布式技术股份有限公司 | Data loading method, device, equipment and medium |
US11841871B2 (en) | 2021-06-29 | 2023-12-12 | International Business Machines Corporation | Managing extract, transform and load systems |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080306984A1 (en) * | 2007-06-08 | 2008-12-11 | Friedlander Robert R | System and method for semantic normalization of source for metadata integration with etl processing layer of complex data across multiple data sources particularly for clinical research and applicable to other domains |
CN102142039A (en) * | 2004-12-17 | 2011-08-03 | 亚马逊科技公司 | Apparatus and method for data warehousing |
CN105359141A (en) * | 2013-05-17 | 2016-02-24 | 甲骨文国际公司 | Supporting combination of flow based ETL and entity relationship based ETL |
US9922072B1 (en) * | 2010-12-31 | 2018-03-20 | United Services Automobile Association (Usaa) | Extract, transform, and load application complexity management framework |
-
2018
- 2018-10-31 CN CN201811283414.4A patent/CN109299180B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102142039A (en) * | 2004-12-17 | 2011-08-03 | 亚马逊科技公司 | Apparatus and method for data warehousing |
US20080306984A1 (en) * | 2007-06-08 | 2008-12-11 | Friedlander Robert R | System and method for semantic normalization of source for metadata integration with etl processing layer of complex data across multiple data sources particularly for clinical research and applicable to other domains |
US9922072B1 (en) * | 2010-12-31 | 2018-03-20 | United Services Automobile Association (Usaa) | Extract, transform, and load application complexity management framework |
CN105359141A (en) * | 2013-05-17 | 2016-02-24 | 甲骨文国际公司 | Supporting combination of flow based ETL and entity relationship based ETL |
Non-Patent Citations (1)
Title |
---|
杨等: "大型商业银行基于Hadoop分布式数据仓库建设初探", 《计算机应用与软件》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021099903A1 (en) * | 2019-11-18 | 2021-05-27 | International Business Machines Corporation | Multi-tenant extract transform load resource sharing |
GB2603098A (en) * | 2019-11-18 | 2022-07-27 | Ibm | Multi-tenant extract transform load resource sharing |
GB2603098B (en) * | 2019-11-18 | 2022-12-14 | Ibm | Multi-tenant extract transform load resource sharing |
CN111190972A (en) * | 2019-12-31 | 2020-05-22 | 武汉俊楚信息科技有限公司 | Experiment data management system |
CN111897827A (en) * | 2020-07-06 | 2020-11-06 | 苏宁金融科技(南京)有限公司 | Data updating method and system for data warehouse and electronic equipment |
US11841871B2 (en) | 2021-06-29 | 2023-12-12 | International Business Machines Corporation | Managing extract, transform and load systems |
CN113434497A (en) * | 2021-08-26 | 2021-09-24 | 中国电子信息产业集团有限公司 | Data element vault composed of data warehouse and data element warehouse |
CN114860349A (en) * | 2022-07-06 | 2022-08-05 | 深圳华锐分布式技术股份有限公司 | Data loading method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN109299180B (en) | 2021-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109299180A (en) | A kind of data warehouse ETL operating system | |
CN100444121C (en) | Batch task scheduling engine and dispatching method | |
CN107864174B (en) | Rule-based Internet of things equipment linkage method | |
CN109933306A (en) | Mix Computational frame generation, data processing method, device and mixing Computational frame | |
CN108920261A (en) | A kind of two-stage self-adapting dispatching method suitable for large-scale parallel data processing task | |
CN101017546A (en) | Method and device for categorical data batch processing | |
CN106548324A (en) | A kind of IT system O&M service management system | |
CN102467532A (en) | Task processing method and task processing device | |
CN106126601A (en) | A kind of social security distributed preprocess method of big data and system | |
CN108037919A (en) | A kind of visualization big data workflow configuration method and system based on WEB | |
CN108664635B (en) | Method, device, equipment and storage medium for acquiring database statistical information | |
CN109753596B (en) | Information source management and configuration method and system for large-scale network data acquisition | |
CN112162980A (en) | Data quality control method and system, storage medium and electronic equipment | |
US20200334314A1 (en) | Emergency disposal support system | |
CN114218218A (en) | Data processing method, device and equipment based on data warehouse and storage medium | |
CN107544844A (en) | A kind of method and device of lifting Spark Operating ettectiveness | |
CN115757603A (en) | Visual data modeling system and method | |
CN114416849A (en) | Data processing method and device, electronic equipment and storage medium | |
CN107451211B (en) | A kind of download system based on RabbitMQ and MongoDB | |
CN116974994A (en) | High-efficiency file collaboration system based on clusters | |
CN111046059B (en) | Low-efficiency SQL statement analysis method and system based on distributed database cluster | |
CN112667873A (en) | Crawler system and method suitable for general data acquisition of most websites | |
CN114756629B (en) | Multi-source heterogeneous data interaction analysis engine and method based on SQL | |
CN105630997A (en) | Data parallel processing method, device and equipment | |
CN115599524A (en) | Data lake system based on cooperative scheduling processing of streaming data and batch data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240104 Address after: 430074 East Lake New Technology Development Zone, Wuhan City, Hubei Province, China (Free Trade Zone, Wuhan Area) Patentee after: Wuhan Comb Big Data Technology Co.,Ltd. Address before: 430000 No. 04, room 01, floor 1-2, zone 3, 3S geospatial information industry base, wudayuan Road, Donghu New Technology Development Zone, Wuhan City, Hubei Province Patentee before: WUHAN OPTICS VALLEY DATA TECHNOLOGIES Co.,Ltd. |