CN102541942B - Data bulk transfer system and method thereof - Google Patents

Data bulk transfer system and method thereof Download PDF

Info

Publication number
CN102541942B
CN102541942B CN201010619384.7A CN201010619384A CN102541942B CN 102541942 B CN102541942 B CN 102541942B CN 201010619384 A CN201010619384 A CN 201010619384A CN 102541942 B CN102541942 B CN 102541942B
Authority
CN
China
Prior art keywords
data
source
target
target data
task descriptor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010619384.7A
Other languages
Chinese (zh)
Other versions
CN102541942A (en
Inventor
杨萌藜
吴金坛
周继恩
冯兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN201010619384.7A priority Critical patent/CN102541942B/en
Publication of CN102541942A publication Critical patent/CN102541942A/en
Application granted granted Critical
Publication of CN102541942B publication Critical patent/CN102541942B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a data bulk transfer system and a method thereof. The data bulk transfer system comprises a trigger module, a conversion module and a writing-in module, wherein the trigger module is used for judging whether a trigger condition is satisfied or not; if the trigger condition is satisfied, a task descriptor is generated and is sent to an intermediate conversion module, otherwise indexes in a source data base are continuously read; the conversion module is used for reading source data from the source data base according to the task descriptor and processing the source data to generate target data; target data parameters are determined according to the task descriptor, and the target data and the target data parameters are sent to the writing-in module; and the writing-in module is used for wiring the target data in a target data base according to a predetermined data format. By adopting the data bulk transfer system and the method thereof, loose coupling bulk data can be transferred; and meanwhile, the data bulk transfer system also has the characteristics of rapidity, multithreading concurrent processing support, and the like.

Description

A kind of batch data transfer system and method thereof
Technical field
The present invention relates to a kind of data transferring system and method, relate in particular to a kind of batch data transfer system and method thereof.
Background technology
ETL (Extraction-Transform-Load at present, extract-change-load) common method of data pick-up is all into target with target data demand, in source database, extract the available data resource that meets transformation rule, analyze the relation between available data resource and target data demand in existing origin system, and formulate data pick-up overall process, write ETL code according to fixing extraction process, complete extraction.
The advantage of this mode is to realize simple, exploitation rapidly, shortcoming is that code is special, it is only the function of a certain demand data exploitation, once demand data changes, this private code cannot continue to provide the target data of variation, needs developer to recompilate whole program.In data pick-up and changes in demand ratio system variation stage more frequently, developer's exploitation amount is very large, because the pattern that each data shift is basic identical, makes developer's repeated work, inefficiency.
Shift as example in batches taking certain financial institution's common parameter, common parameter every day storehouse issues nearly 60 multiple common parameter tables will to 16 subsystems, and be substantially all with certain business rule, data source is copied to the similar procedure of target database, though these processes have difference, but the place overlapping is a lot, and before and after shifting, its core difference is mainly reflected in: 1, trigger condition; 2, business bore; 3, extract subsequent treatment.Deployment and the scheduling of so many data-interfaces simultaneously also becomes a difficult problem, very complicated.
Summary of the invention
This in view of this, the object of invention is to provide a kind of system and method thereof shifting for batch data, seven fundamentals analyzing data pick-up are shifted to trigger condition, pre-service, input, output, transformation rule, processing procedure and the subsequent treatment request module from subsystem, to reach in the process shifting in data pick-up, data, the thought of application batch processing, carries out data-driven but not proceduredriven; Set up loose coupling, reusable process configuration mechanism, so that management and control task scheduling, reply changes business bore and data structure frequently, effectively to solve existing data-interface complexity, deployment and scheduling inconvenience, the often problem of variation of data transfer demand.
For achieving the above object, the invention provides a kind of batch data transfer system, comprising:
Trigger module, for judging whether to meet trigger condition, if met, generates and sends task descriptor to intermediate conversion module, otherwise continues to read the index in source database;
Modular converter, for reading source data and it is processed to generate target data from source database according to task descriptor, determines target data parameter and target data and target data parameter is sent to writing module according to task descriptor;
Writing module, writes target database by target data according to tentation data form.
Preferably, in system of the present invention, described intermediate conversion module comprises:
Whether pretreatment module, meet for detection of system the condition that executing data shifts in batches; If met, carry out data processing, otherwise send alerting signal;
Source data interface module, determines source data parameter and source data parameter is sent to processor according to task descriptor;
Target data interface module, determines target data parameter and target data parameter is sent to writing module according to task descriptor;
Transformation rule module, determines transformation rule and is sent to processor according to task descriptor;
Processor, reads source data according to source data parameter from source database; According to transformation rule, source data is processed and generated target data, target data is sent to writing module.
Preferably, in system of the present invention, described processor is further configured to the request of data of response from subsystem, and target data is sent to this subsystem.
Preferably, in system of the present invention, described source data parameter comprises source data table owner, source data table name, source data table tricks.
Preferably, in system of the present invention, described target data parameter comprises target matrix owner, target data table name, target matrix tricks.
Preferably, in system of the present invention, described transformation rule selects the group of free the following composition: data normalization, fill in that default default value, data integration, packet, data correlation, data merge, Scheduling Sum function computing.
The present invention also provides a kind of batch data transfer method, comprises the following steps:
A, judge whether to meet trigger condition, if met, generate and send task descriptor, otherwise continue to read the index in source database;
B, read source data and it is processed to generate target data from source database according to task descriptor, determine target data parameter according to task descriptor;
C, according to target data parameter, target data is write to target database according to tentation data form.
Preferably, in the method for the invention, step B is further comprising the steps:
Whether detection system meets the initialization condition that executing data shifts in batches; If met, carry out subsequent step, otherwise send alerting signal
Determine source data parameter according to task descriptor;
Determine target data parameter according to task descriptor;
Determine transformation rule according to task descriptor;
Read source data according to source data parameter from source database;
According to transformation rule, source data is processed and generated target data.
Preferably, in the method for the invention, described method is further comprising the steps of:
The request of data that response subsystem sends, is sent to described subsystem by target data.
Preferably, in the method for the invention, described source data parameter comprises source data table owner, source data table name, source data table tricks.
Preferably, in the method for the invention, described target data parameter comprises target matrix owner, target data table name, target matrix tricks.
Preferably, in the method for the invention, described transformation rule selects the group of free the following composition: data normalization, fill in that default default value, data integration, packet, data correlation, data merge, Scheduling Sum function computing.
Technique effect of the present invention is: effectively reduced the workload of coding, made the change of each subsystem, all the impact of batch transfer itself has been reduced, the performance of subsystem itself has not been had a great impact simultaneously.The design, in realizing loose coupling batch data and shifting, but also has fast, supports the features such as multi-thread concurrent processing.
Brief description of the drawings
Fig. 1 is according to the schematic diagram of the batch data transfer system of embodiment of the present invention;
Fig. 2 is according to the schematic diagram of the batch data transfer method of embodiment of the present invention.
Embodiment
Describe the preferred embodiments of the present invention in detail below in conjunction with accompanying drawing, identical reference number represents identical element in the accompanying drawings.
Fig. 1 is according to the schematic diagram of the batch data transfer system of embodiment of the present invention.As shown in the figure, this system comprises trigger module 1, modular converter 2 and writing module 3.
Trigger module 1, reads the index in source database and judges whether to meet trigger condition, if met, generates and sends task descriptor to intermediate conversion module, otherwise continues to read the index in source database.Wherein this index is for example and without limitation to the some zone bits in data volume, the source system data storehouse of origin system.Whether the data volume that correspondingly, this trigger condition is for example and without limitation to origin system reaches predetermined value, whether this zone bit is effective.Those skilled in the art can define the required index reading and trigger condition according to actual needs.
Modular converter 2, for reading source data and it is processed to generate target data from source database 4 according to task descriptor, determines target data parameter and target data and target data parameter is sent to writing module 3 according to task descriptor;
Further, this modular converter 2 comprises pretreatment module 20, source data interface module 21, transformation rule module 22, target data interface module 23 and processor 24.
Whether pretreatment module 20, in the time receiving task descriptor, meet for detection of system the condition that executing data shifts in batches; If met, carry out data processing, otherwise send alerting signal.Pretreatment module 20 is for example carried out detection system by the initializing set of extraction task and whether is met the condition that executing data shifts in batches, includes but not limited to oneself state inspection, the monitoring cleaning of running environment.Those skilled in the art can set according to actual needs.
Source data interface module 21, receives the task descriptor from trigger module 1, determines source data parameter and source data parameter is sent to processor thereby resolve this task descriptor.Wherein source data parameter comprises source data table owner, source data table name, source data table tricks.
Target data interface module 23, receives the task descriptor from trigger module 1, determines target data parameter and target data parameter is sent to writing module thereby resolve this task descriptor.Wherein target data parameter comprises target matrix owner, target data table name, target matrix tricks.
Transformation rule module 22, receives the task descriptor from trigger module 1, determines transformation rule and is sent to processor 24 thereby resolve this task descriptor.Data conversion rule is generally followed following basic law:
Target data=Transformation (source data input field 1, source data input field 2 ...).This transformation rule for example for but be not limited to data normalization, fill in default default value, data integration, packet, data correlation, data merge, Scheduling Sum function computing.Wherein the implication of each transformation rule is as follows:
Data normalization (Expression): by identical essence implication in origin system but describe different data definitions, be converted to general, unified, that downstream subsystem can be identified form.
Fill in default default value: do not exist when certain is defined in origin system, produce according to a preconcerted arrangement new data.For example: formation sequence (Sequence Generator):
Data integration (Aggregator): realize gathering of data.When origin system is during based on business flowing water, can produce the thicker tabulate statistics result of granularity by data integration.
Packet (Router): realize the shunting of data.Realize and once read source data, be distributed to multiple subsystems, alleviate origin system pressure, save transmission cost.
Data correlation (Joiner): realize the merging of data.
Sequence (Sorter): realize the appointment sequence of data.
Functional operation: the functional operation that some are special, for example numerical value delivery, gets that certain is specific the time, and cross-talk collection character wherein etc. got in character.
Processor 24, reads source data according to source data parameter from source data 4 storehouses; According to transformation rule, source data is processed and generated target data, target data is sent to writing module 3.
Writing module 3 writes target database 5 by target data according to tentation data form according to target data parameter.This writing module 3 is for example and without limitation to ETL and extracts transform engine.
Preferably, in the present invention, processor 24 is further configured to the request of data of response from subsystem 6, and target data is sent to this subsystem 6.
Fig. 2 is according to the schematic diagram of the batch data transfer method of embodiment of the present invention.The method comprises the following steps:
A, read origin system index and judge whether to meet trigger condition, if met, generate and send task descriptor, otherwise continue to read origin system index;
B, read source data and it is processed to generate target data from source database according to task descriptor, determine target data parameter according to task descriptor;
C, according to target data parameter, target data is write to target database according to tentation data form.
Wherein, step B is further comprising the steps:
Whether detection system meets the initialization condition that executing data shifts in batches; If met, carry out subsequent step, otherwise send alerting signal
Determine source data parameter according to task descriptor, wherein source data parameter comprises source data table owner, source data table name, source data table tricks;
Determine target data parameter according to task descriptor, wherein target data parameter comprises target matrix owner, target data table name, target matrix tricks;
Determine transformation rule according to task descriptor, wherein transformation rule selects the group of free the following composition: data normalization, fill in that default default value, data integration, packet, data correlation, data merge, Scheduling Sum function computing;
Read source data according to source data parameter from source database;
According to transformation rule, source data is processed and generated target data.
Preferably, batch data transfer method of the present invention also comprises the request of data that step response subsystem sends, and target data is sent to described subsystem.
In view of these instructions, those of ordinary skill in the art will easily expect other embodiments of the invention, combination and amendment.Therefore,, in the time reading in conjunction with above-mentioned explanation and accompanying drawing, the present invention is only defined by the claims.

Claims (1)

1. a batch data transfer system, is characterized in that, comprising:
Trigger module, for reading the index of source database and judging whether to meet trigger condition, if met, generates and sends task descriptor to modular converter, otherwise continues to read the index in source database;
Modular converter, for reading source data and it is processed to generate target data from source database according to task descriptor, determines target data parameter and target data and target data parameter is sent to writing module according to task descriptor;
Writing module, writes target database by target data according to tentation data form, and wherein, described modular converter comprises:
Whether pretreatment module, meet for detection of system the condition that executing data shifts in batches; If met, carry out data processing, otherwise send alerting signal;
Source data interface module, determines source data parameter and source data parameter is sent to processing module according to task descriptor;
Target data interface module, determines target data parameter and target data parameter is sent to writing module according to task descriptor;
Transformation rule module, determines transformation rule and is sent to processing module according to task descriptor;
Processing module, reads source data according to source data parameter from source database; According to transformation rule, source data is processed and generated target data, target data is sent to writing module.
2. the system as claimed in claim 1, is characterized in that, described processing module is further configured to the request of data of response from subsystem, and target data is sent to this subsystem.
3. system as claimed in claim 1 or 2, is characterized in that, described source data parameter comprises source data table owner, source data table name, source data table tricks.
4. system as claimed in claim 1 or 2, is characterized in that, described target data parameter comprises target matrix owner, target data table name, target matrix tricks.
5. system as claimed in claim 1 or 2, is characterized in that, described transformation rule selects the group of free the following composition: data normalization, fill in that default default value, data integration, packet, data correlation, data merge, Scheduling Sum function computing.
6. a batch data transfer method, is characterized in that, comprises the following steps:
A, read the index in source database and judge whether to meet trigger condition, if met, generate and send task descriptor, otherwise continue to read the index in source database;
B, read source data and it is processed to generate target data from source database according to task descriptor, determine target data parameter according to task descriptor;
C, according to target data parameter, target data is write to target database according to tentation data form, wherein, step B is further comprising the steps:
Whether detection system meets the initialization condition that executing data shifts in batches; If met, carry out subsequent step, otherwise send alerting signal
Determine source data parameter according to task descriptor;
Determine target data parameter according to task descriptor;
Determine transformation rule according to task descriptor;
Read source data according to source data parameter from source database;
According to transformation rule, source data is processed and generated target data.
7. method as claimed in claim 6, is characterized in that, described method is further comprising the steps of:
The request of data that response subsystem sends, is sent to described subsystem by target data.
8. method as claimed in claim 6, is characterized in that, described source data parameter comprises source data table owner, source data table name, source data table tricks.
9. method as claimed in claim 6, is characterized in that, described target data parameter comprises target matrix owner, target data table name, target matrix tricks.
10. method as claimed in claim 6, is characterized in that, described transformation rule selects the group of free the following composition: data normalization, fill in that default default value, data integration, packet, data correlation, data merge, Scheduling Sum function computing.
CN201010619384.7A 2010-12-31 2010-12-31 Data bulk transfer system and method thereof Active CN102541942B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010619384.7A CN102541942B (en) 2010-12-31 2010-12-31 Data bulk transfer system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010619384.7A CN102541942B (en) 2010-12-31 2010-12-31 Data bulk transfer system and method thereof

Publications (2)

Publication Number Publication Date
CN102541942A CN102541942A (en) 2012-07-04
CN102541942B true CN102541942B (en) 2014-09-17

Family

ID=46348857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010619384.7A Active CN102541942B (en) 2010-12-31 2010-12-31 Data bulk transfer system and method thereof

Country Status (1)

Country Link
CN (1) CN102541942B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819589B (en) * 2012-08-06 2015-02-04 北京久其软件股份有限公司 ETL (Extract Transform Load)-based data optimization method and equipment
CN103927314B (en) * 2013-01-16 2017-10-13 阿里巴巴集团控股有限公司 A kind of method and apparatus of batch data processing
CN104424326B (en) * 2013-09-09 2018-06-15 华为技术有限公司 A kind of data processing method and device
CN105095327A (en) * 2014-05-23 2015-11-25 深圳市珍爱网信息技术有限公司 Distributed ELT system and scheduling method
CN105843966A (en) * 2016-04-22 2016-08-10 中国银联股份有限公司 Data processing system and method
CN110019133B (en) * 2017-12-21 2021-07-13 北京京东尚科信息技术有限公司 Data online migration method and device
CN109165200B (en) * 2018-08-10 2022-04-01 北京奇虎科技有限公司 Data synchronization method and device, computing equipment and computer storage medium
CN112988860B (en) * 2019-12-18 2023-09-26 菜鸟智能物流控股有限公司 Data acceleration processing method and device and electronic equipment
CN115600560B (en) * 2022-09-28 2023-06-20 中电金信软件有限公司 Data conversion method, device and system, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1407465A (en) * 2001-08-13 2003-04-02 深圳市丛文软件技术有限公司 Data exchanging method and device between different databases with different structure
CN1477546A (en) * 2002-08-19 2004-02-25 万达信息股份有限公司 Method for duplicating data of identical data table between two data bases

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7376675B2 (en) * 2005-02-18 2008-05-20 International Business Machines Corporation Simulating multi-user activity while maintaining original linear request order for asynchronous transactional events

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1407465A (en) * 2001-08-13 2003-04-02 深圳市丛文软件技术有限公司 Data exchanging method and device between different databases with different structure
CN1477546A (en) * 2002-08-19 2004-02-25 万达信息股份有限公司 Method for duplicating data of identical data table between two data bases

Also Published As

Publication number Publication date
CN102541942A (en) 2012-07-04

Similar Documents

Publication Publication Date Title
CN102541942B (en) Data bulk transfer system and method thereof
Isah et al. A survey of distributed data stream processing frameworks
CN102810057A (en) Log recording method
CN102467506A (en) Cookie processing method and system
CN110019267A (en) A kind of metadata updates method, apparatus, system, electronic equipment and storage medium
CN101779189A (en) Proactive power management in a parallel computer
US20140245068A1 (en) Using linked data to determine package quality
CN105183698A (en) Control processing system and method based on multi-kernel DSP
CN104778032A (en) Method and equipment used for carrying out continuous integration
CN103186455A (en) Method and system for generating automatic test script on page
CN103176892A (en) Page monitoring method and system
CN104834599A (en) WEB security detection method and device
CN106095678A (en) Automatization's result inspection method of data bank service operation under windows platform
US9195682B2 (en) Integrated analytics on multiple systems
Abbani et al. A distributed reconfigurable active SSD platform for data intensive applications
CN102636987A (en) Dual control device
CN106155822A (en) A kind of disposal ability appraisal procedure and device
CN104299170B (en) Intermittent energy source mass data processing method
CN106502842A (en) Data reconstruction method and system
Liew et al. Performance database: capturing data for optimizing distributed streaming workflows
WO2006013158A3 (en) Managing resources in a data processing system
CN102486731A (en) Method, device and system for enhancing visualization of software call stack of software
CN104836710A (en) Method and apparatus based on one-master with multi-slaves communication of distributed system
CN103593239A (en) Method and device for processing application process commands in Linux system
CN104899207A (en) Visualized structured query language (SQL) condition tree establishing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant