CN102541942A - Data bulk transfer system and method thereof - Google Patents

Data bulk transfer system and method thereof Download PDF

Info

Publication number
CN102541942A
CN102541942A CN2010106193847A CN201010619384A CN102541942A CN 102541942 A CN102541942 A CN 102541942A CN 2010106193847 A CN2010106193847 A CN 2010106193847A CN 201010619384 A CN201010619384 A CN 201010619384A CN 102541942 A CN102541942 A CN 102541942A
Authority
CN
China
Prior art keywords
data
target
source
target data
task descriptor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010106193847A
Other languages
Chinese (zh)
Other versions
CN102541942B (en
Inventor
杨萌藜
吴金坛
周继恩
冯兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN201010619384.7A priority Critical patent/CN102541942B/en
Publication of CN102541942A publication Critical patent/CN102541942A/en
Application granted granted Critical
Publication of CN102541942B publication Critical patent/CN102541942B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a data bulk transfer system and a method thereof. The data bulk transfer system comprises a trigger module, a conversion module and a writing-in module, wherein the trigger module is used for judging whether a trigger condition is satisfied or not; if the trigger condition is satisfied, a task descriptor is generated and is sent to an intermediate conversion module, otherwise indexes in a source data base are continuously read; the conversion module is used for reading source data from the source data base according to the task descriptor and processing the source data to generate target data; target data parameters are determined according to the task descriptor, and the target data and the target data parameters are sent to the writing-in module; and the writing-in module is used for wiring the target data in a target data base according to a predetermined data format. By adopting the data bulk transfer system and the method thereof, loose coupling bulk data can be transferred; and meanwhile, the data bulk transfer system also has the characteristics of rapidity, multithreading concurrent processing support, and the like.

Description

A kind of batch data transfer system and method thereof
Technical field
The present invention relates to a kind of data transferring system and method, relate in particular to a kind of batch data transfer system and method thereof.
Background technology
Present ETL (Extraction-Transform-Load; I.e. extraction-conversion-loading) common method of data pick-up all is to be target with the target data demand; In source database, extract the available data resource that meets transformation rule, the relation in the existing origin system of analysis between available data resource and the target data demand, and formulate the data pick-up overall process; Extraction process according to fixing is write the ETL code, accomplishes and extracts.
The advantage of this mode is to realize simple; Exploitation rapidly; Shortcoming is that code is special-purpose, only is the function of a certain demand data exploitation, in case demand data changes; Then this private code then can't continue to provide the target data of variation, needs the developer that whole procedure is recompilated.In data pick-up and more frequent system variation stage of changes in demand, developer's exploitation amount is very big, because the pattern that each data shift is basic identical, makes the developer repeat work, inefficiency.
Shifting in batches with certain financial institution's common parameter is example; Every day, the common parameter storehouse issued nearly more than 60 common parameter table will for 16 sub-systems; And basically all be with certain business rule, data source copied to the similar procedure of target database, though these processes have difference; But what overlap is local a lot, and its core difference is mainly reflected in before and after shifting: 1, trigger condition; 2, professional bore; 3, extract subsequent treatment.So the deployment and the scheduling of multidata interface also become a difficult problem simultaneously, and be very complicated.
Summary of the invention
This in view of this; The purpose of invention is to provide a kind of system and method thereof that batch data shifts that be used for; Seven fundamentals analyzing data pick-up are promptly shifted trigger condition, pre-service, input, output, transformation rule, processing procedure and from the subsequent treatment request moduleization of subsystem; So that reach in the process that data pick-up, data shift, use the thought of batch processing, carry out data-driven but not the process driving; Set up loose coupling, reusable process configuration mechanism; So that management and control task scheduling; Reply changes frequent professional bore and data structure, to solve the problem that existing data-interface complicacy, deployment and scheduling inconvenience, data shift the frequent variation of demand effectively.
For realizing above-mentioned purpose, the present invention provides a kind of batch data transfer system, comprising:
Trigger module is used to judge whether to satisfy trigger condition, if satisfy then generate and send task descriptor to intermediate conversion module, otherwise continues to read the index in the source database;
Modular converter is used for reading source data and it being handled to generate target data from source database according to task descriptor, confirms the target data parameter and target data and target data parameter are sent to writing module according to task descriptor;
Writing module writes target database with target data according to the tentation data form.
Preferably, in system of the present invention, said intermediate conversion module comprises:
Pre-processing module is used for detection system and whether satisfies the condition that batch data shifts of carrying out; If satisfy, then carry out data processing, otherwise send alerting signal;
The source data interface module is confirmed the source data parameter and the source data parameter is sent to processor according to task descriptor;
The target data interface module is confirmed the target data parameter and the target data parameter is sent to writing module according to task descriptor;
The transformation rule module is confirmed transformation rule and is sent to processor according to task descriptor;
Processor reads source data according to the source data parameter from source database; According to transformation rule source data is handled the generation target data, target data is sent to writing module.
Preferably, in system of the present invention, said processor further is configured to respond the request of data from subsystem, and target data is sent to this subsystem.
Preferably, in system of the present invention, said source data parameter comprises source data table owner, source data table name, source data table tricks.
Preferably, in system of the present invention, said target data parameter comprises target matrix owner, target data table name, target matrix tricks.
Preferably, in system of the present invention, said transformation rule is selected from the group of being made up of the following: data normalization, fill in default default value, data integration, packet, data association, data merging, ordering and functional operation.
The present invention also provides a kind of batch data transfer method, may further comprise the steps:
A, judge whether to satisfy trigger condition,, otherwise continue to read the index in the source database if satisfy then generate and send task descriptor;
B, read source data and it is handled to generate target data from source database, confirm the target data parameter according to task descriptor according to task descriptor;
C, target data is write target database according to the tentation data form according to the target data parameter.
Preferably, in the method for the invention, step B further may further comprise the steps:
Whether detection system satisfies is carried out the initialization condition that batch data shifts; If satisfy, then carry out subsequent step, otherwise send alerting signal
Confirm the source data parameter according to task descriptor;
Confirm the target data parameter according to task descriptor;
Confirm transformation rule according to task descriptor;
Read source data according to the source data parameter from source database;
According to transformation rule source data is handled the generation target data.
Preferably, in the method for the invention, said method is further comprising the steps of:
The request of data that response subsystem sends is sent to said subsystem with target data.
Preferably, in the method for the invention, said source data parameter comprises source data table owner, source data table name, source data table tricks.
Preferably, in the method for the invention, said target data parameter comprises target matrix owner, target data table name, target matrix tricks.
Preferably, in the method for the invention, said transformation rule is selected from the group of being made up of the following: data normalization, fill in default default value, data integration, packet, data association, data merging, ordering and functional operation.
Technique effect of the present invention is: reduced the workload of coding effectively, made the change of each subsystem, all the influence to batch transfer itself reduces, and there is not very big influence the while to the performance of subsystem itself.The design but also has fast, supports characteristics such as multi-thread concurrent processing when realizing that the loose coupling batch data shifts.
Description of drawings
Fig. 1 is the synoptic diagram according to the batch data transfer system of embodiment of the present invention;
Fig. 2 is the synoptic diagram according to the batch data transfer method of embodiment of the present invention.
Embodiment
To combine accompanying drawing to describe the preferred embodiments of the present invention in detail below, identical in the accompanying drawings reference number is represented components identical.
Fig. 1 is the synoptic diagram according to the batch data transfer system of embodiment of the present invention.As shown in the figure, this system comprises trigger module 1, modular converter 2 and writing module 3.
Trigger module 1 reads the index in the source database and judges whether to satisfy trigger condition, if satisfy then generate and send task descriptor to intermediate conversion module, otherwise continues to read the index in the source database.Wherein this index is for example and without limitation to the data volume of origin system, the some zone bits in the source system data storehouse.Correspondingly, whether this trigger condition data volume of being for example and without limitation to origin system reaches predetermined value, whether this zone bit is effective.Those skilled in the art can define required index that reads and trigger condition according to actual needs.
Modular converter 2 is used for reading source data and it being handled to generate target data from source database 4 according to task descriptor, confirms the target data parameter and target data and target data parameter are sent to writing module 3 according to task descriptor;
Further, this modular converter 2 comprises pre-processing module 20, source data interface module 21, transformation rule module 22, target data interface module 23 and processor 24.
Pre-processing module 20 when receiving task descriptor, is used for detection system and whether satisfies the condition that batch data shifts of carrying out; If satisfy, then carry out data processing, otherwise send alerting signal.Pre-processing module 20 for example comes detection system whether to satisfy the condition of carrying out the batch data transfer through the initializing set that extracts task, includes but not limited to the oneself state inspection, the monitoring cleaning of running environment.Those skilled in the art can set according to actual needs.
Source data interface module 21 receives the task descriptor from trigger module 1, confirms the source data parameter and the source data parameter is sent to processor thereby resolve this task descriptor.Wherein the source data parameter comprises source data table owner, source data table name, source data table tricks.
Target data interface module 23 receives the task descriptor from trigger module 1, confirms the target data parameter and the target data parameter is sent to writing module thereby resolve this task descriptor.Wherein the target data parameter comprises target matrix owner, target data table name, target matrix tricks.
Transformation rule module 22 receives the task descriptor from trigger module 1, confirms transformation rule and is sent to processor 24 thereby resolve this task descriptor.Data conversion rule is generally followed following basic law:
Target data=Transformation (source data input field 1, source data input field 2 ...).This transformation rule for example for but be not limited to data normalization, fill in default default value, data integration, packet, data association, data merging, ordering and functional operation.Wherein the implication of each transformation rule is following:
Data normalization (Expression): essence implication in the origin system is identical but describe different data definitions, convert general, form unified, that the downstream subsystem can be discerned into.
Fill in default default value: do not exist when certain is defined in origin system, produce new data according to a preconcerted arrangement.For example: formation sequence (Sequence Generator):
Data integration (Aggregator): realize gathering of data.When origin system is based on professional flowing water, can produce the thicker tabulate statistics result of granularity through data integration.
Packet (Router): realize the shunting of data.Realize once reading source data, be distributed to a plurality of subsystems, alleviate origin system pressure, save transmission cost.
Data association (Joiner): realize the merging of data.
Ordering (Sorter): realize the appointment ordering of data.
Functional operation: the functional operation that some are special, numerical value delivery for example gets that certain is specific the time, and cross-talk collection character wherein or the like got in character.
Processor 24 reads source data according to the source data parameter from source data 4 storehouses; According to transformation rule source data is handled the generation target data, target data is sent to writing module 3.
Writing module 3 writes target database 5 with target data according to the tentation data form according to the target data parameter.This writing module 3 is for example and without limitation to ETL and extracts transform engine.
Preferably, in the present invention, processor 24 further is configured to respond the request of data from subsystem 6, and target data is sent to this subsystem 6.
Fig. 2 is the synoptic diagram according to the batch data transfer method of embodiment of the present invention.This method may further comprise the steps:
A, read the origin system index and judge whether to satisfy trigger condition,, otherwise continue to read the origin system index if satisfy then generate and send task descriptor;
B, read source data and it is handled to generate target data from source database, confirm the target data parameter according to task descriptor according to task descriptor;
C, target data is write target database according to the tentation data form according to the target data parameter.
Wherein, step B further may further comprise the steps:
Whether detection system satisfies is carried out the initialization condition that batch data shifts; If satisfy, then carry out subsequent step, otherwise send alerting signal
Confirm the source data parameter according to task descriptor, wherein the source data parameter comprises source data table owner, source data table name, source data table tricks;
Confirm the target data parameter according to task descriptor, wherein the target data parameter comprises target matrix owner, target data table name, target matrix tricks;
Confirm transformation rule according to task descriptor, wherein transformation rule is selected from the group of being made up of the following: data normalization, fill in default default value, data integration, packet, data association, data merging, ordering and functional operation;
Read source data according to the source data parameter from source database;
According to transformation rule source data is handled the generation target data.
Preferably, batch data transfer method of the present invention also comprises the request of data that the step response subsystem sends, and target data is sent to said subsystem.
In view of these instructions, those of ordinary skill in the art will expect other embodiments of the invention, combination and modification easily.Therefore, when combining above-mentioned explanation and accompanying drawing to read, the present invention only is defined by the claims.

Claims (12)

1. a batch data transfer system is characterized in that, comprising:
Trigger module is used to judge whether to satisfy trigger condition, if satisfy then generate and send task descriptor to intermediate conversion module, otherwise continues to read the index in the source database;
Modular converter is used for reading source data and it being handled to generate target data from source database according to task descriptor, confirms the target data parameter and target data and target data parameter are sent to writing module according to task descriptor;
Writing module writes target database with target data according to the tentation data form.
2. the system of claim 1 is characterized in that, said intermediate conversion module comprises:
Pre-processing module is used for detection system and whether satisfies the condition that batch data shifts of carrying out; If satisfy, then carry out data processing, otherwise send alerting signal;
The source data interface module is confirmed the source data parameter and the source data parameter is sent to processor according to task descriptor;
The target data interface module is confirmed the target data parameter and the target data parameter is sent to writing module according to task descriptor;
The transformation rule module is confirmed transformation rule and is sent to processor according to task descriptor;
Processor reads source data according to the source data parameter from source database; According to transformation rule source data is handled the generation target data, target data is sent to writing module.
3. system as claimed in claim 2 is characterized in that, said processor further is configured to respond the request of data from subsystem, and target data is sent to this subsystem.
4. like claim 2 or 3 described systems, it is characterized in that said source data parameter comprises source data table owner, source data table name, source data table tricks.
5. like claim 2 or 3 described systems, it is characterized in that said target data parameter comprises target matrix owner, target data table name, target matrix tricks.
6. like claim 2 or 3 described systems, it is characterized in that said transformation rule is selected from the group of being made up of the following: data normalization, fill in default default value, data integration, packet, data association, data merging, ordering and functional operation.
7. a batch data transfer method is characterized in that, may further comprise the steps:
A, judge whether to satisfy trigger condition,, otherwise continue to read the index in the source database if satisfy then generate and send task descriptor;
B, read source data and it is handled to generate target data from source database, confirm the target data parameter according to task descriptor according to task descriptor;
C, target data is write target database according to the tentation data form according to the target data parameter.
8. method as claimed in claim 7 is characterized in that step B further may further comprise the steps:
Whether detection system satisfies is carried out the initialization condition that batch data shifts; If satisfy, then carry out subsequent step, otherwise send alerting signal
Confirm the source data parameter according to task descriptor;
Confirm the target data parameter according to task descriptor;
Confirm transformation rule according to task descriptor;
Read source data according to the source data parameter from source database;
According to transformation rule source data is handled the generation target data.
9. like claim 7 or 8 described methods, it is characterized in that said method is further comprising the steps of:
The request of data that response subsystem sends is sent to said subsystem with target data.
10. method as claimed in claim 8 is characterized in that, said source data parameter comprises source data table owner, source data table name, source data table tricks.
11. method as claimed in claim 8 is characterized in that, said target data parameter comprises target matrix owner, target data table name, target matrix tricks.
12. method as claimed in claim 8 is characterized in that, said transformation rule is selected from the group of being made up of the following: data normalization, fill in default default value, data integration, packet, data association, data merging, ordering and functional operation.
CN201010619384.7A 2010-12-31 2010-12-31 Data bulk transfer system and method thereof Active CN102541942B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010619384.7A CN102541942B (en) 2010-12-31 2010-12-31 Data bulk transfer system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010619384.7A CN102541942B (en) 2010-12-31 2010-12-31 Data bulk transfer system and method thereof

Publications (2)

Publication Number Publication Date
CN102541942A true CN102541942A (en) 2012-07-04
CN102541942B CN102541942B (en) 2014-09-17

Family

ID=46348857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010619384.7A Active CN102541942B (en) 2010-12-31 2010-12-31 Data bulk transfer system and method thereof

Country Status (1)

Country Link
CN (1) CN102541942B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819589A (en) * 2012-08-06 2012-12-12 北京久其软件股份有限公司 ETL (Extract Transform Load)-based data optimization method and equipment
CN103927314A (en) * 2013-01-16 2014-07-16 阿里巴巴集团控股有限公司 Data batch processing method and device
CN104424326A (en) * 2013-09-09 2015-03-18 华为技术有限公司 Data processing method and device
CN105095327A (en) * 2014-05-23 2015-11-25 深圳市珍爱网信息技术有限公司 Distributed ELT system and scheduling method
WO2017181872A1 (en) * 2016-04-22 2017-10-26 中国银联股份有限公司 Data processing system and method
CN109165200A (en) * 2018-08-10 2019-01-08 北京奇虎科技有限公司 Method of data synchronization, calculates equipment and computer storage medium at device
CN110019133A (en) * 2017-12-21 2019-07-16 北京京东尚科信息技术有限公司 Online data moving method and device
CN112988860A (en) * 2019-12-18 2021-06-18 菜鸟智能物流控股有限公司 Data acceleration processing method and device and electronic equipment
CN115600560A (en) * 2022-09-28 2023-01-13 中电金信软件有限公司(Cn) Data conversion method, device, system, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1407465A (en) * 2001-08-13 2003-04-02 深圳市丛文软件技术有限公司 Data exchanging method and device between different databases with different structure
CN1477546A (en) * 2002-08-19 2004-02-25 万达信息股份有限公司 Method for duplicating data of identical data table between two data bases
US20080215586A1 (en) * 2005-02-18 2008-09-04 International Business Machines Corporation Simulating Multi-User Activity While Maintaining Original Linear Request Order for Asynchronous Transactional Events

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1407465A (en) * 2001-08-13 2003-04-02 深圳市丛文软件技术有限公司 Data exchanging method and device between different databases with different structure
CN1477546A (en) * 2002-08-19 2004-02-25 万达信息股份有限公司 Method for duplicating data of identical data table between two data bases
US20080215586A1 (en) * 2005-02-18 2008-09-04 International Business Machines Corporation Simulating Multi-User Activity While Maintaining Original Linear Request Order for Asynchronous Transactional Events

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819589B (en) * 2012-08-06 2015-02-04 北京久其软件股份有限公司 ETL (Extract Transform Load)-based data optimization method and equipment
CN102819589A (en) * 2012-08-06 2012-12-12 北京久其软件股份有限公司 ETL (Extract Transform Load)-based data optimization method and equipment
CN103927314A (en) * 2013-01-16 2014-07-16 阿里巴巴集团控股有限公司 Data batch processing method and device
CN103927314B (en) * 2013-01-16 2017-10-13 阿里巴巴集团控股有限公司 A kind of method and apparatus of batch data processing
CN104424326B (en) * 2013-09-09 2018-06-15 华为技术有限公司 A kind of data processing method and device
CN104424326A (en) * 2013-09-09 2015-03-18 华为技术有限公司 Data processing method and device
CN105095327A (en) * 2014-05-23 2015-11-25 深圳市珍爱网信息技术有限公司 Distributed ELT system and scheduling method
WO2017181872A1 (en) * 2016-04-22 2017-10-26 中国银联股份有限公司 Data processing system and method
CN110019133A (en) * 2017-12-21 2019-07-16 北京京东尚科信息技术有限公司 Online data moving method and device
CN109165200A (en) * 2018-08-10 2019-01-08 北京奇虎科技有限公司 Method of data synchronization, calculates equipment and computer storage medium at device
CN109165200B (en) * 2018-08-10 2022-04-01 北京奇虎科技有限公司 Data synchronization method and device, computing equipment and computer storage medium
CN112988860A (en) * 2019-12-18 2021-06-18 菜鸟智能物流控股有限公司 Data acceleration processing method and device and electronic equipment
CN115600560A (en) * 2022-09-28 2023-01-13 中电金信软件有限公司(Cn) Data conversion method, device, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN102541942B (en) 2014-09-17

Similar Documents

Publication Publication Date Title
CN102541942A (en) Data bulk transfer system and method thereof
US11288142B2 (en) Recovery strategy for a stream processing system
Chen et al. Big data: A survey
US20220058170A1 (en) Systems and methods for indexing and searching
CN103019838B (en) Multi-DSP (Digital Signal Processor) platform based distributed type real-time multiple task operating system
CN106126410B (en) The reminding method and device of code conflicts
CN102810057A (en) Log recording method
WO2006124084A3 (en) Peer data transfer orchestration
CN102937964B (en) Intelligent data service method based on distributed system
KR20070050352A (en) A method for sending service data to an rfid tag while an attached computer system is powered off and a computer system therefor
CN102508919A (en) Data processing method and system
CN103186455A (en) Method and system for generating automatic test script on page
CN102446100B (en) For the type of data type and the abstract system and method for length
Massri et al. Routing protocols for delay tolerant networks: a reference architecture and a thorough quantitative evaluation
US9195682B2 (en) Integrated analytics on multiple systems
CN103544564A (en) Loose-coupling remote-sensing satellite ground receiving system
KR102032895B1 (en) Apparatus and method for sharing functional logic between functional units, and reconfigurable processor
Tambe et al. An extensible CBM architecture for naval fleet maintenance using open standards
CN111078286B (en) Data communication method, computing system and storage medium
CN103593239A (en) Method and device for processing application process commands in Linux system
CN103106174A (en) Complex system on-chip (SOC) communication method
Liew et al. Performance database: capturing data for optimizing distributed streaming workflows
CN114401239A (en) Metadata transmission method and device, computer equipment and storage medium
O’Donovan et al. Waternomics: a cross-site data collection to support the development of a water information platform
CN105701738A (en) Area architectural energy consumption platform data acquisition processing method and device for realizing same

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant