CN105930389A - Method and system for transferring data - Google Patents

Method and system for transferring data Download PDF

Info

Publication number
CN105930389A
CN105930389A CN201610232604.8A CN201610232604A CN105930389A CN 105930389 A CN105930389 A CN 105930389A CN 201610232604 A CN201610232604 A CN 201610232604A CN 105930389 A CN105930389 A CN 105930389A
Authority
CN
China
Prior art keywords
data
task
configuration information
source
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610232604.8A
Other languages
Chinese (zh)
Inventor
王新武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201610232604.8A priority Critical patent/CN105930389A/en
Publication of CN105930389A publication Critical patent/CN105930389A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention discloses a method and system for transferring data. The method comprises following steps: configuring data transferring information; generating corresponding collecting tasks according to scheduling task configuration information, related data source configuration information, transferring task configuration information or batch transferring task configuration information; executing the collecting tasks and grabbing source data to be transferred from a source data source to generate multiple fragmentation tasks according to the data source configuration information, transferring task configuration information or batch transferring task configuration information of the collecting tasks; executing the multiple fragmentation tasks in parallel and transferring the data to be transferred from the source database to a target database. The system comprises a configuration module, a scheduling engine, a data collection module and an execution engine. According to the method and system for transferring data, by configuring a series of rules, data transferring can be dynamically realized through analyzing the rules, the efficiency of developers is greatly increased, and the cost of application system development and maintenance is reduced.

Description

The method and system that data are carried down
Technical field
The present invention relates to information data processing technology field, specifically, relate to a kind of by data from The data base of a certain information system transfers to the method and system that the data of other data bases are carried down.
Background technology
Currently for the information system that some Business Processing amounts are bigger, real-time in order to reduce system Business datum amount alleviates data base's pressure, promotes system stability and response speed, generally uses Method be by database data and carry down, will the data in system database transfer to specify number According in storehouse.
The process logic carried down due to different pieces of information table is different, and tradition solves the scheme that data are carried down, It is the database table carried down as required and one timing of exploitation of data association rule is carried down task, Dispatched by timing and drive execution.Thus, traditional solution is had to for each need Tables of data one timing of exploitation to be carried down is carried down task, then is dispatched this strategy of carrying down by timing Realize carrying down of data.
For slightly more complex information system, owing to a lot of tables of data are required for carrying down, Appoint so that a timing will be developed for the tables of data of each tables of data or multiple association Business is periodically executed, thus development and maintenance cost is higher.
Summary of the invention
The technical problem to be solved in the present invention is, for the deficiencies in the prior art, it is provided that a kind of number According to the method and system carried down, dynamically realize data and carry down, when task is carried down in increase, be not required to Want the newly developed or existing program function of amendment, thus improve developer's efficiency, lower application system Exploitation, maintenance cost.
For solving above-mentioned technical problem, according to an aspect of the present invention, the invention provides one Data are carried down method, wherein, comprise the following steps:
Configuration data are carried down information, including scheduler task configuration information, data source configuration information, knot Turn task configuration information and batch is carried down task configuration information;
According to described scheduler task configuration information and relevant data source configuration information and task of carrying down Configuration information or batch are carried down task configuration information, generate corresponding acquisition tasks;
Perform described acquisition tasks, according to the data source configuration information in described acquisition tasks and carrying down Task configuration information or batch are carried down task configuration information, capture source to be carried down number from source data source According to, generate multiple burst task;With
Executed in parallel the plurality of burst task, according to carrying down task configuration information or batch is carried down and appointed Business configuration information, carries over data to be carried down to target database from source database.
Preferably, the source data to be carried down captured from source data source is major key data;Many generating After individual burst task, the major key data of each burst task are write and gathers in tables of data.
Preferably, described execution described burst task, according to task configuration information of carrying down, will carry down The step that data carry over target data source to from source data source specifically includes:
From described collection tables of data, read the data list of described burst task, obtain the most Individual major key data;
Carry down described in loading task configuration information, obtain database information and with described burst task pair The data table information answered;
Source data source from source database obtains the source number successively corresponding with described major key data According to, and by described source data write target database;With
Described source data is deleted from described source data source.
Preferably, if described burst task is batch carry down task time, by each major key data After corresponding source data is successfully carried in number of targets storehouse, also include: join according to the batch task of carrying down Correlation rule in confidence breath, generates corresponding sublist task.
Preferably, the method for the invention also includes the executed in parallel of multiple sublist task.
Preferably, before by described source data write target database, also include: according to joining Confidence ceases, and builds by the data type conversion of source data to target data, field mapping, number of targets According to storehouse write field and value.
Preferably, described according to data source configuration information with carry down task configuration information or batch is carried down Task configuration information, the step capturing source data to be carried down from source data source specifically includes:
Create database access according to the database information in data source configuration information to connect, and connect To described data base;
According to carry down task configuration information or batch carry down task configuration information obtain source data source and Data screening rule, from described source data source capture meet described data screening rule wait carry down Source data.
Preferably, when configuration schedules task configuration information, set up described scheduler task configuration information Carry down task configuration information with corresponding or batch is carried down the association of task configuration information.
For solving above-mentioned technical problem, according to another aspect of the present invention, the invention provides one Kind of data are carried down system, wherein, and including:
Configuration module, is used for configuring data and carries down information, appoint including data source configuration information, scheduling Business configuration information, task configuration information of carrying down and batch are carried down task configuration information;
Scheduling engine, is connected with described configuration module, for carrying down information according to corresponding data, Control the collection of data by sending trigger message and carry down;
Data acquisition module, is connected, for basis with described configuration module and described scheduling engine Described scheduler task configuration information generates corresponding acquisition tasks, according to the triggering of described scheduling engine Information and data source configuration information, perform described acquisition tasks, captures and wait to carry down from source data source The major key data of data, generate multiple burst task, and by the plurality of burst task registration to institute State scheduling engine;With
Enforcement engine, is connected with described configuration module and described scheduling engine, for according to described The trigger message of scheduling engine performs corresponding burst task, according to task configuration letter of carrying down accordingly Breath or batch are carried down task configuration information, carry over source data to be carried down to number of targets from source database According to storehouse, or generate sublist task registration to described scheduling engine.
Preferably, described system also includes data-interface, respectively with described data acquisition module and institute State enforcement engine to be connected, be used for providing between described data acquisition module and data base, described in hold Data manipulation between row engine and data base processes.
Preferably, described configuration module includes:
Data source dispensing unit, for configuring the relevant information of source database and target database;
Task of carrying down dispensing unit, the data table information carried down for configuring task of carrying down to need;
Batch is carried down task dispensing unit, is used for configuring batch and carries down multiple in task being mutually related Task of the carrying down related information of tables of data;With
Scheduler task dispensing unit, for configuration schedules task strategy information, and appoints described scheduling Business task creation of carrying down with the task of carrying down or batch associates.
Preferably, described scheduling engine includes:
Task registration unit, is connected with described data acquisition module and described enforcement engine respectively, For receiving acquisition tasks and burst task and the described enforcement engine that described data acquisition module generates The sublist task generated;With
Task trigger element, respectively with described task registration unit, described configuration module, described number Be connected with described enforcement engine according to acquisition module, according to corresponding data carry down information and task note The burst task of volume unit reception and sublist task, to described data acquisition module and/or described execution Engine sends trigger message.
Preferably, described data acquisition module includes:
Acquisition tasks signal generating unit, is connected with described configuration module, joins according to described scheduler task Confidence breath and relevant data source configuration information and carry down task configuration information or the batch task of carrying down is joined Confidence ceases, and generates acquisition tasks;
Data grabber unit, is connected with described scheduling engine, for according to described scheduling engine Trigger message performs corresponding acquisition tasks, captures the major key number of data to be carried down from source data source According to;With
Burst task creation unit, is connected with described data grabber unit, for according to crawl The major key data of data to be carried down, create multiple burst task, and the plurality of burst task are noted Volume is to described scheduling engine.
Preferably, described enforcement engine includes:
Task receives unit, is connected with described scheduling engine, for according to described scheduling engine Send the trigger message of coming, it is thus achieved that concrete mission bit stream;
Configuration information loading unit, receives unit with described task respectively and described configuration module is connected Connect, for receiving, according to described task, the mission bit stream that unit receives, obtain from described configuration module Corresponding data are carried down information;With
Data Migration processing unit, receives unit with described configuration information loading unit and task respectively It is connected, for carrying down information and described mission bit stream according to corresponding data, by source data Source data corresponding in storehouse moves to target data source storehouse, and by described source data from source database Delete.
Preferably, described enforcement engine also includes:
Sublist task creation unit, is connected with described Data Migration processing unit, for described After source data is successfully carried in number of targets storehouse by Data Migration processing unit, carry down according to batch and appoint Correlation rule in business configuration information, creates corresponding sublist task, and described sublist task is noted Volume is to described scheduling engine;
The described mission bit stream that the task reception unit of described enforcement engine obtains includes burst task Information and sublist mission bit stream.
The present invention, by a series of rule configuration, sets periodic data and carries down rule, pass through Resolution rules can be carried down with dynamic implement data.Newly-increased carry down task time, it is not necessary to native system Again develop or make big amendment, it is only necessary to can be realized by configuration rule, thus greatly carry The high efficiency of developer, reduces the cost of application system development, maintenance.
Accompanying drawing explanation
By referring to the following drawings description to the embodiment of the present invention, the present invention above-mentioned and other Objects, features and advantages will be apparent from, in the accompanying drawings:
Fig. 1 is that data of the present invention are carried down the overview flow chart of method;
Fig. 2 is that data of the present invention are carried down the structural principle block diagram of system;
Fig. 3 is the structural principle block diagram of configuration module of the present invention;
Fig. 4 is the structural principle block diagram of scheduling engine of the present invention;
Fig. 5 is the structural principle block diagram of data acquisition module of the present invention;
Fig. 6 be the present invention for perform single carry down task time the structural principle block diagram of enforcement engine;
Fig. 7 be the present invention for perform batch carry down task time the structural principle block diagram of enforcement engine;
Fig. 8 is the schematic flow sheet of data acquisition specific embodiment of the present invention;With
Fig. 9 be enforcement engine of the present invention perform to carry down task time the flow process of specific embodiment show It is intended to.
Detailed description of the invention
Below based on embodiment, present invention is described, but the present invention is not restricted to these Embodiment.In below the details of the present invention being described, detailed describe some specific detail portion Point.The description not having these detail sections for a person skilled in the art can also understand this completely Invention.In order to avoid obscuring the essence of the present invention, known method, process, flow process are the most in detail Narration.Additionally accompanying drawing is not necessarily drawn to scale.
Flow chart in accompanying drawing, block diagram illustrate the system of the embodiment of the present invention, method, device Possible System Framework, function and operation, the square frame on flow chart and block diagram can represent a mould Block, program segment or only one section of code, described module, program segment and code are all used to realize The executable instruction of regulation logic function.It should also be noted that described realize regulation logic function can Perform instruction can reconfigure, thus generate new module and program segment.Therefore the square frame of accompanying drawing And square frame order is used only to preferably illustrate process and the step of embodiment, and should not make with this For the restriction to invention itself.
The invention provides a kind of data to carry down method, described in carry down method as it is shown in figure 1, include:
Step S1, configuration data are carried down information, specifically include data source configuration information, scheduler task Configuration information, task configuration information of carrying down and batch are carried down task configuration information.These information include Various rules.
Step S2, appoints with carrying down according to scheduler task configuration information and relevant data source configuration information Business configuration information or batch are carried down task configuration information, generate corresponding acquisition tasks TCi.And will Acquisition tasks TCi is registered to scheduling engine, by the tune in scheduler task configuration information while registration Degree time parameter method is sent to scheduling engine, and scheduling engine dispatches collection according to this scheduling time strategy The execution of task.
Step S3, performs described acquisition tasks, according to data source configuration information and task configuration of carrying down Information or batch are carried down task configuration information, capture the major key of source data to be carried down from source data source Data, and the data grabbed are carried out burst, generate burst task TDi, by each burst The major key data write of task TDi gathers in tables of data.
Step S4, executed in parallel the plurality of burst task TDi, according to task configuration information of carrying down Or batch carries down task configuration information, data of carrying down carry over target database to from source database. Such as, according to the major key data configuration inquiry sql statement collected, go to inquire about in storehouse, source complete Data.Wherein, described major key, is the field representing this item data value uniqueness, permissible Being that a table is from increasing id, it is also possible to be other fields, specific rules can carried down in task configuration Configure.
Carry down system, wherein, described system bag as in figure 2 it is shown, the invention provides a kind of data Include configuration module 1, scheduling engine 2, data acquisition module 3 and enforcement engine 4.Wherein, configuration Module 1 is used for configuring data and carries down information, including data source configuration information, scheduler task configuration letter Breath, task configuration information of carrying down and batch are carried down task configuration information.
Scheduling engine 2 is connected with described configuration module 1, for carrying down letter according to corresponding data Breath, controls the collection of data by sending trigger message and carries down.
Data acquisition module 3 is connected with described configuration module 1 and described scheduling engine 2, is used for Generate corresponding acquisition tasks TCi according to described scheduler task configuration information, draw according to described scheduling Hold up trigger message and the data source configuration information of 2, perform described acquisition tasks TCi, from source data Source 6 captures the major key data of data to be carried down, generates multiple burst task TDi, and by described Multiple burst tasks TDi are registered to described scheduling engine 2.
Enforcement engine 4 is connected, for basis with described configuration module 1 and described scheduling engine 2 The trigger message of described scheduling engine 2 performs corresponding burst task TDi, according to carrying down accordingly Task configuration information or batch are carried down task configuration information, will source data be carried down from source database 6 Carry over target database 7 to, or generate sublist task registration to described scheduling engine 2.
Data of the present invention system of carrying down also includes data-interface 5, is connected to data base with front Between the data acquisition module stated or between enforcement engine and data base.This data-interface 5 is responsible for number According to storehouse connection, inquire about, write, data encapsulation, the work such as conversion, thus shield different number Process, on carrying down, the impact that logic causes according to the difference between storehouse.
Specifically, as it is shown on figure 3, be the structural principle block diagram of configuration module 1 of the present invention. Described configuration module includes: data source dispensing unit 11, scheduler task dispensing unit 12, carry down appoint Business dispensing unit 13 and batch are carried down task dispensing unit 14.
Wherein, data source dispensing unit 11 connects relevant information for configuration to data base, as IP address, user name, password and the coded format etc. of data base.Wherein it is possible to configure multiple not Same data base, such that it is able to support that the data between the data base of multiple operation system are carried down.
Scheduler task dispensing unit 12 is used for configuration schedules task strategy information, and by scheduler task Task creation of carrying down with the task of carrying down or batch associates.Described scheduler task policy information includes task Perform time range, perform the parameter such as frequency and execution interval time.Due to by described scheduler task Tactful task creation of carrying down with the task of carrying down or batch associates, and so, holds in each scheduler task During row, task of carrying down or the batch task being associated can be performed, thus complete the automatic execution of task.
Task of carrying down dispensing unit 13 needs, for configuring, the data table information carried down, as source table name, Target table name, source data source, target data source and data screening rule etc. of carrying down.
Batch carry down task dispensing unit 14 for configure multiple interrelated tables of data carry down appoint Business related information.As a rule, data base has and multiple there is the tables of data that is mutually related, During data of carrying down, the tables of data these associated is needed to carry down together, thus, batch is carried down task Dispensing unit 14 is carried down order etc. for the carry down field correlation rule between tables of data, table of configuration, And specify a table as master meter, so can be first from the beginning of master meter when batch carries down tasks carrying Carry down, and according to the correlation rule of master meter Yu sublist, the data of sublist of carrying down.
As shown in Figure 4, for the structural principle block diagram of scheduling engine 2 of the present invention.Described scheduling Engine 2 includes task registration unit 21 and task trigger element 22.
Described task registration unit 21 respectively with described data acquisition module 3 and described enforcement engine 4 It is connected, is used for receiving the acquisition tasks of described data acquisition module 3 generation and burst task and institute State the sublist task that enforcement engine 4 generates.Task trigger element 22 respectively with described configuration module 1, Described data acquisition module 2 is connected with described enforcement engine 3, carries down letter according to corresponding data Scheduler task policy information in breath, sends trigger message to described data acquisition module 3, triggering Described data acquisition module 3 performs described acquisition tasks, and/or sends trigger message to described execution Engine 4, is performed described burst task and/or sublist task by described enforcement engine 4.
As it is shown in figure 5, be the structural principle block diagram of data acquisition module 3 of the present invention.Described Data acquisition module 1 includes that acquisition tasks signal generating unit 31, data grabber unit 32 and burst are appointed Business creating unit 33.
Wherein, acquisition tasks signal generating unit 31 is connected with described configuration module, according to described Scheduler task configuration information and carry down task configuration information accordingly or batch is carried down task configuration information Generate acquisition tasks, and described acquisition tasks is registered to the described task note of described scheduling engine 2 Volume unit 21.
Described data grabber unit 32 is connected with described scheduling engine 2, draws according to described scheduling The trigger message holding up 2 performs described acquisition tasks.Specifically, according to the configuration letter of configuration module 1 Breath, as according to data source configuration information, obtained database address, the user name needing to connect And database name.Thus create database access link, it is connected to data source storehouse 6 to be carried down, and According to task configuration information of carrying down needs the data table information carried down, as source table name, target table name, Source data source, target data source and data screening rule etc. of carrying down, inquire about qualified number of carrying down According to, from described source data source, capture major key data.
Burst task creation unit 33 is connected with described data grabber unit 32, for according to grabbing The major key data of the data to be carried down taken, create multiple burst task, and by each burst task The write of major key data gather in tables of data, simultaneously by the plurality of burst task registration to described tune Degree engine 2.Described collection tables of data is also a burst gross task list, and all of burst task is all Record is in described burst gross task list.Each burst task records the major key data of active data Information.
Batch is carried down task, only gathers master meter data when carrying out data acquisition, do not gather son Table data, and the master meter data of collection are stored in collection tables of data.
In the present invention, when whole system starts, all scheduler tasks of query configuration, one by one Generate corresponding acquisition tasks and be registered to scheduling engine center.Can also manually increase new scheduling Task.There is newly-increased scheduler task when the system is operated, can manually trigger this task of startup, number Generate corresponding acquisition tasks according to acquisition module 3 according to this task, and be registered to scheduling engine center 2。
In a concrete Application Example of the present invention, by a single burst tasks carrying Device performs burst task.This executor is scheduling by scheduling engine, and distributed execution.
When scheduling engine 2 triggers burst task performer execution burst task, scheduling engine 2 is informed Executor needs to perform which burst task, and executor is in (the most aforesaid collection of burst task list Tables of data) the inner pending task obtaining corresponding burst.After obtaining pending task, by task by Individual enforcement engine 4 of giving processes.
After enforcement engine 4 obtains burst task, install task configuration information, batch information, data additional The configuration information that source configuration information etc. are necessary, then performs to carry down, if task is batch task, Generate corresponding association sublist task.
Specifically, as shown in Figure 6, for the present invention for performing to carry down the enforcement engine 4 during task Structural principle block diagram.When performing to carry down task, described enforcement engine 4 includes that task receives unit 41, configuration information loading unit 42 and data migration process unit 43.
Wherein, described task reception unit 41 is connected, for root with described scheduling engine 2 Send, according to described scheduling engine, the trigger message come from described collection tables of data, obtain corresponding point Sheet task.Described task receives unit 41 and aforesaid burst task can be used in the specific implementation to hold Row device.
Configuration information loading unit 42 receives, according to described task, the burst task that unit 41 obtains, Corresponding configuration information is obtained, as data base connects relevant information, example from described configuration module 1 Such as database address, user name or database-name etc. and data table information, such as source table name, mesh Mark table name, source data source, target data source and data screening rule of carrying down.
Described Data Migration processing unit 43, according to described configuration information, creates the chain with data base Connect, and be connected in data base, according to source table name, find source table and source data source, according to dividing The major key data of sheet task, move to source data corresponding with described major key data in source data source In the target data source of the object table of target database 7, and by described source data from source data source Delete.
If the task of carrying down is batch carry down task time, the structure of enforcement engine is as it is shown in fig. 7, be The present invention for perform batch carry down task time the structural principle block diagram of enforcement engine.Now described Enforcement engine 4 includes that task receives unit 41, configuration information loading unit 42, Data Migration process Unit 43 and sublist task creation unit 44.
Wherein, described task receives shown in unit 41 and configuration information loading unit 42 and Fig. 6 Structure is similar to, and except for the difference that, described configuration information loading unit 42 obtains from described configuration module 1 Be that corresponding batch is carried down task configuration information, as between master meter information, each associated data table Field correlation rule, table carry down order, specify master meter etc. and master meter information.
After the major key data completing burst task according to Fig. 6 and a upper embodiment are carried down, as Really this subtask is that batch is carried down task, then created accordingly by described sublist task creation unit 44 Contingency table subtask, and record in the table of subtask.Owing to the master meter of a task has multiple master Key data, and according to one contingency table subtask of each major key data creation, so, can produce Raw multiple association sublist tasks, and they are registered to described scheduling engine 2.
That is, after master meter data are successfully carried down, the table correlation rule defined in configuring according to batch, raw Become corresponding sublist task.
For example:
Table_master is master meter, and major key is id, and table name hereinafter represents with tm.
Table_slave is that master meter associates sublist, and major key is id, and table name hereinafter represents with ts.
Carry down in task configuration information the regulation to master meter Yu the incidence relation of sublist according to batch, this Embodiment boss's table incidence relation is: ts.master_id=tm.id.
When tm telogenesis merit is carried down the data of an id=10000, then a corresponding ts can be generated The task of table, and associate master meter field value on task the exterior and the interior is subsidiary.
Perform batch carry down task time, have two kinds of tasks in executed in parallel, one is by master meter The burst task of data genaration, another kind is the sublist task being associated with master meter for carrying down.Main After one major key data of table burst task are successfully carried down, generate corresponding sublist task, and write In sublist task list.The burst task of master meter is to be obtained by burst task performer, and by performing Engine performs, and sublist task is to be obtained by sublist task performer, and is performed by enforcement engine.
Performing of sublist task is similar with the execution of aforementioned burst task, does not repeats them here.
The introduction of the concrete structure according to system of above data being carried down, when implementing, permissible Use more existing technology to realize one or more structures above-mentioned.Such as, the scheduling of the present invention is drawn Holding up for task scheduling, described task scheduling refers to based on some preset time, given interval or The given number of times that performs of person performs task automatically.There is a variety of implementation at present, such as Timer, Scheduler, Quartz and JCron Tab.For data acquisition module and enforcement engine, When carrying out concrete data grabber and tasks carrying, before can being realized by the worker created The part of module stated, the function of unit.
In step S3 shown in Fig. 1, perform described acquisition tasks, according to data source configuration letter Breath and task configuration information of carrying down capture the major key data of data to be carried down from source data source, and will The data grabbed carry out burst, generate burst task TDi, can be by stream as shown in Figure 8 Journey implements.
Step S31, generates acquisition tasks.
Step S32, the trigger message sent according to described scheduling engine, obtains from described configuration module Take the configuration information of corresponding acquisition tasks.
Step S33, automatically builds and gathers worker, according to the data base in data source configuration information Link information, automatically builds database access and connects, and be connected in this data base.
Step S34, according to the data table information carried down in task configuration information, determines source table, source Data source.
Step S35, and according to data screening rule of carrying down, the source data source in the table of source is inquired about, Capture source data.
The data captured, after having captured source data, are carried out burst by step S36, create corresponding Burst task, and by described burst task registration to scheduling engine.
The major key data of each burst task are write and gather in tables of data by step S37.When holding Go after above-mentioned steps, returned and perform next acquisition tasks.
In step S4 in FIG, when performing burst task TDi, implement flow process such as Shown in Fig. 9.In this flow process, realize data by building migration worker and migration process device Carry down, the data of the most single task of carrying down are carried down.
Step S41, as migration worker during scheduling engine calls burst executor, i.e. figure, Migrate worker to initialize, make migration worker process duty.
Step S42, gathers tables of data by inquiry and obtains the data list of burst task.
Step S43, it is judged that whether the size of the data list of burst task is more than 0, if greater than 0, Illustrate that data need to migrate, then carried out step S44, if no more than 0, then explanation does not has data Need to migrate, make worker dormancy.
Step S44, loads configuration information.Such as database information, data table information etc..
Step S45, initializes migration process device.
Step S46, finds source data, and the source data of query source data base according to major key information, Wherein major key is made up of, can identify the data of data uniqueness one or a group field.
Step S47, it may be judged whether obtain appointment source data, if obtaining appointment source data, Then carry out step S48, if it did not, update burst task status in step S51.Described here Burst task status refer to execution state, burst task status be divided into pending, performed With perform unsuccessfully three kinds of states, for indicating the implementation status of current slice task.
Step S48, writes target database by source data, if writing successfully, then in step S49 Delete source data, without being successfully written, then step S50 update burst task perform letter Breath, will original state be updated to perform failure.
Step S49, deletes source data, if successfully deleting source data, then in step S51 more New burst task status.Have failed if deleted, then update the execution of burst task in step S50 Information, the execution information of current slice task will be modified to carry out failure, and record unsuccessfully main Information, performs time, accumulative execution number of times etc..After performing unsuccessfully, after burst task start under Secondary also can perform, accumulative execution exceedes maximum reattempt times, then abandon task, and state is for perform mistake Lose.
After having processed current data, judge that whether this task is that batch is carried down task in step S52, If it is, generate corresponding sublist task in step S53, and write sublist task list.
About the execution of sublist task, the execution with burst task is similar, and sublist task performer connects Receive the trigger message of scheduling engine, inquire about sublist task list, it is determined whether have pending subdata Task, and obtain pending sublist task, load and initialize current corresponding the carrying down of sublist task Task configuration information, and construct initialization processor of carrying down and process, concrete data are carried down process Logical AND burst tasks carrying logic is essentially identical, and explanation is not repeated.Sublist task is according to circumstances Also the sublist task of associated data table can again be generated.
The present invention, by carrying out a series of rule configuration, sets periodic data and carries down rule, Can be carried down with dynamic implement data by resolution rules.Newly-increased carry down task time, it is not necessary to this System is again developed or makes big amendment, it is only necessary to can be realized by configuration rule, thus greatly Improve the efficiency of developer, reduce the cost that application system development is safeguarded.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for this For skilled person, the present invention can have various change and change.All spirit in the present invention Any modification, equivalent substitution and improvement etc. with being made within principle, should be included in the present invention's Within protection domain.

Claims (15)

1. data are carried down a method, wherein, comprise the following steps:
Configuration data are carried down information, carry down task configuration information including scheduler task configuration information, data source configuration information, task configuration information of carrying down and batch;
According to described scheduler task configuration information and relevant data source configuration information with carry down task configuration information or batch is carried down task configuration information, generate corresponding acquisition tasks;
Perform described acquisition tasks, according to the data source configuration information in described acquisition tasks with carry down task configuration information or batch is carried down task configuration information, from source data source, capture source data to be carried down, generate multiple burst task;With
Executed in parallel the plurality of burst task, according to carrying down task configuration information or batch is carried down task configuration information, carries over data to be carried down to target database from source database.
2. data as claimed in claim 1 are carried down method, and wherein, the source data to be carried down captured from source data source is major key data;After generating multiple burst tasks, the major key data of each burst task are write and gathers in tables of data.
3. data as claimed in claim 2 are carried down method, wherein, perform described burst task, and according to task configuration information of carrying down, data of carrying down are carried over target database to from source database and specifically included following steps:
From described collection tables of data, read the data list of described burst task, obtain corresponding multiple major key data;
Carry down described in loading task configuration information, obtain database information and the data table information corresponding with described burst task;
Source data source from source database obtains the source data successively corresponding with described major key data, and by described source data write target database;With
Described source data is deleted from described source database.
4. data as claimed in claim 3 are carried down method, wherein, if described burst task is batch carry down task time, after source data corresponding for each major key data is successfully carried in number of targets storehouse, also include: the correlation rule carrying down in task configuration information according to batch, generate corresponding sublist task.
5. data as claimed in claim 4 are carried down method, wherein, also include the step that multiple sublist tasks in parallel performs.
6. the data as described in claim 4 or 5 are carried down method, wherein, before by described source data write target database, also include: according to configuration information, build by the data type conversion of source data to target data, field mapping, target database write field and value.
7. data as claimed in claim 1 are carried down method, wherein, according to data source configuration information with carry down task configuration information or batch is carried down task configuration information, capture source data to be carried down and comprise the following steps from source data source:
Create database access according to the database information in data source configuration information to connect, and be connected to described data base;
According to carrying down task configuration information or batch task configuration information of carrying down obtains source data source and data screening rule, from described source data source, capture the source data to be carried down meeting described data screening rule.
8. data as claimed in claim 1 are carried down method, wherein, when configuration schedules task configuration information, set up described scheduler task configuration information and carry down task configuration information with corresponding or batch is carried down the association of task configuration information.
9. data are carried down a system, wherein, and including:
Configuration module, is used for configuring data and carries down information, carries down task configuration information including data source configuration information, scheduler task configuration information, task configuration information of carrying down and batch;
Scheduling engine, is connected with described configuration module, for carrying down information according to corresponding data, controls the collection of data by sending trigger message and carries down;
Data acquisition module, it is connected with described configuration module and described scheduling engine, for generating corresponding acquisition tasks according to described scheduler task configuration information, trigger message according to described scheduling engine and data source configuration information, perform described acquisition tasks, generate multiple burst task, and by the plurality of burst task registration to described scheduling engine;With
Enforcement engine, it is connected with described configuration module and described scheduling engine, corresponding burst task or sublist task is performed for the trigger message according to described scheduling engine, according to carrying down task configuration information accordingly or batch is carried down task configuration information, carry over source data to be carried down to target database from source database.
10. data as claimed in claim 9 are carried down system, wherein, also include data-interface, be connected with described data acquisition module and described enforcement engine respectively, for providing the data manipulation between described data acquisition module and data base, between described enforcement engine and data base to process.
11. data as described in claim 9 or 10 are carried down system, and wherein, described configuration module includes:
Data source dispensing unit, for configuring the relevant information of source database and target database;
Task of carrying down dispensing unit, the data table information carried down for configuring task of carrying down to need;
Batch is carried down task dispensing unit, carries down task of the carrying down related information of multiple tables of data that are mutually related in task for configuring batch;With
Scheduler task dispensing unit, for configuration schedules task strategy information, and task creation of described scheduler task being carried down with the task of carrying down or batch associates.
12. data as described in claim 9 or 10 are carried down system, and wherein, described scheduling engine includes:
Task registration unit, is connected with described data acquisition module and described enforcement engine respectively, is used for receiving the acquisition tasks of described data acquisition module generation and burst task and the sublist task of described enforcement engine generation;With
Task trigger element, it is connected with described task registration unit, described configuration module, described data acquisition module and described enforcement engine respectively, carry down the burst task and sublist task that information and task registration unit receive according to corresponding data, send trigger message to described data acquisition module and/or described enforcement engine.
13. data as described in claim 9 or 10 are carried down system, and wherein, described data acquisition module includes:
Acquisition tasks signal generating unit, is connected with described configuration module, according to described scheduler task configuration information and relevant data source configuration information with carry down task configuration information or batch is carried down task configuration information, generates acquisition tasks;
Data grabber unit, is connected with described scheduling engine, performs corresponding acquisition tasks for the trigger message according to described scheduling engine, captures the major key data of data to be carried down from source data source;With
Burst task creation unit, is connected with described data grabber unit, for the major key data according to the data to be carried down captured, creates multiple burst task, and by the plurality of burst task registration to described scheduling engine.
14. data as claimed in claim 9 are carried down system, and wherein, described enforcement engine includes:
Task receives unit, is connected with described scheduling engine, for sending, according to described scheduling engine, the trigger message of coming, it is thus achieved that concrete mission bit stream;
Configuration information loading unit, receives unit and described configuration module with described task respectively and is connected, and for receiving, according to described task, the mission bit stream that unit receives, obtains corresponding data from described configuration module and carries down information;With
Data Migration processing unit, receive unit with described configuration information loading unit and task respectively to be connected, for carrying down information and described mission bit stream according to corresponding data, source data corresponding in source database is moved to target database, and described source data is deleted from source database.
15. data as claimed in claim 14 are carried down system, and wherein, described enforcement engine also includes:
Sublist task creation unit, it is connected with described Data Migration processing unit, after source data successfully being carried in number of targets storehouse at described Data Migration processing unit, carry down according to batch the correlation rule in task configuration information, create corresponding sublist task, and by described sublist task registration to described scheduling engine;
The described mission bit stream that the task reception unit of described enforcement engine obtains includes burst mission bit stream and sublist mission bit stream.
CN201610232604.8A 2016-04-14 2016-04-14 Method and system for transferring data Pending CN105930389A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610232604.8A CN105930389A (en) 2016-04-14 2016-04-14 Method and system for transferring data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610232604.8A CN105930389A (en) 2016-04-14 2016-04-14 Method and system for transferring data

Publications (1)

Publication Number Publication Date
CN105930389A true CN105930389A (en) 2016-09-07

Family

ID=56839158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610232604.8A Pending CN105930389A (en) 2016-04-14 2016-04-14 Method and system for transferring data

Country Status (1)

Country Link
CN (1) CN105930389A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446172A (en) * 2016-09-27 2017-02-22 浪潮软件集团有限公司 Data dump method for big data comparison
CN106570135A (en) * 2016-10-27 2017-04-19 深圳市中科长海科技股份有限公司 Data synchronization method and device for databases
CN107665233A (en) * 2017-07-24 2018-02-06 上海壹账通金融科技有限公司 Database data processing method, device, computer equipment and storage medium
CN108304473A (en) * 2017-12-28 2018-07-20 石化盈科信息技术有限责任公司 Data transmission method between data source and system
CN108549652A (en) * 2018-03-08 2018-09-18 北京三快在线科技有限公司 Hotel's dynamic data acquisition methods, device, electronic equipment and readable storage medium storing program for executing
CN108984652A (en) * 2018-06-27 2018-12-11 北京圣康汇金科技有限公司 A kind of configurable data cleaning system and method
CN109558089A (en) * 2018-12-03 2019-04-02 湖南御家科技有限公司 Data migration method, device and equipment and readable storage medium
CN110555012A (en) * 2018-05-14 2019-12-10 杭州海康威视数字技术股份有限公司 data migration method and device
CN110633280A (en) * 2019-09-11 2019-12-31 北京亚信数据有限公司 Batch data acquisition method and device, readable storage medium and computing equipment
CN113760858A (en) * 2020-06-05 2021-12-07 中国移动通信集团湖北有限公司 Dynamic migration method and device for memory database data, computing equipment and storage equipment
CN113836219A (en) * 2021-08-10 2021-12-24 浙江中控技术股份有限公司 Distributed data transfer scheduling system and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999537A (en) * 2011-09-19 2013-03-27 阿里巴巴集团控股有限公司 System and method for data migration
US20130318526A1 (en) * 2012-05-25 2013-11-28 Jeffrey Keating Conrad Cloud Defragmentation
CN103870602A (en) * 2014-04-03 2014-06-18 中国科学院地理科学与资源研究所 Database spatial sharding replication method and system
CN104536994A (en) * 2014-12-11 2015-04-22 北京京东尚科信息技术有限公司 Universal data migration method and device
CN105095384A (en) * 2015-07-01 2015-11-25 北京京东尚科信息技术有限公司 Method and device for data carrying-over
CN105095425A (en) * 2015-07-17 2015-11-25 北京京东尚科信息技术有限公司 Cross-database transfer method and device for databases

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999537A (en) * 2011-09-19 2013-03-27 阿里巴巴集团控股有限公司 System and method for data migration
US20130318526A1 (en) * 2012-05-25 2013-11-28 Jeffrey Keating Conrad Cloud Defragmentation
CN103870602A (en) * 2014-04-03 2014-06-18 中国科学院地理科学与资源研究所 Database spatial sharding replication method and system
CN104536994A (en) * 2014-12-11 2015-04-22 北京京东尚科信息技术有限公司 Universal data migration method and device
CN105095384A (en) * 2015-07-01 2015-11-25 北京京东尚科信息技术有限公司 Method and device for data carrying-over
CN105095425A (en) * 2015-07-17 2015-11-25 北京京东尚科信息技术有限公司 Cross-database transfer method and device for databases

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446172A (en) * 2016-09-27 2017-02-22 浪潮软件集团有限公司 Data dump method for big data comparison
CN106570135A (en) * 2016-10-27 2017-04-19 深圳市中科长海科技股份有限公司 Data synchronization method and device for databases
CN107665233B (en) * 2017-07-24 2020-07-31 深圳壹账通智能科技有限公司 Database data processing method and device, computer equipment and storage medium
CN107665233A (en) * 2017-07-24 2018-02-06 上海壹账通金融科技有限公司 Database data processing method, device, computer equipment and storage medium
WO2019019361A1 (en) * 2017-07-24 2019-01-31 上海壹账通金融科技有限公司 Method and apparatus for processing data of database, computer device, and storage medium
CN108304473A (en) * 2017-12-28 2018-07-20 石化盈科信息技术有限责任公司 Data transmission method between data source and system
CN108304473B (en) * 2017-12-28 2020-09-04 石化盈科信息技术有限责任公司 Data transmission method and system between data sources
CN108549652A (en) * 2018-03-08 2018-09-18 北京三快在线科技有限公司 Hotel's dynamic data acquisition methods, device, electronic equipment and readable storage medium storing program for executing
CN108549652B (en) * 2018-03-08 2021-10-29 北京三快在线科技有限公司 Hotel dynamic data acquisition method and device, electronic equipment and readable storage medium
CN110555012A (en) * 2018-05-14 2019-12-10 杭州海康威视数字技术股份有限公司 data migration method and device
CN110555012B (en) * 2018-05-14 2022-03-25 杭州海康威视数字技术股份有限公司 Data migration method and device
CN108984652B (en) * 2018-06-27 2020-10-27 北京圣康汇金科技有限公司 Configurable data cleaning system and method
CN108984652A (en) * 2018-06-27 2018-12-11 北京圣康汇金科技有限公司 A kind of configurable data cleaning system and method
CN109558089A (en) * 2018-12-03 2019-04-02 湖南御家科技有限公司 Data migration method, device and equipment and readable storage medium
CN110633280A (en) * 2019-09-11 2019-12-31 北京亚信数据有限公司 Batch data acquisition method and device, readable storage medium and computing equipment
CN113760858A (en) * 2020-06-05 2021-12-07 中国移动通信集团湖北有限公司 Dynamic migration method and device for memory database data, computing equipment and storage equipment
CN113760858B (en) * 2020-06-05 2024-03-19 中国移动通信集团湖北有限公司 Dynamic migration method and device for memory database data, computing equipment and storage equipment
CN113836219A (en) * 2021-08-10 2021-12-24 浙江中控技术股份有限公司 Distributed data transfer scheduling system and method

Similar Documents

Publication Publication Date Title
CN105930389A (en) Method and system for transferring data
CN104536811B (en) Method for scheduling task based on HIVE tasks and device
CN103197969B (en) Distributed timed task control device and method
JP5026415B2 (en) Data centric workflow
US10725794B2 (en) Data processing device, data processing method, setting management device, and data processing system
US7640538B2 (en) Virtual threads in business process programs
CN101834750B (en) Method for monitoring common service
CN106022007A (en) Cloud platform system and method oriented to biological omics big data calculation
CN110532074A (en) A kind of method for scheduling task and system of multi-tenant Mode S aaS service cluster environment
CN100484140C (en) Network working flow describing and verificating method driven normally
CN106408272A (en) Distributed deployment based cross-system process engine collaboration system and method
CN104572062A (en) Construction method for geospatial information workflow service function flow templates
JP2002259643A (en) Business process control program
CN112835714A (en) Container arrangement method, system and medium for CPU heterogeneous cluster in cloud edge environment
CN111930354B (en) Framework component system for software development and construction method thereof
CN110569113A (en) Method and system for scheduling distributed tasks and computer readable storage medium
CN107656796A (en) A kind of virtual machine cold moving method, system and equipment
Mueller et al. Automatic generation of simulation models for semiconductor manufacturing
JP2011081579A (en) System operation management method and system for virtual resource in it system virtualization
CN109284935A (en) A kind of task processing system based on micro- assistance the superior and the subordinate office group
CN103164230B (en) Requirement modeling method based on new characteristic model and model transformation method
CN113296809B (en) Declarative general Kubernetes tuning method
JP4000718B2 (en) Program binding method and distributed processing system
KR20230081380A (en) Method for multi-robot task processing allocating tasks to robots and apparatus thereof
CN114443050A (en) Novel log display method based on CI engine assembly line

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160907

RJ01 Rejection of invention patent application after publication