CN104361133B - Data pick-up device and method - Google Patents

Data pick-up device and method Download PDF

Info

Publication number
CN104361133B
CN104361133B CN201410750223.XA CN201410750223A CN104361133B CN 104361133 B CN104361133 B CN 104361133B CN 201410750223 A CN201410750223 A CN 201410750223A CN 104361133 B CN104361133 B CN 104361133B
Authority
CN
China
Prior art keywords
data
switch
extracts
extraction
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410750223.XA
Other languages
Chinese (zh)
Other versions
CN104361133A (en
Inventor
姜亚健
胡沛兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yonyou Network Technology Co Ltd
Original Assignee
Yonyou Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yonyou Network Technology Co Ltd filed Critical Yonyou Network Technology Co Ltd
Priority to CN201410750223.XA priority Critical patent/CN104361133B/en
Publication of CN104361133A publication Critical patent/CN104361133A/en
Application granted granted Critical
Publication of CN104361133B publication Critical patent/CN104361133B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of data pick-up devices, comprising: switch state is loaded into unit, for being loaded into switch state of the control switch when this extracts data, and saves switch state of the control switch when last time extracting data;Data extraction controller, for the switch state according to loading, judging that this extracts the type of data is that full dose extracts type or increment extraction type;When the type that this extracts data is increment extraction type, calculates separately and need to extract, screen and filing specified data;Data pick-up unit carries out data pick-up for needing extraction, filtering and filing specified data according to what is be calculated.The present invention also provides a kind of data pick-up methods.According to the technical solution of the present invention, single object type can be made full use of to complete the data pick-up of more object types on the basis of existing data pick-up mode, establishes the general, unified of the data pick-up that more object types participate in and extracts thinking.

Description

Data pick-up device and method
Technical field
The present invention relates to technical field of data processing, and in particular, to a kind of data pick-up device and a kind of data pick-up Method.
Background technique
Big data is seen everywhere at present, and more and more enterprises start layout data warehouse, passes through data mining and business Analysis tool assists the enterprise to make wise business business decision.The data that each operation system of enterprise generates pass through ETL work Have to data warehouse and the data of high quality are provided.ETL includes the extraction of data, conversion and load 3 links, data pick-up towards How various data sources, such as heterogeneous database, file etc. from the extraction data of the correct stability and high efficiency of these data sources are One of the critical issue that must be taken into consideration in ETL design process.Currently a popular abstracting method has some limitations.
1. full dose extracts: all data of operation system all being passed through ETL tool loads to data warehouse every time, are extracted Simply, but when data volume is increasing, this mode causes to be unable to real-time update data there are performance bottleneck.
2. the pushing-type increment extraction based on triggering mode: the performance of data pick-up is higher, energy real-time update data, but requires Trigger is established in service database or program coding pushes away data to data warehouse, is needed to operation system or service database It is adjusted, not only influences the safety of the stabilization and business datum of operation system, while having certain performance to operation system It influences.
3. the increment extraction based on timestamp: as similar trigger, performance is also relatively good, the relatively clear letter of process It is single, but field is stabbed to operation system mandatory requirement having time, meanwhile, modification time stabs field when data update, and for deleting The data removed need to be recorded in library, are unable to complete deletion, there is certain limitation on data accuracy.
4. full table compares increment extraction: compared by full table to determine that the additions and deletions of data change, performance is poor.
5. database journal compares: the data of variation are judged by the log of analytical database itself, by class database The limitation of type and version, does not support heterogeneous database.
Therefore, it is necessary to a kind of new Data Extraction Technologies, can make full use of on the basis of existing data pick-up mode Single object type completes the data pick-up of more object types, establishes the general, unified of the data pick-up that more object types participate in and takes out Take thinking.
Summary of the invention
The present invention is based on the above problems, proposes a kind of new Data Extraction Technology, can take out in existing data On the basis of taking mode, single object type is made full use of to complete the data pick-up of more object types, establishes what more object types participated in The general, unified of data pick-up extracts thinking.
In view of this, the invention proposes a kind of data pick-up devices, comprising: switch state is loaded into unit, for being loaded into Switch state of the control switch when this extracts data, and save switch state of the control switch when last time extracting data; Data extraction controller, for the switch state according to loading, judging that this extracts the type of data is that full dose extracts type also It is increment extraction type;When the type that this extracts data is increment extraction type, calculates separately and need to extract, screen and return Shelves specified data;Data pick-up unit, for needing extraction, filtering and filing specified data according to what is be calculated, into Row data pick-up.In the technical scheme, can when carrying out data pick-up according to data switch information and business date come into Row increment extraction has effectively been evaded the limitation of current extraction mode, and data warehouse and service database independent operating, has been conducive to Enterprise controls the safety of the stabilization and business datum of operation system.
In the above-mentioned technical solutions, it is preferable that the switch state is loaded into unit, and specifically include: state insmods, and uses In switch state of the loading control switch when this extracts data;State preserving module, for the control switch based on loading Switch state when this extracts data is transferred and saves switch state of the control switch when last time extracting data.At this In technical solution, it can be carried out increment extraction even if operation system is there are in the case that physics is deleted, meet increment extraction Data extrapolating it is high, sources operation system performance etc. is influenced small.
In the above-mentioned technical solutions, it is preferable that the data extraction controller specifically includes: type judging module is extracted, For the switch state according to loading, judging that this extracts the type of data is that full dose extracts type or increment extraction type; Data computation module, for when this extract data type be increment extraction type when, calculate separately need extract, screening with File specified data.In the technical scheme, it can be taken out according to based on the data switch in operation system as data increment The foundation taken, when switch in the case that open, newest closing or when pent three kinds again after be switched on again to corresponding data It does and extracts again, screen and file.
In the above-mentioned technical solutions, it is preferable that the data pick-up unit specifically includes: initialization full dose abstraction module, For needing extraction, filtering and filing specified data according to what is be calculated, the initialization full dose for carrying out data is extracted;It is daily Increment extraction module, the data for being extracted according to initialization full dose, carries out the daily increment extraction of data.In the technical solution In, the main data switch state by comparing operation system, to realize the increment extraction of corresponding business datum.
In the above-mentioned technical solutions, it is preferable that the initialization full dose that the initialization full dose abstraction module carries out data is taken out The operation taken further comprises: (1) loading the current state of switch to last time and extracts state;(2) data are controlled by full dose mode to take out It takes;(3) full dose extracts business datum, does the initialization operation of data warehouse;And/or the daily increment extraction module is counted According to daily increment extraction operation, further comprise: (1) load this switch state to this extract mode bit;(2) this is passed through The time point that the secondary state computation extracted with last time goes out to need incremental data to extract;(3) when according to the data pick-up being calculated Between, increment extraction, filtering loading data;(4) current switch states data conversion storage to last time is extracted into state position, facilitate next time It is used when extraction.In the technical scheme, the data volume of increment extraction is small, to extract the characteristics such as performance height.
According to a further aspect of the invention, it is also proposed that a kind of data pick-up method, comprising: step 202: being loaded into control The switch state when this extracts data is switched, and saves switch state of the control switch when last time extracting data;Step 204: according to the switch state of loading, judging that this extracts the type of data is that full dose extracts type or increment extraction type; When the type that this extracts data is increment extraction type, calculates separately and need to extract, screen and filing specified data;Step Rapid 206: needing extraction, filtering and filing specified data according to what is be calculated, carry out data pick-up.In the technical solution In, increment extraction can be carried out according to data switch information and business date when carrying out data pick-up, effectively evade mesh The limitation of preceding extraction mode, and data warehouse and service database independent operating, conducive to the stabilization of enterprise's control operation system With the safety of business datum.
In the above-mentioned technical solutions, it is preferable that the step 202 specifically includes: step 302: being loaded into control switch at this Switch state when secondary extraction data;Step 304: based on the control switch of loading this extract data when switch state, It transfers and saves switch state of the control switch when last time extracting data.In the technical scheme, even if being deposited in operation system It also can be carried out increment extraction in the case where physics is deleted, the data extrapolating for meeting increment extraction is high, to sources operation system Performance etc. influences small.
In the above-mentioned technical solutions, it is preferable that the step 204 specifically includes: step 402: according to the switch shape of loading State, judging that this extracts the type of data is that full dose extracts type or increment extraction type;Step 404: when this extracts number According to type be increment extraction type when, calculate separately need extract, screening with filing specified data.In the technical solution In, it can be according to the foundation based on the data switch in operation system as data increment extraction, when switch is in opening, newest pass Corresponding data are done after closing or being switched on again and in the case of three kinds pent and are extracted again, screens and files.
In the above-mentioned technical solutions, it is preferable that the step 206 specifically includes: step 502: according to the need being calculated Specified data is extracted, filters and files, the initialization full dose for carrying out data extracts;Step 504: being taken out according to initialization full dose The data taken carry out the daily increment extraction of data.In the technical scheme, the main data switch by comparing operation system State, to realize the increment extraction of corresponding business datum.
In the above-mentioned technical solutions, it is preferable that the step 502 carries out the operation that the initialization full dose of data extracts, into One step includes: (1) to load the current state of switch to extract state to last time;(2) data pick-up is controlled by full dose mode;(3) full dose is taken out Business datum is taken, the initialization operation of data warehouse is done;And/or the step 504 carries out the behaviour of the daily increment extraction of data Make, further comprise: (1) loading this switch state to this and extract mode bit;(2) the state meter of this and last time extraction is passed through Calculate the time point for needing incremental data to extract;(3), according to the data pick-up time being calculated, increment extraction, filtering load number According to;(4) current switch states data conversion storage to last time is extracted into state position, use when facilitating extract next time.In the technical solution In, the data volume of increment extraction is small, to extract the characteristics such as performance height.
By above technical scheme, single object type can be made full use of complete on the basis of existing data pick-up mode At the data pick-up of more object types, establishes the general, unified of the data pick-up that more object types participate in and extract thinking.
Detailed description of the invention
Fig. 1 shows the block diagram of the data pick-up device of embodiment according to the present invention;
Fig. 2 shows the flow charts of the data pick-up method of embodiment according to the present invention;
The switch state that Fig. 3 shows embodiment according to the present invention is loaded into the flow chart of unit;
Fig. 4 shows the flow chart of the data extraction controller of embodiment according to the present invention;
Fig. 5 shows the flow chart of the data pick-up unit of embodiment according to the present invention;
Fig. 6 shows the schematic diagram that incremental data extraction is done according to data switch of embodiment according to the present invention;
The flow chart for the logic that the full dose that Fig. 7 has gone out embodiment according to the present invention extracts;
Fig. 8 has gone out the flow chart of the daily increment extraction of embodiment according to the present invention.
Specific embodiment
To better understand the objects, features and advantages of the present invention, with reference to the accompanying drawing and specific real Applying mode, the present invention is further described in detail.It should be noted that in the absence of conflict, the implementation of the application Feature in example and embodiment can be combined with each other.
In the following description, numerous specific details are set forth in order to facilitate a full understanding of the present invention, still, the present invention may be used also To be implemented using other than the one described here other modes, therefore, protection scope of the present invention is not by described below Specific embodiment limitation.
Fig. 1 shows the block diagram of the data pick-up device of embodiment according to the present invention.
As shown in Figure 1, the data pick-up device 100 of embodiment according to the present invention, comprising: switch state is loaded into unit 102, for being loaded into switch state of the control switch when this extracts data, and control switch is saved when last time extracting data Switch state;Data extraction controller 104 judges that this extracts the type of data and is for the switch state according to loading Full dose extracts type or increment extraction type;When the type that this extracts data is increment extraction type, calculating separately is needed It extracts, screen and filing specified data;Data pick-up unit 106, for according to be calculated need extract, filtering and File specified data, carries out data pick-up.In the technical scheme, can be believed when carrying out data pick-up according to data switch Breath and business date carry out increment extraction, have effectively evaded the limitation of current extraction mode, and data warehouse and business number According to library independent operating, conducive to the safety of the stabilization and business datum of enterprise's control operation system.
In the above-mentioned technical solutions, it is preferable that switch state is loaded into unit 102, and specifically include: state insmods 1022, for being loaded into switch state of the control switch when this extracts data;State preserving module 1024, for based on loading Control switch this extract data when switch state, transfer and save control switch last time extract data when switch State.In the technical scheme, it can be carried out increment extraction even if operation system is there are in the case that physics is deleted, meet The data extrapolating of increment extraction is high, influences on sources operation system performance etc. small.
In the above-mentioned technical solutions, it is preferable that data extraction controller 104 specifically includes: extracting type judging module 1042, for the switch state according to loading, judging that this extracts the type of data is that full dose extracts type or increment extraction Type;Data computation module 1044 calculates separately when the type for extracting data when this is increment extraction type and needs to take out It takes, screen and filing specified data.In the technical scheme, can according to based on the data switch in operation system as number According to the foundation of increment extraction, when switch in the case that open, newest closing or pent three kinds again after be switched on again to right The data answered are done to be extracted again, is screened and is filed.
In the above-mentioned technical solutions, it is preferable that data pick-up unit 106 specifically includes: initialization full dose abstraction module 1062, for needing extraction, filtering and filing specified data according to what is be calculated, the initialization full dose for carrying out data is taken out It takes;Daily increment extraction module 1066, the data for being extracted according to initialization full dose, carries out the daily increment extraction of data. In the technical scheme, the main data switch state by comparing operation system, to realize that the increment of corresponding business datum is taken out It takes.
In the above-mentioned technical solutions, it is preferable that the initialization full dose that initialization full dose abstraction module 1062 carries out data is taken out The operation taken further comprises: (1) loading the current state of switch to last time and extracts state;(2) data are controlled by full dose mode to take out It takes;(3) full dose extracts business datum, does the initialization operation of data warehouse;And/or daily increment extraction module 1064 is counted According to daily increment extraction operation, further comprise: (1) load this switch state to this extract mode bit;(2) this is passed through The time point that the secondary state computation extracted with last time goes out to need incremental data to extract;(3) when according to the data pick-up being calculated Between, increment extraction, filtering loading data;(4) current switch states data conversion storage to last time is extracted into state position, facilitate next time It is used when extraction.In the technical scheme, the data volume of increment extraction is small, to extract the characteristics such as performance height.
Fig. 2 shows the flow charts of the data pick-up method of embodiment according to the present invention.
As shown in Fig. 2, the data pick-up method of embodiment according to the present invention, comprising: step 202: being loaded into control switch Switch state when this extracts data, and save switch state of the control switch when last time extracting data;Step 204: According to the switch state of loading, judging that this extracts the type of data is that full dose extracts type or increment extraction type;When this When the secondary type for extracting data is increment extraction type, calculates separately and need to extract, screen and filing specified data;Step 206: needing extraction, filtering and filing specified data according to what is be calculated, carry out data pick-up.In the technical scheme, Increment extraction can be carried out according to data switch information and business date when carrying out data pick-up, effectively evade current pumping The limitation of mode, and data warehouse and service database independent operating are taken, conducive to the stabilization and industry of enterprise's control operation system The safety for data of being engaged in.
In the above-mentioned technical solutions, it is preferable that as shown in figure 3, step 202, specifically includes: step 302: being loaded into control and open Close the switch state when this extracts data;Step 304: based on switch of the control switch of loading when this extracts data State is transferred and saves switch state of the control switch when last time extracting data.In the technical scheme, even if in business system System also can be carried out increment extraction in the case where deleting there are physics, and the data extrapolating for meeting increment extraction is high, to source business System performance etc. influences small.
In the above-mentioned technical solutions, it is preferable that as shown in figure 4, step 204, specifically includes: step 402: according to loading Switch state, judging that this extracts the type of data is that full dose extracts type or increment extraction type;Step 404: when this When the type for extracting data is increment extraction type, calculates separately and need to extract, screen and filing specified data.In the technology It, can be according to the foundation based on the data switch in operation system as data increment extraction, when switch is being opened, most in scheme Corresponding data are done after newly closing or be switched on again and in the case of three kinds pent and are extracted again, screens and files.
In the above-mentioned technical solutions, it is preferable that as shown in figure 5, step 206, specifically includes: step 502: according to calculating To need extract, filtering and filing specified data, carry out data initialization full dose extract;Step 504: according to initialization The data that full dose extracts, carry out the daily increment extraction of data.In the technical scheme, the main number by comparing operation system According to switch state, to realize the increment extraction of corresponding business datum.
In the above-mentioned technical solutions, it is preferable that step 502 carries out the operation that the initialization full dose of data extracts, further It include: (1) to load the current state of switch to extract state to last time;(2) data pick-up is controlled by full dose mode;(3) full dose extracts industry Business data, do the initialization operation of data warehouse;And/or step 504 carries out the operation of the daily increment extraction of data, into one Step includes: (1) to load this switch state to extract mode bit to this;(2) go out to need with the state computation that last time is extracted by this The time point for wanting incremental data to extract;(3) according to the data pick-up time being calculated, increment extraction, filtering loading data;⑷ Current switch states data conversion storage to last time is extracted into state position, use when facilitating extract next time.In the technical scheme, increase It is small to measure the data volume extracted, to extract the characteristics such as performance height.
Technical solution of the present invention provides a kind of data pick-up method and apparatus based on data switch.Enterprise is in reality In the work of border, it is often necessary to load cleaning and conversion are carried out to business datum, but as data volume is increasing, every time to data Carrying out full dose extraction becomes more and more difficult, and carries out increment extraction than equity according to timestamp, trigger, full table and exist very More limitations.But in view of in practical business, enterprises are in order to guarantee the stability of business datum, it will usually data be arranged and open It closes.Therefore, technical solution of the present invention carries out increment according to data switch information and business date when carrying out data pick-up It extracts, has effectively evaded the limitation of current extraction mode, and data warehouse and service database independent operating, be conducive to enterprise and control The safety of the stabilization and business datum of operation system processed.
With the arriving of big data era, also become increasingly complex to data pick-up, filtering and the demand of transformation, for data The requirement of the method for extraction is also increasingly harsher.And the method for data pick-up show show in some cases it is unusual excellent More, it shows in other cases not fully up to expectations or even completely not applicable.
Technical solution of the present invention is primarily adapted for use in following situations:
1, there are data switches for operation system, and record the operation version of each switch;2, data switch is being closed After can permit and be again turned on, also can permit can not open, but should finally approach to turn off;3, the pent correspondence of data switch Data will be unable to modify, if data switch must be reopened by modifying.Technical solution of the present invention, it is main by comparing industry The data switch state of business system, to realize the increment extraction of corresponding business datum.
In order to efficiently solve problem of the existing technology, has devised a kind of incremental number is done according to data switch herein According to the device of extraction, which is made of following three parts, referring to Fig. 6:
Switch state is loaded into unit: its main function is the state in order to be loaded into control switch when this extracts data, And the switch state of last time extraction is saved, in order to which data extraction controller judge to need to extract again respectively, screen and files Part data.
Data extraction controller: it, which needs to be loaded into the switch state that unit is loaded into according to switch state, judges to do full dose pumping Take still increment extraction, when doing increment extraction, calculate separately out the part data for needing to extract, screen and file.
Data pick-up unit: it extracted, filtered and filing specified data according to data extraction controller.The data pick-up list The data pick-up one of member is divided into two parts, i.e. initialization full dose extracts and daily increment extraction.What wherein full dose extracted patrols Volume fairly simple, can be divided into two steps: specific steps are referring to Fig. 7.
Daily increment extraction step is referring to Fig. 8.Note: the comparison principle of this method in step 2 is as follows:
(1) this switch is opened, and data need to correspond to extraction and filing again;
(2) this switch is closed, and is open when last time switchs, is needed to extract corresponding data again and file;
(3) this switch is closed, and last time is also switched off, but intermediate once opened, is needed to extract corresponding data again and be returned Shelves.
Below from the point of view of us an inventory auditing specific example.Inventory auditing module need by each accounting moon and at The combination in this domain does the moon and checks out operation, and some information such as document that corresponding cost domain is combined with the accounting moon after monthly closing entry are with regard to nothing Method modification, and this operation of monthly closing entry due to it is possible that mistake and be allowed to reopen modification, and open modify and Regular, i.e., monthly closing entry can carry out monthly closing entry according to the sequence of the accounting moon, i.e., if to tie the latter moon, the previous moon must be It is checkout, and if taking in reef knot account one month, need first to do the operation of reef knot account subsequent month.In inventory auditing module In, data volume is very big, and full dose extracts consumes resource and time very much, but since data volume is big, there is some operation systems pair Some data bank service tables have directly carried out physics delete operation, so that carrying out data increment extraction change by way of timestamp Can not.
We just carry out increment extraction, filtering and the filing of data using the present invention below.It is determined as data first The table structure of the monthly closing entry tables of data primary fields of switch is as follows:
Table 1
The storage mode of table 1 stores a data when being each accounting moon and the combination monthly closing entry in cost domain, if gone out Now the phenomenon that anti-monthly closing entry, data are just deleted into label labeled as deletion.The tables of data being related to is with material monthly financial statement and detail document For, the structure of their tables of data primary fields is as shown in the table:
Table 2
Table 3
Wherein table 2 be for amount of storage and the amount of money of the every kind of material in each accounting moon and cost domain and they Average unit price, the table are deleted without using physics, deletion are denoted as deletion when there is reef knot account, if there is reef knot account Operation, just by the data modification and delete field mark be do not delete.And table 3 is detail document table, each detail form According to a data, data volume is very huge, therefore physics deletion has been used in operation system.
Just to come to solve the problems, such as appeal using the method in the present invention below, tables of data 1 can be used as data switch table, if It has been settled accounts that, i.e., stored an effective record in tables of data 1, then being directed to this records corresponding cost domain and meeting The data switch of meter moon combination is considered as closing, and is otherwise considered as it and is open.Look first at the step initialized It is rapid:
(1) tables of data 1 is regard as clock switch, and the result that its data full dose loading data warehouse was extracted as last time;
(2) extraction state is set to full dose and extracted and filing by data extraction controller;
(3) data pick-up device extracts the data to data warehouse of table 2 and table 3 according to the state full dose of controller, and will count According to full dose filing to data exhibiting layer.
The operation of data initialization is relatively easy, but the mode for still using full dose to extract when scheduler routine will Very consumption resource, it is necessary to use the mode of increment extraction.Firstly, since tables of data 1 meets the premise by increment extraction, So timestamp increment extraction data are based on to tables of data 1, since the data of increment extraction are all that data are opened after extracting the last time Variation was done in pass, so these, which switch corresponding data, should all do extraction again and filing.In addition, last time extracts number later It is also to be likely to occur variation according to the correspondence part business datum that switch is not turned off, so this partial data is also required to weight It is new to extract and file.Above be exactly the strategy of daily increment extraction data, below referring again to daily increment extraction once the step of:
(1) timestamp is based on to table 1 and does data increment extraction, be loaded into data warehouse.
(2) data pick-up mode is set increment extraction by data extraction controller, and the data that step 1 is extracted are corresponding Combination of the accounting moon with cost domain corresponding to data markers be to need to extract and file again, the data that last time is extracted are opened The part data markers that the Central Shanxi Plain is not turned off are to need to extract and file again.Herein these are needed to extract and file again Cost domain and the combination of the accounting moon be denoted as set A.
(3) for tables of data 2, due to there is no physics deletion, it is possible to use highly efficient timestamp increment extraction side Formula, by data interface tier by data pick-up into data warehouse, then by the accounting moon and cost domain combination belong to set A that Partial data does the processing of increment filing again.
(4) for tables of data 3, due to there is physics deletion, so not being available the increment extraction mode based on timestamp.It will The combination in the accounting moon and cost domain belongs to the detail document in set A by data interface tier increment extraction to data bins in table 3 In library, and this partial data is filed again to data exhibiting layer data table.
(5) the data of the newest change of increment extraction in step 1 were updated to last time and extract data switch state table, make it It is consistent with business library, the state as this data switch retains, for use next time.
Technical solution of the present invention, according to the foundation based on the data switch in operation system as data increment extraction, When switch in the case that open, newest closing or pent three kinds again after be switched on again s are done corresponding data and are taken out again It takes, screen and files.Accordingly even when meeting increasing there are also can be carried out increment extraction in the case that physics is deleted in operation system It is high to measure the data extrapolating extracted, sources operation system performance etc. is influenced small.Meanwhile the data volume of increment extraction is small, to take out Take the characteristics such as performance height.
The technical scheme of the present invention has been explained in detail above with reference to the attached drawings, it is contemplated that not easy, system in the related technology One solution extracted for complex type data.Existing data pick-up is unable to complete the data of complicated type participation Extraction process.Therefore, the invention proposes a kind of data pick-up devices and a kind of data pick-up method, can be in existing data On the basis of extraction mode, single object type is made full use of to complete the data pick-up of more object types, establishes more object types and participate in The general, unified of data pick-up extract thinking.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (5)

1. a kind of data pick-up device characterized by comprising
Switch state is loaded into unit, for being loaded into switch state of the control switch when this extracts data, and saves control and opens Close the switch state when last time extracting data;
Data extraction controller, for the switch state according to loading, judging that this extracts the type of data is that full dose extracts class Type or increment extraction type;When the type that this extracts data is increment extraction type, calculates separately and need to extract, screen With filing specified data;
The data extraction controller, specifically includes:
Type judging module is extracted, for the switch state according to loading, judging that this extracts the type of data is that full dose extracts Type or increment extraction type;
Data computation module when the type for extracting data when this is increment extraction type, calculates separately and needs to extract, sieve Choosing and filing specified data;
Data pick-up unit carries out data pumping for needing extraction, filtering and filing specified data according to what is be calculated It takes;
The data pick-up unit, specifically includes:
Full dose abstraction module is initialized, for needing extraction, filtering and filing specified data according to what is be calculated, is counted According to initialization full dose extract;
Daily increment extraction module, the data for being extracted according to initialization full dose, carries out the daily increment extraction of data;
The initialization full dose abstraction module carries out the operation that the initialization full dose of data extracts, and further comprises:
(1) load the current state of switch and extract state to last time;
(2) data pick-up is controlled by full dose mode;
(3) full dose extracts business datum, does the initialization operation of data warehouse;
And/or
The daily increment extraction module carries out the operation of the daily increment extraction of data, further comprises:
(1) it loads this switch state and extracts mode bit to this;
(2) by the time point that this goes out to need incremental data to extract with the state computation that last time is extracted, wherein comparing principle such as Under: this switch is opened, and data need to correspond to extraction and filing again;This switch is closed, and is open, is needed when last time switchs Again it to extract corresponding data and file;This switch is closed, and last time is also switched off, but intermediate once opened, needs to take out again It takes corresponding data and files;
Or when switch in the case that open, newest closing or pent three kinds again after be switched on again s do weight to corresponding data It is new to extract, screening and filing;
(3) according to the data pick-up time being calculated, increment extraction, filtering loading data;
(4) current switch states data conversion storage to last time is extracted into state position, use when facilitating extract next time.
2. data pick-up device according to claim 1, which is characterized in that the switch state is loaded into unit, specific to wrap It includes:
State insmods, for being loaded into switch state of the control switch when this extracts data;
State preserving module is transferred and is saved for switch state of the control switch based on loading when this extracts data Switch state of the control switch when last time extracting data.
3. a kind of data pick-up method characterized by comprising
Step 202: being loaded into switch state of the control switch when this extracts data, and save control switch and extracted number in last time According to when switch state;
Step 204: according to the switch state of loading, judging that this extracts the type of data is that full dose extracts type or increment is taken out Take type;When the type of this extraction data is increment extraction type, calculates separately and extraction, screening and filing is needed to specify Data;
Step 206: needing extraction, filtering and filing specified data according to what is be calculated, carry out data pick-up;
The step 206, specifically includes:
Step 502: needing extraction, filtering and filing specified data according to what is be calculated, carry out the initialization full dose of data It extracts;
Step 504: the data extracted according to initialization full dose carry out the daily increment extraction of data;
The step 502 carries out the operation that the initialization full dose of data extracts, and further comprises:
(1) load the current state of switch and extract state to last time;
(2) data pick-up is controlled by full dose mode;
(3) full dose extracts business datum, does the initialization operation of data warehouse;
And/or
The step 504 carries out the operation of the daily increment extraction of data, further comprises:
(1) it loads this switch state and extracts mode bit to this;
(2) by the time point that this goes out to need incremental data to extract with the state computation that last time is extracted, wherein comparing principle such as Under: this switch is opened, and data need to correspond to extraction and filing again;This switch is closed, and is open, is needed when last time switchs Again it to extract corresponding data and file;This switch is closed, and last time is also switched off, but intermediate once opened, needs to take out again It takes corresponding data and files;
Or when switch in the case that open, newest closing or pent three kinds again after be switched on again s do weight to corresponding data It is new to extract, screening and filing;
(3) according to the data pick-up time being calculated, increment extraction, filtering loading data;
(4) current switch states data conversion storage to last time is extracted into state position, use when facilitating extract next time.
4. data pick-up method according to claim 3, which is characterized in that the step 202 specifically includes:
Step 302: being loaded into switch state of the control switch when this extracts data;
Step 304: based on switch state of the control switch of loading when this extracts data, transferring and save control switch and exist Last time extracts switch state when data.
5. data pick-up method according to claim 3 or 4, which is characterized in that the step 204 specifically includes:
Step 402: according to the switch state of loading, judging that this extracts the type of data is that full dose extracts type or increment is taken out Take type;
Step 404: when the type that this extracts data is increment extraction type, calculating separately and need to extract, screen and file Specified data.
CN201410750223.XA 2014-12-10 2014-12-10 Data pick-up device and method Active CN104361133B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410750223.XA CN104361133B (en) 2014-12-10 2014-12-10 Data pick-up device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410750223.XA CN104361133B (en) 2014-12-10 2014-12-10 Data pick-up device and method

Publications (2)

Publication Number Publication Date
CN104361133A CN104361133A (en) 2015-02-18
CN104361133B true CN104361133B (en) 2019-06-21

Family

ID=52528393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410750223.XA Active CN104361133B (en) 2014-12-10 2014-12-10 Data pick-up device and method

Country Status (1)

Country Link
CN (1) CN104361133B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512176B (en) * 2015-11-24 2019-07-09 北京中电普华信息技术有限公司 A kind of increment extraction method and system based on Informatica Powercenter
CN105760485A (en) * 2016-02-17 2016-07-13 上海携程商务有限公司 Financial data extraction method and system
CN106126612A (en) * 2016-06-22 2016-11-16 重庆秒银科技有限公司 A kind of big ETL process dynamically divides the data pick-up method of timeslice
CN107229721B (en) * 2017-06-02 2019-10-29 泰华智慧产业集团股份有限公司 A kind of method and device changing data pick-up
CN108876585B (en) * 2018-09-29 2021-05-04 金蝶软件(中国)有限公司 Method for anti-checkout in spanning period and related equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102375891A (en) * 2011-11-15 2012-03-14 山东浪潮金融信息系统有限公司 Implementation tool for unloading and loading incremental data
CN102508908A (en) * 2011-11-11 2012-06-20 北京用友政务软件有限公司 Method for acquiring subordinate financial business data and system for acquiring subordinate financial business data
CN104102737A (en) * 2014-07-28 2014-10-15 中国农业银行股份有限公司 Historical data storage method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080177892A1 (en) * 2007-01-19 2008-07-24 International Business Machines Corporation Method for service oriented data extraction transformation and load
CN101923566A (en) * 2010-06-24 2010-12-22 浙江协同数据系统有限公司 Data increment extraction method based on trigger

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102508908A (en) * 2011-11-11 2012-06-20 北京用友政务软件有限公司 Method for acquiring subordinate financial business data and system for acquiring subordinate financial business data
CN102375891A (en) * 2011-11-15 2012-03-14 山东浪潮金融信息系统有限公司 Implementation tool for unloading and loading incremental data
CN104102737A (en) * 2014-07-28 2014-10-15 中国农业银行股份有限公司 Historical data storage method and system

Also Published As

Publication number Publication date
CN104361133A (en) 2015-02-18

Similar Documents

Publication Publication Date Title
CN104361133B (en) Data pick-up device and method
CN108319543A (en) A kind of asynchronous processing method and its medium, system of computer log data
CN109144414A (en) The multistage storage method and device of block chain data
CN104102737B (en) A kind of historical data storage method and system
CN110297866A (en) Method of data synchronization and data synchronization unit based on log analysis
CN102799962A (en) Configuration type business flow system and realization method thereof
CN104636338B (en) A kind of data cleansing storage method for the monitoring of value-added tax negative and positive ticket
CN103370691A (en) Managing buffer overflow conditions
CN104636337B (en) A kind of data cleansing storage method for value-added tax
CN107239457A (en) Data archiving method and device
CN106294810A (en) A kind of system and method for enterprise product data filing
CN107491563A (en) Towards the data processing method and system of settlement for account
Jain et al. Refreshing datawarehouse in near real-time
CN102332004A (en) Data processing method and system for managing mass data
CN111639121A (en) Big data platform and method for constructing customer portrait
CN106127567A (en) Enterprise deposits journal account and the account checking method of cash in banks statement and system
CN104050251B (en) A kind of file management method and management system
CN109189724A (en) Improve the method and device of video monitoring system audio, video data storage efficiency
CN106803815A (en) A kind of flow control methods and device
CN107679926A (en) A kind of invoice batch intelligence is split and combinational algorithm
CN107665153A (en) Data back up method, restoration methods and device in a kind of big data system
CN104407811B (en) A kind of merging I/O device based on cloud computing
CN109189657A (en) A kind of recording method, storage medium and the server of user's operation behavior
CN105824867A (en) Mass file management system based on multi-stage distributed metadata
CN109753357A (en) The resource and constructing network topology method, equipment, medium of virtual machine management platform

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100094 Beijing city Haidian District North Road No. 68, UFIDA Software Park

Applicant after: Yonyou Network Technology Co., Ltd.

Address before: 100094 Beijing city Haidian District North Road No. 68, UFIDA Software Park

Applicant before: UFIDA Software Co., Ltd.

COR Change of bibliographic data
GR01 Patent grant
GR01 Patent grant