CN108228908A - A kind of data pick-up method and device - Google Patents
A kind of data pick-up method and device Download PDFInfo
- Publication number
- CN108228908A CN108228908A CN201810132705.7A CN201810132705A CN108228908A CN 108228908 A CN108228908 A CN 108228908A CN 201810132705 A CN201810132705 A CN 201810132705A CN 108228908 A CN108228908 A CN 108228908A
- Authority
- CN
- China
- Prior art keywords
- data
- pick
- data pick
- mode
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of data pick-up method and device, this method includes:The data pick-up task of the source system of acquisition is parsed, and data pick-up list corresponding with the data pick-up task is generated according to data partition granularity;According to the data capacity in the data pick-up list, the data pick-up mode to the source system is determined;When carrying out data pick-up according to the first preset data extraction mode, the data of each data partition are extracted, and generate the first data file, first data file is preserved to goal systems;When carrying out data pick-up according to the second preset data extraction mode, data pick-up is carried out, and the data file being drawn into is preserved to the goal systems to the source system.Tables of data subregion is realized by the present invention to extract, and improves data pick-up efficiency and reduces the purpose of data pick-up mistake.
Description
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of data pick-up method and device.
Background technology
With the development of Internet technology, need to carry out data transmission between more and more systems and apply, this is just needed
The data of certain systems are extracted and are imported or exported to corresponding purpose system.
Existing data pick-up scheme usually through the following steps that realize:Source system data table derived from identification needs
Range;Export import statement is write, export DMP files pass to goal systems or import DMP files.Available data extraction side
The entire process flow of case is required for operating personnel to be controlled and performed, and can cause data pick-up efficiency due to manual intervention in this way
It is relatively low.Also, it easily malfunctions for the export importing process of mass data table, once derived DPM files are problematic, for one
Secondary property export hundreds of even thousands of tables, then all tables all will be unable to successfully export and import so that it is less efficient simultaneously
And accuracy is relatively low;Since the structure of every tables of data is different, pumping can not be realized using data pick-up sentence in currently existing scheme
Take a part of data for meeting demand in all tables, it may appear that the influence of data pick-up mistake.
Invention content
The above problem is directed to, the present invention provides a kind of data pick-up method and device, realizes the extraction of tables of data subregion,
It improves data pick-up efficiency and reduces the purpose of data pick-up mistake.
To achieve these goals, the present invention provides following technical solutions:
A kind of data pick-up method, including:
The data pick-up task of the source system of acquisition is parsed, and according to the generation of data partition granularity and the data
The corresponding data pick-up list of extraction task;
According to the data capacity in the data pick-up list, the data pick-up mode to the source system is determined;
When carrying out data pick-up according to the first preset data extraction mode, the data of each data partition are carried out
It extracts, and generates the first data file, first data file is preserved to goal systems;
When carrying out data pick-up according to the second preset data extraction mode, data pick-up is carried out to the source system, and
The data file being drawn into is preserved to the goal systems.
Preferably, which is characterized in that the data pick-up task of the source system of described pair of acquisition parses, and according to data
Subregion granularity generates data pick-up list corresponding with the data pick-up task, including:
The data pick-up task of acquisition is parsed, obtains configuration information corresponding with the data pick-up task;
Subregion is carried out to the data pick-up task according to the configuration information, by the data pick-up task of each subregion
Generation data pick-up subtask corresponding with the subregion;
Each data pick-up subtask is generated into data pick-up list.
Preferably, data file is being preserved to before the goal systems, further included:
The goal systems is judged with the presence or absence of subregion corresponding with the data file, if being not present, in the mesh
Subregion corresponding with the data file is added in mark system;
If in the presence of target partition corresponding with the data file in the goal systems is found, and by the mesh
Data in mark subregion are emptied.
Preferably, the data capacity in the data pick-up list determines to take out the data of the source system
Mode is taken, including:
Judge whether the data capacity in the data pick-up list is more than preset data amount threshold value, if it is, to institute
It states source system and data pick-up is carried out using the first preset data extraction mode, conversely, then extracting mode using the second preset data
Carry out data pick-up;
The first data pick-up mode extracts mode for DMP files, and the second data pick-up mode is DBLIK data
Extraction mode.
Preferably, it when preserving first data file to goal systems, further includes:
First data file is transmitted to by preset degree of parallelism by preset data transfer mode described
Goal systems;
Judge whether the goal systems is successful and first data file obtained according to the data pick-up list parallel,
If it is, continue to upload first data file;If it is not, then judge that first data file whether there is mistake.
A kind of data pick-up device, including:
Generation module, the data pick-up task for the source system to acquisition parse, and according to data partition granularity
Generation data pick-up list corresponding with the data pick-up task;
Determining module for the data capacity in the data pick-up list, determines the data to the source system
Extraction mode;
First abstraction module, will be each described when carrying out data pick-up for extracting mode according to the first preset data
The data of data partition are extracted, and generate the first data file, and first data file is preserved to goal systems;
Second abstraction module, when carrying out data pick-up for extracting mode according to the second preset data, to the source system
System carries out data pick-up, and the data file being drawn into is preserved to the goal systems.
Preferably, the generation module includes:
Resolution unit parses for the data pick-up task to acquisition, obtains corresponding with the data pick-up task
Configuration information;
Zoning unit, for carrying out subregion to the data pick-up task according to the configuration information, by each subregion
Data pick-up task generate corresponding with subregion data pick-up subtask;
Generation unit, for each data pick-up subtask to be generated data pick-up list.
Preferably, data file is being preserved to before the goal systems, further included:
Judgment module, for judging the goal systems with the presence or absence of subregion corresponding with the data file, if not depositing
Subregion corresponding with the data file is then being added in the goal systems;
If in the presence of target partition corresponding with the data file in the goal systems is found, and by the mesh
Data in mark subregion are emptied.
Preferably, the determining module includes:
Capacity judging unit, for judging whether the data capacity in the data pick-up list is more than preset data amount threshold
Value, if it is, extracting mode using the first preset data to the source system carries out data pick-up, conversely, then using second
Preset data extracts mode and carries out data pick-up;
The first data pick-up mode extracts mode for DMP files, and the second data pick-up mode is DBLIK data
Extraction mode.
Preferably, it when preserving first data file to goal systems, further includes:
Transmission unit, for by preset data transfer mode by first data file by preset degree of parallelism simultaneously
Row is transmitted to the goal systems;
Judging unit is obtained, is obtained parallel according to the data pick-up list for judging whether the goal systems is successful
First data file, if it is, continuing to upload first data file;If it is not, then judge first data
File whether there is mistake.
Compared to the prior art, data pick-up method and device provided by the invention passes through the number of the source system to acquisition
It is parsed according to extraction task, data pick-up task is divided into a rule subtask according to subregion granularity ultimately generates task extraction
List, such Paralleled extracts data when namely being imported or exported to data, can solve the prior art
The problem of data pick-up efficiency is low caused by the middle data that whole is to be extracted are extracted.Two kinds of data pick-ups are set simultaneously
Mode, while can so that extraction efficiency improves using corresponding data pick-up mode according to data capacity, without manually doing
In advance, it can also realize that source system is synchronous with the data of goal systems, decrease the erroneous effects of data pick-up.
Description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention, for those of ordinary skill in the art, without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow diagram of data pick-up method provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of another data pick-up method provided in an embodiment of the present invention;
Fig. 3 is a kind of structure diagram of data pick-up device provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other without making creative work
Embodiment shall fall within the protection scope of the present invention.
Term " first " and " second " in description and claims of this specification and above-mentioned attached drawing etc. are for area
Not different object rather than for describing specific sequence.In addition term " comprising " and " having " and their any deformations,
It is intended to cover non-exclusive include.Such as it contains the process of series of steps or unit, method, system, product or sets
It is standby not to be set in the step of having listed or unit, but the step of may include not listing or unit.
An embodiment of the present invention provides a kind of data pick-up methods, and referring to Fig. 1, this method may comprise steps of:
S11, the data pick-up task of the source system of acquisition is parsed, and according to data partition granularity generation with it is described
The corresponding data pick-up list of data pick-up task;
A kind of data pick-up list generation method is additionally provided in another embodiment of the invention, can include following step
Suddenly:
The data pick-up task of acquisition is parsed, obtains configuration information corresponding with the data pick-up task;
Subregion is carried out to the data pick-up task according to the configuration information, by the data pick-up task of each subregion
Generation data pick-up subtask corresponding with the subregion;
Each data pick-up subtask is generated into data pick-up list.
The data pick-up task of front-end configuration is parsed, wherein, data pick-up task for front-end task personnel according to
It needs, selects table range, date range, territorial scope etc., generate a data pick-up task, in embodiments of the present invention data
Extraction can be specially that data export or data import.
Then, from the background can by pre-set code, such as p_gen_task_file parse the data pick-up task with confidence
Breath in a manner that a child partition generation a data goes out subtask, ultimately generates data pick-up list, i.e., the data are taken out
List is taken to include multiple data pick-up subtasks.
It should be noted that p_gen_task_file is the importing task for parsing front-end configuration, generation subtask (text
Part list) program;Program derived from reality is exp_process.sh, this shell can read subtask (listed files),
According to the degree of parallelism of setting, export dmp is played in each subtask using expdp tune, for example degree of parallelism is set as 10, then same
Just there are ten expdp sentences at moment in running background, 10 dmp files of generation;The program of importing is imp_process.sh, mistake
Journey is to read subtask (listed files), and according to the degree of parallelism of setting, each subtask is risen using impdp tune and imported.
S12, the data capacity in the data pick-up list determine the data pick-up mode to the source system;
The embodiment of the present invention additionally provides a kind of data pick-up mode and determines method, can include:
Judge whether the data capacity in the data pick-up list is more than preset data amount threshold value, if it is, to institute
It states source system and data pick-up is carried out using the first preset data extraction mode, conversely, then extracting mode using the second preset data
Carry out data pick-up;
The first data pick-up mode extracts mode for DMP files, and the second data pick-up mode is DBLIK data
Extraction mode.
Corresponding data pick-up mode, the small use of data volume are selected for the data capacity size of data pick-up list
DBLIK data pick-up modes, the big use DMP file modes of data volume, the size of data volume is judged with predetermined threshold value
, for example, data volume is generally considered that data volume is small less than 100M, the small table configuration of data volume can be supplied journey in a table
Sequence uses.
S13, according to the first preset data extract mode carry out data pick-up when, by the data of each data partition
It is extracted, and generates the first data file, first data file is preserved to goal systems;
S14, according to the second preset data extract mode carry out data pick-up when, to the source system carry out data pumping
It takes, and the data file being drawn into is preserved to the goal systems.
For example, when being extracted to data, export list (EXP_TASK_FILES) designs as follows, major key
For DMP_TASK_ID and DMP_FILENAME, which created in source system, provides export information, text derived from record
Part state, every corresponding subregion of record or a table (zoneless table), EXP_TASK_FILES is goal systems journey
Sequence is inserted by DBLINK, then in the case of goal systems and source system can not use DBLINK, needs goal systems
With source system treaty rule, source system voluntarily generates DMP files by treaty rule.
Data pick-up method provided by the invention is parsed by the data pick-up task of the source system to acquisition, will
Data pick-up task is divided into a rule subtask according to subregion granularity and ultimately generates task extraction list, such Paralleled logarithm
According to extracted namely data are imported or are exported when, can solve in the prior art by all data to be extracted into
The problem of data pick-up efficiency is low caused by row extracts.Two kinds of data pick-up modes are set simultaneously, it can be according to data capacity
While so that extraction efficiency improves using corresponding data pick-up mode, without manual intervention, can also realize source system and
The data of goal systems synchronize, and decrease the erroneous effects of data pick-up.
A kind of subregion addition and method for cleaning are additionally provided in embodiments of the present invention, can be included:
The goal systems is judged with the presence or absence of subregion corresponding with the data file, if being not present, in the mesh
Subregion corresponding with the data file is added in mark system;
If in the presence of target partition corresponding with the data file in the goal systems is found, and by the mesh
Data in mark subregion are emptied.
After data file is generated, subregion corresponding with the data file is possible in goal systems and is not present, that
It just needs to judge that the subregion whether there is, if it does not, just needing to be added in goal systems;If it does, it needs
The subregion is cleared up, the original data of the subregion are all deleted.For zoneless table, before data file importing
Whole table data can be emptied.
Not table master data amount all very littles of subregion, this kind of table can carry out data pick-up using dblink modes substantially,
Dmp file modes can also be used, only generate a dmp;In addition this kind of table without child partition, it is just deferred to use most fine granularity
It generates subtask (listed files), such as not partition table, then subtask (listed files) just only one, i.e. table are in itself;It is right
In the table of only single subregion (not being compound subregion, i.e., no child partition), then subtask (listed files) item number is exactly single
One number of partitions.For the table of compound subregion, then subtask (listed files) item number is exactly child partition number.
A kind of data file transmission method is additionally provided in embodiments of the present invention, can be included:
First data file is transmitted to by preset degree of parallelism by preset data transfer mode described
Goal systems;
Judge whether the goal systems is successful and first data file obtained according to the data pick-up list parallel,
If it is, continue to upload first data file;If it is not, then judge that first data file whether there is mistake.
According to transmission mode, for FTP, (File Transfer Protocol, file pass preset fraction in embodiments of the present invention
Defeated agreement), FTP is used for transmit dmp files to goal systems.Specifically, according to setting degree of parallelism and be about to generation DMP
File FTP is to goal systems;If FTP successes, then it is ready that can put file status in imp_task_files, for importing
Function uses:
It is that source system performs update file by DBKLINK in the case of goal systems and source system can be with DBLINK
State is " ready ";
For goal systems and source system can not DBLINK in the case of, then goal systems is by judging that goal systems receives
Under catalogue source system involved in DMP_TASK_ID is determined with the presence or absence of the empty file of the entitled DMP_TASK_ID+ sources system name of file
Whether the table of system has all received the text that institute's active system under all tasks is successfully put if there is so goal systems
Part state is " ready ".
Data import feature in data extraction process has been divided into DMP file modes and DBLINK modes:
DMP file modes:
Goal systems is " ready " for importing file status in listed files, and corresponding subregion is processed, according to
The degree of parallelism parameter of configuration performs importing;
When data volume is bigger than normal, DMP file modes are selected, each child partition can export a DMP file, not divide
The table in area exports whole table data generation DMP file, and source system is passed to DMP files by this preset transformats of FTP
Goal systems, DMP file modes are to support breakpoint continues to lead the advantages of importing and exporting, report an error if export imports, journey
Sequence can automatic identification carry out export again and import, and lead can be so that avoid batch peak period since breakpoint being supported to continue, this
Sample can so that file is more efficient during importing and exporting.
DBLINK modes:
In the case of goal systems can be connected with source system using DBLINK, the data pick-up of DBLINK modes is provided
Mode, particular configuration data are:Cfg_value is specific table name, and part_col is main partition field, and subpart_col is son
These information are spliced into select sentences, according to setting by subregion field according to the importing listed files of the configuration information and generation
The degree of parallelism put is performed different select sentences by dbms_scheduler.create_job and realizes data from source system parallel
It unites to the extraction of goal systems.
The embodiment of the present invention additionally provides another data pick-up mode, referring to Fig. 2, mainly includes:
S21, data import configuration step;
S22, Command Line Parsing step is imported;
S23, data deriving step;
S24, file transmitting step;
S25, data steps for importing;
S26, front end show step.
It is imported in configuration step in S21 data, the tables of data range of front-end interface offer, date range, affiliated province is provided
Range configuration data imports task;
It is imported in Command Line Parsing step in S22, according to the partition information of backstage processing, parsing data import configuration, production
Corresponding export imports listed files, each subregion or the corresponding record of every table (zoneless table);
In S23 data deriving steps, according to export list and export is performed, every record one DMP text of generation
Part;
In S24 file transmitting steps, by the DMP files FTP of generation to goal systems, and file status is updated;
In S25 data steps for importing, according to listed files is imported, perform the importing of DMP files or directly DBLINK is arrived
Source system performs data and imports;
In showing step in S26 front ends, front end can show the execution state of the importing task of configuration, and provide every table
The partition list having been introduced into.
Traditional data pick-up whole flow process is required for manual intervention.Need from determining data area, write perform sentence,
It performs export, generation DMP files, perform importing, a whole set of flow is required for operating personnel to perform step by step, takes time and effort, effect
Rate is low, and the embodiment of the present invention is only needed in the good data area of front-end configuration, and subsequent operation whole programming automation is without people
Work intervention.
It easily malfunctions for the export importing process of mass data table, once derived DMP files are problematic, for primary
Property derivative hundred open setting thousands of tables, then all tables all will be unable to successfully export importing, and integrally re-execute export
Import, and by the present invention come realize export and import when, derived file most fine granularity to child partition rank, each child partition
A DMP file is generated, DMP files are passed to goal systems, at goal systems end, have program repeating query to guard monitoring by program one by one
Whether there is new file, just fallen importing automatically if having;Export imports degree of parallelism by state modulator, and a file imports successfully,
Extended meeting has transferred new importing process automatically afterwards, controls in the degree of parallelism of parameter setting, avoids system being caused to be born parallel because of height
Lotus is much to lead to the machine of delaying, and importing is exported between file and file and is independent of each other.It is not interfered with so if individual partition is problematic
Export to other subregions imports.
The embodiment of the present invention can be led by exporting the presence or absence of importing process to determine whether having completed file
Enter, the automatic journal file for reading generation, inquiry error keyword determines whether successfully to export importing.And due to most particulate
Degree can import, so as to fulfill number is extracted in child partition rank so data pick-up demand is split into numerous child partition export
According to the partial data of table, without whole table, all export imports.And due to the difference of daily table structure, in data pick-up journey
During the realization of sequence, mode that program synchronizes between source system and goal systems automatically so that data extraction process
It is more efficient succinct without manual intervention mode.
It is corresponding with data pick-up method provided in an embodiment of the present invention, a kind of number is additionally provided in the embodiment of the present invention
According to draw-out device, referring to Fig. 3, including:
Generation module 1, the data pick-up task for the source system to acquisition parse, and according to data partition granularity
Generation data pick-up list corresponding with the data pick-up task;
Determining module 2 for the data capacity in the data pick-up list, determines the data to the source system
Extraction mode;
First abstraction module 3, will be each described when carrying out data pick-up for extracting mode according to the first preset data
The data of data partition are extracted, and generate the first data file, and first data file is preserved to goal systems;
Second abstraction module 4, when carrying out data pick-up for extracting mode according to the second preset data, to the source system
System carries out data pick-up, and the data file being drawn into is preserved to the goal systems.
Optionally, in an alternative embodiment of the invention, the generation module includes:
Resolution unit parses for the data pick-up task to acquisition, obtains corresponding with the data pick-up task
Configuration information;
Zoning unit, for carrying out subregion to the data pick-up task according to the configuration information, by each subregion
Data pick-up task generate corresponding with subregion data pick-up subtask;
Generation unit, for each data pick-up subtask to be generated data pick-up list.
Optionally, in an alternative embodiment of the invention, data file is being preserved to before the goal systems, also wrapped
It includes:
Judgment module, for judging the goal systems with the presence or absence of subregion corresponding with the data file, if not depositing
Subregion corresponding with the data file is then being added in the goal systems;
If in the presence of target partition corresponding with the data file in the goal systems is found, and by the mesh
Data in mark subregion are emptied.
Optionally, in an alternative embodiment of the invention, the determining module includes:
Capacity judging unit, for judging whether the data capacity in the data pick-up list is more than preset data amount threshold
Value, if it is, extracting mode using the first preset data to the source system carries out data pick-up, conversely, then using second
Preset data extracts mode and carries out data pick-up;
The first data pick-up mode extracts mode for DMP files, and the second data pick-up mode is DBLIK data
Extraction mode.
Optionally, in an alternative embodiment of the invention, it when preserving first data file to goal systems, also wraps
It includes:
Transmission unit, for by preset data transfer mode by first data file by preset degree of parallelism simultaneously
Row is transmitted to the goal systems;
Judging unit is obtained, is obtained parallel according to the data pick-up list for judging whether the goal systems is successful
First data file, if it is, continuing to upload first data file;If it is not, then judge first data
File whether there is mistake.
Data pick-up device provided by the invention is parsed by the data pick-up task of the source system to acquisition, will
Data pick-up task is divided into a rule subtask according to subregion granularity and ultimately generates task extraction list, such Paralleled logarithm
According to extracted namely data are imported or are exported when, can solve in the prior art by all data to be extracted into
The problem of data pick-up efficiency is low caused by row extracts.Two kinds of data pick-up modes are set simultaneously, it can be according to data capacity
While so that extraction efficiency improves using corresponding data pick-up mode, without manual intervention, can also realize source system and
The data of goal systems synchronize, and decrease the erroneous effects of data pick-up.
Each embodiment is described by the way of progressive in this specification, the highlights of each of the examples are with other
The difference of embodiment, just to refer each other for identical similar portion between each embodiment.For device disclosed in embodiment
For, since it is corresponded to the methods disclosed in the examples, so description is fairly simple, related part is said referring to method part
It is bright.
The foregoing description of the disclosed embodiments enables professional and technical personnel in the field to realize or use the present invention.
A variety of modifications of these embodiments will be apparent for those skilled in the art, it is as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and the principles and novel features disclosed herein phase one
The most wide range caused.
Claims (10)
- A kind of 1. data pick-up method, which is characterized in that including:The data pick-up task of the source system of acquisition is parsed, and according to the generation of data partition granularity and the data pick-up The corresponding data pick-up list of task;According to the data capacity in the data pick-up list, the data pick-up mode to the source system is determined;When carrying out data pick-up according to the first preset data extraction mode, the data of each data partition are taken out It takes, and generates the first data file, first data file is preserved to goal systems;When carrying out data pick-up according to the second preset data extraction mode, data pick-up is carried out to the source system, and will take out The data file got is preserved to the goal systems.
- 2. according to the method described in claim 1, it is characterized in that, the data pick-up task of the source system of described pair of acquisition carries out Parsing, and data pick-up list corresponding with the data pick-up task is generated according to data partition granularity, including:The data pick-up task of acquisition is parsed, obtains configuration information corresponding with the data pick-up task;Subregion is carried out to the data pick-up task according to the configuration information, the data pick-up task of each subregion is generated Data pick-up subtask corresponding with the subregion;Each data pick-up subtask is generated into data pick-up list.
- 3. according to the method described in claim 1, it is characterized in that, preserving data file to before the goal systems, It further includes:The goal systems is judged with the presence or absence of subregion corresponding with the data file, if being not present, in the target system Subregion corresponding with the data file is added in system;If in the presence of target partition corresponding with the data file in the goal systems is found, and the target is divided Data in area are emptied.
- 4. according to the method described in claim 1, it is characterized in that, the data in the data pick-up list are held Amount determines the data pick-up mode to the source system, including:Judge whether the data capacity in the data pick-up list is more than preset data amount threshold value, if it is, to the source System extracts mode using the first preset data and carries out data pick-up, is carried out conversely, then extracting mode using the second preset data Data pick-up;The first data pick-up mode extracts mode for DMP files, and the second data pick-up mode is DBLIK data pick-ups Mode.
- 5. according to the method described in claim 1, it is characterized in that, when preserving first data file to goal systems When, it further includes:First data file is transmitted to by the target by preset degree of parallelism by preset data transfer mode System;Judge whether the goal systems is successful and first data file is obtained according to the data pick-up list parallel, if It is then to continue to upload first data file;If it is not, then judge that first data file whether there is mistake.
- 6. a kind of data pick-up device, which is characterized in that including:Generation module, the data pick-up task for the source system to acquisition parse, and are generated according to data partition granularity Data pick-up list corresponding with the data pick-up task;Determining module for the data capacity in the data pick-up list, determines the data pick-up to the source system Mode;First abstraction module, will each data when carrying out data pick-up for extracting mode according to the first preset data The data of subregion are extracted, and generate the first data file, and first data file is preserved to goal systems;Second abstraction module, for according to the second preset data extract mode carry out data pick-up when, to the source system into Row data pick-up, and the data file being drawn into is preserved to the goal systems.
- 7. device according to claim 6, which is characterized in that the generation module includes:Resolution unit parses for the data pick-up task to acquisition, obtains match corresponding with the data pick-up task Confidence ceases;Zoning unit, for carrying out subregion to the data pick-up task according to the configuration information, by the number of each subregion Data pick-up subtask corresponding with the subregion is generated according to the task of extraction;Generation unit, for each data pick-up subtask to be generated data pick-up list.
- 8. data file is being preserved to before the goal systems, further included by device according to claim 6:Judgment module, for judging that the goal systems whether there is subregion corresponding with the data file, if being not present, Subregion corresponding with the data file is added in the goal systems;If in the presence of target partition corresponding with the data file in the goal systems is found, and the target is divided Data in area are emptied.
- 9. device according to claim 6, which is characterized in that the determining module includes:Capacity judging unit, for judging whether the data capacity in the data pick-up list is more than preset data amount threshold value, If it is, extracting mode using the first preset data to the source system carries out data pick-up, conversely, then default using second Data pick-up mode carries out data pick-up;The first data pick-up mode extracts mode for DMP files, and the second data pick-up mode is DBLIK data pick-ups Mode.
- 10. device according to claim 6, which is characterized in that preserved when by first data file to goal systems When, it further includes:Transmission unit, for being passed first data file parallel by preset degree of parallelism by preset data transfer mode Transport to the goal systems;Judging unit is obtained, for judging whether the goal systems is successful according to data pick-up list acquisition parallel First data file, if it is, continuing to upload first data file;If it is not, then judge first data file With the presence or absence of mistake.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810132705.7A CN108228908B (en) | 2018-02-09 | 2018-02-09 | Data extraction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810132705.7A CN108228908B (en) | 2018-02-09 | 2018-02-09 | Data extraction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108228908A true CN108228908A (en) | 2018-06-29 |
CN108228908B CN108228908B (en) | 2021-11-12 |
Family
ID=62661325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810132705.7A Active CN108228908B (en) | 2018-02-09 | 2018-02-09 | Data extraction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108228908B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108984738A (en) * | 2018-07-16 | 2018-12-11 | 中国银行股份有限公司 | A kind of data shop fixtures method and device |
CN110032559A (en) * | 2019-04-19 | 2019-07-19 | 成都四方伟业软件股份有限公司 | A kind of data pick-up method and device |
EP4160432A4 (en) * | 2020-05-27 | 2024-06-12 | Bcore | Data loading and processing system, and method therefor |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060173926A1 (en) * | 2000-07-06 | 2006-08-03 | Microsoft Corporation | Data transformation to maintain detailed user information in a data warehouse |
CN101216821A (en) * | 2007-01-05 | 2008-07-09 | 中兴通讯股份有限公司 | Data acquisition system storage management method |
CN101329676A (en) * | 2007-06-20 | 2008-12-24 | 华为技术有限公司 | Data paralleling abstracting method and apparatus and database system |
US7769648B1 (en) * | 2003-12-04 | 2010-08-03 | Drugstore.Com | Method and system for automating keyword generation, management, and determining effectiveness |
US9426219B1 (en) * | 2013-12-06 | 2016-08-23 | Amazon Technologies, Inc. | Efficient multi-part upload for a data warehouse |
CN107040608A (en) * | 2017-05-19 | 2017-08-11 | 宁波绮耘软件股份有限公司 | A kind of data processing method and system |
-
2018
- 2018-02-09 CN CN201810132705.7A patent/CN108228908B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060173926A1 (en) * | 2000-07-06 | 2006-08-03 | Microsoft Corporation | Data transformation to maintain detailed user information in a data warehouse |
US7769648B1 (en) * | 2003-12-04 | 2010-08-03 | Drugstore.Com | Method and system for automating keyword generation, management, and determining effectiveness |
CN101216821A (en) * | 2007-01-05 | 2008-07-09 | 中兴通讯股份有限公司 | Data acquisition system storage management method |
CN101329676A (en) * | 2007-06-20 | 2008-12-24 | 华为技术有限公司 | Data paralleling abstracting method and apparatus and database system |
US9426219B1 (en) * | 2013-12-06 | 2016-08-23 | Amazon Technologies, Inc. | Efficient multi-part upload for a data warehouse |
CN107040608A (en) * | 2017-05-19 | 2017-08-11 | 宁波绮耘软件股份有限公司 | A kind of data processing method and system |
Non-Patent Citations (1)
Title |
---|
邓绪斌: "面向复杂数据源的数据抽取模型和算法研究", 《中国博士学位论文全文数据库》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108984738A (en) * | 2018-07-16 | 2018-12-11 | 中国银行股份有限公司 | A kind of data shop fixtures method and device |
CN110032559A (en) * | 2019-04-19 | 2019-07-19 | 成都四方伟业软件股份有限公司 | A kind of data pick-up method and device |
EP4160432A4 (en) * | 2020-05-27 | 2024-06-12 | Bcore | Data loading and processing system, and method therefor |
Also Published As
Publication number | Publication date |
---|---|
CN108228908B (en) | 2021-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104881494B (en) | The methods, devices and systems synchronous with Redis server progress data | |
CN104809202B (en) | A kind of method and apparatus of database synchronization | |
US7827299B2 (en) | Transitioning between historical and real time data streams in the processing of data change messages | |
CN104809201B (en) | A kind of method and apparatus of database synchronization | |
CN104317843B (en) | A kind of data syn-chronization ETL system | |
CN108228908A (en) | A kind of data pick-up method and device | |
CN104809200B (en) | A kind of method and apparatus of database synchronization | |
CN102890682B (en) | Build the method, search method, apparatus and system of index | |
CN105447156A (en) | Resource description framework distributed engine and incremental updating method | |
CN109542593B (en) | NIFI-based data processing flow design method | |
CN107992367A (en) | A kind of Modbus serial datas processing method | |
CN109670081A (en) | The method and device of service request processing | |
EP3673369B1 (en) | Method of executing a tuple graph program across a network | |
EP3616057B1 (en) | Method for intra-subgraph optimization in tuple graph programs | |
CN102096626A (en) | Mobile terminal and processing method of test log thereof | |
CN113420026B (en) | Database table structure changing method, device, equipment and storage medium | |
CN102073527A (en) | Method and device for updating input method word stock | |
CN107247811A (en) | SQL statement performance optimization method and device based on oracle database | |
CN109062592A (en) | A kind of method and system that game numerical value is synchronous | |
CN105138679A (en) | Data processing system and method based on distributed caching | |
CN104657164B (en) | Software upgrading treating method and apparatus | |
EP3789882A1 (en) | Automatic configuration of logging infrastructure for software deployments using source code | |
CN101286886B (en) | Method and device to recover configuring information of network appliance | |
CN106528300A (en) | Console game synchronizing method, device and terminal | |
US20070061092A1 (en) | Generational global name table |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |