CN105095384B - The method and apparatus that data are carried down - Google Patents

The method and apparatus that data are carried down Download PDF

Info

Publication number
CN105095384B
CN105095384B CN201510376695.8A CN201510376695A CN105095384B CN 105095384 B CN105095384 B CN 105095384B CN 201510376695 A CN201510376695 A CN 201510376695A CN 105095384 B CN105095384 B CN 105095384B
Authority
CN
China
Prior art keywords
data
source
carried down
major key
carried
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510376695.8A
Other languages
Chinese (zh)
Other versions
CN105095384A (en
Inventor
刘喜男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201510376695.8A priority Critical patent/CN105095384B/en
Publication of CN105095384A publication Critical patent/CN105095384A/en
Application granted granted Critical
Publication of CN105095384B publication Critical patent/CN105095384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support

Abstract

The present invention provides a kind of method and apparatus that data are carried down, and helps to improve the correctness that data are carried down.The method carried down of data of the present invention includes:Judge in middle table whether there is data, if so, using the data major key in middle table in the table of source corresponding data as data to be carried down;Otherwise the major key of the data chosen in advance in the table of source is saved in middle table, and using the data chosen in advance as data to be carried down;By the data fragmentation to be carried down, multiple threads for being then turned in thread pool;The multiple thread respectively mutually not repeatedly obtains the corresponding data of data major key according to the data major key in middle table from the table of source, and then the data of acquisition are carried down to object table;The data of acquisition are carried down to object table in the multiple thread, data to be carried down described in judgement carry down it is front and back whether consistent then output judging result.

Description

The method and apparatus that data are carried down
Technical field
The present invention relates to field of computer technology, a kind of particularly method and apparatus that data are carried down.
Background technology
With the continuous development of computer application system, data volume in database is also in sustainable growth, then data knot Turning scheme becomes the important means for reducing database pressure.Data are carried down, and be primarily referred to as will be some or all of in a database Data are transferred in another database.
Some current file system provide the scheme that data are carried down, such as may be implemented using Hadoop+sqoop Data transmission of the Hadoop distributed file systems (hdfs) between relevant database, to may be implemented to close by hfds It is the data transfer between type database.Specifically, first, using sqoop by data from relevant database (such as SQLServer databases) in carry over in hfds, then reversely carry over data in hdfs to another relevant database again Among (such as another SQLServer database).
The data provided using current file system are carried down scheme, can reach prodigious data throughout, i.e., each Batch transfer data volume it is very big, and system can horizontal extension, can also integration across database transfer.But its disadvantage is mainly It is difficult to ensure the correctness of carrying down of each data.
Invention content
In view of this, the present invention provides a kind of method and apparatus that data are carried down, help to improve data carry down it is correct Property.Other objects of the present invention and effect can be obtained from specific implementation mode.
To achieve the above object, according to an aspect of the invention, there is provided a kind of method that data are carried down.
The method carried down of data of the present invention includes:Judge to whether there is data in middle table, if so, by middle table Data major key corresponding data in the table of source are used as data to be carried down;Otherwise the major key of the data chosen in advance in the table of source is protected It is stored to middle table, and using the data chosen in advance as data to be carried down;By the data fragmentation to be carried down, it is then turned on Multiple threads in thread pool;The multiple thread is respectively mutually not repeatedly obtained according to the data major key in middle table from the table of source The corresponding data of data major key are taken, then the data of acquisition are carried down to object table;In the multiple thread by the number of acquisition According to carrying down to object table, data to be carried down described in judgement carry down it is front and back whether consistent then output judging result.
Optionally, whether consistent step includes data to be carried down described in the judgement before and after carrying down:Calculate separately source The binary system check value of data to be carried down and the data carried down into object table in table, to respectively obtain source hash table and mesh Mark hash table;Whether consistent with target hash table according to the source hash table, data to be carried down described in determination are before and after carrying down It is no consistent.
Optionally, the source table and object table are the tables of data in SQLServer databases;It is waited in the table of the calculating source The step of binary system check value of data of carrying down and the data carried down into object table includes:Use binary_checksum letters Number obtains the binary system check value of the data to be carried down in the table of source and the data carried down into object table.
Optionally, whether consistent with target hash table according to the source hash table, data to be carried down are being carried down described in determination It is front and back that whether consistent step includes:Using target hash table as reference, source hash table is carried out using JAVA sentences removeall Operation;After the completion of the operation, if judging in the hash table of source without remaining data, it is determined that the data to be carried down are before carrying down It is consistent afterwards, otherwise it is inconsistent.
Optionally, after the output judging result the step of, further include:If the judging result is consistent, delete Except the data in middle table and the data in the table of source;If the judging result is inconsistent, prompt message is exported.
According to another aspect of the present invention, a kind of device that data are carried down is provided.
The device carried down of data of the present invention includes:Judgment module whether there is data for judging in middle table;Major key Preserving module, in the case where data are not present in middle table, the major key of the data chosen in advance in the table of source to be saved in Middle table;Fragment module, for, there are in the case of data, the data major key in middle table being corresponded in the table of source in middle table Data carry out fragment, and middle table be not present data in the case of, by the data fragmentation chosen in advance in the table of source;Line Cheng Chi modules, for managing multiple threads in thread pool, wherein the multiple thread is respectively according to the data in middle table Major key mutually not repeatedly obtains the corresponding data of data major key from the table of source, and then the data of acquisition are carried down to object table; Correction verification module, for the data of acquisition to be carried down to object table in the multiple thread, data to be carried down described in judgement exist Whether judging result is unanimously then exported before and after carrying down.
Optionally, the correction verification module is additionally operable to:It calculates separately the data to be carried down in the table of source and carries down into object table Data binary system check value, to respectively obtain source hash table and target hash table;According to the source hash table and target Whether hash table is consistent, and whether data to be carried down described in determination are consistent before and after carrying down.
Optionally, the source table and object table are the tables of data in SQLServer databases;The correction verification module is additionally operable to The binary system school of the data to be carried down in the table of source and the data carried down into object table is obtained using binary_checksum functions Test value.
Optionally, the correction verification module is additionally operable to:Using target hash table as reference, using removeall pairs of JAVA sentences Source hash table is operated;After the completion of the operation, if judging in the hash table of source without remaining data, it is determined that described to wait carrying down Data are being carried down self-consistent, are otherwise inconsistent.
Optionally, further include removing module and reminding module, wherein:Removing module is used in the comparison result be consistent In the case of, delete the data in middle table and the data in the table of source;Reminding module is used in the comparison result be to differ In the case of cause, prompt message is exported.
According to the technique and scheme of the present invention, multi-thread mechanism is used, helps to ensure that data are carried down efficiency;It is basic herein On carry down the front and back correctness for whether unanimously verifying, ensureing to carry down front and back data for data;In addition middle table is used to protect Deposit data major key helps to ensure that and does not lose processing when data processing and do not reprocess.
Description of the drawings
Attached drawing does not constitute inappropriate limitation of the present invention for more fully understanding the present invention.Wherein:
Fig. 1 is the schematic diagram of the key step for the method carried down according to the data of embodiment of the present invention;
Fig. 2 is the schematic diagram of the element for the device carried down according to the data of embodiment of the present invention.
Specific implementation mode
It explains to the exemplary embodiment of the present invention below in conjunction with attached drawing, including embodiment of the present invention Various details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize Know, various changes and modifications can be made to embodiment described herein, without departing from scope and spirit of the present invention. Equally, for clarity and conciseness, the description to known function and structure is omitted in following description.
In embodiments of the present invention, data are dumped into object table from source table, which can choose according to configuration All or part in the table of source.By middle table preserve the data to be carried down major key, then press middle table in data major key from Source table inquiry data again carry down the data to object table.Flow that specifically can be as shown in Figure 1 executes, and Fig. 1 is according to this hair The schematic diagram of the key step for the method that the data of bright embodiment are carried down.The flow can be used the program assembly individually developed and come It completes.
Step S11:Loading configuration file information.Source where being configured with such as source table and object table in configuration file respectively Database information and target database information, the username and password, the Yi Jiyao that log in source database and target database needs Which carry down data etc..
Step S12:Judge to whether there is data in middle table.This data may be verification failure because last time carries down, and break Electricity and various abnormal conditions are encountered in data are carried down cause.If there are data, first residual data is carried down completion herein, because This is directly entered step S14, otherwise enters step S13.
Step S13:The major key of the data to be carried down in the table of source is put into middle table.The major key of data is the unique of data Mark.
Step S14:By data fragmentation, it is then turned on multiple threads.The total amount of data to be carried down can be first calculated at this time, so Every size is determined afterwards.Here thread pool unified management may be used in multiple threads, keeps the data slice of each thread process mutual It does not repeat.
Step S15:Per thread carries down to data.In this step, per thread is taken out oneself from middle table and is wanted The major key of the data of processing inquires the corresponding data of the major key from the table of source, then the data is carried down to object table.It carries down completion Afterwards, mark can be added in middle table to have carried down completion to indicate the corresponding data of the major key.In all threads to oneself After the completion of the data to be carried down is carried down, S16 is entered step.
Step S16:Judge whether data to be carried down are consistent before and after carrying down.This step be to carry down front and back data whether Unanimously verified.There are the data before carrying down in the table of source, carries down comprising the data after carrying down, more identical major key in object table Whether preceding data and the data after carrying down are identical, you can determine whether data to be carried down are consistent before and after carrying down.If consistent, Illustrate to carry down correct, enters step S17, otherwise enter step S18.
Step S17:Delete the data in middle table and source table.So far, data carry down flow completion.
Step S18:Export prompt message.Personnel can learn to carry down and be abnormal in this way, so as to timely processing.
In above step, the data provided different from existing file system are carried down scheme, are used in above-mentioned flow Multi-thread mechanism helps to ensure that data are carried down efficiency;Whether unanimously verified before and after carrying down on this basis for data, Ensure the correctness of front and back data of carrying down;In addition it uses middle table to preserve data major key, is not lost when helping to ensure that data processing It loses processing and does not reprocess.
In embodiments of the present invention, in the judgement in carrying out step S16, it can first calculate separately and be waited in the table of source The binary system check value of data of carrying down and the data carried down into object table, to respectively obtain source hash table and target hash Table;Then according to source hash table and target hash table it is whether consistent come determine data to be carried down carry down it is front and back whether consistent.Such as Source database and target database use SQLServer databases, then can obtain source using binary_checksum functions The binary system check value of data to be carried down and the data carried down into object table in table.Determining data to be carried down before carrying down Afterwards whether it is consistent when, a fast mode is completed using JAVA programs, wherein mainly with target hash table being Reference operates source hash table using JAVA sentences removeall, thus by the hash table of source with target hash table one The data of cause are deleted, and after the completion of operation, if judging, whether there is or not remaining datas in the hash table of source, if without illustrating that data to be carried down are being tied Turn it is self-consistent, on the contrary it is then be it is inconsistent.
Fig. 2 is the schematic diagram of the element for the device carried down according to the data of embodiment of the present invention.Such as Fig. 2 institutes Show, the device 20 that the data of embodiment of the present invention are carried down includes mainly judgment module 21, major key preserving module 22, fragment module 23, thread pool module 24 and correction verification module 25.
Judgment module 21 whether there is data for judging in middle table;Major key preserving module 22 in middle table for not depositing In the case of data, the major key of the data chosen in advance in the table of source is saved in middle table;Fragment module 23 is used in Between table there are data in the case of, by the data major key in middle table, corresponding data carry out fragment in the table of source, and in Between table there is no in the case of data, by the data fragmentation chosen in advance in the table of source;Thread pool module 24 is for managing thread Multiple threads in pond, wherein the multiple thread respectively according to the major key of the data in middle table from the table of source mutually not repeatedly The corresponding data of data major key are obtained, then the data of acquisition are carried down to object table;Correction verification module 25 is used for described more A thread carries down the data of acquisition to object table, and whether data to be carried down described in judgement are consistent and then defeated before and after carrying down Go out judging result.
According to embodiment of the present invention, multi-thread mechanism is used, helps to ensure that data are carried down efficiency;On this basis It carries down the front and back correctness for whether unanimously verifying, ensureing to carry down front and back data for data;In addition middle table is used to preserve Data major key helps to ensure that and does not lose processing when data processing and do not reprocess.
Above-mentioned specific implementation mode, does not constitute limiting the scope of the invention.Those skilled in the art should be bright It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and replacement can occur.It is any Modifications, equivalent substitutions and improvements made by within the spirit and principles in the present invention etc., should be included in the scope of the present invention Within.

Claims (12)

1. a kind of method that data are carried down, which is characterized in that including:
Judge in middle table whether there is data, if so, using the data major key in middle table in the table of source corresponding data as Data to be carried down;Otherwise the major key of the data chosen in advance in the table of source is saved in middle table, and is chosen described in advance Data are used as data to be carried down;
By the data fragmentation to be carried down, multiple threads for being then turned in thread pool;
The multiple thread respectively mutually not repeatedly obtains the data major key pair according to the data major key in middle table from the table of source Then the data answered carry down the data of acquisition to object table;
The data of acquisition are carried down to object table in the multiple thread, data to be carried down described in judgement front and back are carrying down No consistent then output judging result.
2. according to the method described in claim 1, it is characterized in that, data to be carried down described in the judgement carry down it is front and back whether Consistent step includes:
The binary system check value for calculating separately the data to be carried down in the table of source and the data carried down into object table, to obtain respectively To source hash table and target hash table;
It is whether consistent with target hash table according to the source hash table, determine described in data to be carried down carry down it is front and back whether one It causes.
3. according to the method described in claim 2, it is characterized in that,
The source table and object table are the tables of data in SQLServer databases;
The step of binary system check value of data to be carried down in the table of the calculating source and the data carried down into object table includes: The binary system school of the data to be carried down in the table of source and the data carried down into object table is obtained using binary_checksum functions Test value.
4. according to the method described in claim 2, it is characterized in that, according to the source hash table and target hash table whether one It causes, whether consistent step includes data to be carried down described in determination before and after carrying down:
Using target hash table as reference, source hash table is operated using JAVA sentences removeall;
After the completion of the operation, if judging in the hash table of source without remaining data, it is determined that the data to be carried down are before carrying down It is consistent afterwards, otherwise it is inconsistent.
5. method according to claim 1 to 4, which is characterized in that the output judging result the step of Later, further include:If the judging result is consistent, the data in middle table and the data in the table of source are deleted;If described Judging result is inconsistent, then exports prompt message.
6. a kind of device that data are carried down, which is characterized in that including:
Judgment module whether there is data for judging in middle table;If so, the data major key in middle table is right in the table of source The data answered are used as data to be carried down;Otherwise using the data chosen in advance as data to be carried down;
Major key preserving module is used in the case where data are not present in middle table, by the master of the data chosen in advance in the table of source Key is saved in middle table;
Fragment module, in middle table there are in the case of data, the data major key in middle table is corresponding in the table of source Data carry out fragment, and in the case where data are not present in middle table, by the data fragmentation chosen in advance in the table of source;
Thread pool module, for managing multiple threads in thread pool, wherein the multiple thread is respectively according in middle table The major key of data mutually not repeatedly obtains the corresponding data of data major key from the table of source, and then the data of acquisition are carried down to mesh Mark table;
Correction verification module, for the data of acquisition to be carried down to object table in the multiple thread, number to be carried down described in judgement Whether judging result is unanimously then exported according to before and after carrying down.
7. device according to claim 6, which is characterized in that the correction verification module is additionally operable to:
The binary system check value for calculating separately the data to be carried down in the table of source and the data carried down into object table, to obtain respectively To source hash table and target hash table;
It is whether consistent with target hash table according to the source hash table, determine described in data to be carried down carry down it is front and back whether one It causes.
8. device according to claim 7, which is characterized in that
The source table and object table are the tables of data in SQLServer databases;
The correction verification module is also used for binary_checksum functions and obtains the data to be carried down in the table of source and carry down to mesh Mark the binary system check value of the data in table.
9. device according to claim 7, which is characterized in that the correction verification module is additionally operable to:
Using target hash table as reference, source hash table is operated using JAVA sentences removeall;
After the completion of the operation, if judging in the hash table of source without remaining data, it is determined that the data to be carried down are before carrying down It is consistent afterwards, otherwise it is inconsistent.
10. the device according to any one of claim 6 to 9, which is characterized in that further include removing module and prompt mould Block, wherein:
Removing module is used in the case where the judging result is unanimous circumstances, deletes the data in middle table and the number in the table of source According to;
Reminding module is used to, in the case where the judging result is inconsistent, export prompt message.
11. a kind of electronic equipment that data are carried down, which is characterized in that including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-5.
12. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1-5 is realized when row.
CN201510376695.8A 2015-07-01 2015-07-01 The method and apparatus that data are carried down Active CN105095384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510376695.8A CN105095384B (en) 2015-07-01 2015-07-01 The method and apparatus that data are carried down

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510376695.8A CN105095384B (en) 2015-07-01 2015-07-01 The method and apparatus that data are carried down

Publications (2)

Publication Number Publication Date
CN105095384A CN105095384A (en) 2015-11-25
CN105095384B true CN105095384B (en) 2018-09-14

Family

ID=54575821

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510376695.8A Active CN105095384B (en) 2015-07-01 2015-07-01 The method and apparatus that data are carried down

Country Status (1)

Country Link
CN (1) CN105095384B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930389A (en) * 2016-04-14 2016-09-07 北京京东尚科信息技术有限公司 Method and system for transferring data
CN107315752B (en) * 2016-04-27 2020-07-31 北京京东尚科信息技术有限公司 Data transfer method and system
CN107391508B (en) * 2016-05-16 2020-07-17 顺丰科技有限公司 Data loading method and system
CN106469226A (en) * 2016-09-30 2017-03-01 安徽马钢自动化信息技术有限公司 Data communication method based on data base's middle table
CN107861799B (en) * 2016-12-28 2020-12-25 平安科技(深圳)有限公司 Task processing method and device based on multi-thread environment
CN108470045B (en) * 2018-03-06 2020-02-18 平安科技(深圳)有限公司 Electronic device, data chain archiving method and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1032175A2 (en) * 1999-01-07 2000-08-30 Sun Microsystems, Inc. System and method for transferring partitioned data sets over multiple threads
CN1758631A (en) * 2004-10-08 2006-04-12 乐金电子(中国)研究开发中心有限公司 Integral method of transmit table information of disperse data packet transmit system
CN102084360A (en) * 2008-04-06 2011-06-01 弗森-艾奥公司 Apparatus, system, and method for validating that a correct data segment is read from a data storage device
CN104639298A (en) * 2013-11-08 2015-05-20 腾讯科技(深圳)有限公司 Data transmission method, device and system
CN104731888A (en) * 2015-03-12 2015-06-24 北京奇虎科技有限公司 Data migration method, device and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9075529B2 (en) * 2013-01-04 2015-07-07 International Business Machines Corporation Cloud based data migration and replication

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1032175A2 (en) * 1999-01-07 2000-08-30 Sun Microsystems, Inc. System and method for transferring partitioned data sets over multiple threads
CN1758631A (en) * 2004-10-08 2006-04-12 乐金电子(中国)研究开发中心有限公司 Integral method of transmit table information of disperse data packet transmit system
CN102084360A (en) * 2008-04-06 2011-06-01 弗森-艾奥公司 Apparatus, system, and method for validating that a correct data segment is read from a data storage device
CN104639298A (en) * 2013-11-08 2015-05-20 腾讯科技(深圳)有限公司 Data transmission method, device and system
CN104731888A (en) * 2015-03-12 2015-06-24 北京奇虎科技有限公司 Data migration method, device and system

Also Published As

Publication number Publication date
CN105095384A (en) 2015-11-25

Similar Documents

Publication Publication Date Title
CN105095384B (en) The method and apparatus that data are carried down
CN109034809B (en) Block chain generation method and device, block chain node and storage medium
EP3678346A1 (en) Blockchain smart contract verification method and apparatus, and storage medium
TWI628551B (en) Data library copying method and device based on log parsing
US9471470B2 (en) Automatically recommending test suite from historical data based on randomized evolutionary techniques
WO2017162032A1 (en) Method and device for executing data recovery operation
US10747776B2 (en) Replication control using eventually consistent meta-data
CN109766349B (en) Task duplicate prevention method, device, computer equipment and storage medium
BR112016022388A8 (en) SYSTEMS AND METHODS TO OPTIMIZE SUPPORT OF MULTIPLE VERSIONS IN INDEXES
CN103577546B (en) A kind of method of data backup, equipment and distributed cluster file system
CN104036029B (en) Large data consistency control methods and system
CN101673374B (en) Bill processing method and device
GB2586718A (en) Methods and systems for simplified graphical depictions of bipartite graphs
US10268776B1 (en) Graph store built on a distributed hash table
AU2015316450A1 (en) Method for updating data table of KeyValue database and apparatus for updating table data
US20150074063A1 (en) Methods and systems for detecting data divergence and inconsistency across replicas of data within a shared-nothing distributed database
CN115374102A (en) Data processing method and system
CN108833133B (en) Network configuration management method and device based on cloud computing network and storage medium
CN109714249B (en) Method and related device for pushing applet messages
CN110162344A (en) A kind of method, apparatus, computer equipment and readable storage medium storing program for executing that current limliting is isolated
CN105069128B (en) Method of data synchronization and device
Bronson et al. Open data challenges at Facebook
CN104361713A (en) Index data monitoring method, device and server
CN105468975A (en) Method, device and system for tracking malicious code misinformation
CN109753823B (en) Block chain data supervision method, system and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant