CN105045917A - Example-based distributed data recovery method and device - Google Patents

Example-based distributed data recovery method and device Download PDF

Info

Publication number
CN105045917A
CN105045917A CN201510515919.9A CN201510515919A CN105045917A CN 105045917 A CN105045917 A CN 105045917A CN 201510515919 A CN201510515919 A CN 201510515919A CN 105045917 A CN105045917 A CN 105045917A
Authority
CN
China
Prior art keywords
node
secondary storage
delaying
storage unit
master
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510515919.9A
Other languages
Chinese (zh)
Other versions
CN105045917B (en
Inventor
赖春波
薛英飞
王仆
赵博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201510515919.9A priority Critical patent/CN105045917B/en
Publication of CN105045917A publication Critical patent/CN105045917A/en
Priority to PCT/CN2015/095766 priority patent/WO2017028394A1/en
Priority to US15/533,955 priority patent/US10783163B2/en
Application granted granted Critical
Publication of CN105045917B publication Critical patent/CN105045917B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees

Abstract

The invention discloses an example-based distributed data recovery method and device. A specific execution manner of the method comprises the following steps: detecting a down non-master node; distributing a plurality of secondary storage units which belong to the down node to at least one online node; carrying out Hash classification on examples kept in logs and distributing the examples to a plurality of threads; and recovering data of a plurality of primary storage units in parallel in the online node. According to the example-based distributed data recovery method and device disclosed by the invention, the parallel recovery of data of down nodes in distributed databases in nodes is realized.

Description

A kind of distributed data restoration methods of Case-based Reasoning and device
Technical field
The present invention relates to database field, be specifically related to a kind of distributed data restoration methods and device of Case-based Reasoning.
Background technology
Along with the development of internet, distributed data base obtains to be applied more and more widely, thus also improves constantly the requirement of its reliability.In order to reduce the time of break in service, the data base set group node data reconstruction method carried out after machine of delaying just seems most important.The distributed data restoration methods that current industry uses the data of machine node of delaying is distributed to multiple line node recover, and adopt single thread at each intra-node, or by recovering realizing multithreading after the retrys such as log recording sequence.Use these methods recover data obviously exist delay machine node data recover efficiency low, the shortcoming low to Duty-circle.
Summary of the invention
Embodiments provide a kind of distributed data restoration methods of Case-based Reasoning, when distributed data base system delays machine, parallel data recovery can be carried out, improve data recovering efficiency and Duty-circle, thus improve the availability of Database Systems.
An aspect of the application provides a kind of distributed data restoration methods of Case-based Reasoning, comprising:
The non-master of the machine of delaying detected, the multiple secondary storage belonging to machine node of delaying are distributed at least one line node, the example that daily record is deposited is carried out to Hash classification and is assigned to multiple thread, and in the data of the multiple one-level storage unit of the inner parallel recovery of line node.
In a kind of exemplary embodiment of the application's first aspect, tertiary storage unit has the index of secondary storage node, each in multiple secondary storage stores the index of multiple one-level storage unit, each in multiple one-level storage unit stores an example, and the data stored in multiple one-level storage unit are orderly according to example, non-master and host node form the node in cluster jointly, the one-level storage unit of each management secondary storage index in non-master, host node management tertiary storage unit and secondary storage.
In addition, in data recovery procedure, utilize Hash to sort out to make the daily record of identical example be mapped to same thread, thus according to the difference of example, daily record is assigned to multiple thread; At least one line node recurs recovery data according to the content of daily record in the in-process logic of carrying out of oneself.After at least one line node completes date restoring, the management node of secondary storage is changed to the line node performing recovery operation.
The second aspect of the application provides a kind of device, comprises host node device and non-master equipment, and host node device is for managing host node, and non-master equipment is for managing non-master.
In a kind of exemplary embodiment of the application's second aspect, host node device comprises the detection module of the non-master for detecting the machine of delaying, and for the multiple secondary storage corresponding to machine node of delaying being distributed to the distribution module of at least one line node.
In addition, non-master equipment comprises: receiver module, for being dispensed to the information of relevant multiple secondary storage corresponding to machine node of delaying of non-master; Scan module, for scanning machine node log of delaying; And processing module, sort out same with what make the daily record of identical example be mapped in multiple thread for carrying out Hash.
Wherein, the management node of secondary storage, also for after at least one line node completes date restoring, is changed to the line node performing recovery operation by distributor.Receiver module is also for receiving the network address and the port name of machine node of delaying.
The beneficial effect of the application is: after node delays machine, by carrying out Hash classification to the example deposited in daily record, being assigned to multiple thread, making line node recover data concurrently at intra-node.Thus improve data recovering efficiency and the utilization factor to node.
Accompanying drawing explanation
Fig. 1 is a kind of distributed data system general frame figure that embodiments of the invention provide;
Fig. 2 is the data store organisation block diagram of the Case-based Reasoning that embodiments of the invention provide;
Fig. 3 is the Data Recovery Process figure of the distributed data restoration methods of the Case-based Reasoning that embodiments of the invention provide;
Fig. 4 be the distributed data restoration methods of the Case-based Reasoning that embodiments of the invention provide in data recovery procedure, the schematic diagram of Hash classification process;
Fig. 5 is the block diagram of a kind of host node device that embodiments of the invention provide; And
Fig. 6 is the block diagram of a kind of non-master equipment that embodiments of the invention provide.
Embodiment
The invention provides a kind of distributed data restoration methods of Case-based Reasoning, below in conjunction with Figure of description, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein is only for instruction and explanation of the present invention, is not intended to limit the present invention.And when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.
Fig. 1 is a kind of distributed data system general frame figure that embodiments of the invention provide, and it should be understood that, the embodiment of the present invention is not limited to the framework shown in Fig. 1.
In the present embodiment, there are two kinds of nodes in data-base cluster: host node 100 and non-master 102.In a cluster, a configuration host node 100 usually.In another embodiment, also can configure multiple standby host node, but only have a host node in running order.As can be seen from Figure 1, multiple non-master 102 is also comprised.Under the state that Database Systems normally work, the equal on-line operation of multiple node, is called line node, as indicated by numeral 104 in Fig. 1.Have at any time in distributed data base node delay machine situation occur, in this case, non-master 102 is divided into again individual machine node 106 and N2 (N1<N2<N wherein remembers that not a node adds up to N) the individual line node 104 of delaying of N1 (1≤N1<N2).Host node 100 and non-master 102 data cooperatively in management database.In distributed data base, data are present in distributed file system 108 in the form of a file, and file system 108 is present in storer regularly.Node can carry out read-write operation to the data in file system 108.Daily record (Log) 110 in file system 108 have recorded all changes (comprise insertion, deletion etc.) of node for data, and thus distributed file system is shared for node.
Fig. 2 is the data store organisation block diagram of the Case-based Reasoning that embodiments of the invention provide.In this example, technology realizes being basic with being stored as of example.Particularly, example can be storage object, such as machine name (as server name), program name etc.In addition, database is the storage organization of three levels, and example is stored in one-level storage unit (as SSTABLE) 202.Other two levels are respectively secondary storage (as LeafTablet) 204 and tertiary storage unit (as RootTablet) 206.
Alternatively, database store structure comprises multiple one-level storage unit 202, and one-level storage organization 202 can be storage unit minimum in database, and the data of each one-level storage unit 202 li are orderly according to major key.Instance Name is contained in major key as a part for major key, and the data thus stored are orderly according to example.In addition, only store the data of an example in each one-level storage unit 202, the sequence number of each one-level storage unit 202 is unique.Database store structure also can comprise multiple secondary storage 204, and secondary storage 204 can be the least unit of cluster host node 100 metadata store.The index according to the orderly one-level storage unit 202 of major key is deposited in each secondary storage 204.In addition, database store structure also can comprise one or more tertiary storage unit 206, and tertiary storage unit 206 is used for index secondary storage 204, deposits the index according to the orderly sensing secondary storage 204 of major key wherein.
Further, at the cluster that embodiments of the invention provide, non-master 102 manages one-level storage unit 202, and each non-master 102 manages one or more one-level storage unit 202 by secondary storage 204 index.A secondary storage 204 can not across multiple node administration, and namely in a secondary storage 204, the one-level storage unit 202 of index can only be managed by a non-master 102.Particularly, as shown in Figure 2, one-level storage unit 208 and 210 can not belong to two non-master management.In addition, when an one-level storage unit can not simultaneously by two different secondary storage indexes.Particularly, as shown in Figure 2, one-level storage unit 212 can not simultaneously by secondary storage 214 and 216 index.Host node 100 manages secondary storage 204 and tertiary storage unit 206.
Fig. 3 is the Data Recovery Process figure of the distributed data restoration methods of the Case-based Reasoning that embodiments of the invention provide.In an embodiment provided by the invention, certain node in a certain moment cluster is delayed machine.According to method provided by the invention, date restoring can comprise the steps.
In step 302, host node 100 detects the node of the machine of delaying.
According to an embodiment provided by the invention, i.e. a kind of exemplary database store structure as shown in Figure 2, host node 100 manages this index and to delay the secondary storage 204 of multiple one-level storage unit 202 that machine node 106 manages.In this embodiment, the secondary storage that multiple one-level storing storage units of being managed by machine node 106 of delaying of host node 100 are corresponding distributes to line node 104, as step 304.
According to narrating above, the secondary storage 202 corresponding to a non-master 102 can have multiple.In one embodiment, for ensureing data recovering efficiency, when performing step 304, host node 100 will correspond to multiple secondary storage 202 uniform distribution of this machine node 106 of delaying to multiple line node 104.In another embodiment, the daily record of each node can be left in the catalogue with the network address of this node and port name, when secondary storage 204 is distributed to line node 104, network address and the port of machine of the delaying node 106 that simultaneously will recover inform line node.Thus make line node 104 can find the log region of this machine node 106 of delaying in daily record.
In step 306, by carrying out Hash classification to daily record, and the daily record after sorting out is assigned to multiple thread.
In step 308, sort out and after distributing thread, carry out date restoring concurrently in line node 104 internal multi-thread completing Hash.
Further, in some embodiments, multiple line node 104 according to distributed multiple threads, carries out logic recurrence according to the operation of content to machine node 106 of delaying that daily record stores at intra-node.
In one embodiment, after line node 104 completes date restoring, host node can redistribute corresponding relation in tertiary storage unit 206 former secondary storage of delaying machine node 106 correspondence, and these secondary storage are corresponded to the line node recovering them, as step 310.
Fig. 4 be the distributed data restoration methods of the Case-based Reasoning that embodiments of the invention provide in data recovery procedure, the schematic diagram of Hash classification process.In embodiment provided by the invention, when carrying out data recovery procedure, after line node 104 finds the deposit position 402 of this node log according to the network address of machine node 106 of delaying and port, line node scans one by one to journal file.Because all there is the information about secondary storage 204 in every bar log recording, makes line node can find the daily record needing to be recovered by oneself in the scanning process of daily record, often find a daily record conformed to, then Hash classification is carried out to this daily record.
Particularly, in one embodiment, the Hash classifying method in stage 404 can be following process.According to the file layout of example, the Instance Name recorded in log recording is changed.In the present embodiment, example can be machine name, program name etc., be then equivalent to character string.Character string can be converted to ASC II yard.Then, by conversion after ASC II yard add up, and by gained and be taken as a 32bit integer numeral.Again to this numeral to recovery number of threads delivery, the Thread Id of this example that is restored.Because Instance Name is unique, so corresponding Thread Id is also unique.That is, after such conversion, the thread that each example is corresponding unique.Therefore the daily record having machine node 106 of delaying can be mapped as multiple parallel date restoring thread according to example.
The second aspect of the application provides a kind of device of the distributed data base date restoring for Case-based Reasoning.This device comprises host node device and non-master equipment.
Fig. 5 shows the block diagram of a kind of host node device that embodiments of the invention provide.Selectively, host node device 500 comprises detection module 502 and distribution module 504.In one embodiment, detection module 502 is for detecting the non-master 102 of the machine of delaying.Distribution module 504 is for distributing at least one line node 104 by the multiple secondary storage 204 corresponding to machine node 106 of delaying.In another embodiment, the distribution module 504 provided can also be used for, and after line node 104 completes date restoring, the management node of secondary storage 204 is changed to the line node 104 performing recovery operation.
Fig. 6 shows the block diagram of a kind of non-master equipment that embodiments of the invention provide.Selectively, non-master equipment 600 comprises: receiver module 602, scan module 604 and processing module 606.
In one embodiment, receiver module 602 is for being dispensed to the information of relevant multiple secondary storage 204 corresponding to machine node 106 of delaying of non-master.Scan module 604 is for scanning machine node log of delaying.Processing module 606 is sorted out same with what make the daily record 110 of identical example be mapped in multiple thread for carrying out Hash.In another embodiment, the receiver module 602 provided also for receiving the network address of machine node 106 of delaying and port name, with the region making the line node 104 for date restoring to be found daily record 110 place of machine node 106 of delaying by the network address that receives and port name in file system 108.
It will be understood by those skilled in the art that all or part of of above-described embodiment method has come by computer program instruction related hardware, described program can store with computer-readable storage medium.During executive routine, the flow process of the embodiment of said method can be comprised.Description of the invention is for instruction those skilled in the art realize best mode of the present invention, can not therefore limit interest field of the present invention, therefore according to the equivalent variations of claim of the present invention, still belong to the scope that the present invention is contained.

Claims (13)

1. a distributed data restoration methods for Case-based Reasoning, comprising:
The non-master of the machine of delaying detected;
Multiple secondary storage of the non-master corresponding to described machine of delaying are distributed at least one line node;
The example that daily record is deposited is carried out to Hash classification and is assigned to multiple threads of described line node inside; And
The data of the multiple one-level storage unit of parallel recovery in described multiple thread.
2. method according to claim 1, wherein, the index of described secondary storage node is stored in tertiary storage unit, each in described multiple secondary storage stores the index of described multiple one-level storage unit, each in described multiple one-level storage unit stores an example, and the data stored in described multiple one-level storage unit are orderly according to described example.
3. method according to claim 1 and 2, wherein, host node and non-master form the node in cluster jointly, the described one-level storage unit of each management described secondary storage index in described non-master, described host node manages described tertiary storage unit and described secondary storage.
4. multiple secondary storage of machine node of delaying described in corresponding to wherein, are distributed at least one line node by method according to claim 1 equably.
5. method according to claim 1, wherein, utilizes Hash to sort out same with what make the daily record of identical described example be mapped in described multiple thread, thus according to the difference of described example, described daily record is assigned to multiple described thread.
6. method according to claim 5, wherein, described Hash classifying step is: change the Instance Name recorded in log recording, adds up after each character of character string is transformed into ASC II yard, and by gained and be taken as a 32bit integer numeral; And, to this numeral to recovery number of threads delivery, the Thread Id of this example that is restored.
7. method according to claim 1, wherein, at least one line node described recurs recovery data according to the content of described daily record in the in-process logic of carrying out of oneself.
8. method according to claim 1, also comprises:
After at least one line node described completes date restoring, the management node of described secondary storage is changed to the line node performing recovery operation.
9., for a distributed data recovery device for method according to claim 1, comprising:
Host node device, for managing secondary storage and tertiary storage unit; And
Non-master equipment, for managing one-level storage unit.
10. device according to claim 9, wherein, host node device comprises:
Detection module, for detecting the non-master of the machine of delaying; And
Distribution module, for distributing at least one line node by the multiple described secondary storage corresponding to machine node of delaying.
11. devices according to claim 9, wherein, non-master equipment comprises:
Receiver module, for receiving the information of the relevant multiple secondary storage corresponding to machine node of delaying being dispensed to described non-master;
Scan module, for machine node log of delaying described in scanning; And
Processing module, sorts out same with what make the daily record of identical described example be mapped in multiple thread for carrying out Hash.
The distributed data recovery device of 12. a kind of Case-based Reasoning according to claim 10, wherein, the management node of described secondary storage, also for after at least one line node described completes date restoring, is changed to the line node performing recovery operation by distributor.
13. devices according to claim 11, wherein, receiver module is also for the network address and the port name of machine node of delaying described in receiving.
CN201510515919.9A 2015-08-20 2015-08-20 A kind of the distributed data restoration methods and device of Case-based Reasoning Active CN105045917B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201510515919.9A CN105045917B (en) 2015-08-20 2015-08-20 A kind of the distributed data restoration methods and device of Case-based Reasoning
PCT/CN2015/095766 WO2017028394A1 (en) 2015-08-20 2015-11-27 Example-based distributed data recovery method and apparatus
US15/533,955 US10783163B2 (en) 2015-08-20 2015-11-27 Instance-based distributed data recovery method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510515919.9A CN105045917B (en) 2015-08-20 2015-08-20 A kind of the distributed data restoration methods and device of Case-based Reasoning

Publications (2)

Publication Number Publication Date
CN105045917A true CN105045917A (en) 2015-11-11
CN105045917B CN105045917B (en) 2019-06-18

Family

ID=54452464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510515919.9A Active CN105045917B (en) 2015-08-20 2015-08-20 A kind of the distributed data restoration methods and device of Case-based Reasoning

Country Status (3)

Country Link
US (1) US10783163B2 (en)
CN (1) CN105045917B (en)
WO (1) WO2017028394A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930397A (en) * 2016-04-15 2016-09-07 北京思特奇信息技术股份有限公司 Message processing method and system
WO2017028394A1 (en) * 2015-08-20 2017-02-23 北京百度网讯科技有限公司 Example-based distributed data recovery method and apparatus
CN106919679A (en) * 2017-02-27 2017-07-04 北京小米移动软件有限公司 Method, device and terminal are recurred in the daily record for being applied to distributed file system
CN110825706A (en) * 2018-08-07 2020-02-21 华为技术有限公司 Data compression method and related equipment
CN111459896A (en) * 2019-01-18 2020-07-28 阿里巴巴集团控股有限公司 Data recovery system and method, electronic device, and computer-readable storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9921910B2 (en) * 2015-02-19 2018-03-20 Netapp, Inc. Virtual chunk service based data recovery in a distributed data storage system
NL2027048B1 (en) * 2020-12-04 2022-07-07 Ing Bank N V Methods, systems and networks for recovering distributed databases, and computer program products, data carrying media and non-transitory tangible data storage media with computer programs and/or databases stored thereon useful in recovering a distributed database.
CN113268470A (en) * 2021-06-17 2021-08-17 重庆富民银行股份有限公司 Efficient database rollback scheme verification method
CN115437843B (en) * 2022-08-25 2023-03-28 北京万里开源软件有限公司 Database storage partition recovery method based on multi-level distributed consensus

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161565A1 (en) * 2008-12-18 2010-06-24 Electronics And Telecommunications Research Institute Cluster data management system and method for data restoration using shared redo log in cluster data management system
CN101853186A (en) * 2008-12-31 2010-10-06 Sap股份公司 Distributed transactional recovery system and method
CN102364448A (en) * 2011-09-19 2012-02-29 浪潮电子信息产业股份有限公司 Fault-tolerant method for computer fault management system
CN103049355A (en) * 2012-12-25 2013-04-17 华为技术有限公司 Method and equipment for database system recovery
CN103198159A (en) * 2013-04-27 2013-07-10 国家计算机网络与信息安全管理中心 Transaction-redo-based multi-copy consistency maintaining method for heterogeneous clusters
CN104376082A (en) * 2014-11-18 2015-02-25 中国建设银行股份有限公司 Method for importing data in data source file to database

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8965849B1 (en) 2012-08-06 2015-02-24 Amazon Technologies, Inc. Static sorted index replication
CN105045917B (en) 2015-08-20 2019-06-18 北京百度网讯科技有限公司 A kind of the distributed data restoration methods and device of Case-based Reasoning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161565A1 (en) * 2008-12-18 2010-06-24 Electronics And Telecommunications Research Institute Cluster data management system and method for data restoration using shared redo log in cluster data management system
CN101853186A (en) * 2008-12-31 2010-10-06 Sap股份公司 Distributed transactional recovery system and method
CN102364448A (en) * 2011-09-19 2012-02-29 浪潮电子信息产业股份有限公司 Fault-tolerant method for computer fault management system
CN103049355A (en) * 2012-12-25 2013-04-17 华为技术有限公司 Method and equipment for database system recovery
CN103198159A (en) * 2013-04-27 2013-07-10 国家计算机网络与信息安全管理中心 Transaction-redo-based multi-copy consistency maintaining method for heterogeneous clusters
CN104376082A (en) * 2014-11-18 2015-02-25 中国建设银行股份有限公司 Method for importing data in data source file to database

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017028394A1 (en) * 2015-08-20 2017-02-23 北京百度网讯科技有限公司 Example-based distributed data recovery method and apparatus
US10783163B2 (en) 2015-08-20 2020-09-22 Beijing Baidu Netcom Science And Technology Co., Ltd. Instance-based distributed data recovery method and apparatus
CN105930397A (en) * 2016-04-15 2016-09-07 北京思特奇信息技术股份有限公司 Message processing method and system
CN105930397B (en) * 2016-04-15 2019-05-17 北京思特奇信息技术股份有限公司 A kind of message treatment method and system
CN106919679A (en) * 2017-02-27 2017-07-04 北京小米移动软件有限公司 Method, device and terminal are recurred in the daily record for being applied to distributed file system
CN106919679B (en) * 2017-02-27 2019-12-13 北京小米移动软件有限公司 Log replay method, device and terminal applied to distributed file system
CN110825706A (en) * 2018-08-07 2020-02-21 华为技术有限公司 Data compression method and related equipment
CN110825706B (en) * 2018-08-07 2022-09-16 华为云计算技术有限公司 Data compression method and related equipment
CN111459896A (en) * 2019-01-18 2020-07-28 阿里巴巴集团控股有限公司 Data recovery system and method, electronic device, and computer-readable storage medium
CN111459896B (en) * 2019-01-18 2023-05-02 阿里云计算有限公司 Data recovery system and method, electronic device, and computer-readable storage medium

Also Published As

Publication number Publication date
US10783163B2 (en) 2020-09-22
WO2017028394A1 (en) 2017-02-23
US20180150536A1 (en) 2018-05-31
CN105045917B (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN105045917A (en) Example-based distributed data recovery method and device
US7992037B2 (en) Scalable secondary storage systems and methods
US11093468B1 (en) Advanced metadata management
US20130227194A1 (en) Active non-volatile memory post-processing
CN107180113B (en) Big data retrieval platform
WO2015109250A1 (en) CREATING NoSQL DATABASE INDEX FOR SEMI-STRUCTURED DATA
CN102214205A (en) Logical replication in clustered database system with adaptive cloning
CN110297866A (en) Method of data synchronization and data synchronization unit based on log analysis
CN103744906A (en) System, method and device for data synchronization
CN111460023A (en) Service data processing method, device, equipment and storage medium based on elastic search
CN104036029B (en) Large data consistency control methods and system
CN106649676B (en) HDFS (Hadoop distributed File System) -based duplicate removal method and device for stored files
KR20100070968A (en) Cluster data management system and method for data recovery using parallel processing in cluster data management system
CN102725739A (en) Distributed database system by sharing or replicating the meta information on memory caches
CN107807787B (en) Distributed data storage method and system
US9189489B1 (en) Inverse distribution function operations in a parallel relational database
CN110674152B (en) Data synchronization method and device, storage medium and electronic equipment
CN107070645A (en) Compare the method and system of the data of tables of data
CN104484131A (en) Device and corresponding method for processing data of multi-disk servers
CN102262591A (en) Garbage collection method and system for memory copy system
CN110427364A (en) A kind of data processing method, device, electronic equipment and storage medium
CN111680017A (en) Data synchronization method and device
CN107818106B (en) Big data offline calculation data quality verification method and device
Liu et al. Hadoop based scalable cluster deduplication for big data
Tsai et al. Data Partitioning and Redundancy Management for Robust Multi-Tenancy SaaS.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant