CN108121618A - A kind of method and apparatus of repair data - Google Patents

A kind of method and apparatus of repair data Download PDF

Info

Publication number
CN108121618A
CN108121618A CN201611069108.1A CN201611069108A CN108121618A CN 108121618 A CN108121618 A CN 108121618A CN 201611069108 A CN201611069108 A CN 201611069108A CN 108121618 A CN108121618 A CN 108121618A
Authority
CN
China
Prior art keywords
record
subclass
address
sample
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611069108.1A
Other languages
Chinese (zh)
Other versions
CN108121618B (en
Inventor
李科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201611069108.1A priority Critical patent/CN108121618B/en
Publication of CN108121618A publication Critical patent/CN108121618A/en
Application granted granted Critical
Publication of CN108121618B publication Critical patent/CN108121618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the present application provides a kind of method and apparatus of repair data, for reducing the execution time of service node repair cache collection group records, so as to promote the system performance of service node.Method includes:Sample group is obtained, sample group includes at least one sample set, and sample set includes the part record in cache cluster;According to sample group acquisition error rate, exception record accounts for the ratio of all records in sample group in error rate expression sample group, and exception record refers to the record different from the reference record in data-base cluster, and the address of exception record and the address of reference record correspond.If error rate is more than 0 and is less than first threshold, when the summary of the first set in cache cluster is different from the summary of the second set in data-base cluster, the exception record in first set is repaired according to second set;First set includes the part record in cache cluster, and the address of first set and the address of second set correspond.

Description

A kind of method and apparatus of repair data
Technical field
The present invention relates to technical field of data processing more particularly to a kind of method and apparatus of repair cache data.
Background technology
As shown in Figure 1, Web control device cluster includes cache cluster, data-base cluster and business cluster.Wherein, Cache cluster includes multiple cache nodes, and data-base cluster includes multiple database nodes, and business cluster includes multiple business sections Point.
Record is write data-base cluster by service node during write record, first, then record is write cache cluster. Since the readwrite performance of cache cluster is better than the readwrite performance of data-base cluster, service node is generally read from cache cluster Take record.If service node, which detects, does not have the record to be read (i.e. the record missing of cache cluster) in cache cluster, to Data base set pocket transmission is asked;Data-base cluster sends requested record according to the request to service node;Service node connects After the record for receiving data base set pocket transmission, the response message of the record is carried to data base set pocket transmission.It is if it follows that slow The record missing of cluster is deposited, then adds the expense of the record in service node access cache cluster, also increases service node Access the expense of the record in data-base cluster.
In order to solve the above-mentioned technical problem, service node can be remembered this during it will record write-in cache cluster Record is persisted in perdurable data storehouse, then, by comparing the record in perdurable data storehouse and cache cluster, so as to slow The record deposited in cluster is repaired.Specifically:The part record that service node is obtained in cache cluster is remembered as sample caching Record, and the sample persistence record identical with the address of the sample caching record is obtained in persistence cluster;If it is determined that sample Caching record and sample persistence record differ, then by comparing in all records and perdurable data storehouse in cache cluster Corresponding record finds out the address of the record of cache cluster loss;Finally, the record with loss in perdurable data storehouse is utilized Address record correspondingly, the record of the loss in cache cluster is repaired, so as to ensure the note in cache cluster Record is consistent with the record in perdurable data storehouse, that is, ensures that the record in cache cluster is consistent with the record in data-base cluster.
But the above method, it is necessary to compare slow during the record of the loss in service node repair cache cluster The corresponding record in all records and perdurable data storehouse in cluster is deposited, therefore, adds service node repair cache collection The execution time of record in group, decline system performance.
The content of the invention
The application provides a kind of method and apparatus of repair data, for reducing the note in service node repair cache cluster The execution time of record, so as to promote the system performance of service node.
In order to achieve the above object, the embodiment of the present invention adopts the following technical scheme that:
In a first aspect, a kind of method of repair data is provided, applied to the system for including cache cluster and data-base cluster In, this method can include:Service node obtains sample group, wherein, sample group includes at least one sample set, sample set Including the part record in cache cluster.Service node obtains error rate according to sample group, wherein, error rate is represented in sample group Exception record accounts for the ratio of all records in sample group, and exception record refers to the note different from the reference record in data-base cluster Record, the address of exception record and the address of reference record correspond.If error rate is more than 0 and is less than first threshold, work as caching When the summary of first set in cluster is different from the summary of the second set in data-base cluster, the is repaired according to second set Exception record in one set;Wherein, first set includes the part record in cache cluster, the address of first set and second The address of set corresponds.Due in technical solution provided in an embodiment of the present invention, if set and data in cache cluster The summary of set corresponding with the set is identical in the cluster of storehouse, then can not compare every note in the set in cache cluster The reference record of this every record in record and data-base cluster.Therefore, service node can be shortened to determine in cache cluster The execution time of exception record, improve system performance.
In the first possible realization of first aspect, first set is sample set;Service node is according to sample group Error rate is obtained, can be included:Service node obtains the summary of first set and the summary of second set.If first set is plucked It is identical with the summary of second set, then any record in first set is not repaired.If the summary of first set and the second collection The summary of conjunction is different, then obtains the item number of the exception record in first set;Then, service node is different in first set The item number often recorded obtains error rate.
In second of possible realization method of first aspect, first set is non-sample set;If error rate is more than 0 Less than first threshold, then this method can also include:Obtain the summary of first set and the summary of second set.
With reference to first aspect or any possible realization method of first aspect, the third in first aspect are possible In realization method, this method can also include:It is all in data-base cluster if error rate is more than or equal to first threshold Record all records in repair cache cluster.Wherein, service node all remembers all records in cache cluster as abnormal Record can save the time of the exception record in definite cache cluster, so as to save the exception record in repair cache cluster Time.
With reference to first aspect or any possible realization method of first aspect, the 4th kind in first aspect are possible In realization method, this method can also include:If error rate is equal to 0, the number of the sample set in sample group is reduced;It is if poor Error rate is more than 0, then increases the number of the sample set in sample group.Wherein, if error rate is equal to 0, service node reduces sample The number of sample set in this group.So, it is possible to reduce service node obtains the expense of sample slice.If error rate is more than 0, Then service node increases the number of the sample set in sample group.In this way, service node is obtaining the mistake of sample set next time Error rate can be determined more accurately in Cheng Zhong.
With reference to first aspect or any possible realization method of first aspect, the 5th kind in first aspect are possible In realization method, first set includes M subclass, and each subclass includes at least two records;Second set includes M son To gather, the address of m-th of subclass in the address and second set of m-th of subclass in first set corresponds, M >= 2,1≤m≤M, M and m are integer;In this case, service node repairs the exception record in first set according to second set, It can include:Obtain the summary that the summary of the first subclass is closed with second subset;Wherein, the first subclass is that first set includes M subclass in any one subclass, second subset conjunction is a pair of with the address one of the first subclass in second set The subclass answered;If the summary of the first subclass is different from the summary that second subset is closed, it is determined that the exception in the first subclass The address of record, and the exception record repaired in the first subclass is closed according to second subset.In this way, service node passes through comparison the Summary in the summary and section point of each subclass of one set with the one-to-one subclass in address of each subclass, can To reduce, there are correspondences between the address of the record in record and second set in service node comparison first set Record number, so as to reduce the expense of service node.
With reference to first aspect, in the 6th kind of possible realization method of first aspect, this method can also include:It is if poor Error rate is equal to 0, then service node not any record in repair cache cluster.
With reference to first aspect or the 5th kind of possible realization method of first aspect, the 7th kind in first aspect are possible In realization method, this method can also include:If error rate is greater than or equal to second threshold, increase the value of M.If error rate More than 0 and it is less than second threshold, then reduces the value of M.Wherein, error rate is greater than or equal to second threshold, illustrates cache cluster Including more exception record, then increase the value of M, in this way, service node can be reduced in cache cluster in addition to sample set Respectively gather the number for the subclass being divided into, reduce the summary and data-base cluster of the subclass in service node comparison cache cluster In number with the summary of the one-to-one subclass in address of the subclass, so as to reduce the expense of service node.Error rate More than 0 and less than second threshold, illustrate that cache cluster includes less exception record, in this way, service node can increase cache set The number for respectively gathering the subclass being divided into group in addition to sample set, so as to which exception record be determined more accurately.
Second aspect, provides a kind of device of the data in repair cache cluster, which can include:Acquisition module and Repair module.Acquisition module for obtaining sample group, and obtains error rate according to sample group.Wherein, sample group includes at least one A sample set, sample set include the part record in cache cluster.Error rate represents that exception record accounts for sample in sample group The ratio of all records in group.Exception record refers to the record different from the reference record in data-base cluster.Exception record Address and the address of reference record correspond.Repair module if being more than 0 for error rate is less than first threshold, works as caching When the summary of first set in cluster is different from the summary of the second set in data-base cluster, the is repaired according to second set Exception record in one set.Wherein, first set includes the part record in cache cluster, the address of first set and second The address of set corresponds.
In the first possible realization method of second aspect, first set is sample set;Acquisition module is specifically used In, the summary of first set and the summary of second set are obtained, if the summary of first set is different with the summary of second set, Obtain the item number of the exception record in first set;And the item number of the exception record in first set, obtain error rate.
In second of possible realization method of second aspect, first set is non-sample set;Acquisition module is also used In, if error rate be more than 0 be less than first threshold, obtain the summary of first set and the summary of second set.
With reference to any possible realization method of second aspect or second aspect, the third in second aspect is possible In realization method, repair module is additionally operable to, if error rate is more than or equal to first threshold, all notes in data-base cluster Record all records in repair cache cluster.
With reference to any possible realization method of second aspect or second aspect, the 4th kind in second aspect is possible In realization method, which further includes:Swap modules if being equal to 0 for error rate, reduce sample set in sample group Number.Or, if error rate is more than 0, increases the number of the sample set in sample group.
With reference to any possible realization method of second aspect or second aspect, the 5th kind in second aspect is possible In realization method, first set includes M subclass, and each subclass includes at least two records;Second set includes M son To gather, the address of m-th of subclass in the address and second set of m-th of subclass in first set corresponds, M >= 2,1≤m≤M, M and m are integer.Acquisition module is specifically used for, and the summary for obtaining the first subclass is plucked with what second subset was closed It will;Wherein, the first subclass is any one subclass in the M subclass that first set includes, and second subset conjunction is second The one-to-one subclass in address with the first subclass in set.Repair module is specifically used for:If the first subclass is plucked It is different from the summary that second subset is closed, then close the exception record repaired in the first subclass according to second subset, and definite the The address of exception record in one subclass.
With reference to second aspect, in the 6th kind of possible realization method of second aspect, repair module is additionally operable to, if mistake Rate is equal to 0, then not any record in repair cache cluster.
With reference to the 7th kind of possible realization method of second aspect, in the 5th kind of possible realization method of second aspect In, which further includes:Swap modules, if for so error rate is greater than or equal to second threshold, increasing the value of M.Or, If error rate is more than 0 and is less than second threshold, the value of M is reduced.
The third aspect, provides a kind of device of repair data, which can realize the reparation that above-mentioned first aspect provides Performed function in the method example of data, the function can also be performed corresponding by hardware realization by hardware Software realize.The hardware or software include one or more above-mentioned corresponding modules of function.
In a kind of possible realization method of the third aspect, the structure of the device include processor, system bus and Communication interface;The processor is configured as that the device is supported to perform corresponding function in the above method.The communication interface is for branch Hold the communication between the device and other network elements (such as data-base cluster).
Fourth aspect provides a kind of computer storage media, is stored with computer program code, computer program code Including instruction, when the processor of service node executes instruction, service node is performed such as the repair data of above-mentioned first aspect Method.
The device or computer storage media of any repair data of above-mentioned offer are used to perform presented above repair The method of complex data, attainable advantageous effect can refer to advantageous effect in corresponding method presented above, this Place repeats no more.
Description of the drawings
Fig. 1 is a kind of system architecture diagram provided in the prior art;
Fig. 2 is a kind of method flow schematic diagram of repair cache data provided in an embodiment of the present invention;
Fig. 3 is the signal of the set in set and data-base cluster in a kind of cache cluster provided in an embodiment of the present invention Figure;
Fig. 4 is a kind of method flow schematic diagram for repairing the record in first set provided in an embodiment of the present invention;
Fig. 5 is the subclass in subclass and data-base cluster in a kind of cache cluster provided in an embodiment of the present invention Schematic diagram;
Fig. 6 is a kind of structure diagram of the device of repair cache data provided in an embodiment of the present invention;
Fig. 7 is a kind of structure diagram of the device of repair cache data provided in an embodiment of the present invention.
Specific embodiment
Embodiment provided by the invention can be adapted in system architecture as shown in Figure 1, and system shown in FIG. 1 includes: Cache cluster, data-base cluster and business cluster.Cache cluster includes multiple cache nodes, and data-base cluster includes multiple data Storehouse node, business cluster include multiple service nodes.Each service node in business cluster can be with one in cache cluster It communicates, can also be carried out between one or more database nodes of data-base cluster between a or multiple cache nodes Communication.
With reference to the attached drawing in the embodiment of the present invention, exemplary retouch is carried out to the technical solution in the embodiment of the present invention It states, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.
As shown in Fig. 2, be a kind of method flow schematic diagram of repair cache data provided in an embodiment of the present invention, this method It can be applied in system architecture as shown in Figure 1.This method can include:
S101:Service node is obtained in record and data-base cluster in each sample set in cache cluster with being somebody's turn to do Record in the one-to-one set in the address of each sample set.
A plurality of record is stored in cache cluster, there are one addresses for every record tool.The address of record can be record Mark (English:Identity, ID).A plurality of record is stored in data-base cluster, there are one addresses for every record tool.Cache set The address of a record in group in the address and data-base cluster of every record corresponds.In embodiments of the present invention, industry Record in cache cluster can be divided at least two set by business node, and each set includes at least two records;And according to The address correspondence of record in the address and data-base cluster of record in cache cluster, by the record in data-base cluster It is divided at least two set.
Service node in S101 can be any service node in business cluster.Multiple set that business cluster is divided into It can include sample set and non-sample set.The set that each sample set in business cluster is formed is properly termed as sample group. Sample set in sample group is used to determine the exception record in cache cluster.Wherein, exception record refers to and data-base cluster In the different record of reference record;The address of exception record and the address of reference record correspond.If appointing in sample group One sample set includes exception record, then it is assumed that cache cluster includes exception record.If each sample set in sample group is equal Do not include exception record, then it is assumed that cache cluster does not include exception record.Any set can be used as sample in cache cluster Set.The number of sample set in sample group is more, and whether the cache cluster that service node determines includes the knot of exception record Fruit is more accurate.
The address of set is the information for the address for referring to represent all records in the set.Optionally, the ground of set Location can be the set that the address of all records in the set is formed.Optionally, if the address of the record in set is continuous , then the address gathered can be represented with the minimum value in the address of the record in the set and maximum, it is understood that be It is represented with initial address and termination address.
It is exemplary, as shown in figure 3, the schematic diagram for the set in the set and data-base cluster in cache cluster.Assuming that Set in the cache cluster that service node obtains is respectively to gather 1, set 2 and set 3;Wherein, what service node obtained is slow It is continuous to deposit the address of the record in each set in cluster;In this way, the set in the data-base cluster that service node obtains Respectively set 4, set 5 and set 6.In this case, the address of the record in set 1 includes ID1 to ID5;Note in set 2 The address of record includes ID6 to ID10;The address of record in set 3 includes ID11 to ID15;The address bag of record in set 4 ID1 is included to ID5;The address of record in set 5 includes ID6 to ID10;The address of record in set 6 includes ID11 extremely ID15。
If sample set include a plurality of record, service node obtain cache cluster in sample set, can by with Under any mode realize:
Mode 1:Service node sends the request for the address for carrying sample set to cache cluster.Cache cluster, which received, to ask It asks, and the record in the acquisition request sample set, the record in sample set is then sent to service node.Business section Record in the sample set of point order caching collection pocket transmission.
It is exemplary, if sample set is set { a1, a2, a3 }, the address of sample set for set the address of a1, a2's Address, the address of a3 };Then service node can carry set { address of a1, the ground of a2 into the request that cache cluster is sent Location, the address of a3 }, to obtain a1, a2 and a3.
Mode 2:Service node obtains every record in sample set according to following steps:Service node is to cache cluster Send the request for the address for carrying a record in sample set.Cache cluster receives the request, and according to the acquisition request Then the record sends the record to service node.The record of service node order caching collection pocket transmission.
Exemplary, if sample set is set { a1, a2, a3 }, then service node sends the ground for carrying a1 to cache cluster The request of location, to obtain a1;The request for the address for carrying a2 is sent to cache cluster, to obtain a2;It is taken to cache cluster transmission The request of address with a3, to obtain a3.
S102:Service node obtain sample group in each sample set summary and data-base cluster in the various kinds The summary that the address of this set is gathered correspondingly.
Service node can obtain the summary of sample set by carrying out summary computing to the record in sample set.If The storage format of record in cache cluster is different from the storage format of the record in data-base cluster, then in database It, can be first by the ground with each sample set in database before the summary gathered correspondingly with the address of each sample set Set is converted into caching form correspondingly for location, then collecting correspondingly with the address of each sample set to caching form Conjunction carries out summary computing.Summary computing can include but is not limited to Message Digest 5 (English:message digest Algorithm, MD) 5, MD4 etc..
S103:In service node contrast sample's group each sample set summary and data-base cluster in each sample set Address correspondingly set in record.
In a kind of possible realization method, in the every record and data-base cluster in service node judgement sample set Whether the record gathered correspondingly with the address of the sample set is identical;It is if identical, it is determined that not include in sample set Exception record;If it is different, then determining that exception record should be included in sample set, and record the ground of exception record in sample set Location.
In alternatively possible realization method, sample set includes the first record, and data-base cluster includes the second record; Wherein, there are correspondences between the address of the first record and the address of the second record.Service node breathes out the first record Uncommon code computing, obtains the Hash code value of the first record;Hash codes computing is carried out to the second record, obtains the Hash codes of the second record Value.Then, the Hash code value of the Hash code value of the first record of comparison and the second record;If the Hash code value and second of the first record The Hash code value of record is identical, it is determined that the first record is not exception record;If the Hash code value of the first record and the second record Hash code value differ, it is determined that first record be exception record, and determine first record address.Service node is according to really Fixed first record whether be exception record method, determine whether other records are exception record in first set.If the first collection Conjunction includes exception record, then the address of recording exceptional record.Since the Hash code value length of record is less than the length of record, because This, which can reduce the execution time of service node address of exception record in sample set is determined.
S104:Service node obtains error rate;Wherein, error rate represents that exception record accounts for all in sample group in sample group The ratio of record.
If sample group includes a sample set, error rate can be expressed asWherein, w is expressed as mistake Rate;A is expressed as the item number of the exception record in sample set;B is expressed as the item number of all records in sample set.If sample group Include multiple sample sets, then error rate can be expressed asWherein, w is expressed as Error rate;anIt is expressed as the item number of the exception record in n-th of sample set in sample group, bnIt represents in n-th of sample set The item number of all records, N represent the number of the sample set in sample group, and N is the integer more than or equal to 1.
If error rate is 0, illustrates not include exception record in cache cluster, then perform S105;It is small that if error rate is more than 0 In first threshold, illustrate that cache cluster includes a small amount of exception record, then perform S106;If error rate is greater than or equal to the first threshold Value, illustrates that cache cluster includes a large amount of exception records, then performs S109.First threshold is more than 0, and the embodiment of the present invention is to first The specific value of threshold value is without limiting.
S105:Service node not any record in repair cache cluster.After performing S105, terminate.
S106:Service node repairs the exception record in the sample set in sample group.
Specifically, service node carries the request of the address of exception record in sample set to data base set pocket transmission.Number The request of service node transmission is received according to storehouse cluster, and exception record in sample set is sent to service node according to the request. After service node receives the sample set exception record of data base set pocket transmission, send and carried in sample set to cache cluster Exception record in sample set is write cache cluster by the request of exception record.
It is exemplary, it is assumed that sample set includes record a1, and a1 is exception record;So, service node is to data base set Pocket transmission carries the request of the address of a1;Data-base cluster receives the request, and according to acquisition request b1, wherein, the ground of a1 There are correspondence between location and the address of b1, then b1 is sent to service node.After service node obtains b1, to cache set Pocket transmission carries the request of b1;Cache cluster receives the request, and according to the address of acquisition request b1, a1 is covered with b1.
S107:Service node determines different in each set (i.e. non-sample set) in cache cluster in addition to sample set The address often recorded.
Service node in the example according to S103, can determine the mode of the address of exception record in sample set, come Determine the address of the exception record in each set in cache cluster in addition to sample set.
S108:The exception in set (i.e. non-sample set) in service node repair cache cluster in addition to sample set Record.After performing S108, terminate.
Service node can be repaired the mode of the exception record in sample set, carry out repair cache cluster according in S106 Exception record in middle non-sample set.
S109:All records in service node repair cache cluster in all set.After performing S109, terminate.
Service node obtains all records in all set in data-base cluster, then sends and asks to cache cluster, To cover all records in cache cluster.It should be noted that error rate is more than the first predetermined threshold value, illustrate in cache cluster Including a large amount of exception records, in this case, service node using all records in cache cluster all as exception record, Neng Goujie The time for the address for determining the exception record in cache cluster is saved, so as to save the abnormal note in service node repair cache cluster The time of record.
In a kind of possible realization method, service node determines set in cache cluster (including sample set and non- Sample set) in exception record address, can include:Service node comparison cache cluster in one set summary with The summary gathered correspondingly between the set in data-base cluster if differing, illustrates the collection in cache cluster Conjunction includes exception record, then records the address of exception record in the set, then, the next set in comparison cache cluster The summary gathered correspondingly with the address of next set of summary and data-base cluster;If identical, caching is compared The summary of next set of cluster and the summary gathered correspondingly with the address of next set of data-base cluster. And so on, until compared the summary of all set and data-base cluster in cache cluster one between the address of the set The summary of one corresponding set, so that it is determined that whether include exception record, if cache cluster includes if each gathering in cache cluster The address of exception record, then recording exceptional record.
In this case, exception record (such as S106 and S108) in respectively gathering in service node repair cache cluster, it can be with Including:Service node carries the request of the address of all exception records in cache cluster to data base set pocket transmission.Data base set Group receives the request that service node is sent, and sends all exception records in cache cluster to service node according to the request.Industry It is engaged in node order caching cluster after all exception records, sending to cache cluster and carrying all exception records in cache cluster Request, all exception records in the cache cluster of acquisition are write into cache cluster.So, it is possible to reduce service node is to data The request that the address of exception record records correspondingly in the acquisition of storehouse collection pocket transmission and cache cluster, so as to reduce business section The expense of point.
In the method for repair cache data provided in an embodiment of the present invention, service node is by by the record in cache cluster It is divided at least two set, and the record in data-base cluster is divided into corresponding at least two set;Then cache set is compared Whether the summary of the set corresponding with the set in the summary and data-base cluster of the set in group is identical, if it is different, explanation The set in cache cluster includes exception record, it is determined that the address of exception record in first set, and repair first set In exception record.Compared with prior art, if the corresponding set of the set and data-base cluster set in cache cluster Summary it is identical, then can not compare in every in the set in cache cluster record and data-base cluster with this every Record the address of corresponding record.Therefore, the execution time for the exception record that service node is determined in cache cluster can be shortened, Improve system performance.
In a kind of possible realization method, this method can also include:According to error rate, the sample in sample group is adjusted The number of set, specifically:If error rate is 0, illustrates not include exception record in cache cluster, illustrate in cache cluster not Including exception record, in this case, service node can reduce the number of the sample set in sample group.So as to save down The time for whether including exception record in cache cluster is once determined using the sample set in sample group.If error rate is more than 0, Illustrate that cache cluster includes exception record.In this case, service node can increase the number of the sample set in sample group. So as in the accuracy of next time definite error rate.
In a kind of possible realization method, first set is any one set in cache cluster, and first set can be with It is sample set or non-sample set.Second set is being deposited between the address of first set in data-base cluster In the set of correspondence.In this case, as shown in figure 4, it is the record in a kind of reparation first set provided by the invention Method flow schematic diagram.This method may comprise steps of S201~S204:
S201:Service node obtains each subclass of first set and each subclass of second set.
First set can be divided into M subclass by service node, wherein, M >=2, M are integers;Each subclass includes At least two records.Correspondingly, second set can be divided into M subclass by service node.M-th of set in first set Address and second set in m-th set address between there are correspondence, wherein, 1≤m≤M, m are integers.According to First set is divided into the method for M subclass, any set in cache cluster can be divided into multiple subsets by service node It closes.The number for the subclass that different sets in cache cluster are divided into may be the same or different.In different subclass The item number of record may be the same or different.
It is exemplary, based on Fig. 3, as shown in figure 5, being the subset in the subclass and data-base cluster in a kind of cache cluster The schematic diagram of conjunction.Assuming that set 1 is first set;Since the address of set 4 and the address of cache set 1 correspond, then, Set 4 is second set.In this case, first set is divided into 3 subclass by service node, it is respectively:Subclass 1, subset Close 2 and zygote set 3.The address of record in subclass 1 includes ID1 to ID2, and the address of the record in subclass 2 includes ID3 To ID4, the address of the record in subclass 3 includes ID5;In this way, second set is divided into 3 subclass by service node, respectively For:Subclass 4, set 5 and subclass 6.The address of record in subclass 4 includes ID1 to ID2, the record in subclass 5 Address includes ID3 to ID4, and the address of the record in subclass 6 includes ID5.
S202:Service node obtain first set each subclass summary and second set in each subclass The one-to-one subclass in address summary.
Service node is according in S102, and service node obtains the summary of each sample set in sample group, to determine first The summary of the summary of each subclass of set and each subclass of second set.
S203:Service node comparison first set in each subclass summary and second set in each subset The summary of the one-to-one subclass in address of conjunction.
First set includes the first subclass, and the first subclass includes the part record in first set, and the first subset is Any subset in first set closes;Second set is closed including second subset, and second subset conjunction includes the part in second set Record;The address of first subclass is corresponded with the address that second subset is closed.If the summary and second subset of the first subclass The summary of conjunction is identical, illustrates that the first subclass does not include exception record, then service node need not be compared in the first subclass Each record and second subset record correspondingly in closing with each recording address.If the summary and second subset of the first subclass The summary of conjunction differs, and illustrates that the first subclass includes exception record, then service node compares each record in the first subclass It is recorded correspondingly with each recording address in being closed with second subset, to determine the address of exception record.Service node according to The mode of the address of exception record in the first subclass can be determined, to determine the son in first set in addition to the first subclass The address of exception record in set.
S204:Service node repairs the exception record in the subclass in first set.
Specifically, service node repairs the mode of the exception record in sample set according to service node in S108, to repair The exception record in subclass in multiple first set.
In a kind of possible realization method, other service nodes in business cluster can be to other in cache cluster Set is repaired.Wherein, any service node in business cluster can be according to the method for above-mentioned S201 to S204, to caching The record in set in cluster is repaired.Therefore, in a repair process, multiple service nodes in business cluster can To be repaired respectively to set different in cache cluster, so, it is possible to reduce exception record in repair cache cluster Perform the time.Wherein, the number of the set in the cache cluster that each service node obtains is without limiting.
It should be noted that if service node reads record based on set coarseness from data-base cluster or cache cluster, The number for reading record can be then reduced, so as to save the time for reading record;Also, if service node is based on subclass granularity Determine in first set with the presence or absence of exception record, then service node need not search in first set " in second set Subclass the identical subclass of summary " in whether include exception record, therefore reduce service node and determine cache cluster In exception record address the execution time, improve the system performance of service node.
In a kind of possible realization method, this method can also include:When error rate is less than first threshold, if mistake Rate is more than 0 and is less than second threshold, wherein, second threshold is less than first threshold, illustrates that the exception record in cache cluster is less, then Service node can reduce the block number for the subclass that non-sample set is divided into, so as to reduce the number of the summary of comparison subclass, So as to save expense.If error rate is more than or equal to second threshold and less than first threshold, illustrate the exception record in cache cluster More, then service node can increase the block number for the subclass that non-sample set is divided into, so as to increase the summary of comparison subclass Number, reduce the time that uses of address for determining exception record.
It is exemplary, it is assumed that first threshold 0.2, second threshold 0.01;The subset that sample set is divided by service node The number of conjunction is 8;Each set for the cache cluster that service node obtains includes 24 records.In this case, if error rate is more than 0 Less than 0.01, then service node each set (i.e. non-sample set) in cache cluster in addition to sample set can be divided into 6 A subclass, each subclass include 4 records.If error rate is more than or equal to 0.01 and less than 0.2, service node can will delay It deposits each set (i.e. non-sample set) in cluster in addition to sample set and is divided into 12 subclass, each subclass includes 2 records.
It is above-mentioned that mainly scheme provided in an embodiment of the present invention is described from the angle of service node.It is appreciated that It is that, in order to realize above-mentioned each function, service node, which contains, performs the corresponding hardware configuration of each function and/or software mould Block.Those skilled in the art should be readily appreciated that, with reference to each exemplary module of the embodiments described herein description And algorithm steps, the present invention can be realized with the combining form of hardware or hardware and computer software.Some function actually with Hardware or computer software drive the mode of hardware to perform, depending on the specific application of technical solution and design constraint item Part.Professional technician can realize described function to each specific application using distinct methods, but this It realizes it is not considered that beyond the scope of this invention.
The embodiment of the present invention can carry out service node according to the above method example division of function module, for example, can Each function module is divided with each function of correspondence, two or more functions can also be integrated in a processing module In.The form that hardware had both may be employed in above-mentioned integrated module is realized, can also be realized in the form of software function module.It needs It is only a kind of division of logic function it is noted that being schematical to the division of module in the embodiment of the present invention, it is actual real There can be other dividing mode now.
Fig. 6 shows a kind of structure diagram of the device 6 of the data in repair cache cluster.The device 6 can include: Acquisition module 601, repair module 602.Wherein, acquisition module 601 be used for support device 6 perform Fig. 2 in S101, S102 and S201 and S202 in S104, Fig. 4 and/or other processes for techniques described herein.Repair module 602 is for branch It holds device 6 and performs S103 to S109 in Fig. 2, S203 and S204 in Fig. 4 and/or its for techniques described herein Its process.In addition, device 6 can also include:Memory module.Memory module performs presented above appoint for storage device 6 Program code and data corresponding to the method for one repair cache data.
As shown in fig. 7, the structure diagram of the device 7 for a kind of repair data provided in an embodiment of the present invention.The device 7 It can include:Memory 700, processor 701, communication interface 702 and bus 703;Wherein, memory 700, processor 701, Communication interface 702 is connected with each other by bus 703.Memory 700 is used for the necessary program instruction of storage service node and data, The program is used for when device 7 is run, the method that processor 701 performs any repair data provided above.Processor 701 Can be central processing unit (central processing unit, CPU), network processing unit (network processor, ) or the combination of CPU and NP NP.Processor 701 can further include hardware chip.Above-mentioned hardware chip can be number Signal processor (English:Digital signal processor), application-specific integrated circuit (English:application- Specific integrated circuit), field programmable gate array (English:field programmable gate Array) or it is combined.It can realize or perform with reference to the described various example modules of the embodiment of the present invention and Circuit.Bus 703 can be divided into address bus, data/address bus, controlling bus etc..For ease of representing, only with a thick line in Fig. 7 It represents, it is not intended that an only bus or a type of bus.
It can be realized in a manner of hardware with reference to the step of described method of the disclosure of invention or algorithm, also may be used It is realized in a manner of being to be executed program instructions by processing module.Program instruction can be stored on nonvolatile memory (English Text:Non-volatile memory), such as flash memory, read-only memory (English:Read only memory), erasable compile Journey read-only memory (English:Erasable programmable ROM), Electrically Erasable Programmable Read-Only Memory (English: Electrically EPROM), register, hard disk, mobile hard disk, read-only optical disc (English:CD-ROM) or it is well known that Any other form storage medium in.A kind of illustrative storage medium is coupled to processor, so as to enable a processor to From the read information, and information can be write to the storage medium.Certainly, storage medium can also be the group of processor Into part.
Those skilled in the art are it will be appreciated that in said one or multiple examples, work(described in the invention It can be realized with hardware, software, firmware or their any combination.It when implemented in software, can be by these functions It is stored in computer-readable medium or is transmitted as one or more instructions on computer-readable medium or code. Computer-readable medium includes computer storage media and communication media, and wherein communication media includes being convenient for from a place to another Any medium of one place transmission computer program.It is any that storage medium can be that universal or special computer can access Usable medium.
Above-described specific embodiment has carried out the purpose of the present invention, technical solution and advantageous effect further It is described in detail, it should be understood that the foregoing is merely the specific embodiments of the present invention, is not intended to limit the present invention Protection domain.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations, ability The those of ordinary skill in domain should be understood:It can still modify to the technical solution recorded in foregoing embodiments or Person carries out equivalent substitution to which part technical characteristic;And these are changed or are replaced, and do not make the essence of appropriate technical solution Depart from the scope of various embodiments of the present invention technical solution.

Claims (12)

  1. A kind of 1. method of repair data, which is characterized in that applied in the system including cache cluster and data-base cluster, institute The method of stating includes:
    Sample group is obtained, wherein, the sample group includes at least one sample set, and the sample set includes the cache set Part record in group;
    Error rate is obtained according to the sample group, wherein, the error rate represents that exception record accounts for the sample in the sample group The ratio of all records in this group, the exception record refer to the record different from the reference record in the data-base cluster; The address of the exception record and the address of the reference record correspond;
    If the error rate is more than 0 and is less than first threshold, when summary and the number of the first set in the cache cluster According to the summary of the second set in the cluster of storehouse it is different when, abnormal remembered according to what the second set was repaired in the first set Record;Wherein, the first set includes the part record in the cache cluster, the address of the first set and described second The address of set corresponds.
  2. 2. according to the method described in claim 1, it is characterized in that, the first set is sample set;Described in the basis Sample group obtains error rate, including:
    The summary of the first set and the summary of the second set are obtained, if the summary of the first set and described second The summary of set is different, then obtains the item number of the exception record in the first set;
    The item number of exception record in the first set obtains error rate.
  3. 3. according to the method described in claim 1, it is characterized in that, the first set is non-sample set;If the mistake Rate is more than 0 and is less than first threshold, then the method further includes:
    Obtain the summary of the first set and the summary of the second set.
  4. 4. method according to any one of claims 1 to 3, which is characterized in that the method further includes:
    If the error rate is more than or equal to the first threshold, described in all records reparation in the data-base cluster All records in cache cluster.
  5. 5. method according to any one of claims 1 to 4, which is characterized in that the method further includes:
    If the error rate is equal to 0, the number of the sample set in the sample group is reduced;
    If the error rate is more than 0, increase the number of the sample set in the sample group.
  6. 6. method according to any one of claims 1 to 5, which is characterized in that the first set includes M subclass, Each subclass includes at least two records;The second set includes M subclass, and m-th in the first set is sub The address of set and the address of m-th of subclass in the second set correspond, and M >=2,1≤m≤M, M and m are whole Number;
    The exception record repaired according to the second set in the first set, including:
    Obtain the summary that the summary of the first subclass is closed with second subset;Wherein, first subclass is the first set Including M subclass in any one subclass, second subset conjunction be in the second set with the described first son The one-to-one subclass in address of set;
    If the summary of first subclass is different from the summary that the second subset is closed, it is determined that in first subclass The address of exception record, and the exception record repaired in first subclass is closed according to the second subset.
  7. 7. a kind of device of repair data, which is characterized in that applied in the system including cache cluster and data-base cluster, institute Stating device includes:
    Acquisition module for obtaining sample group, and obtains error rate according to the sample group;Wherein, the sample group is included extremely A few sample set, the sample set include the part record in the cache cluster;The error rate represents the sample Exception record accounts for the ratio of all records in the sample group in this group, the exception record refer to in the data-base cluster The different record of reference record;The address of the exception record and the address of the reference record correspond;
    Repair module, if being more than 0 for the error rate is less than first threshold, when the first set in the cache cluster When summary is different from the summary of the second set in the data-base cluster, the first set is repaired according to the second set In exception record;Wherein, the first set includes the part record in the cache cluster, the address of the first set It is corresponded with the address of the second set.
  8. 8. device according to claim 7, which is characterized in that the first set is sample set;
    The acquisition module is specifically used for, and obtains the summary of the first set and the summary of the second set, if described the The summary of one set is different with the summary of the second set, then obtains the item number of the exception record in the first set;And The item number of exception record in the first set obtains error rate.
  9. 9. device according to claim 7, which is characterized in that the first set is non-sample set;
    The acquisition module is additionally operable to, if the error rate, which is more than 0, is less than first threshold, obtains the summary of the first set With the summary of the second set.
  10. 10. device according to any one of claims 7 to 9, which is characterized in that
    The repair module is additionally operable to, if the error rate is more than or equal to the first threshold, according to the data-base cluster In all records repair all records in the cache cluster.
  11. 11. according to claim 7 to 10 any one of them device, which is characterized in that described device further includes:Swap modules, If being equal to 0 for the error rate, the number of the sample set in the sample group is reduced;If the error rate is more than 0, Increase the number of the sample set in the sample group.
  12. 12. according to claim 7 to 11 any one of them device, which is characterized in that the first set includes M subset It closes, each subclass includes at least two records;The second set includes M subclass, m-th in the first set The address of subclass and the address of m-th of subclass in the second set correspond, and M >=2,1≤m≤M, M and m are Integer;
    The acquisition module is specifically used for, and obtains the summary that the summary of the first subclass is closed with second subset;Wherein, described first Subclass is any one subclass in the M subclass that the first set includes, and the second subset conjunction is described second The one-to-one subclass in address with first subclass in set;
    The repair module is specifically used for, if the summary of first subclass is different from the summary that the second subset is closed, It determines the address of the exception record in first subclass, and is closed and repaired in first subclass according to the second subset Exception record.
CN201611069108.1A 2016-11-28 2016-11-28 Method and device for repairing data Active CN108121618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611069108.1A CN108121618B (en) 2016-11-28 2016-11-28 Method and device for repairing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611069108.1A CN108121618B (en) 2016-11-28 2016-11-28 Method and device for repairing data

Publications (2)

Publication Number Publication Date
CN108121618A true CN108121618A (en) 2018-06-05
CN108121618B CN108121618B (en) 2021-02-12

Family

ID=62225406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611069108.1A Active CN108121618B (en) 2016-11-28 2016-11-28 Method and device for repairing data

Country Status (1)

Country Link
CN (1) CN108121618B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177513A (en) * 2019-12-31 2020-05-19 北京百度网讯科技有限公司 Method and device for determining abnormal access address, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102938001A (en) * 2012-12-10 2013-02-20 曙光信息产业(北京)有限公司 Data loading device and data loading method
CN103078927A (en) * 2012-12-28 2013-05-01 合一网络技术(北京)有限公司 Key-value data distributed caching system and method thereof
CN103685575A (en) * 2014-01-06 2014-03-26 洪高颖 Website security monitoring method based on cloud architecture
CN104504147A (en) * 2015-01-04 2015-04-08 华为技术有限公司 Resource coordination method, device and system for database cluster
US20160132535A1 (en) * 2014-11-06 2016-05-12 National Applied Research Laboratories Acceleration method for database using index value operation and mixed-mode leveled cache
CN105701219A (en) * 2016-01-14 2016-06-22 北京邮电大学 Distributed cache implementation method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102938001A (en) * 2012-12-10 2013-02-20 曙光信息产业(北京)有限公司 Data loading device and data loading method
CN103078927A (en) * 2012-12-28 2013-05-01 合一网络技术(北京)有限公司 Key-value data distributed caching system and method thereof
CN103685575A (en) * 2014-01-06 2014-03-26 洪高颖 Website security monitoring method based on cloud architecture
US20160132535A1 (en) * 2014-11-06 2016-05-12 National Applied Research Laboratories Acceleration method for database using index value operation and mixed-mode leveled cache
CN104504147A (en) * 2015-01-04 2015-04-08 华为技术有限公司 Resource coordination method, device and system for database cluster
CN105701219A (en) * 2016-01-14 2016-06-22 北京邮电大学 Distributed cache implementation method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177513A (en) * 2019-12-31 2020-05-19 北京百度网讯科技有限公司 Method and device for determining abnormal access address, electronic equipment and storage medium
CN111177513B (en) * 2019-12-31 2023-10-31 北京百度网讯科技有限公司 Determination method and device of abnormal access address, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108121618B (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN108829344A (en) Date storage method, device and storage medium
CN105511957B (en) For generating the method and system of operation alarm
CN109753443B (en) Data processing method and device and electronic equipment
US9495286B2 (en) Method and arrangement for completion or recovery of data transactions in a flash type memory device using a commit process
CN112597153B (en) Block chain-based data storage method, device and storage medium
CN110781231B (en) Database-based batch import method, device, equipment and storage medium
US10229023B2 (en) Recovery of storage device in a redundant array of independent disk (RAID) or RAID-like array
CN104731896A (en) Data processing method and system
CN109976669B (en) Edge storage method, device and storage medium
CN106407224A (en) Method and device for file compaction in KV (Key-Value)-Store system
WO2023051282A1 (en) Embedded vector prefetching method, apparatus and system, and related device
US11726970B2 (en) Incremental transfer of database segments
CN107291392A (en) A kind of solid state hard disc and its reading/writing method
CN107430546B (en) File updating method and storage device
CN108121618A (en) A kind of method and apparatus of repair data
CN106325769B (en) A kind of method and device of data storage
US10761940B2 (en) Method, device and program product for reducing data recovery time of storage system
CN115202589B (en) Placement group member selection method, device and equipment and readable storage medium
CN104598485A (en) Method and device for processing database table
CN106055640A (en) Buffer memory management method and system
CN110085284B (en) SSD (solid State disk) -oriented gene comparison method and system
CN104216666A (en) Method and device for managing writing of disk data
CN107846327A (en) A kind of processing method and processing device of network management performance data
CN114020525A (en) Fault isolation method, device, equipment and storage medium
EP2653972A1 (en) Journal management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant