CN104932841B

CN104932841B - Economizing type data de-duplication method in a kind of cloud storage system

Info

Publication number: CN104932841B
Application number: CN201510339033.3A
Authority: CN
Inventors: 徐小龙; 涂群; 李涛; 徐佳; 朱洁
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Ciic Yunfu Hangzhou Medical Technology Co Ltd
Priority date: 2015-06-17
Filing date: 2015-06-17
Publication date: 2018-05-08
Anticipated expiration: 2035-06-17
Also published as: CN104932841A

Abstract

The invention discloses economizing type data de-duplication method in a kind of cloud storage system, the cloud storage system by carry out file operation client, store the meta data server of file system metadata information, the image file of backed up in synchronization metadata and the two level meta data server of operation log, the memory node of storage data block collectively forms, five steps of its method are directed to the dynamic of data in cloud storage system, consider the characteristic of data in itself, split data into hot spot data and non-thermal point data, opportunity is deleted again using different for different data, to ensure the performance of system more preferably, can be more preferable for the reducing effect of system response time.

Description

Economizing type data de-duplication method in a kind of cloud storage system

Technical field

The present invention relates to economizing type repeated data in computer data field of storage, more particularly to a kind of cloud storage system to delete Except method.

Background technology

In recent years, the technology such as cloud computing, mobile computing, Internet of Things become increasingly popular so that current data in explosion type increase Long, cloud storage technology is come into being.Counted according to International Data Corporation (IDC) IDC, global metadata total amount has reached 1.8ZB (1ZB within 2011 =10⁹TB), it is contemplated that the informational capacity produced to the year two thousand twenty whole world is up to 35ZB.The storage pressure of system is also growing day by day. It has also been found that there is nearly 75% repeated and redundant data in information system, substantial amounts of repeatability redundant data wastes largely for IDC investigation Storage resource, and data de-duplication technology can effectively reduce data.

Data de-duplication technology retains unique data by comparing fingerprint value, and with the pointer for being directed toward unique data Instead of the data of other repetitions.Data de-duplication technology has been widely used in backup and filing system, wherein more mature Data de-duplication strategy have based on file semantics perceive multilayer source repeated data method (Semantic-aware Multiered Deduplication, SAM-Dedupe), based on causal data de-duplication method (Causality- Based Deduplication, CABdedupe), based on application perceive data de-duplication method (Application- Aware Deduplication, AA-Dedupe) etc..They respectively have advantage and disadvantage, and SAM-Dedupe passes through to file size, file Fingerprint comparison scope is constantly reduced in position, file type, the cognition of document time stamp；CABdedupe is standby with recording by capture Causality of part data set between multiple time points, excavates unmodified data and implements to delete again；AA-Dedupe passes through to not Same type file apply use different block algorithms and fingerprint extraction technology with obtain it is optimal delete effect again, such as static application number According to or virtual machine image taken the fingerprint using FSC (Fixed-Sized Chunking) algorithm piecemeals and MD5 algorithms.These strategies Using standby system as environment, cause processing data it is relatively static, that is, after uploading to storage end, user will not be to storage end In data directly modify, therefore simply transplant these methods and be not particularly suited for cloud storage system.At present, cloud storage system In also have some achievements in research, lay particular emphasis on security of system, or based on proxy-encrypted data de-duplication mechanism, or based on friendship The data de-duplication mechanism of the PoW (Proof of Ownership) of mutual formula, or the safe repeat number based on data stream degree According to deleting mechanism.Data de-duplication method causes same data block to be shared by multiple users, and modification of the user to data In diversity, how to ensure that the availability of data and security are necessary.

The relatively static backup of the universal data-oriented of the prior art and filing system, avoid in repeated data from source It is not intended that whether the data in storage system can be changed after biography, and data are shared by multi-user in cloud storage system, more User, which changes data, causes the dynamic of data to strengthen, therefore and inapplicable cloud storage system.

The content of the invention

In order to solve the above technical problems, the technical solution adopted by the present invention is as follows：

Economizing type data de-duplication method in a kind of cloud storage system, the cloud storage system is by carry out file operation Client, the meta data server for storing file system metadata information, the image file of backed up in synchronization metadata and operation day Two level meta data server, the memory node of storage data block of will collectively form, and this method comprises the following steps：

Step 1：Each client pre-processes local file to be uploaded, carries out the office of file-level and block level Then the metadata information of file to be uploaded is uploaded to member by portion's data de-duplication operations to prevent the upload again of repeated data Data server；

Step 2：Meta data server receives the metadata information from different clients, be successively read file fingerprint, Data block fingerprint, then compares memory, the fingerprint index information in hard disk and write buffer area, finally believes the fingerprint value being transmitted through on not Breath returns to each client.

Step 3：The new data being transmitted through on not is uploaded to storage end by client, and storage end stores new data, and Update storage the metadata information table at end.

Step 4：Client sends the request of data to be changed, and data place to be modified is obtained by meta data server Memory node number, then connect memory node and operation of directly modifying to the data of storage end.

Step 5：Storage end is detected amended data block, when amended data block is by comparing fingerprint value It was found that on this node, directly it is deleted again；When amended data block is not on this node, then this is first saved in On node, then found on other nodes by the comparison of meta data server, which is deleted again using delay；Work as modification Data block afterwards is found neither on this node, and do not exist by comparing the fingerprint index on this node and meta data server On other nodes, except the data block is saved on this node, meta data server also needs to create a Copy for the data block.

The cloud storage system is characterized in that：Also contain filtering module and update module on meta data server, Filtering module is used for the repeated data information for filtering different clients, and update module is used to update storage end global data metadata Information, i.e., directly update the metadata information of repeated data block, wait receive just update after memory node feedack it is non-heavy The metadata information of complex data block.

The client has file pretreatment module, part to delete module, metadata management module and data transmission module again, Wherein file pretreatment module carries out document classification according to the type of file, then gives local module of deleting again and carries out file-level weight Delete, the non-duplicate file after file-level is deleted again is returned to file pretreatment module and filtered again, filters out less than 64MB Non-duplicate file, finally again by it is local delete module again and carry out block level delete again.Metadata management module is used to record client End has uploaded the fingerprint value information of data block, to avoid the upload of local repeated data；Data transmission module is then that client connects The interface of meta data server and memory node is connect, that is, is responsible for the metadata information of file to be uploaded uploading to Metadata Service Device, non-duplicate data block is uploaded on memory node.

The memory node includes memory module, metadata management module, Self-Check Report module and delay and deletes module again, its Middle memory module is responsible for the storage of data block, distributes the physical address of data block；On metadata management module minute book node The metadata information of data block；Self-Check Report module be detect data block modification caused by repeated data, give delay weight Delete module carry out hot spot repeated data block judgement with it is corresponding processing and the metadata information of modification is fed back into Self-Check Report Module, is then reported to meta data server.

File-level data de-duplication in the step 1：Using MD5 algorithm calculation document fingerprint values, size and class are compared The equal file fingerprint value of type, is then compared with local metadata information table, determines duplicate file and non-duplicate text again Part；

Block level data de-duplication described in the step 1 is as follows：It is non-heavy less than 64MB for having filtered out Multiple file, piecemeal is carried out using fixed length block algorithm, and block length is set to 64MB, and the fingerprint value of data block is calculated using MD5 algorithms, than The data block equal to block length determines repeated data block.

When file fingerprint is compared in the step 2, if finding, fingerprint value is existing, no longer the fingerprint of comparison data block, Otherwise the data block fingerprint of configuration file is also compared.

The mapping relations of the in store data block fingerprint and its storage address thereon of each storage end in the step 3, Pass through data block fingerprint, you can determine the physical address of data block storage.

Modification of the multiple users of client to data block can introduce new repeated data block in the step 4, and existing Storage system puts aside these data blocks repeated.User backs up again after local is to data modification in standby system, The part not made an amendment is filtered out during backup；And cloud storage is experienced as in local, user to the high in the clouds that user brings The address for the data for wanting modification is got, is directly modified to data.This is exactly cloud storage and the difference of standby system.

Postpone to delete again comprising to behaviour of both hot spot repeated data block and non-hot repeated data block in the step 5 Make, determination methods use equation below：

In formula, a certain data block is changed in node i, and determines that the data block does not repeat in node i, in node j On have repeated data block；Represent in t_p+1-t_pSome interior data block of period being averaged except node i at memory node end Access times；α is a threshold value, represents to become access times minimum in the hot spot data block unit interval；A_j(t_p) and A_j (t_p+1) t is represented respectively_pAnd t_p+1The access times of a certain data block on moment node j；Z is the numbering of node where data block B Set.

Then postpone to delete again for hot spot repeated data block to reduce the access response time of system；For non-hot repeat number According to block, then the deletion where selecting non-hot repeated data block on the relatively small number of node of memory node residual capacity is negative to realize Carry balanced.

Beneficial effect

1. existing data de-duplication is mainly directed towards the relatively static backup of data and filing system, and does not apply to Cloud storage system, and data are shared by multi-user in cloud storage system, multi-user, which changes data, causes the dynamic of data to increase By force.The present invention is directed to the dynamic of data in cloud storage system, considers data characteristic in itself, split data into hot spot data and Non-thermal point data, deletes opportunity, to ensure the performance of system more preferably again for different data using different.

2. the present invention, with reference to replica management mechanism, is ensureing compared to existing data de-duplication strategy in cloud storage On the premise of availability of data, using delayed deletion repeat hot spot data block (being temporarily regarded as copy), within a certain period of time Access pressure of the user to hot spot data block is alleviated, therefore can be more preferable for the reducing effect of system response time.

3. the non-hot data block repeated is also considered as a copy by the present invention, the storage of node where comparing all copies The copy loaded on larger node is deleted, to realize that storage load is more balanced.

Brief description of the drawings

Fig. 1 is the architectural framework figure of cloud storage data deduplication system

Fig. 2 is the procedure chart of delay data de-duplication

Fig. 3 is the processing schematic diagram that storage end changes data block

Embodiment

In order to facilitate description, The present invention gives the Organization Chart of cloud storage data deduplication system, as shown in Figure 1. The system is by m client (Client), 1 meta data server (Metadata Server, MS), 1 two level metadata clothes Business device (Secondary Metadata Server, SMS) and n memory node (Storage Node, Snode) collectively form. Wherein, client mainly initiates the object of the operations such as file upload, access, modification, deletion；Meta data server is mainly stored All metadata informations of file system, there is provided access control and the global foundation deleted again, it is equivalent to whole system framework Maincenter.Two level meta data server mainly undertakes the work of the image file and operation log of backed up in synchronization metadata；Storage section Point is then responsible for the actual data block of storage.In addition, there is close contact in system between each composition part, cooperate. Interacting for metadata information is only carried out between client and meta data server, to mitigate the load of the transmission bandwidth of metadata.When When client will upload data, by meta data server to determine non-repetitive data message；When client will access (including Modification) data when, by meta data server with determine data place nodal information.Can be into line number between client and memory node According to transmission.Memory node can also be interacted with meta data server, such as the metadata for the data changed on memory node Information will also be interacted with meta data server, to determine whether for repeated data.Meanwhile meta data server also can be according to storage The situation of data access creates certain copy for it and accesses load to reduce on node.For there was only a meta data server Framework, once it breaks down, whole system will paralyse, therefore between meta data server and two level meta data server For active and standby relation.

Client mainly has file pretreatment module, part to delete module, metadata management module and data transmission module again, Wherein file pretreatment module carries out document classification according to the type of file, and the later stage, which filtered out block level is deleted again when, to be less than The non-duplicate file of 64MB；Part deletes module and is deleted operation again from two angles of file-level and block level again；Metadata pipe Reason module essential record client has uploaded the fingerprint value information of data block, to avoid the upload of local repeated data；Data pass Defeated module is responsible for the metadata information of file to be uploaded uploading to meta data server, and non-duplicate data block is uploaded to storage On node.Have certain contact between each module, the file after the processing of file pretreatment module give it is local delete again module into Row file-level is deleted again, and the non-duplicate file after file-level is deleted again is returned to file pretreatment module and filtered again, most Deleting again for block level is carried out by local module of deleting again again afterwards.Part involved in whole process to metadata information will be with member Data management module interacts, and data transmission module is then the interface of client connection meta data server and memory node.

There are filtering module and update module on meta data server, wherein filtering module passes through the rope on meta data server The metadata information drawn in table (being distributed on memory and disk) and write buffer area filters out the repeat number from different clients It is believed that breath.For the data block repeated, directly pass through the metadata information of update module renewal corresponding data block；For non-duplicate Data block, update module is then just by the renewal of its metadata information on disk after memory node feedack is received In concordance list.When the data of memory node are changed, it can also be interacted with meta data server, so as to trigger renewal mould Renewal of the block to concordance list on meta data server.

Memory node mainly includes memory module, metadata management module, Self-Check Report module and delay and deletes module again, its Middle memory module is mainly responsible for the storage of data block, records the physical address of data block；Metadata management module minute book node On data block metadata information；Self-Check Report module is mainly to detect repeated data caused by the modification of data block, is handed over Module is deleted again to delay, and the metadata information of modification is reported to meta data server；Delay deletes module for detecting again Repeated data block, then judge whether repeated data block is hot spot repeated data block, for hot spot repeated data block delay delete again, Then select the identical block on suitable node to delete for non-hot repeated data block, believe involved in this module to metadata The part of breath needs to interact with metadata management module and Self-Check Report module.

The present invention carries out data de-duplication according to following steps：

Step 1：Each client pre-processes local file to be uploaded, carries out the office of file-level and block level Then portion's data de-duplication operations (including are treated the metadata information of file to be uploaded to prevent the upload again of repeated data The fingerprint value of the fingerprint value of upper transmitting file and its all data blocks) upload to meta data server.Upload the finger of repeated data block Line value is to quote number to update the data block in meta data server.Wherein, local data de-duplication operations is specific It is described as follows：

1. file-level data de-duplication：Using MD5 algorithm calculation document fingerprint values, size and the equal text of type are compared Part fingerprint value, is then compared with local metadata information table, determines duplicate file and non-duplicate file again；

2. block level data de-duplication：For non-duplicate file (having filtered out the file less than 64MB), using calmly Long block algorithm carries out piecemeal, and block length is set to 64MB, and the fingerprint value of data block is calculated using MD5 algorithms, and it is equal to compare block length Data block determines repeated data block.

When comparing file fingerprint, if finding, fingerprint value is existing, no longer the fingerprint of comparison data block, otherwise also to compare The data block fingerprint of configuration file.Fingerprint index table is distributed in memory and hard disk, and the space for being primarily due to memory extremely has Limit, therefore most of fingerprint index table storage is in a hard disk.In addition, also have partial data block fingerprint value information in write buffer area, this It is because storage end does not complete the storage work of new data block sended over to client also, and the fingerprint value of new data is not yet It can write in hard disk.

During fingerprint value compares, the present invention utilizes " type by the time for sacrificing document classification, size sorts The file identical with size is very likely similar documents " and " identical block that files in different types is shared can almost neglect Slightly " carry out continuous drawdown ratio to scope.

For the data repeated, client have updated its letter on meta data server by step 1 and step 2 Breath, and it is uploaded directly into storage end for non-repetitive data, client.And the in store number thereon of each storage end According to block fingerprint and its mapping relations of storage address.Pass through data block fingerprint, you can determine the physical address of data block storage.

Modification of the client to data is different because of user, that is, the mode for the user's modification for enjoying same data block is different, and Different data are also possible to that identical data can be modified to, this is the dynamic of cloud storage data, and cloud storage with The difference of standby system.Standby system is that user backs up again after local is to data modification, mistake during backup The part not made an amendment is filtered, and cloud storage is experienced as in local, user gets desired modification to the high in the clouds that user brings Data address, directly modify to data.

Step 5：Storage end is detected amended data block, and judges that amended data block belongs in table 1 Which kind of situation simultaneously takes appropriate measures, and specific Method And Principle is as shown in Figure 2.

The amended three kinds of situations of 1 data block of table and corresponding operating

Need to recalculate its fingerprint value for amended data block, and compare the progress of the metadata information on this node Judge, if finding, the data block on this node, directly deletes it again；If it was found that amended data block does not exist On this node, then first it is saved on this node, then compares meta data server and find on other nodes, then carries out delay weight Delete；If it was found that after fingerprint index of the amended data block on this node and meta data server is compared, neither in this node On, and not on other nodes, then meta data server also needs to create a Copy for the data block.Delay is deleted comprising to hot spot again Operation of both repeated data block and non-hot repeated data block, determination methods use formula (1), for hot spot repeated data Block then postpones to delete again to reduce the access response time of system；For non-hot repeated data block, then non-hot repeat number is selected According to the deletion on the relatively small number of node of memory node residual capacity where block to realize load balancing.

In order to make it easy to understand, some concepts of complementary definition：

Hot spot data block：Average access frequency reaches the data block of certain threshold value in a period of time, that is, meets formula (1). The data block that condition is not satisfied, is known as non-hot data block.

Hot spot repeated data block：Amended data block A ' do not have found on this node, but find on other nodes with Identical data block A, and data block A is hot spot data block, then A ' is referred to as hot spot repeated data block.

Non-hot repeated data block：Amended data block B ' does not have found on this node, but but is sent out on other nodes Existing same data block B, and data block B is non-hot data block, then and B ' is referred to as non-hot repeated data block.

Present invention is alternatively directed to the step 5 combination attached drawing 3 give user change memory node i (i=1,2,3 ..., N) data block on, the specific implementation step that storage end is handled are as follows：

1. Request request modifications：Node i is connected to after modification request of the client to a certain data block (being denoted as A), Read block A is replicated into memory；

2. Modify is modified：Node i in memory modifies data block A (amended data block is denoted as B) Then the reference number of A does the operation that subtracts 1, and the fingerprint value of B is calculated using MD5 algorithms；

3. Check repeats to detect：Whether node i has quickly existed in the fingerprint value for locally searching B, to avoid repeat number According to storage.If jumping to step 5. without if, otherwise remember that data block identical with data block B in node i is B ', and carry out next Step；

4. Deduplicate deduplications：Data block B is deleted, and uses the pointer replacement data block B for being directed toward data block B ' Storage；

5. Store is stored：Amended new data block B is stored in node i, and updates the metadata of node i local Information table；

6. Check repeats to detect：The metadata information of renewal is periodically sent on meta data server by node i, by member Data server judges whether there is identical block on other node j (j ≠ i).Step is jumped to if finding 8., it is otherwise next Step；

7. Replica creates a Copy：Created a Copy by meta data server for new data block B；

8. classification is handled：Meta data server judges whether repeated data block B is hot spot repeated data block, such as formula (1), If so, then jump to step 10., otherwise in next step；

In formula, t_p+1A certain data block is changed in moment node i, and determines that the data block does not repeat in node i, There is repeated data block on node j；Represent in t_p+1-t_pSome interior data block of period is at memory node end (except section The Average visits of point i)；α is a threshold value, represents to become access times minimum in the hot spot data block unit interval；A_j (t_p) and A_j(t_p+1) t is represented respectively_pAnd t_p+1The access times of a certain data block on moment node j；Z nodes where data block B Numbering set.

9. greed is deleted：t_p+1Moment, the residual capacity S of node k (k ∈ Z) where more non-hot repeated data block B_k (t_p+1) andSize, all the time select the relatively small number of node of residual capacity on data block B delete.Update Metadata Service Device.Wherein t_p+1Moment storage end average residual capacityAsk for as shown in formula (2),

In formula, S_m(t_p+1) it is t_p+1The memory space residual capacity of moment node m, n are that the section of storage end is always counted.

10. delayed deletion：t_p+1Moment does not delete hot spot data block B, and the metadata of synchronizing data blocks B is on node j, etc. To subsequent time t_p+2Continue step 8..

Claims

1. economizing type data de-duplication method in a kind of cloud storage system, the cloud storage system by carry out file operation visitor Family end, meta data server, the image file and operation log of backed up in synchronization metadata for storing file system metadata information Two level meta data server, store data block memory node collectively form, this method comprises the following steps：

Step 1：Each client pre-processes local file to be uploaded, carries out the part weight of file-level and block level Then the metadata information of file to be uploaded is uploaded to metadata by complex data delete operation to prevent the upload again of repeated data Server；

Step 2：Meta data server receives the metadata information from different clients, is successively read file fingerprint, data Block fingerprint, then compares memory, the fingerprint index information in hard disk and write buffer area, finally returns the fingerprint value information being transmitted through on not Return to each client；

Step 3：The new data being transmitted through on not is uploaded to storage end by client, and storage end stores new data, and updates The metadata information table of storage end；

Step 4：Client sends the request of data to be changed, and depositing where data to be modified is obtained by meta data server Node number is stored up, then connects memory node and operation of directly modifying to the data of storage end；

Step 5：Storage end is detected amended data block, when amended data block is found by comparing fingerprint value On this node, directly it is deleted again；When amended data block is not on this node, then this node is first saved in On, then found on other nodes by the comparison of meta data server, which is deleted again using delay；When amended Data block found neither on this node by comparing the fingerprint index on this node and meta data server, and not at other On node, except the data block is saved on this node, meta data server also needs to create a Copy for the data block.

2. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute State and also contain filtering module and update module on meta data server, filtering module is used for the repeat number for filtering different clients It is believed that breath, update module is used to update storage end global data metadata information, i.e., directly updates the metadata of repeated data block Information, waits the metadata information for receiving and non-duplicate data block just being updated after memory node feedack.

3. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute Stating client has file pretreatment module, part to delete module, metadata management module and data transmission module again, and wherein file is pre- Processing module carries out document classification according to the type of file, then gives local module progress file-level of deleting again and deletes again, by text Non-duplicate file part level is deleted again after is returned to file pretreatment module and is filtered again, filters out the non-duplicate text less than 64MB Part, is finally deleted, metadata management module has uploaded number for recording client again by local module progress block level of deleting again again According to the fingerprint value information of block, to avoid the upload of local repeated data；Data transmission module is then client connection metadata clothes The interface of business device and memory node, that is, be responsible for the metadata information of file to be uploaded uploading to meta data server, will be non-heavy Complex data block is uploaded on memory node.

4. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute Stating memory node includes memory module, metadata management module, Self-Check Report module and postpones to delete module, wherein memory module again It is responsible for the storage of data block, distributes the physical address of data block；The member of data block on metadata management module minute book node Data message；Self-Check Report module be detect data block modification caused by repeated data, give delay delete again module progress Hot spot repeated data block judgement with it is corresponding processing and the metadata information of modification is fed back into Self-Check Report module, Ran Houbao Accuse to meta data server.

5. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute State file-level data de-duplication in step 1：Using MD5 algorithm calculation document fingerprint values, size and the equal text of type are compared Part fingerprint value, is then compared with local metadata information table, determines duplicate file and non-duplicate file again.

6. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute The block level data de-duplication stated described in step 1 is as follows：For having filtered out the non-duplicate file less than 64MB, profit Piecemeal is carried out with fixed length block algorithm, block length is set to 64MB, and the fingerprint value of data block is calculated using MD5 algorithms, compares block length phase Deng data block determine repeated data block.

7. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute State when file fingerprint is compared in step 2, if finding, fingerprint value is existing, the no longer fingerprint of comparison data block, otherwise also than To the data block fingerprint of configuration file.

8. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute The mapping relations of the in store data block fingerprint and its storage address thereon of each storage end in step 3 are stated, pass through data block Fingerprint, you can determine the physical address of data block storage.

9. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute New repeated data block can be introduced by stating modification of the multiple users of client in step 4 to data block, and existing storage system is temporary Without considering the data block that these are repeated, user backs up again after local is to data modification in standby system, the process of backup In filter out the part not made an amendment；And cloud storage is experienced as got desired in local, user to the high in the clouds that user brings The address of the data of modification, directly modifies data, this is exactly cloud storage and the difference of standby system.

10. economizing type data de-duplication method in a kind of cloud storage system according to claim 1, it is characterised in that institute State and postpone to delete again comprising to operation, determination methods of both hot spot repeated data block and non-hot repeated data block in step 5 Using equation below：

<mrow> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mover> <mi>f</mi> <mo>&OverBar;</mo> </mover> <mrow> <mi>a</mi> <mi>c</mi> <mi>c</mi> <mi>e</mi> <mi>s</mi> <mi>s</mi> </mrow> </msub> <mo>></mo> <mi>&alpha;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mover> <mi>f</mi> <mo>&OverBar;</mo> </mover> <mrow> <mi>a</mi> <mi>c</mi> <mi>c</mi> <mi>e</mi> <mi>s</mi> <mi>s</mi> </mrow> </msub> <mo>=</mo> <munder> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>&Element;</mo> <mi>Z</mi> </mrow> </munder> <mfrac> <mrow> <msub> <mi>A</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mrow> <mi>p</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>A</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>p</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>t</mi> <mrow> <mi>p</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>-</mo> <msub> <mi>t</mi> <mi>p</mi> </msub> </mrow> </mfrac> <mo>,</mo> <mi>j</mi> <mo>&NotEqual;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&Element;</mo> <mi>Z</mi> <mo>,</mo> <msub> <mi>t</mi> <mrow> <mi>p</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>></mo> <msub> <mi>t</mi> <mi>p</mi> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>

In formula, a certain data block is changed in node i, and determines that the data block does not repeat in node i, is had on node j Repeated data block；Represent in t_p+1-t_pSome interior data block of period is at memory node end except the average access of node i Number；α is a threshold value, represents to become access times minimum in the hot spot data block unit interval；A_j(t_p) and A_j(t_p+1) point T is not represented_pAnd t_p+1The access times of a certain data block on moment node j；Z is the numbering set of node where data block B；

Then postpone to delete again for hot spot repeated data block to reduce the access response time of system；For non-hot repeated data Block, the then deletion where selecting non-hot repeated data block on the relatively small number of node of memory node residual capacity are loaded with realizing It is balanced.