CN104933169B

CN104933169B - Based on the preferential file system fragmentation method for sorting of hot spot file

Info

Publication number: CN104933169B
Application number: CN201510372541.1A
Authority: CN
Inventors: 李旭东
Original assignee: Nankai University
Current assignee: Nankai University
Priority date: 2015-06-29
Filing date: 2015-06-29
Publication date: 2018-05-01
Anticipated expiration: 2035-06-29
Also published as: CN104933169A

Abstract

The invention discloses based on the preferential file system fragmentation method for sorting of hot spot file, belong to computer realm, this method preferentially carries out defragmentation to hot spot file, so that accelerating file system defragmentation time and efficiency.Scrap cleaning method of the present invention includes：Disk partition data block status where obtaining file system, establishes " continuous freed data blocks area " chained list, " file " chained list and " contiguous file data block area " chained list, and ranking index；" threshold value " of file defragmentation is set, establishes " file for treating defragmentation " chained list, and the high hot spot file of fragment degree is added to the chained list；" file for treating defragmentation " chained list is traveled through, defragmentation is carried out for each file, file defragmentation is carried out again after free area defragmentation is carried out if this document can not find " the continuous freed data blocks area " set met the requirements.The present invention can be effectively reduced file system fragmentation finishing time and improve the access performance of file system.

Description

Based on the preferential file system fragmentation method for sorting of hot spot file

Technical field

The invention belongs to computer realm, is primarily upon improving the defragmentation efficiency of disk file system and improves file The access performance of system.

Background technology

As the main memory carrier disk of mass data in current and following a very long time, increase in heap file Revising will inevitably run into after operation the file access hydraulic performance decline caused by the fragmentation of file the problem of, The defragmentation of disk file system is essential.But time-consuming for existing file system fragmentation arrangement at present, largely On have impact on the usage time of user.

The content of the invention

The present invention seeks on the one hand improve the defragmentation efficiency of disk file system, file system is on the other hand also improved The access performance of system.The present invention is provided using based on the preferential file system fragmentation method for sorting of hot spot file.

Technical solution of the present invention

Based on the preferential file system fragmentation method for sorting of hot spot file, including comprise the following steps that：

Disk partition data block status where 1st step, foundation file system, establishes " continuous freed data blocks area " chained list, " continuous freed data blocks area " chained list is made of " continuous freed data blocks area " item, and " continuous freed data blocks area " includes starting Data block number, continuous freed data blocks number, press initial data block number sequence chain table pointer, arranged by continuous freed data blocks number Sequence chain table pointer；And index is ranked up to " continuous freed data blocks area " chained list；

2nd step, catalogue and file metadata information according to file system, establish " file " chained list and " contiguous file number According to block area " chained list；" file " chained list is traveled through, calculates the fragment association statistical information of each file, fragment association statistical information bag Include data block total number, fragment sum, last access time, visiting frequency, fragment degree and weighting fragment degree；Further to " file " Chained list and " contiguous file data block area " chained list are ranked up index；

3rd step, " threshold value " for setting file defragmentation, including most long non-access time " threshold value ", visiting frequency " threshold Value ", fragment degree " threshold value ", weighting fragment degree " threshold value "；

4th step, establish " file for treating defragmentation " chained list；For each file in " file " chained list, set " current Treat the file of defragmentation " " defragmentation state " be " not carrying out ", and according to the fragment degree of this document and visiting frequency come Determine whether this document adding " file for treating defragmentation " chained list；And further to " file for treating defragmentation " chained list into Row ranking index；Wherein, defragmentation " is treated to determine whether to add this document according to the fragment degree of this document and visiting frequency File " chained list method it is as follows：

By the high hot spot file of fragment degree be added to " file for treating defragmentation " chained list method 1 be file weighting it is broken The standard of the piece degree hot spot file high as fragment degree, if the weighting fragment degree of this document is more than or equal to weighting fragment degree " threshold value " This document is then added into " file for treating defragmentation " chained list, and " file for treating defragmentation " chained list is arranged by weighting fragment degree Sequence index；

The high hot spot file of fragment degree is added to " file for treating defragmentation " if the method 2 of chained list is the visit of this document Ask that frequency is more than or equal to visiting frequency " threshold value " and the fragment degree of file then adds this document more than or equal to fragment degree " threshold value " " file for treating defragmentation " chained list, and by visiting frequency and fragment degree to " file for treating defragmentation " chained list ranking index.

5th step, for " file for treating defragmentation " chained list, perform following sub-step：

If the 5.1st step, " file for treating defragmentation " chained list for sky, set " system fragmentation collating condition " be successfully, And further perform the 6th step；Otherwise first " file for treating defragmentation " conduct from " file for treating defragmentation " chained list " file for currently treating defragmentation "；

5.2nd step, " the defragmentation state " of setting " file for currently treating defragmentation " are " in progress "；Further root According to the weighting of the fragment degree of " file for currently treating defragmentation " in the 5.1st step, weighting fragment degree and file defragmentation Fragment degree " threshold value " is come " the target maximum fragment number " after determining " currently after the file of defragmentation " defragmentation；

5.3rd step, find in " continuous freed data blocks area " chained list and can accommodate " file for currently treating defragmentation " " continuous freed data blocks area " gathers, and " the continuous freed data blocks area " number for being somebody's turn to do " continuous freed data blocks area " set is less than Or equal to " the target maximum fragment number " after " currently after the file of defragmentation " defragmentation；If find the " company met the requirements Continuous freed data blocks area " set, then record " the former sequence of blocks of data " and " new data block sequence of " file for currently treating defragmentation " Row "；If can not find " the continuous freed data blocks area " set met the requirements, the 5.9th step is performed；

5.4th step, " former sequence of blocks of data " and " new data block sequence " according to " file for currently treating defragmentation ", according to The data block contents of secondary " former sequence of blocks of data " by " file for currently treating defragmentation " copy to " new data block sequence " In data block；

5.5th step, " the contiguous file data block area " chained list for updating " file for currently treating defragmentation ", renewal are " current Treat the file of defragmentation " disk storage metadata information；And update the " contiguous file of " file for currently treating defragmentation " The ranking index of data block area " chained list；

5.6th step, by the data block of " the former sequence of blocks of data " of " file for currently treating defragmentation " be recovered as free block It is merged into " continuous freed data blocks area " chained list, and updates the ranking index of " continuous freed data blocks area " chained list；

5.7th step, the fragment association statistical information for recalculating " file for currently treating defragmentation "；

5.8th step, " the defragmentation state " of setting " file for currently treating defragmentation " are " success ", " will currently be treated The file of defragmentation " is removed from " file for treating defragmentation " chained list, and further performs the 5.1st step；

5.9th step, foundation " continuous freed data blocks area " link table information, carry out free area defragmentation, if free area is broken Piece arranges successfully, then further performs the 5.3rd step；" the defragmentation state " of " file for currently treating defragmentation " is otherwise set For " failure ", " system fragmentation collating condition " is set further to perform the 6th step for failure；

6th step, the file system fragmentation of this document system are arranged and terminated.

Described, " contiguous file data block area " item bag in " contiguous file data block area " chained list described in the 2nd step Logical block number (LBN) containing start file, initial data block number, contiguous file number of data blocks, by start file logical block number (LBN) sort chained list Pointer, press initial data block number sequence chain table pointer and by contiguous file number of data blocks sequence chain table pointer.

Described, " file " item in " file " chained list described in the 2nd step includes file ID number, data block total number, broken Piece sum, last access time, visiting frequency, fragment degree, weighting fragment degree, " contiguous file data block area " linked list head, by adding Power fragment degree sequence chain table pointer, treat defragmentation file linked list pointer and defragmentation state；" contiguous file data block area " Chain table pointer is into this document " contiguous file data block area " chained list by first " continuous text of start file logical block number (LBN) sequence Part data block area " item；Treat that defragmentation file linked list pointers form " file for treating defragmentation " chain in " file item " Table.

Each file of each file fragment association statistical information, including data block total number, fragment sum, most Nearly access time, visiting frequency, fragment degree and weighting fragment degree；The fragment sum of file is the total of the consecutive data block area of file Number；" visiting frequency " of file is determined that " last access time " is apart from the more near then " visit of current time by " last access time " Ask frequency " it is bigger, when " last access time " exceeds most long non-access time " threshold value ", " visiting frequency " value is zero；File " fragment degree " is determined by the data block total number of file and the ratio between the fragment sum of file；" weighting fragment degree " by " fragment degree " and " visiting frequency " determines, " weighting fragment degree " is proportional to " fragment degree " and " visiting frequency ".

Described, each file being directed to described in the 4th step in " file " chained list, according to the fragment degree of this document and access Frequency determines whether this document adding " file for treating defragmentation " chained list, and the method is only to " treating defragmentation File in file " chained list carries out defragmentation；The method " treats fragment by the way that the high hot spot file of fragment degree is added to The file of arrangement " chained list supports the preferential defragmentation of hot spot file；

The hot spot file that fragment degree is high, which is added in the method 1 of " file for treating defragmentation " chained list, passes through tune The weighting fragment degree " threshold value " for saving file defragmentation adjusts the number of " file for treating defragmentation "；

The hot spot file that fragment degree is high, which is added in the method 2 of " file for treating defragmentation " chained list, passes through tune Visiting frequency " threshold value " and fragment degree " threshold value " are saved to adjust the number of " file for treating defragmentation ".

Described, " continuous freed data blocks area " item bag in " continuous freed data blocks area " chained list described in the 1st step Block containing initial data number, consecutive data block number, by data block number ranking index pointer and by consecutive data block number sort rope Draw pointer；Sort by consecutive data block number and find and can hold in " continuous freed data blocks area " chained list to improve in the 5.3rd step Receive " file for currently treating defragmentation " " continuous freed data blocks area " set efficiency；Is improved by data block number sequence The efficiency of 5.9 step free area defragmentations.

The method of the free area defragmentation of file system, the described method includes comprise the following steps that：

1st step, setting free area defragmentation target are " continuous freed data blocks area " sum；

2nd step, foundation " continuous freed data blocks area " chained list, determine to meet free area defragmentation target in the 1st step This " free area defragmentation scope area ", this " free area defragmentation scope area " include " initial data block number " and " expiration data block number "；If finding less than this " the free area defragmentation scope area " that can meet the 1st step, set " idle Area's defragmentation state " is failure, further performs the 6th step；

3rd step, from " the initial data block number " of the 2nd step sequentially travel through all data block numbers to " expiration data block number ", if Fixed " currently data block to be arranged " is initial data block number, performs following sub-step：

3.1st step, judge whether " currently data block number to be arranged " is more than " expiration data block number ", empty if more than then setting Not busy area's defragmentation state is successfully, further to perform the 4th step；

3.2nd step, foundation " continuous freed data blocks area " chained list, judge whether " currently data block number to be arranged " is idle Block, if free block, then it is " currently data block number to be arranged " plus 1 to set " currently data block number to be arranged ", and performs the 3.9th Step；

3.3rd step, foundation " file " chained list and " contiguous file data block area " chained list, obtain " currently data block to be arranged Number " data block belonging to fileinfo and " contiguous file data block area " information；

3.4th step, foundation " continuous freed data blocks area " chained list, find outside in this " free area defragmentation scope area " Into the 3.3rd step, " the contiguous file number of data blocks " in " contiguous file data block area " a freed data blocks are used as " new data block Sequence ", if not finding " the new data block sequence " met the requirements, sets free area defragmentation state to fail, further holds The 4th step of row；

" the former sequence of blocks of data " and " new data block sequence in " contiguous file data block area " in 3.5th step, the 3.3rd step of record Row "；By the file record belonging to the data block of " currently data block number to be arranged " into " associated with " list；

The content of the sequence of blocks of data in " contiguous file data block area " is into the 3.4th step in 3.6th step, the 3.3rd step of copy " new data block sequence "；" the continuous text of file belonging to the data block of " currently data block number to be arranged " is updated in the 3.3rd step Part data block area " chained list, disk storage metadata information；

3.7th step, be recovered as free block by " former sequence of blocks of data " in the 3.5th step and be merged into " continuous freed data blocks area " In chained list；

3.8th step, setting " currently data block number to be arranged " are that " currently data block number to be arranged " adds in the 3.5th step " original The data block number of sequence of blocks of data "；

3.9th step, further perform the 3.1st step；

4th step, traversal " associated with ", for each file, count the fragment sum of this document, fragment degree, add again Fragment degree is weighed, and updates the ranking index of " contiguous file data block area " chained list of this document, and in renewal " file " chained list The ranking index of this document；Update " continuous freed data blocks area " chained list ranking index；

5th step, setting " free area defragmentation state " are successfully；

6th step, free area defragmentation terminate.

Described, free area defragmentation target is set described in the 1st step of the free area scrap cleaning method of file system In, if setting " file for currently treating defragmentation ", free area defragmentation target is that " continuous freed data blocks area " is total Number at least should be greater than the data block total number of " file for currently treating defragmentation "；Otherwise free area defragmentation target at least should be big The data block total number of first " file for treating defragmentation " in " file for treating defragmentation " chained list.

It is described, determine this described in the 2nd step of the free area scrap cleaning method of file system " free area fragment is whole Manage scope area " in, in all data block ranges of disk partition where file system, using the 1st piece of data block as starting, press It is that " continuous freed data blocks area " sum divides disk partition where file system for unit according to free area defragmentation target For several " consecutive data block areas ", according to " continuous freed data blocks area " chained list, found in these " consecutive data block areas " Comprising most " the consecutive data block area " of freed data blocks number be used as this " free area defragmentation scope area ".

Particular content according to the present invention and the term are meant that：

(1) visiting frequency：Weigh whether file is one of standard of hot spot file often accessed；File " accesses frequency Degree " is determined that " last access time " is bigger apart from current time more near then " visiting frequency " by " last access time ", when " visiting frequency " value is zero when " last access time " exceeds most long non-access time " threshold value "；

" visiting frequency " can use equation below：Exceed most long non-access time " threshold value " when working as " last access time " When " visiting frequency " value be zero, otherwise " visiting frequency "=most long non-access time " threshold value "-(" current time "-" accesses recently Time ").

(2) fragment：Several continuous data blocks of address are collectively referred to as a fragment；Several continuous file datas of address Block is collectively referred to as a file fragmentation；Several continuous freed data blocks of address are collectively referred to as a free area fragment.

(3) fragment degree：Weigh one of standard of fragmentation figures of file；" the fragment degree " of file is by the continuous data of file Block sum is that fragment sum is determined with the ratio between the data block total number of file, and fragment degree is up to 1, and minimum is close to 0；Fragment degree It is bigger represent file fragmentation situation it is more serious, the efficiency that application program accesses this document is lower.

(4) fragment degree is weighted：Weigh one of the fragmentation figures of file and the comprehensive standard of visiting frequency；Weight fragment degree by " fragment degree " and " visiting frequency " determines, weighting fragment degree is generally proportional to " fragment degree " and " visiting frequency "；Weight fragment Degree can directly be the product of " fragment degree " and " visiting frequency "；The bigger this document that represents of weighting fragment degree is both frequent accesses again Fragmentation situation is serious.

(5) free area fragment degree：Weigh one of standard of fragmentation figures that idle set of data blocks is closed in file system；It is idle Area's fragment degree is that fragment is total with the free block sum of whole file system by the continuous free block sum of whole file system Than determining, free area fragment degree is up to 1, minimum close to 0；Free area fragment degree is bigger to represent file system free area Fragmentation situation is more serious, and the efficiency of application program increase file content operation is lower.

The advantages and positive effects of the present invention：

On the one hand, for the file that those are infrequently accessed, the access performance of file system is not in fact interfered with；It is another Aspect is for the serious file of those fragmentations, its file access performance can be very low, therefore the hot spot file to being often accessed Have to reduce its file fragmentation.

The present invention can be effectively reduced file system fragmentation finishing time, because those need not infrequently be accessed File carries out defragmentation；The present invention effectively improves the access performance of file system at the same time, because those heat for being often accessed The fragmentation of dot file has greatly reduced.

Brief description of the drawings

Fig. 1 is based on the preferential file system fragmentation method for sorting flow chart of hot spot file.

Fig. 2 is continuous freed data blocks area chained list dependency structure body schematic diagram.

Fig. 3 is contiguous file data block area chained list dependency structure body schematic diagram.

Fig. 4 is " file " chained list dependency structure body schematic diagram.

Fig. 5 is T1 moment disk partition data block status schematic diagrames.

Fig. 6 is T2 moment disk partition data block status schematic diagrames, i.e., carries out defragmentation for file F2 after the T1 moment Disk partition data block status schematic diagram afterwards.

Fig. 7 is T3 moment disk partition data block status schematic diagrames, i.e., after carrying out free area defragmentation after the T2 moment Disk partition data block status schematic diagram.

Embodiment

Embodiment 1,

Based on the preferential file system fragmentation method for sorting of hot spot file, the described method includes comprise the following steps that：

Various operating systems generally can all provide that low layer I/O function interfaces come or the data of more designated disk subregions are block State, such as can use Win32API DeviceIoControl letters in windows platform acquisition new technology file system disk partition Number and parameter obtain designated disk partition data bulk state message bit pattern, the data knot of NTFS format disk partition essential information Structure：

" continuous freed data blocks area " chained list is linked successively by " continuous freed data blocks area " item, as shown in Fig. 2, wherein " continuous freed data blocks area " includes initial data block number, continuous freed data blocks number, presses initial data block number sequence chained list Pointer, sort chain table pointer by continuous freed data blocks number；" continuous freed data blocks area " item also can be configured such that " continuous idle The tree structure bodies such as data block area " binary tree, can equally establish with the ranking index of above-mentioned critical field；

As shown in figure 5, " continuous freed data blocks area " link table information can be obtained, including 4 " continuous freed data blocks areas ", The information in wherein the 1st " continuous freed data blocks area " includes initial data block number (3), continuous freed data blocks number (3) etc., The information in other " continuous freed data blocks areas " does not repeat.

" contiguous file data block area " chained list is linked successively by " contiguous file data block area " item, as shown in figure 3, wherein " contiguous file data block area " item include start file logical block number (LBN), initial data block number, contiguous file number of data blocks, by rise Beginning Documents Logical block number sequence chain table pointer, press initial data block number sequence chain table pointer, arranged by contiguous file number of data blocks Sequence chain table pointer." contiguous file data block area " item also can be configured such that the tree structures such as " contiguous file data block area " binary tree Body, can equally establish with the ranking index of above-mentioned critical field；

" file " chained list is linked successively by " file " item, as shown in figure 4, wherein " file " item includes file ID number, number According to block sum, fragment sum, last access time, visiting frequency, fragment degree, weighting fragment degree, " contiguous file data block area " Linked list head, by weighting fragment degree sequence chain table pointer, treat defragmentation file linked list pointer, defragmentation state；" contiguous file Data block area " chain table pointer is into this document " contiguous file data block area " chained list by the first of the sequence of start file logical block number (LBN) A " contiguous file data block area " item；" file " item also can be configured such that the tree structure bodies such as " file " binary tree, can equally establish With the ranking index of above-mentioned critical field；

As shown in figure 5, " file " link table information can be obtained, including 5 " files "；Wherein " the contiguous file number of file F1 According to block area " chained list includes 3 " contiguous file data block areas " again, wherein the packet in the 1st " contiguous file data block area " Include start file logical block number (LBN) (1), initial data block number (1), contiguous file number of data blocks (2) etc., the 2nd " contiguous file The information in data block area " includes start file logical block number (LBN) (3), initial data block number (6), contiguous file number of data blocks (1) Include start file logical block number (LBN) (4), initial data block number (11), continuous Deng, the information in the 3rd " contiguous file data block area " File data blocks number (1) etc.；The information in " the contiguous file data block area " of other files does not repeat；

As shown in figure 5, " file " item of file F1 can be obtained, including file ID number (can be obtained by the metadata of file system Know, its unique identifier as file), data block total number (4), fragment total (3), last access time is (by file system Metadata would know that, can be assumed that in present case as before 1 year), visiting frequency (can be derived from by last access time, it is most long not visit Ask the time that " threshold value " may be set to 30 days, then this document visiting frequency for 0), fragment degree (3/4=0.75), weighting fragment degree (for 3/4*0=0) etc.；

As shown in figure 5, " file " item of file F2 can be obtained, including file ID number (can be obtained by the metadata of file system Know, its unique identifier as file), data block total number (3), fragment total (2), last access time is (by file system Metadata would know that, can be assumed that in present case as before 1 day), visiting frequency (can be derived from by last access time, it is most long not visit Ask the time that " threshold value " may be set to 30 days, according to the calculation formula provided in aforementioned access frequency term, then this document accesses frequency Spend for 30-1=29), fragment degree (2/3=0.67), weighting fragment degree (for 0.67*29=19) etc.；

The associated information calculation of " file " item of other files does not repeat.

Most long non-access time " threshold value " may be set to 30 (my god)；Visiting frequency " threshold value " may be set to 21；Fragment degree " threshold Value " may be set to 0.2；Weighting fragment degree " threshold value " may be set to 3；

As shown in figure 5, from the point of view of the relevant information of each file calculated by the 2nd step, although file F1 fragment degree is very high, But since not visiting frequency is very low, according to file defragmentation method provided by the invention, it is not necessary that to file F1 Carry out defragmentation；And although the fragment degree of file F2 is high without the fragment degree of F1 but the visiting frequency of F2 is very high, therefore weight Fragment degree (19) is beyond weighting fragment degree " threshold value ", it is therefore desirable to preferentially carries out defragmentation to F2；Compared with F1, F2 is F2 The hot spot file that frequently accesses and comprising higher file fragmentation.

Defragmentation file linked list pointer is treated as shown in figure 4, being contained in " file " item in " file " chained list；" treat broken The file that piece arranges " chained list is to treat that defragmentation file linked list refers in " file " item for the file that defragmentation is also treated by meeting Pin links successively；" file for treating defragmentation " can be also built into the tree structure body such as binary tree by " file " item, can equally be built The vertical ranking index with above-mentioned critical field；

By the high hot spot file of fragment degree be added to " file for treating defragmentation " chained list method 1 be file weighting it is broken The standard of the piece degree hot spot file high as fragment degree, if the weighting fragment degree of this document is more than or equal to weighting fragment degree " threshold value " This document is then added into " file for treating defragmentation " chained list, and " file for treating defragmentation " chained list is arranged by weighting fragment degree Sequence index, the number of " file for treating defragmentation " is adjusted by adjusting the weighting fragment degree " threshold value " of file defragmentation；

As shown in figure 5, the example with reference to given by the 2nd and 3 steps, whether the weighting fragment degree of directly more each file is big In equal to weighting fragment degree " threshold value ", according to said method 1, file F1 cannot add " file for treating defragmentation " chained list, and file F2 can add " file for treating defragmentation " chained list；

The high hot spot file of fragment degree is added to " file for treating defragmentation " if the method 2 of chained list is the visit of this document Ask that frequency is more than or equal to visiting frequency " threshold value " and the fragment degree of file then adds this document more than or equal to fragment degree " threshold value " " file for treating defragmentation " chained list, and by visiting frequency and fragment degree to " file for treating defragmentation " chained list ranking index, The number of " file for treating defragmentation " is adjusted by adjusting visiting frequency " threshold value " and fragment degree " threshold value "；

As shown in figure 5, the example with reference to given by the 2nd and 3 steps, according to said method 2, be respectively compared visiting frequency " threshold value " and Fragment degree " threshold value "；File F1 cannot add " file for treating defragmentation " chained list, and file F2 can be added and " be treated defragmentation File " chained list；But the linked list order of " file for treating defragmentation " chained list according to said method given by 2 namely next text File defragmentation sequencing may be different in part de-fragmenting steps；

It can also combine the above method 1 and method 2, and further according to the type of file such as data file, executable Program, temporary file etc. are treated with a certain discrimination, will to user either application program or operating system by unessential files classes Type is got rid of from " file for treating defragmentation " chained list；

" file for treating defragmentation " chained list can be obtained as stated above includes file F2 and file F5, and other texts Part need not preferentially carry out defragmentation.

As shown in figure 5, the sample result with reference to given by the 4th step, " treated for first in " file for treating defragmentation " chained list The file of defragmentation " is file F2, and file F2 is set to " file for currently treating defragmentation "；

" target maximum fragment number " after " currently after the file of defragmentation " defragmentation can be 1, can also be more than 1, but have to be much smaller than " the fragment number " before " file for currently treating defragmentation " defragmentation, otherwise should after defragmentation The access performance of file may not improve；

The current fragment number of the file F2 of " if the file for currently treating defragmentation ", file F2 are 2, therefore after defragmentation " target maximum fragment number " be 1, i.e., will make all data blocks of file F2 after the defragmentation of file F2 in physical data block It is completely continuous in layout.

If the current fragment number of the file F2 of " file for currently treating defragmentation ", file F2 are 2, data block total number 3, " the continuous freed data blocks of " file for currently treating defragmentation " can be accommodated by being found in " continuous freed data blocks area " chained list Area " gather, and should " continuous freed data blocks area " set " continuous freed data blocks area " number be less than or equal to " currently treat broken " target maximum fragment number " (1) after the file that piece arranges " defragmentation；

As shown in Figure 5, the 1st " continuous freed data blocks area " in " continuous freed data blocks area " chained list is i.e. continuous idle Data block 3,4,5 can meet the defragmentation target of file F2, therefore can perform the data that the 5.4th step etc. carries out file F2 The operations such as block movement；If can not find " the continuous freed data blocks area " set met the requirements, the 5.9th step is performed.

If the file F2 of " file for currently treating defragmentation ", connects previous step, successively by the former sequence of blocks of data of file F2 The data block contents of (9,10,7) are copied in the data block of " new data block sequence " (3,4,5).

If the file F2 of " file for currently treating defragmentation ", due to being weighed in previous step to the data block of file F2 New adjustment, occupies some free blocks, original data block will be released recycling again, it is therefore necessary to update and " currently treat that fragment is whole " contiguous file data block area " chained list of the file of reason ", the disk storage metadata of renewal " file for currently treating defragmentation " Information；And the ranking index of " the contiguous file data block area " chained list of " file for currently treating defragmentation " is updated, it could represent The last state of current file system.

If the file F2 of " file for currently treating defragmentation ", due to being weighed in previous step to the data block of file F2 New adjustment, original data block will be released recycling, the data block recycling of " former sequence of blocks of data " (9,10,7) of file F2 again To have to merge with the existing adjacent free block of " continuous freed data blocks area " chained list during free block, so as to produce more Big " continuous freed data blocks area ", therefore existing free block 8 needs to merge with the data block (9,10,7) newly discharged, from And " the continuous freed data blocks area " of 1 bigger is merged into, rather than 3 small " continuous freed data blocks areas ".

If the file F2 of " file for currently treating defragmentation ", due to being weighed in previous step to the data block of file F2 New adjustment, occupies some free blocks, original data block will be released recycling again, therefore " file for currently treating defragmentation " The fragment association statistical information of file F2 needs to recalculate, and could represent the file F2 i.e. last state of current file system.

If the file F2 of " file for currently treating defragmentation ", file F2 is gone from " file for treating defragmentation " chained list Remove, " file for treating defragmentation " chained list is only left file F5 at this time, therefore next the 5.1st step of rebound carries out file F5 again Defragmentation；

As shown in fig. 6, the disk partition data block last state after defragmentation is carried out for file F2.

Free area defragmentation can use the method for the free area defragmentation of file system provided by the invention, can also Using third-party free area scrap cleaning method；As shown in figure 5, the target of free area defragmentation is will be current " continuous empty 4 " continuous freed data blocks areas " in not busy data block area " chained list merge into " the continuous free time number less than 4, preferably 1 According to block area ", so as to provide sufficiently large consecutive data block for other file defragmentations or be provided for new files sufficiently large Consecutive data block.

When " when the file of defragmentation " chained list is empty, this time the file system fragmentation of file system is arranged and terminated；Can To adjust the file in " file for treating defragmentation " chained list by setting various " threshold values " of file defragmentation in the 3rd step Set.

Embodiment 2

If setting " file for currently treating defragmentation ", free area defragmentation target is " continuous freed data blocks Area " sum at least should be greater than the data block total number of " file for currently treating defragmentation "；Otherwise free area defragmentation target is extremely It should be greater than the data block total number of first " file for treating defragmentation " in " file for treating defragmentation " chained list less, or it is idle Area's defragmentation target is the sum of all free area data blocks；

Example as shown in Figure 6, if carrying out the free area defragmentation of file system at the T2 moment, can obtain " continuous at this time Freed data blocks area " link table information, including 3 " continuous freed data blocks areas ", wherein the 1st " continuous freed data blocks area " Information includes initial data block number (7), continuous freed data blocks number (4) etc., the information in other " continuous freed data blocks areas " Do not repeat；

If setting " file for currently treating defragmentation " (being assumed to be F1), free area defragmentation target is " continuous Freed data blocks area " sum at least should be greater than the data block total number 4 of " file for currently treating defragmentation " F1；If otherwise do not set " file for currently treating defragmentation ", then free area defragmentation target at least should be greater than " file for treating defragmentation " chained list In first " file for treating defragmentation " data block total number, or free area defragmentation target is all free area data The sum of block, thus can by free area defragmentation goal-setting be all free area data blocks sum 8, below with sky Not busy area's defragmentation target is illustrated for 8.

Method 1：In all data block ranges of disk partition where file system, using the 1st piece of data block as starting, It is that " continuous freed data blocks area " sum draws disk partition where file system for unit according to free area defragmentation target It is divided into several " consecutive data block areas ", according to " continuous freed data blocks area " chained list, is sought in these " consecutive data block areas " " the consecutive data block area " for looking for the freed data blocks number included most is used as this " free area defragmentation scope area "；

Method 2：Can also be (big according to the free area defragmentation target i.e. integral multiple of " continuous freed data blocks area " sum In 1) being that disk partition where file system is divided into several " consecutive data block areas " by unit, according to " continuous idle data Block area " chained list, finds the freed data blocks number included most " consecutive data block area " in these " consecutive data block areas " As this " free area defragmentation scope area "；

Method 3：It can also be used as initial data block successively according to free area defragmentation target from the 1st piece of data block, seek Follow-up several " consecutive data block areas " are looked for, according to " continuous freed data blocks area " chained list, in these " consecutive data block areas " Find the freed data blocks number included most " consecutive data block area " and be used as this " free area defragmentation scope area "；

Example as shown in Figure 6, it is 8 to connect previous step free area defragmentation target, according to method 3, it may be determined that this is " empty Not busy area's defragmentation scope area " realize this free area defragmentation task for the 7th data block to the 14th data block.

Example as shown in Figure 6, at this time " initial data block number " be 7, " expiration data block number " be 14.

" initial data block number " is 7, if " currently data block number to be arranged " is 7, because data block 7 is free block, therefore It is " currently data block number to be arranged " plus 1 to set " currently data block number to be arranged ", and performs the 3.9th step；Similar data block 8, 9th, 10,12 be all free block, clicks here reason；

If " currently data block number to be arranged " is 11, because data block 11 is not free block, therefore ensuing the is performed 3.3 step；Similar data block 13,14, clicks here reason.

Example as shown in Figure 6, if " currently data block number to be arranged " is 11, according to " file " chained list and " contiguous file number According to block area " chained list can know data block 11 be file F1 the 3rd " contiguous file data block area " data block.

In order to improve free block defragmentation efficiency, according to " continuous freed data blocks area " chained list, at this, " free area is broken " contiguous file number of data blocks " a free time in " contiguous file data block area " is searched out in the 3.3rd step outside piece arrangement scope area " Data block is used as " new data block sequence "；

Can also only it be found outside in this " free area defragmentation scope area " according to " continuous freed data blocks area " chained list 1 freed data blocks substitutes the data block of " currently data block number to be arranged " as " new data block "；

Example as shown in Figure 6, if " currently data block number to be arranged " is 11, in this " free area defragmentation scope area " Outer " contiguous file number of data blocks " a freed data blocks for finding " contiguous file data block area " in the 3.3rd step are as " new number According to block sequence ", i.e., 1 freed data blocks is found into freed data blocks 17 in freed data blocks 15 and " currently wait to arrange to substitute The data block 11 of data block number ", can substitute the data block 11 of " currently data block number to be arranged " with freed data blocks 15.

Example as shown in Figure 6, if " currently data block number to be arranged " is 11, " former sequence of blocks of data " is (11) and " new Sequence of blocks of data " is (15), and file F1 is added in " associated with " list.

Example as shown in Figure 6, is the interior of (11) by " former sequence of blocks of data " if " currently data block number to be arranged " is 11 Appearance copies " new data block sequence " to as in (15), the 3rd " the contiguous file data block area " link table information of more new file F1 with And the disk storage metadata information of file F1.

Example as shown in Figure 6, if " currently data block number to be arranged " is 11, " former sequence of blocks of data " is recycled for (11) It is merged into for free block in " continuous freed data blocks area " chained list, it can be seen that data block 11 needs to close original continuous idle number Merged according to block area (7,8,9,10).

Example as shown in Figure 6, if " currently data block number to be arranged " is 11, new " currently data block number to be arranged " is 12。

3.9th step, further perform the 3.1st step；

In order to improve free block defragmentation efficiency, free area fragment carried out again after all arranging " associated with " with And the ranking index renewal of " continuous freed data blocks area " chained list；

Can also be before the 3.8th step for " associated with " and " the continuous freed data blocks area " chained list being related to every time Ranking index renewal；

Example as shown in Figure 6, after the data block copy movement of free area fragment, relevant " associated with " includes file F1, file F5, file F3 are required to count the fragment sum of this document, fragment degree, weighting fragment degree again, and update this document " contiguous file data block area " chained list ranking index, and renewal " file " chained list in this document ranking index and renewal " continuous freed data blocks area " chained list ranking index, so that ensureing the related data structures body of defragmentation includes various chained list generations The last state of table current file system, free area defragmentation and file defragmentation are carried out so as to be repeated several times.

5th step, setting " free area defragmentation state " are successfully；

6th step, free area defragmentation terminate.

The results are shown in Figure 7 for free area defragmentation in previous examples, after carrying out free area defragmentation after the T2 moment Disk partition data block last state.

It can be seen that coming from examples detailed above, the file system fragmentation preferential based on hot spot file that the present invention provides arranges Method, on the one hand, the hot spot file that need to only treat fragment carries out defragmentation rather than All Files progress defragmentation, file The method of the free area defragmentation of system also need to only carry out partial data block section free area defragmentation rather than whole Free area defragmentation in the range of disk partition, therefore greatly reduce file system fragmentation finishing time；Another aspect because Become several small contiguous file data block areas of degree of fragmentating to have carried out defragmentation to frequently accessed hot spot file, So that file access efficiency will greatly improve when accessing the hot spot file again.Of the invention and current existing file system Defragmentation has obvious advantage i.e. quickly and efficiently, and the method that provides of the present invention supports that increment type is into style of writing over and over again Part system fragmentation arranges.

In view of the description of this invention disclosed herein and the embodiment of special case, the other embodiment of the present invention is for this It is aobvious for the technical staff in field and opinion.These explanations and embodiment only consider as an example, all the present invention's Within spirit and principle, any modification, equivalent replacement, improvement and so on, should all be included in the protection scope of the present invention.

Claims

1. based on the preferential file system fragmentation method for sorting of hot spot file, it is characterised in that the described method includes specific steps It is as follows：

Disk partition data block status where 1st step, foundation file system, establishes " continuous freed data blocks area " chained list, " continuous Freed data blocks area " chained list is made of " continuous freed data blocks area " item, and " continuous freed data blocks area " includes initial data Block number, continuous freed data blocks number, press initial data block number sequence chain table pointer, by continuous freed data blocks number sequence chain List index；And index is ranked up to " continuous freed data blocks area " chained list；

2nd step, catalogue and file metadata information according to file system, establish " file " chained list and " contiguous file data block Area " chained list；" file " chained list is traveled through, calculates the fragment association statistical information of each file, fragment association statistical information includes number According to block sum, fragment sum, last access time, visiting frequency, fragment degree and weighting fragment degree；Further to " file " chained list " contiguous file data block area " chained list is ranked up index；

3rd step, " threshold value " for setting file defragmentation, including it is most long non-access time " threshold value ", visiting frequency " threshold value ", broken Piece degree " threshold value ", weighting fragment degree " threshold value "；

4th step, establish " file for treating defragmentation " chained list；For each file in " file " chained list, setting " is currently treated broken " the defragmentation state " of the file that piece arranges " is " not carrying out ", and is determined according to the fragment degree of this document and visiting frequency Whether this document added into " file for treating defragmentation " chained list；And further " file for treating defragmentation " chained list is arranged Sequence index；Wherein, the text of defragmentation " is treated to determine whether to add this document according to the fragment degree of this document and visiting frequency The method of part " chained list is as follows：

By the high hot spot file of fragment degree be added to " file for treating defragmentation " chained list method 1 be file weighting fragment degree The standard of the hot spot file high as fragment degree, will if the weighting fragment degree of this document is more than or equal to weighting fragment degree " threshold value " This document adds " file for treating defragmentation " chained list, and by weighting fragment degree to " file for treating defragmentation " chain list sorting rope Draw；

The high hot spot file of fragment degree is added to " file for treating defragmentation " if the method 2 of chained list is the access frequency of this document This document is then added and " treated by degree more than or equal to the fragment degree of visiting frequency " threshold value " and file more than or equal to fragment degree " threshold value " The file of defragmentation " chained list, and by visiting frequency and fragment degree to " file for treating defragmentation " chained list ranking index；

If the 5.1st step, " file for treating defragmentation " chained list are sky, it is successfully, to go forward side by side to set " system fragmentation collating condition " One step performs the 6th step；Otherwise first " file for treating defragmentation " conduct is " current from " file for treating defragmentation " chained list Treat the file of defragmentation "；

5.2nd step, " the defragmentation state " of setting " file for currently treating defragmentation " are " in progress "；Further according to The fragment degree of " file for currently treating defragmentation " in 5.1 steps, the weighting fragment for weighting fragment degree and file defragmentation Degree " threshold value " is come " the target maximum fragment number " after determining " currently after the file of defragmentation " defragmentation；

5.3rd step, find in " continuous freed data blocks area " chained list and can accommodate the " continuous of " file for currently treating defragmentation " Freed data blocks area " gathers, and " the continuous freed data blocks area " number for being somebody's turn to do " continuous freed data blocks area " set is less than or waits " target maximum fragment number " after " currently after the file of defragmentation " defragmentation；If find " the continuous sky met the requirements Not busy data block area " gathers, then records " the former sequence of blocks of data " and " new data block sequence " of " file for currently treating defragmentation "； If can not find " the continuous freed data blocks area " set met the requirements, the 5.9th step is performed；

5.4th step, " former sequence of blocks of data " and " new data block sequence " according to " file for currently treating defragmentation ", successively will The data block contents of " the former sequence of blocks of data " of " file for currently treating defragmentation " copy the data of " new data block sequence " to In block；

5.5th step, " the contiguous file data block area " chained list for updating " file for currently treating defragmentation ", renewal " are currently treated broken The disk storage metadata information of the file that piece arranges "；And update " the contiguous file data of " file for currently treating defragmentation " The ranking index of block area " chained list；

5.6th step, by the data block of " the former sequence of blocks of data " of " file for currently treating defragmentation " be recovered as idle merged block Into " continuous freed data blocks area " chained list, and update the ranking index of " continuous freed data blocks area " chained list；

5.8th step, " the defragmentation state " of setting " file for currently treating defragmentation " are " success ", " will currently treat fragment The file of arrangement " is removed from " file for treating defragmentation " chained list, and further performs the 5.1st step；

5.9th step, foundation " continuous freed data blocks area " link table information, carry out free area defragmentation, if free area fragment is whole Manage successfully, then further perform the 5.3rd step；Otherwise " the defragmentation state " of setting " file for currently treating defragmentation " is " failure ", sets " system fragmentation collating condition " further to perform the 6th step for failure；

2. according to the method described in claim 1, it is characterized in that：" contiguous file data block area " chained list described in 2nd step In " contiguous file data block area " item include start file logical block number (LBN), initial data block number, contiguous file number of data blocks, By start file logical block number (LBN) sequence chain table pointer, press initial data block number sequence chain table pointer and by contiguous file data block number Mesh sequence chain table pointer.

3. according to the method described in claim 1, it is characterized in that：" file " item in " file " chained list described in 2nd step Include file ID number, data block total number, fragment sum, last access time, visiting frequency, fragment degree, weighting fragment degree, " company Continuous file data blocks area " linked list head, by weighting fragment degree sequence chain table pointer, treat that defragmentation file linked list pointer and fragment are whole Reason state；" contiguous file data block area " chain table pointer is patrolled into this document " contiguous file data block area " chained list by start file Collect first " contiguous file data block area " item of block number sequence；Defragmentation file linked list pointers are treated in " file item " Form " file for treating defragmentation " chained list.

4. the method according to claim 1 or 3, it is characterised in that：The fragment sum of file is the consecutive data block of file The sum in area；" visiting frequency " of file is determined that " last access time " is got over apart from current time by " last access time " Near then " visiting frequency " is bigger, when " last access time " exceeds most long non-access time " threshold value ", " visiting frequency " value is Zero；" the fragment degree " of file is determined by the data block total number of file and the ratio between the fragment sum of file；" weighting fragment degree " by " fragment degree " and " visiting frequency " determines, " weighting fragment degree " is proportional to " fragment degree " and " visiting frequency ".

5. according to the method described in claim 1, it is characterized in that：The each text being directed to described in 4th step in " file " chained list Part, determines whether this document adding " file for treating defragmentation " chained list according to the fragment degree of this document and visiting frequency, The method only carries out defragmentation to the file in " file for treating defragmentation " chained list；The method is by by fragment Spend high hot spot file and be added to " file for treating defragmentation " chained list to support the preferential defragmentation of hot spot file；

The hot spot file that fragment degree is high is added in the method 1 of " file for treating defragmentation " chained list by adjusting text The weighting fragment degree " threshold value " of part defragmentation adjusts the number of " file for treating defragmentation "；

The hot spot file that fragment degree is high is added in the method 2 of " file for treating defragmentation " chained list is visited by adjusting Frequency " threshold value " and fragment degree " threshold value " are asked to adjust the number of " file for treating defragmentation ".

6. according to the method described in claim 1, it is characterized in that：" continuous freed data blocks area " chained list described in 1st step In " continuous freed data blocks area " item include initial data block number, consecutive data block number, refer to by data block number ranking index Pin and by consecutive data block number ranking index pointer；Sort by consecutive data block number to improve in the 5.3rd step " continuous empty The effect of " the continuous freed data blocks area " set that can accommodate " file for currently treating defragmentation " is found in not busy data block area " chained list Rate；The efficiency of the 5.9th step free area defragmentation is improved by data block number sequence.

7. the method for the free area defragmentation of file system, it is characterised in that the described method includes comprise the following steps that：

2nd step, foundation " continuous freed data blocks area " chained list, determine the sheet that can meet free area defragmentation target in the 1st step Secondary " free area defragmentation scope area ", this " free area defragmentation scope area " include " initial data block number " and " cut-off Data block number "；If finding less than this " the free area defragmentation scope area " that can meet the 1st step, " free area fragment is set Collating condition " is failure, further performs the 6th step；

3rd step, sequentially travel through all data block numbers from " the initial data block number " of the 2nd step to " expiration data block number ", set " when Before data block to be arranged " be initial data block number, perform following sub-step：

3.1st step, judge whether " currently data block number to be arranged " is more than " expiration data block number ", if more than free area is then set Defragmentation state is successfully, further to perform the 4th step；

3.2nd step, foundation " continuous freed data blocks area " chained list, judge whether " currently data block number to be arranged " is free block, If free block, then it is " currently data block number to be arranged " plus 1 to set " currently data block number to be arranged ", and performs the 3.9th step；

3.3rd step, foundation " file " chained list and " contiguous file data block area " chained list, obtain " currently data block number to be arranged " Fileinfo and " contiguous file data block area " information belonging to data block；

3.4th step, according to " continuous freed data blocks area " chained list, the is searched out outside this " free area defragmentation scope area " " the contiguous file number of data blocks " in " contiguous file data block area " a freed data blocks are used as " new data block sequence in 3.3 steps Row ", if not finding " the new data block sequence " met the requirements, set free area defragmentation state to fail, further perform 4th step；

" the former sequence of blocks of data " and " new data block sequence " in " contiguous file data block area " in 3.5th step, the 3.3rd step of record； By the file record belonging to the data block of " currently data block number to be arranged " into " associated with " list；

It is " new in the content to the 3.4th step of the sequence of blocks of data in " contiguous file data block area " in 3.6th step, the 3.3rd step of copy Sequence of blocks of data "；" the contiguous file number of file belonging to the data block of " currently data block number to be arranged " is updated in the 3.3rd step According to block area " chained list, disk storage metadata information；

3.7th step, be recovered as free block by " former sequence of blocks of data " in the 3.5th step and be merged into " continuous freed data blocks area " chained list In；

3.8th step, setting " currently data block number to be arranged " are that " currently data block number to be arranged " adds " former data in the 3.5th step The data block number of block sequence "；

3.9th step, further perform the 3.1st step；

4th step, traversal " associated with ", for each file, the fragment sum, fragment degree, weighting for counting this document again are broken Piece degree, and the ranking index of " contiguous file data block area " chained list of this document is updated, and this article in renewal " file " chained list The ranking index of part；Update " continuous freed data blocks area " chained list ranking index；

5th step, setting " free area defragmentation state " are successfully；

6th step, free area defragmentation terminate.

8. according to the method described in claim 7, it is characterized in that：Set described in 1st step in the defragmentation target of free area, if " file for currently treating defragmentation " is set, then defragmentation target in free area is that " continuous freed data blocks area " is total at least It should be greater than the data block total number of " file for currently treating defragmentation "；Otherwise defragmentation target in free area at least should be greater than " treating The data block total number of " file for treating defragmentation " first in the file of defragmentation " chained list.

9. according to the method described in claim 7, it is characterized in that：This " free area defragmentation model is determined described in 2nd step Enclose area " in, in all data block ranges of disk partition where file system, using the 1st piece of data block as starting, according to sky Not busy area's defragmentation target is " continuous freed data blocks area " if disk partition where file system is divided into by sum for unit Dry " consecutive data block area ", according to " continuous freed data blocks area " chained list, finds in these " consecutive data block areas " and includes Most " the consecutive data block area " of freed data blocks number be used as this " free area defragmentation scope area ".