CN106502587B - Hard disk data management method and hard disk control device - Google Patents

Hard disk data management method and hard disk control device Download PDF

Info

Publication number
CN106502587B
CN106502587B CN201610912077.5A CN201610912077A CN106502587B CN 106502587 B CN106502587 B CN 106502587B CN 201610912077 A CN201610912077 A CN 201610912077A CN 106502587 B CN106502587 B CN 106502587B
Authority
CN
China
Prior art keywords
data
log area
hard disk
space
mapping relations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610912077.5A
Other languages
Chinese (zh)
Other versions
CN106502587A (en
Inventor
丁敬文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610912077.5A priority Critical patent/CN106502587B/en
Publication of CN106502587A publication Critical patent/CN106502587A/en
Application granted granted Critical
Publication of CN106502587B publication Critical patent/CN106502587B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device

Abstract

The embodiment of the invention discloses kind of hard disk data management method and hard disk control devices, for efficiently managing the fragment on hard disk.The embodiment of the present invention is applied to the hard disk control device including hard disk, which includes data field and log area, and method includes: to caching device write-in data;Judge whether data are hot spot datas, wherein hot spot data is the data that hard disk can be made to generate preset quantity fragment after being stored on hard disk after the modification of preset times and release;If data are not hot spot datas, match data field space in data separation for data, write data into data field space;If data are hot spot datas, log area space is distributed in log area for data, writes data into log area space.It is managed in different ways by the way that different types of data are stored in different regions on hard disk, the debris management efficiency on hard disk can be improved, efficient management of the log area to hard disk fragment can reduce the generation of hard disk fragment.

Description

Hard disk data management method and hard disk control device
Technical field
The present invention relates to data processing field more particularly to a kind of hard disk data management methods and hard disk control device.
Background technique
For common mechanical hard disk, because it relies on mechanical rotation hard disk and moving head locating read-write position, Hard disk sequence read-write is optimal read-write model.If hard drive space fragmentation, when writing data, can not be assigned to Continuous space causes magnetic head shake serious, and the main time consumption of data transmission is on positioning magnetic track and sector, to leave for The time for transmitting data is seldom.Because the data of file are more discrete, then read these files when, efficiency also compared with It is low.
Therefore, most hard disks file system all avoids generating a large amount of fragment space as possible, but fragmentation is still It not can avoid.
Such as, it can use the advantage of hard disk sequential write using COW mechanism.When to modify write a block number according to when, be not The data of early version are directly covered, but read the data of early version, after modification is good, a new position is write, number will be write According to data be all aggregating, sequentially write on hard disk, discharge the data of early version.Because the change in location of data, needs The pointer being directed toward in the upper layer index block of data is modified, such recurrence to top.It can thus discharge a large amount of Data cause to generate a large amount of fragment on hard disk.
Summary of the invention
The embodiment of the invention provides a kind of hard disk data management method and hard disk control devices, for efficiently managing hard disk On fragment.
First aspect present invention provides a kind of hard disk data management method, and this method is applied to the hard disk controlling including hard disk Device, hard disk include data field and log area, this method comprises:
Data are written to caching device in hard disk control device, which for example can be memory, flash card, solid-state The memory device different from hard disk such as hard disk, then, hard disk control device judge whether the data are hot spot datas, wherein hot spot Data are the data that hard disk can be made to generate preset quantity fragment after being stored on hard disk after the modification of preset times and release.It is logical It crosses on caching device and the data of write-in is judged, determine the type of the data, it is different to be executed to different data Processing mode.
It is empty with data field in data separation for data if the data are not hot spot datas when data are written to hard disk Between, write data into data field space;If the data are hot spot datas, log area space is distributed in log area for data, it will Log area space is written in data.
The hard disk data management method of first aspect present invention, the data for being written into hard disk are divided into hot spot data and non-thermal Point data, hot spot data are easy to that hard disk is caused to generate fragment, hot spot number are stored on log area, carries out pipe with log mode Reason, it is also convenient recycle etc. management to these fragments even if the data on log area, which frequently modifys, generates hard disk fragment, and incite somebody to action Non-thermal point data is stored in data field, and the release of non-thermal point data does not easily lead to hard disk and generates fragment, data field may not need for Hard disk debris management distribute excess resource, thus, by hard disk by different types of data be stored in different regions with Different modes are managed, and the debris management efficiency on hard disk can be improved, be effectively managed, evade to the fragment on hard disk Or it reduces hard disk fragment and generates.
With reference to first aspect, in the first possible implementation, caching device is memory, is the data in log area It distributes after the space of log area, the first possible implementation further include: the mapping for establishing the data and log area space is closed System.I.e. hard disk control device is to establish a mapping relations on caching device after the data distribute log area space in log area, For the corresponding relationship of the data and its log area space being assigned to, data are recorded in the storage of log area by the mapping relations Situation is managed so as to use the mapping relations to eliminate operation to the data that the data of log area are eased up in memory device.Its In, in the first possible implementation, caching device is memory, but caches device and can also be other situations.
The possible implementation of with reference to first aspect the first, establish in the second possible implementation data and The mapping relations in log area space, comprising: establish the log area space that multiple target datas and multiple target datas are assigned to Mapping relations, wherein target data belongs to hot spot data;
Write data into log area space, comprising: multiple write operation groups of multiple target datas are combined into an affairs;It will Log area space is written in all target datas of affairs.And when the write operation of one of target data of affairs executes failure When, the write operation that other target datas of affairs execute fails.Multiple target datas refer at least two target datas, correspondingly, Multiple write operations refer at least two write operations.In this way, the concept of database field affairs is introduced, as unit of multiple hot spot datas To operating, mapping relations such as are established with multiple hot spot datas for belonging to same affairs, and with all hot spot datas of affairs Write operation execute the write operation to log area together.The efficiency of data processing can be improved in this way.
The possible implementation of second with reference to first aspect, the third possible implementation further include: in memory It is upper to cache the data for belonging to hot spot data.Hot spot data is buffered on memory, for example, log area is written in hot spot data When, these data are also retained on memory to the hot spot data alternatively, before data are written to memory from log area reading thereon, And be buffered on memory, in this way, can directly modify on memory to data, data exist when the subsequent write-in data to content It is migrated in memory, reduces the generation of fragment on hard disk, and can arrange according to data of the migration situation to log area.
The third possible implementation with reference to first aspect, in the fourth possible implementation by the institute of affairs There is target data to be written before the space of log area, the 4th kind of possible implementation further include:
Data link table is established according to multiple target datas, wherein data link table is used for management objectives data, data link table pipe The target data of reason and the target data of affairs are identical;Then, target data is managed according to data link table, and, according to Data link table is managed target data, comprising: after establishing the second data link table, when the second mesh of the second data link table management Marking data is when being obtained by the first object data modification of the first data link table management pre-established, on the first data link table Release the management to first object data;First object data are deleted in the first mapping relations corresponding with the first data link table Information.In this way, the migration of data between different affairs can be managed by data link table on memory.
The mode being managed according to data link table to the data on memory can be with are as follows: under default release conditions, according to Foundation sequence of the data link table after arriving first, searches the data that data link table does not decontrol;Target data is discharged on memory The data that chained list does not decontrol, and retain the corresponding target mapping relations of target data chained list on memory.By discharging mesh Data on mark data link table can expand the capacity of memory management data.System passes through the inquiry of target mapping relations, Ji Kecong Corresponding data are read on log area.
The 4th kind of possible implementation with reference to first aspect, in a sixth possible implementation in default release Under the conditions of, according to foundation sequence of the data link table after arriving first, search the data that data link table does not decontrol, comprising: when interior It deposits when reaching the first preset water level, according to foundation sequence of the data link table after arriving first, searches what data link table did not decontroled Data;In addition, the internal storage data that this method further includes second stage is eliminated, i.e., target data chained list is discharged on memory and is not released After the data of management, this method further include: when memory reaches the second preset water level, read target mapping relations from log area The data of direction;Data field is written in the data that target mapping relations are directed toward;The delete target mapping relations on memory.Pass through two The internal storage data eliminative mechanism in stage can expand memory to the management capacity of data, and eliminate in the internal storage data of second stage In, the data that target mapping relations are directed toward at this time are sluggish data, and a possibility that being modified is lower, can by these data from Log area moves to data field preservation, this will not excessively increase the fragment of data field.
The 4th kind of possible implementation with reference to first aspect is also wrapped in the 6th kind of possible implementation this method It includes: being the corresponding data link table of affairs according to incremental rule distribution transaction number according to the write sequence of affairs.By for data-link Table distributes transaction number, can be managed according to transaction number to data link table, improve the efficiency of management.Such as, from Current transaction Number the smallest data link table starts, according to the data that the ascending sequential search data link table of transaction number does not decontrol, this The sequence of the foundation according to data link table after arriving first can be realized in sample, searches the data that data link table does not decontrol.
The 4th kind of possible implementation with reference to first aspect, this method is also wrapped in the 7th kind of possible implementation It includes: under the conditions of default recycling, the step of execution journal area data are moved.Such as the step of execution journal area data resettlement, packet It includes:
Search mapping relations;
Judge whether the data on the first log area corresponding with mapping relations migrate according to the information of mapping relations record It is complete;
If the data on the first log area have not migrated, according to the information that mapping relations record, the first log area is determined On space utilization rate;
When the space utilization rate of the first log area is less than default utilization rate threshold values, extremely by the Data Migration of the first log area Second log area, and update corresponding with the data moved mapping relations, wherein the second log area be the free time log area or The used log area when recycling log area;
When total space water level reaches pre-set space threshold values when current log area, then stop the resettlement of execution journal area data Otherwise step continues to execute the step of log area data are moved.
The 7th kind of possible implementation with reference to first aspect, this method is also wrapped in the 8th kind of possible implementation Include: according to the write sequence of affairs be the corresponding data link table of affairs according to it is incremental rule distribute transaction number, according to transaction number come Data link table is managed, the efficiency of management can be improved.For example, since the smallest data link table of Current transaction number, according to thing Business number ascending sequential search mapping relations corresponding with data link table, can be realized the lookup to mapping relations.
The 7th kind of possible implementation with reference to first aspect presets recycling article in the 9th kind of possible implementation Part includes timer expiry, to the reclaimer operation of internal storage data is completed, the total water level in log area reaches in preset water level threshold values extremely It is one few.
The possible implementation of with reference to first aspect the first, in the tenth kind of possible implementation, on memory Caching belongs to the data of hot spot data, and there are many modes, such as after judging whether data are hot spot data, if data are hot spots Data then retain the data on memory;Alternatively, reading data to caching from log area before data are written to caching device Device caching.
With reference to first aspect or second to the tenth kind any possible implementation of first aspect, the tenth one kind can Can implementation in hot spot data include that size of data is less than the data of preset data threshold values and/or hot spot data includes first number According to.The preset data threshold values for example can be 128KB or other space sizes, specific numerical value and can be adjusted according to type of service Whole, if the size of data of data is less than the preset data threshold values, it is a large amount of broken that the frequent release of the data may be such that hard disk generates Piece.And metadata includes the management data to data, such as the indirect block and conservation object of preservation data address manage structure Meta data block.Metadata may also lead to hard disk and generate a large amount of fragments.These hot spot datas to be filtered out, to be stored in log Area.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 12nd kind Log area space is distributed in log area for data in the implementation of energy, writes data into log area space, comprising: is existed for data Log area space is distributed in log area in sequence, by the additional write-in log area space of data sequence.Data can be realized in this way to exist The sequence of log area is read and write, thus when moving the data of log area, the not no expense of metadata.It is smaller to arrange expense, has Effect ensure that the stability of system performance.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 13rd kind This method in the implementation of energy further include: when the space utilization rate of data field, which is greater than preset data area, utilizes threshold values, will work as The log area of preceding free time is converted into data field;It, will be by when the space utilization rate of log area, which is greater than default log area, utilizes threshold values The data field that idle log area is converted to is converted into log area.In this way, log area and data field are mutually converted with adaption system The variation of capacity.Can the specific usage scenario of flexible adaptation, improve the service efficiency of hard disk.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 14th kind Hard disk further includes superblock in the implementation of energy, and each log area is assigned identification information, and superblock is used in log area quilt The identification information for the log area modified is recorded after modification.Further log area is managed by superblock, if such as After system cut-off or collapse restore, hard disk control device can search in time the day modified according to the information that the superblock records Will area.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 15th kind Log area and data field are arranged alternately on hard disk in the implementation of energy.In this way, may make the data of log area and data field It is arranged closer.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 16th kind Can implementation in hard disk further include block group, block group includes the log area and data field of preset quantity, the log area of chunking and Data field is continuously arranged, and the use of the data field and log area in adjustment chunking can be cooperated by chunking, for example, according to block group It is data in target data area distribution data field space after management information determines idle target data area;To which data be write After entering data field space, method further include: generate target metadata according to data and data field space;To caching device write-in Target metadata, hard disk control device judge metadata to determine object block group belonging to target data area after hot spot data; Determine the available log area of object block group;Available log area is written into target metadata.It in this way can be by metadata in hard disk On the position for being located proximate to the corresponding data of metadata and being stored on hard disk, the convenient read-write to data.
Second with reference to first aspect is to the tenth kind of any possible implementation, the 17th kind of possible realization side This method in formula further include: distribute log area space on log area for mapping relations, then, mapping is written into mapping relations and is closed It is the log area space being assigned to.Mapping relations are also stored on log area, so that mapping relations can on hard disk It is saved by ground.
Second aspect of the present invention provides a kind of hard disk control device, which includes hard disk, which includes Data field and log area, the hard disk control device have the function of hard disk control device in the above method.The function can pass through Hardware realization, it is also possible to which corresponding software realization is executed by hardware.The hardware or software include one or more and above-mentioned function It can corresponding module.
In a kind of possible implementation, which includes:
Writing unit, for data to be written to caching device;
Cache manager, for judging whether data are hot spot datas, wherein hot spot data be stored on hard disk after Hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times;
Data management system is matched data field space in data separation for data, will be counted if not being hot spot data for data According to write-in data field space;
Log manager distributes log area space in log area for data, by data if being hot spot data for data Log area space is written.
In alternatively possible implementation, which includes:
Processor;
The processor executes following movement: data are written to caching device;
The processor executes following movement: judging whether data are hot spot datas, wherein hot spot data is to be stored in hard disk Hard disk can be made to generate the data of preset quantity fragment after upper after the modification and release of preset times;
The processor executes following movement: empty with data field in data separation for data if data are not hot spot datas Between, write data into data field space;
The processor executes following movement: if data are hot spot datas, log area space is distributed in log area for data, Write data into log area space.
The third aspect, the embodiment of the present application provide a kind of computer storage medium, which is stored with journey Sequence code, the program code are used to indicate the method for executing above-mentioned first aspect.
As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that
On the hard disk control device for including hard disk, which includes data field and log area, and number is written to caching device According to rear, hard disk control device judges whether the data are hot spot datas, is the data in number if the data are not hot spot datas Data field space is distributed according to area, writes the data into data field space;It is the data in log if the data are hot spot datas Area distributes log area space, writes the data into log area space.In this way, the data for being written into hard disk be divided into hot spot data and Non-thermal point data, hot spot data are that hard disk can be made to generate present count after being stored on hard disk after the modification of preset times and release Measure the data of fragment, hot spot data is easy to cause hard disk to generate fragment, hot spot number is stored on log area, with log mode into Row management also facilitates even if the data on log area, which are frequently modified, generates hard disk fragment and carries out the management such as recycling to these fragments, And non-thermal point data is stored in data field, the release of non-thermal point data does not easily lead to hard disk and generates fragment, and data field can nothing Excess resource need to be distributed for hard disk debris management, thus, by the way that different types of data to be stored in different areas on hard disk Domain is managed in different ways, can be improved the debris management efficiency on hard disk, efficient management of the log area to hard disk fragment, The generation of hard disk fragment can be reduced.
Detailed description of the invention
Fig. 1 is the logical view of an object on log area provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of hard disk data management method shown in one embodiment of the invention;
Fig. 3 is the schematic diagram that data involved in embodiment illustrated in fig. 2 migrate in memory;
Fig. 4 is the schematic diagram that data involved in embodiment illustrated in fig. 2 cache in memory;
Fig. 5 be another embodiment of the present invention provides a kind of hard disk control device structural schematic diagram;
Fig. 6 is the structural schematic diagram of the recovery unit of hard disk control device shown in fig. 5;
Fig. 7 be another embodiment of the present invention provides a kind of hard disk control device hardware structural diagram.
Specific embodiment
The embodiment of the invention provides a kind of hard disk data management method and hard disk control devices, on efficiently management hard disk Fragment.
One, implementation environment involved in the hard disk data management method of the embodiment of the present invention
A kind of hard disk data management system of the embodiment of the present invention, the hard disk data management system include hard disk, memory, are somebody's turn to do Memory can be used as caching device, which is divided into data field Date zone and log area Journal zone, wherein the day Will area is managed data thereon with log mode.
Before data are written to the hard disk in hard disk data management system, the number is written to the memory first as caching device According to if hard disk data management system judges that the data are hot spot data, which is after being stored on hard disk default Hard disk can be made to generate the data of preset quantity fragment after the modification and release of number, these hot spot datas are on the data field of hard disk It will lead to hard disk after modification and generate a large amount of fragments.So after data are written to memory, if the data are not hot spot datas, for Then the data write the data into the data field space with data field space in data separation.If the data are hot spot numbers According to, then be data log area distribute log area space, write data into log area space.
Wherein, data field space is the memory space on data field, can be the segment space on a data field, can also To be whole spaces on a data field.Log area space is the memory space on data field, be can be on a log area Segment space, be also possible to whole spaces on a log area.
In this way, the data for being written into hard disk are divided into hot spot data and non-thermal point data, hot spot data is to be stored in hard disk Hard disk can be made to generate the data of preset quantity fragment after upper after the modification and release of preset times, hot spot data is easy to cause hard Disk generates fragment, and hot spot number is stored on log area, is managed with log mode, even if the data on log area are frequently repaired Change products stiff disk fragment, also facilitate and these fragments are carried out the management such as to recycle, and non-thermal point data is stored in data field, it is non-thermal The release of point data does not easily lead to hard disk and generates fragment, and data field, which may not need, distributes excess resource for hard disk debris management, from And be managed in different ways by the way that different types of data are stored in different regions on hard disk, it can be improved hard Debris management efficiency on disk, is effectively managed the fragment on hard disk, has also been reached by the management of log area and has been evaded firmly The effect that disk fragment generates.
The setting of the log area and data field of hard disk can have various ways, be described in detail as follows, using as its One of implementation.
It is the two kinds of region in data field and log area by hard disk partition, to the space size sheet of data field and log area Inventive embodiments are not especially limited, such as can be 256M.The data field and log area, which can be, to be arranged alternately, such as one institute of table Show, table one is an a kind of example of hard drive space layout, and hard disk is divided into superblock, data field and log area.Optionally, exist 0.1% ratio is as fixed fixation log area, evenly spaced to be distributed on hard disk, this seed type in the set of log area Log area can only use as log area, and others log area can be converted into data when hard drive space deficiency Area.
Table one
Data field and log area have a variety of set-up modes on hard disk, and above-mentioned data field and log area are arranged alternately only It is one such mode, the present invention is not especially limited this, such as can also be that data field is continuously disposed in the one of hard disk Region, log area is continuously disposed in another region of hard disk or multiple data fields are continuously set as data district's groups, multiple logs Area is continuously set as log district's groups, and then data district's groups and log district's groups are arranged alternately, etc..
Data field can save non-thermal point data, such as will be greater than the data of 128KB and write direct data field.Wherein, exist Data field after writing data into data field, will generate the metadata of management data, after the writable caching device of these metadata, It is again it in log area allocation space, to be stored.
Log area can save hot spot data, such as by the data and meta-data preservation that are less than 128KB into log area, In some embodiments, hot spot data can be to be stored on log area in additional mode, can also be right in the embodiment having An identification information ID is distributed in log area in sequence.In log area, data are managed with log mode.
In the embodiment having, in log area, WriteMode is added in sequence and carries out io processing, when a data block needs Log area is written, the tail portion allocation space being written from the log area last time, when log area can not accommodate a data block It waits, reselects a free time maximum log area and carry out additional write.
The layout of log area is as shown in Table 2, and wherein Journal ctrl is identification information ID, and Map is mapping relations, Record is data, which is, for example, to be less than the data and metadata of 128KB.
Table two
As shown in Table 1, in the embodiment having, it is additionally provided with superblock on the hard disk of hard disk control device, is written in data Behind log area, log area is modified, and superblock will will record the flag information ID for the log area modified.For example, a batch to When the write operation group of log area is combined into an affairs, the data of an affairs are saved in after hard disk, can modify this affairs The identification information ID of log area be recorded in the corresponding bitmap of superblock.
For the space size of superblock, the embodiment of the present invention is not especially limited, and can be carried out according to equipment concrete condition Adjustment, for example, the disk of a 4T, log area has 4T/256M/2=8192, and a 1024B is needed to record day in superblock The overall service condition in will area.
The layout of superblock can be as shown in Table 3.Wherein Super Blkctrk is for recording overall management information, example As hard disk has used capacity, total capacity, total idle capacity, log area number, data field number.Journalbitmap is used In record processed transaction number.The capacity of Super blkctrk and Journalbitmap can be 4K respectively
Table three
Super blkctrk Journalbitmap
It, can be by multiple log areas in the embodiment having in order to make the data for being assigned to data field and log area close It is combined into a chunking with data district's groups, the log area and data field in chunking are continuously arranged.Each piece of group has a space management Object manages the service condition of the hard drive space in this block group by the way of bitmap file bitmap.For example, can will connect 16 log areas of continuous setting and data district's groups are combined into a block group.
Table four and table five show the relationship of chunking, data field and log area three.Table is fourth is that hard as unit of chunking The signal of the layout of disk, table is fifth is that the layout to the chunking 1 in table four is illustrated.
Table four
Table five
As shown in Table 2, mapping relations Map is also stored on log area, which can be for record log The corresponding relationship in the log area space that data and the data in area are assigned to, as shown in Figure 1, patrolling it illustrates hard disk object View is collected, mapping relations are illustrated according to the figure.
As shown in Figure 1, it illustrates an objects on log area.One object can be divided into many levels, most bottom Grade level is 0 layer by layer, the data block of corresponding objects.It is indirect block, level 1 on level0.Top layer is Object Management group Block where structure, level 2.In the case that there are many data block, the address that an indirect block can not save so more data blocks refers to Needle, needs multiple indirect blocks at this time, and the number of plies of object also increases.The block of same layer according to numbering from left to right, such as most bottom The blkid number consecutively of the data block of layer is 0,1,2 and 3.
In affairs when modifying data block, the relationship note in the log area space for needing for data and the data to be assigned to It records into a mapping relations.For example, an affairs create the object in Fig. 1, then needing to record such as in mapping relations The information of table six.In table six, the information type of the data of each column record is followed successively by objsetid in the mapping relations, Objid, levelid, blkid, journalid, offset, size.
Wherein, objsetid refers to that object set ID, objid refer to that object ID, levelid refer to the number of plies where data block, blkid Refer to data block in the place number of plies, serial number from left to right, journalid refers to that the id of the log area of data block write-in, offset refer to The opposite offset of log area is written in data block, and size refers to the size of data block write-in.
It is appreciated that the information type of mapping relations record can be including above-mentioned all information types, also can wrap The some types of above- mentioned information type are included, can also include more other information types, the embodiment of the present invention does not make this It is specific to limit.
Table six
It is appreciated that in the embodiment having, which can be replaced memory by other devices, such as Nvdimm, Flash card, SSD (solid state hard disk, Solid State Drives) etc..It can be with it is appreciated that in the embodiment having, on hard disk Do not include log area, and hot spot data is stored on caching device, the embodiment of the present invention is not specifically limited in this embodiment.
It is appreciated that the hard disk control device of the embodiment of the present invention can be used in the equipment such as computer, server, this Inventive embodiments are not specifically limited in this embodiment.
Fig. 2 is a kind of flow chart of hard disk data management method shown according to an exemplary embodiment.This method application In on hard disk control device, which includes hard disk, which includes data field and log area.In conjunction with foregoing description First part, i.e. implementation environment involved in the hard disk data management method of the embodiment of the present invention held with hard disk control device For the angle of row method provided in an embodiment of the present invention, referring to fig. 2, method flow provided in an embodiment of the present invention includes:
Step 201: data are written to memory;
Before data are written to hard disk in equipment, hard disk control device first writes the data into caching device, to be written Management before hard disk, such as the data are managed using the cache manager of hard disk control device.
The caching device can be the caching device such as memory or flash memory, and the embodiment of the present invention is not made this specifically It limits.
In embodiments of the present invention, it is illustrated using caching device as memory.The data of the write-in memory include hard disk control The metadata generated when data is write in all external datas of device processed and the data field to hard disk.
Wherein, it is write to memory write-in data including modification and two ways is write in creation, the number being written to memory is write in modification According to modify to the data on memory, the data that new data cover is modified, creation is write is written new number to memory According to not caching the initial data of the new data on memory.
Step 202: judge whether the data are hot spot datas, if the data are not hot spot datas, then follow the steps 203, If the data are hot spot datas, 204 are thened follow the steps.
After data are written to memory, hard disk control device judges whether the data are hot spot datas, such as passes through hard disk control The cache manager module of device processed is judged.Hard disk control device judges whether the data of hard disk to be written are hot spot datas Afterwards, different processing modes is executed according to judging result.
Wherein, hot spot data is that after being stored on hard disk after the modification of preset times and release hard disk can generate to preset The data of quantity fragment.For example, in the present embodiment, which, which can be, refers to that size of data is less than preset data threshold values Data and/or the hot spot data may also mean that metadata.
The data that size of data is less than preset data threshold values are small data, which frequently modifies and release is easy to produce Hard disk fragment, and the data that size of data is greater than certain predetermined data threshold values will not generate greatly frequently modifying on hard disk The hard disk fragment of amount.Wherein, the setting of the preset data threshold values is related with business model, such as can be set to 64KB, 128KB Deng.
And metadata is the data block for recording object management architecture and recording data block address.When hard disc data area to be modified On data block when, because data field generally uses COW mechanism, that is, when to modify write a block number according to when, be not direct The data of early version are covered, but read the data of early version, after modification is good, a new position is write, discharges early version Data.Because of the change in location of data, needs to modify on the pointer in the upper layer index block for being directed toward data, that is, want Metadata is modified, modified new data allocations are to new space, and the old metadata needs modified discharge, so Recurrence is to top.Thus a large amount of data can be discharged, the position of the data discharged on hard disk because of the modification of metadata With regard to generating fragment, to accelerate the process of hard disk fragmentation.
To which size of data is less than the data of preset data threshold values to the embodiment of the present invention and/or metadata is classified as hot spot Data, these data are easy to cause hard disk to generate fragment, need to manage it accordingly.
Step 203: matching data field space in data separation for data, write the data into data field space.
Be judged as be not hot spot data data because it is not easy to make hard disk to generate fragment, so as to save it in hard disk On data field.Wherein, data field is the region on hard disk for storing data, which can be used COW mechanism to thereon Data are managed.Data field can be the region on hard disk with pre-set space size, which for example can be with It is 256M.Description of the above-mentioned implementation environment part to data field can refer to the description of the data field.
For example, cache manager judges that the data are not hot spot datas, then firmly when the size of a data is greater than 128KB The dina base administration device module of disk control unit can distribute data field space for the data on data field, then write the data Enter the data field space being assigned to.
Wherein, metadata can be generated after writing data into data field.The metadata is used to record the address of the data, thus The convenient management to the data.For example, when tissue multi-block data generally by the way of multiple index, that is, in data One index block of a upper Layer assignment of block, content are to record the address of data block, can be by multiple data by the index block Block is spliced into a continuous object in logic.This index block is exactly a seed type of metadata.Write data into data field After generating metadata, which can be written memory, execute above-mentioned step 201.Cache manager can determine whether out this yuan of number According to for hot spot data, to be saved on log area and memory.
It is appreciated that the hard disk of hard disk control device further includes block group in the embodiment having, block group includes preset quantity Log area and data field, the log area and data field of block group be continuously arranged, in the hard disk control device for including block group, for number It is to select suitable block group, such as space uses less block according to the specific executive mode for matching data field space in data separation The more block group of group or idle data area, then distributes the data field space of data field for the data in the block group of the selection. It after allocation space, needs to record the service condition in space, that is, needs to search and distribute in the space management bitmap of block group After idle data block, the management structure of modified block group.It is subsequent can be according to the space service condition that block group records to belonging to this The metadata is written in the log area of block group, if the log area that the block group is not idle, needs that neighbouring log area is selected to be written The metadata.So that the data that the metadata is directed toward with the metadata are located proximate on hard disk.
Wherein, which can refer to corresponding description of the above-mentioned implementation environment part to block group.
It is appreciated that the data on write-in data field can also include interior other than being judged as the data of non-thermal point data It deposits because of the data that space is inadequate and eliminates, for example, being directed toward mapping relations in subsequent memory recycling second stage Data move to data field from log area.These data being migrated can also be matched by data space management device in data separation Behind the space of data field, data field space is written.
Step 204: distributing log area space in log area for data.
It is additionally provided with log area on hard disk, the data on log area are managed by log mode on log area.
Log area sky is distributed on log area when the data of step 201 are judged as hot spot data, and for the data Between, such as log manager is delivered the data to, which is that the hot spot data distributes log area on log area Space, so that the hot spot data stores on log area.
It, can be according to write-in log area in order to be more easily managed to the data on log area in the embodiment of the present invention Sequence be to belong to the data of the hot spot data allocation space on log area.It certainly, in other embodiments, can not be heat Point data order-assigned log area space, the present invention is not especially limit this,
And the data for belonging to hot spot data are stored on log area, when the hot spot data on memory is lost because of power down When, can by, so that equipment executes operation, also, memory and log area are used cooperatively in the reading data to memory on log area, Hot spot data is stored in log area, may make that the data volume of the manageable hot spot data of memory is expanded.
The log area can be the region on hard disk with certain space size, such as can be 256M.It can on hard disk With with multiple log areas and data field, which can have identification information according to the setting order-assigned on hard disk.It closes In log area, specific set-up mode can refer to corresponding description of the above-mentioned implementation environment part to log area.
The set-up mode of log area and data interval on hard disk can have various ways, such as log area and data Area is disposed alternately on hard disk, as shown in above-mentioned table one.Certain data field and log area can also be arranged in another manner, The present invention is not especially limit this, specifically refers to what data field and log area was arranged in above-mentioned implementation environment part Corresponding description.
Step 205: establishing the mapping relations in the log area space that multiple target datas and multiple target data are assigned to.
Wherein target data belongs to hot spot data.
After multiple data are written to memory, by whether be hot spot data judgement after, may be obtained on memory multiple Belong to the data of hot spot data.Hard disk control device determines multiple target datas for belonging to hot spot data on memory, To be managed together to target data is multiple, treatment effeciency is improved.Target data is all assigned log area sky on log area Between, mapping relations are established in the log area space that hard disk control device is assigned to according to these target datas and target data.Wherein These target datas belong to hot spot data, i.e. target data includes that metadata and/or size of data are less than preset data threshold values Data.
All data for writing log area require the mapping relations of record data and log area space, pass through mapping relations Data on log area are recorded, with according to the mapping relations to the hot spot data on memory or the data on log area into Row management, for example, equipment reads the corresponding number being stored on log area according to the index for the mapping relations being buffered on memory According to, or according to mapping relations record data information log area is recycled, with to log area carry out debris management.
The information type of mapping relations record may include objsetid, objid, levelid, blkid, The information such as journalid, offset, size,
More contents about the mapping relations can refer to the corresponding description of above-mentioned implementation environment part.
Step 206: mapping relations are cached on memory.
After mapping relations establish, it is saved on memory, for subsequent operation preparation.
In the embodiment having, which can also be saved in log area, when needing to be saved on memory, then The mapping relations are read, from log area to be buffered on memory.It certainly, can be by the mapping relations in the embodiment having It is stored on memory and log area simultaneously.
Step 207: multiple write operation groups of multiple target data are combined into an affairs.
After determining multiple target datas, write operation is executed to multiple target data log area, hard disk is written Multiple write operation groups of this multiple target data are combined into an affairs by control device, are executed as unit of affairs and are write hard disk behaviour Make.Wherein affairs are the addresses to the combination of multiple write operation, are not the execution of write operation.And multiple target datas refer at least Two target datas, correspondingly, multiple write operations refer at least two write operations.
In order to improve write efficiency and guarantee that write-in reliability is generally not only to hold when data are written to log area in equipment Write operation of row, but in the multiple write operations for once executing multiple data into the operation of log area write-in data.These Belong to multiple target datas with a batch of write operation, during hard disk is written or is write as function entirely or write mistake entirely It loses.Multiple write operation groups with a batch of multiple target datas are combined into an affairs.
It is appreciated that the embodiment of the present invention is not especially limited the execution sequence of step 207 and step 205.It is i.e. multiple The write operation group of target data is combined into an affairs, can establish mapping relations according to these target datas, thus an affairs and one Mapping relations are corresponding.
In the embodiment having, the method for the embodiment of the present invention further includes establishing data-link according to multiple target data Table forms a data link table, the number according to a series of metadata in an affairs and less than the data of preset threshold Management objectives data are used for according to chained list.When forming mapping relations, can be existed according to the data on the data link table with these data Mapping relations are established in the space distributed on log area.
Step 208: log area space is written into all target datas of affairs.
Log area space is written in all target datas of affairs by hard disk control device, and multiple target set of data are combined into one After affairs, when the write operation of one of target data of affairs executes failure, what other target datas of affairs executed is write Operation failure.The write operation of only each target data is carried out success, and the write operation of the affairs could succeed.
After the completion of mapping relations are established, it is that the mapping relations distribute log area space by log area manager, belongs to heat The data of point data are also assigned with log area space, so as to which the mapping relations and target data are all written what the two was assigned to Log area space, to be saved on log area to mapping relations and target data.
After mapping relations are stored in log area, system can be read the mapping relations on memory, so that memory can To reacquire the mapping relations, this is particularly useful to reworking after memory power down, certainly, can in the embodiment having The mapping relations not to be stored on log area, so that this also can be real without being mapping relations in log area allocation space The effect of hard disk fragment on data field is now reduced, the present invention is not especially limit this.
In the embodiment of the present invention, the sequence in the target data write-in log area space of mapping relations and affairs is not made specifically It limits.
In the embodiment having, it is data in log area distribution log area space, writes the data into log area space Concrete mode is to distribute log area space in sequence in log area for the data, by the additional write-in log of data sequence Area space.Order-assigned space and the additional write-in data of sequence are distributed i.e. on the memory space of log area according to sequencing empty Between or write-in data.
By distributing log area space in sequence in log area for the data, by the additional write-in log of data sequence The mode in area space can sequentially read and write data when being managed to the data of log area, improve the effect of data management Rate, and determine according to sequentially the data on log area, and mapping relations record has the data for being stored in log area and these numbers According to log area space corresponding relationship, using mapping relations can replace metadata effect, be determined without using metadata Data on log area, from without additional metadata management expense.And when data are write in a manner of sequentially adding in log area, The information of memory space of the mapping relations record data on log area can be facilitated.
In an embodiment of the present invention, the method for the embodiment of the present invention further include: if data are hot spot datas, in memory Log area space is written in all target datas of affairs by upper caching data, and in the target data of affairs is buffered in It deposits, that is, after judging whether data are hot spot data, if the data are hot spot datas, the data is retained on memory, this Sample, during subsequent operation, if data are written to memory, if the write operation is to the target data being buffered on memory Modification write, then can modify directly on memory to the data, with execute it is above-mentioned according to data link table to target data The step of being managed.To retain the data on memory, so that the data migrate on memory, because of hot spot data It is easy to cause hard disk to generate fragment, these data buffer storages is not stored in data field on memory, can avoid on data field Because the migration of the data generates fragment.
In the embodiment having, if can not also execute data is hot spot data, the step of the data is cached on memory Suddenly, but before step 201, data are read from log area and are cached to memory, i.e., before data being written to caching device, Data are read to caching device caching from log area.It is also able to achieve and modifies directly on memory to the data in this way, this is complete At above-mentioned the step of being managed according to data link table to target data.To which the data be retained on memory, with It migrates the data on memory, avoids on data field because the migration of the data generates fragment.
In the embodiment having, hard disk control device further includes superblock, when each log area is assigned identification information, The superblock can be used for recording the identification information for the log area modified after log area is modified.To, after memory power down, The identification information for the log area that can be recorded according to superblock reads the data on corresponding log area, then, delays on memory The data on the log area read are deposited, to continue to execute follow-up data operation.
It is all when affairs for example, after log area space is written in all target datas and mapping relations of affairs It, will be in the bitmap of the identification information recording of the log area of modification to superblock when write operation is fully completed.
It is appreciated that hard disk further includes block group in the embodiment having, which includes the log area sum number of preset quantity According to area, the log area and data field of chunking are continuously arranged, at this point, as described above, for data data separation match data field space, It specifically may is that and determine idle target data area according to the management information of block group;Data are distributed in target data area for data Area space.
To, it writes data into after the space of data field, target metadata can be generated according to data and data field space, it should Target metadata is used to record the address of the data on the data field, the data are managed and be inquired to convenient.
After generating metadata, target metadata is written to memory;Metadata is judged to determine number of targets after hot spot data According to object block group belonging to area;It determines the available log area of object block group, that is, searches available log area in the object block group, so The available log area is written into target metadata afterwards.It is nearest near target data area or object block group if searched without if Log area can be used.
Wherein, to log area write-in data be using affairs as unit when, be affairs multiple target datas distribution log area Above-mentioned method can be used in space, and the data allocations of target data are obtained to the data being directed toward close to the metadata.
In this way, after the data that metadata and the metadata are directed toward are stored on hard disk according to the above method, the metadata The storage location of data be directed toward close to the metadata of storage location improve hard disk to reduce the moving distance of magnetic head Read-write efficiency.
It is the side with chained list to the multiple target datas for belonging to same affairs on memory in the embodiment that the present invention has Formula is managed.As described above, the method for the embodiment of the present invention is after determining multiple target datas, it can be according to multiple target Data establish data link table, and data link table is used for management objectives data.I.e. the data of each affairs and metadata are submitted to firmly After disk, the data of corresponding log area are got up with chained list method management,
Because the write operation group of these target datas is combined into an affairs, so that an affairs correspond to a data-link Table, the target data of data link table management and the target data of affairs are identical.And a mapping is had according to the foundation of these target datas Relationship, so that the corresponding mapping relations of a data link table, the mapping relations have recorded the data of the data link table management in day Storage condition in will area.
Target data can be managed according to data link table, specific management method is as follows:
According to above-mentioned execution step, determined on memory after belonging to multiple first object data of hot spot data, root The first data link table is established according to these first object data, these first object data belong to the first affairs, i.e. these first mesh Data are marked when log area is written, are that log area is written according to same write-in batch, as long as there are a first object data to write Enter failure, then failure is written in other data of the first affairs.These first object data, which are also established, one first mapping relations.It should First mapping relations are buffered on memory.For example, first object data A1, first object on the first data link table managing internal memory Data B1, first object data C1, first object data D1.Corresponding first mapping relations record have first object data A1, The log area that first object data B1, first object data C1, first object data D1 and these data are assigned in log area The relationship in space.In subsequent process, hard disk control device establishes the second data link table according to multiple second target datas, multiple Second target data belongs to hot spot data, while belonging to the second affairs, has second to reflect according to the foundation of multiple second target data Relationship is penetrated, which can for example manage the second target data A2, the second target data E1, the second target data F1, second Target data D2.
When the second target data of the second data link table management is by the first of the first data link table management pre-established When target data is modified to obtain, the management to first object data is released on the first data link table;With the first data link table The information of first object data is deleted in corresponding first mapping relations.In this way, being achieved that target data on data link table With the migration in mapping relations.For example, the second target data A2 when the second data link table management is by the first data link table When first object data A1 modifies to obtain, because the first object data A1 on memory is modified to the second target data A2, the One target data A1 no longer needs, so as to release management of first data link table to first object data A1, correspondingly, first The information about first object data A1 that record is shut in mapping can also delete.Because of the first object of the first mapping relations record The information of data A1 is deleted, log area recycle when, according to first mapping relations, can judge with this first There is fragment generation in the corresponding log area target data A1, and when migration merges the data on log area, because closing from the first mapping The information read less than first object data A1 is fastened, so as to not move the first object data A1 on log area to new Log area, i.e., the first object data A1 on log area are releasable.
Fig. 3 is the schematic diagram that data migrate in memory, wherein each affairs include multiple target datas, are in Fig. 3 Label mark has been carried out to partial data block therein.As shown in figure 3, subsequent affairs are because of io's because affairs constantly generate Locality has modified the data managed in the first affairs.Such as second affairs have modified first object data A1 and obtain the second mesh Data A2 is marked, the 5th affairs have modified first object data B1 and obtain the 5th target data B2, and the 4th affairs have modified the first mesh Mark data C1 obtains the 4th target data C2, and the second affairs have modified first object data D1 and obtain the second target data D2, the Three affairs have modified the second target data D3 and obtain third target data D3.In this way according to the first affairs are corresponding and the first data Chained list is it is found that all data of the first affairs all move to subsequent affairs, and corresponding, log area can all discharge first The target data of affairs.As shown in table 7, the target data of the first affairs is stored on log area 1, in all of the first affairs After data all move to subsequent affairs, 1 data above block of log area all moves to subsequent affairs, and write-in is other In log area, i.e., the data of the corresponding new version of data of log area 1, this when, in log are preserved in other log areas Log area 1 can release when area recycles, as empty log area.The migration of other target datas is similar, goes to When five affairs, actual caching situation is as shown in Figure 4 in memory.
Table 7
In order to be more easily managed according to data link table to target data, in the embodiment having, the present invention is implemented It further include distributing affairs according to incremental rule for the corresponding data link table of affairs according to the write sequence of affairs in the method for example Number, the write sequence of affairs refers to the sequencing of write-in log area between different affairs, the corresponding data-link of the affairs being first written The transaction number that table is assigned to is smaller, and the transaction number that the corresponding data link table of affairs of log area is and then written increases a unit, Data link table each in this way can have corresponding mark, these marks also have incremental rule.Thus can be square according to transaction number That just determines the data link table on memory establishes sequencing.
For example, initially setting up the data link table of the first affairs, which is first written log area, to be first thing The data link table of business distributes transaction number 1, the data link table of the second affairs is then established according to multiple second target datas, this second Log area will be written after the first affairs in the data of affairs, to distribute thing for corresponding second data link table of second affairs Business number 2 similarly distributes transaction number 3 for third data link table, so analogizes.
The above-mentioned partial content i.e. to be managed according to data link table to data.Belong to the data write-in day of hot spot data Will area, after caching the data for belonging to hot spot data on memory or reading data from log area, by the number of the reading According to being buffered on memory, so as to be managed according to data link table to data on memory, hot spot data is allowed to move on memory It moves, reduces migration of the data on hard disk, to reduce the fragment that the data generate hard disk.
By data buffer storage on memory, it can also avoid reading log when reading data, directly from memory read data Data in area.
And by being managed according to data link table to data on memory, to be established and be changed to mapping relations Become, according to mapping relations can data to log area and fragment arrange, to avoid or reduce the fragment of hard disk, moreover it is possible to Improve the efficiency of fragment processing.In order to make full use of the space on memory, the hot spot data being buffered on memory can be carried out Release, recycles the space of memory, so that the other more data of memory cache.So the method for the embodiment of the present invention, pre- If under release conditions, can be recycled to memory headroom.For example, triggering caching is eliminated when reaching the recycling water level of Installed System Memory. An illustrative recovery method is illustrated below, which is divided into two stages.
First stage
When memory reaches the first preset water level, since the smallest data link table of Current transaction number, according to transaction number by The small data not decontroled to big sequential search data link table.Then, to the target data chained list found, on memory The data that release target data chained list does not decontrol, and retain the corresponding target mapping of target data chained list on memory and close System.
Because being managed by way of data link table data, the data of the data link table management first established may be because It is eliminated on memory by the modification of the data of rear write-in, these data eliminated have been completed solution from data link table It removes.In this regard, can refer to the description of the above-mentioned management method to data link table.Internally deposit into capable recycling, that is, search data link table and do not have Superseded data discharge these data that do not eliminate also on memory.These are stored back to belonging to the data that the time receiving is released inside Data link table can be described as target data chained list.And the corresponding mapping relations of target data chained list remain on memory.First The Memory recycle in stage just stops when memory headroom reaches preset stopping water level.
By above-mentioned step, the data less than or equal to preset data threshold values write log with the metadata for writing data generation Area, while caching in memory.To the hot spot data on these log areas and memory, as above, using the successive sequence pipe of affairs Reason is got up, and in this way when system, which runs a period of time caching, reaches the first preset water level, triggers backstage cache garbage collection thread Memory headroom will be recycled according to the sequence of transaction number from small to large.For example, since the smallest data link table of Current transaction number, root According to the data that the ascending sequential search data link table of transaction number does not decontrol, realize from the data link table established at first Start the data on releasing memory.
The remaining data link table of more early foundation not is more sluggish data in the data that memory is eliminated, and equipment reads it A possibility that taking modification is smaller, so as to which these data are discharged from memory, in this way on the data reading performance using redundancy of equipment influence compared with It is small.And because target mapping relations are stored on memory, when equipment will read data, if not read from memory, in The mapping relations for depositing reservation are inquired, can be according to target if determining the data to be inquired according to target mapping relations Mapping relations read corresponding data from log area.In this way, memory can manage more hot spot datas.
It is appreciated that memory reaches the first preset water level, one kind of release conditions is only preset, data link table is searched and does not solve Except the data of management can also trigger under the default release conditions of others, for example, setting timer then etc., the present invention is real It applies example and this is not especially limited.
Second stage
After memory carries out the recycling of first stage, when memory reaches the second preset water level, target is read from log area and is reflected Penetrate the data of relationship direction;Data field is written in the data that target mapping relations are directed toward;The delete target mapping relations on memory.
In order to further make full use of the space of memory, second can be carried out to memory headroom in second stage and recycled.Its In, triggering memory executes the water level of second stage recycling when the second preset water level.Second preset water level can be with the first default water Position is identical, can also be not identical as the first preset water level, and the embodiment of the present invention is not specifically limited in this embodiment, to the first preset water level Also be not specifically limited with the specific value embodiment of the present invention of the second preset water level, for example, can according to actual amount of memory and Type of service is flexibly set.
After the data that log area reads that target mapping relations are directed toward, number can be written in the data that target mapping relations are directed toward According to area.When memory reaches the second preset water level, memory has run the regular hour at this time, caches mapping relations in memory It is more and more, when finally also reaching caching water level, needs to read the data of these mapping relations direction, be written to data field In.
It is currently more early to build because performing the management method of above-mentioned data link table when memory reaches the second preset water level A possibility that data of vertical data link table not being released from are modified is smaller, because, if the data of the management on data link table The data modification being written below, then the data move to the corresponding data link table of affairs below from the data link table.To, The remaining data of the data link table of more early foundation can determine that for sluggish data, these sluggish data can because what is modified Energy property is smaller, so that they are because the hard disk fragment that modification generates is also less, number can be written in the data that target mapping relations are directed toward According to area.For example, reading these data from log area according to target mapping relations, then divided on data field by dina base administration device Data field space is prepared, then on the data field that the write-in of these data is assigned to.
After the execution of above-mentioned steps, when equipment will read the data on hard disk, such as have to having been written into log The data in area access, and hard disk control device can be searched first on memory, if hit in memory, can directly return Return the data.After the first stage of Memory recycle, part hot spot data is deleted from memory, and remains corresponding mapping Relationship, thus if do not find the data to be accessed in memory, but can be found in the mapping relations of memory cache, These data to be accessed can be then directly read in the corresponding log area of the mapping relations.If in mapping relations, It searches less than the data to be searched, then searches and read data into data field.
It is appreciated that being carried out in the embodiment for writing data in the additional mode write in sequence to log area, when continuous more The data link tables of a affairs is eliminated finish after, such as after executing the recycling of above-mentioned memory second stage, reflected with deleted The corresponding log area of relationship is penetrated also to release completely, can be re-used as an empty log area come using.Specific release day Will area and the method for recycling log area will describe below.
In the embodiment having, after executing the above method, the embodiment of the invention also includes the operations recycled to log area.From And the fragment on log area is reduced, make full use of the space of log area.
The recycling of log area below can carry out log area in a manner of adding write in sequence based on above-mentioned method The embodiment for writing data is illustrated the recycling of log area.That is step 208 is, by the additional write-in of target data sequence of affairs Log area space, i.e. data on log area are stored according to the sequencing of affairs,
The specific recovery method of log area, such as may include following step:
A1: under the conditions of default recycling, the step of execution journal area data are moved;
Wherein, recycling condition is preset to include timer expiry, complete the reclaimer operation of internal storage data, the total water level in log area Reach at least one of preset water level threshold values.When referring to above-mentioned Memory recycle to the reclaimer operation completion of internal storage data, Each stage executes completion, and all triggering log area data move step, i.e. triggering log recycling thread execution journal area recycling stream Journey.
A2: when total space water level reaches pre-set space threshold values when current log area, then stop execution journal area data and remove Otherwise the step of moving continues to execute the step of log area data are moved.
Wherein, the step of execution journal area data are moved, comprising:
B1: since the smallest data link table of Current transaction number, according to the ascending sequential search of transaction number and data The corresponding mapping relations of chained list.
Performing in above-mentioned method according to the write sequence of affairs is the corresponding data link table of affairs according to incremental rule After the step of distributing transaction number, hard disk control device is corresponding with data link table according to the sequential search of transaction number from small to large Mapping relations, because mapping relations have recorded the relationship in the space that data and the data are assigned on log area, so as to analyze Distribution situation of the data block on log area in mapping relations may know that also how many data are not moved on corresponding log area It moves, the data not migrated are located on corresponding log area.
And because affairs are just switched to next log area later when promoting, using a log area is finished, because The data of this continuous affairs can be write in continuous log area.According to the sequence analysis mapping relations of transaction number from small to large, i.e., The data storage condition on continuous log area can be got.
B2: whether the data on the first log area corresponding with the mapping relations are judged according to the information of mapping relations record Migration finishes;
When being managed according to data link table to the data on memory, to the data migrated on memory, also to modify Corresponding mapping relations, if the data on the message reflection log area of data link table record are finished with migrating on memory, example After such as the corresponding data in the log area all have the second stage of new version or Memory recycle in other log areas, Data on corresponding log area are moved to data field, at this point, judging the data of the corresponding log area of the mapping relations Through having migrated.
B3: if the Data Migration of the first log area finishes, first log area is recycled;
It is finished if the data of the first log area have been moved, recycles first log area.Day is being recorded by superblock In the embodiment of the service condition in will area, the information corresponding with the first log area in superblock can be understood at this time.
B4: if the data on the first log area have not migrated, according to the information that mapping relations record, the first log is determined Space utilization rate in area;
Because mapping relations have recorded the relationship in the space that data and the data are assigned on log area, reflected so as to basis The information analysis for penetrating relation record goes out space utilization rate on the first log area.
B5: when the space utilization rate of the first log area is less than default utilization rate threshold values, the data of the first log area are moved The second log area is moved to, and updates mapping relations corresponding with the data moved.
Wherein the second log area is idle log area or the used log area when recycling log area, such as preceding next day Will area used log area when recycling.This is preset can be set using threshold values according to particular condition in use, for example, in log area Using it is more when, but the utilization rate of each log area is lower when, which can reduce this, default utilizes threshold value.Such as this is default It can be set as 50% etc. using threshold values, the present invention is not especially limit this.Update reflect corresponding with the data moved Penetrate relationship, can be after data move on log area, the data that timely update in mapping relations relevant to the data and The corresponding relationship in new log area space.
By above-mentioned log area recovery method, the recycling of log area can be realized, reduced the fragment on log area, fill Divide the space that log area is utilized.And because log area is that sequence is read and write, with the sequence of the transaction number of data link table from small to large It is corresponding, thus corresponding mapping relations can be inquired according to the transaction number size of data link table, and data are moved, at this time without member Data come the position of the data in record log area, to reduce the expense of metadata.
The method of the embodiment of the present invention, on the hard disk control device for including hard disk, which includes data field and log Area, to after caching device write-in data, hard disk control device judges whether the data are hot spot datas, if the data are not hot spots Data then match data field space in data separation for the data, write the data into data field space;If the data are hot spot numbers According to, then be the data log area distribute log area space, write the data into log area space.In this way, being written into hard disk Data be divided into hot spot data and non-thermal point data, hot spot data is after being stored on hard disk in the modification of preset times and release Hard disk can be made to generate the data of preset quantity fragment afterwards, hot spot data is easy to that hard disk is caused to generate fragment, hot spot number is stored in It on log area, is managed with log mode, even if the data on log area, which are frequently modified, generates hard disk fragment, is also facilitated to this A little fragments carry out the management such as recycling, and non-thermal point data is stored in data field, and the release of non-thermal point data does not easily lead to hard disk Generate fragment, data field may not need for hard disk debris management distribute excess resource, thus, by hard disk by different type Data be stored in different regions and be managed in different ways, the debris management efficiency on hard disk, log area can be improved Efficient management to hard disk fragment, can reduce the generation of hard disk fragment.
Data in log area because hot spot data locality, be migrated away, it is practical arrange log area when It waits, the data of reading are less, so that the data management efficiency of log area is higher.And hot spot data is stored in log area, because of heat Point data is easy to produce fragment, so that fragment concentrates on log area, further improves the efficiency to defragmentation.
By sequence is additional write in a manner of data are written to log area when, in log area, only needs sequence reads data and suitable Data are written in sequence, and resettlement data do not generate the expense of metadata, and it is high-efficient that log area carries out defragmentation.By arranging log The data in area can effectively avoid or reduce hard disk fragment.
In the embodiment having, data can not also be written to log area in a manner of sequentially adding, know log area When the distribution condition of space, bitmap file bitmap can be used and correspond to journal zone to manage.
For example, the corresponding fixed block size of each bit, e.g. 4K, then the journal zone of 256M needs 8K is managed, and is write journal zone every time and is required to modify this bitmap.
It is appreciated that in an embodiment of the present invention, data field and log area can be classification storage, i.e. data field and day Hot spot data is moved to the accumulation layer of lower-level when log area is recycled in different levels by will area.
It is appreciated that mapping relations can be searched in other ways in the embodiment having, such as random challenge mapping is closed System, then analyzes the space storage condition of corresponding log area according to the mapping relations found, and then carries out log area data Resettlement, can not have to the target data of affairs log area space is written in such a way that sequence is additional at this time, to specific write-in Mode is without limitation.But such mode can may not react whole space storage conditions of log area because of mapping relations, and Cause the recovering effect of log area not ideal enough.
It is appreciated that in the embodiment for including multiple log areas and multiple data fields of the invention, in order to more fully Using data field and log area, to make full use of hard drive space, the method for the embodiment of the present invention further includes data field and log The step of converting in area, for example, when the space utilization rate of data field is greater than preset data area and utilizes threshold values, by the day of current idle Will area is converted into data field;It, will be by idle log when the space utilization rate of log area, which is greater than default log area, utilizes threshold values The data field that area is converted to is converted into log area.
For example, presetting when initialization on hard disk, the hard drive space of half is log area, and others are several According to area.After system runs a period of time, the utilization rate of data field is higher, looks into according to the identification information incremental order of log area Idle log area is looked for, data field is translated into.It is converted into after data field, in the embodiment for including block group, uses block The space management object of group is managed.Data field is set by the state of log area, and the management knot of block group is recorded In structure, disk preservation is write to this.When the data on data field are deleted, hard drive space is released, and one is converted to by log area The space of data field all release, the state of the data field is switched to log area by this when, and is recorded in block group Management structure in, write disk preservation.
Fig. 5 is a kind of structural schematic diagram of hard disk control device shown according to an exemplary embodiment, the hard disk controlling Device includes hard disk, and hard disk includes data field and log area, and the hard disk control device is for executing the corresponding embodiment of above-mentioned Fig. 2 The function that middle hard disk control device executes.Referring to Fig. 5, which includes:
Writing unit 501, for data to be written to caching device;
Cache manager 502, for judging whether data are hot spot datas, wherein hot spot data is after being stored on hard disk Hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times;
Data management system 503 matches data field space in data separation for data if not being hot spot data for data, Write data into data field space;
Log manager 504 distributes log area space in log area for data if being hot spot data for data, will Log area space is written in data.
Optionally, caching device is memory, hard disk control device further include:
Mapping relations establish unit 505, for establishing the mapping relations of data and log area space in memory;
Optionally, hard disk control device further include:
Cache unit 506, for caching the data for belonging to hot spot data on memory.
Optionally,
Mapping relations establish unit 505, are also used to establish the log that multiple target datas and multiple target datas are assigned to The mapping relations in area space, wherein target data belongs to hot spot data;
Log manager 504 is also used to multiple write operation groups of multiple target datas being combined into an affairs for affairs Log area space is written in all target datas, when the write operation of one of target data of affairs executes failure, affairs The write operation failure that other target datas execute.
Optionally,
Hard disk control device further include:
Chained list establishes unit 509, for establishing data link table according to multiple target datas, wherein data link table is for managing Target data is managed, the target data of data link table management and the target data of affairs are identical;
Chained list administrative unit 510, for being managed according to data link table to target data;
Searching unit 511, for according to foundation sequence of the data link table after arriving first, searching under default release conditions The data that data link table does not decontrol;
Memory management unit 512, the data not decontroled for discharging target data chained list on memory, and in memory The corresponding target mapping relations of upper reservation target data chained list;
Wherein, chained list administrative unit 510, is also used to:
After establishing the second data link table, when the second target data of the second data link table management be by pre-establish first When the first object data modification of data link table management obtains, the pipe to first object data is released on the first data link table Reason;The information of first object data is deleted in the first mapping relations corresponding with the first data link table
Optionally,
Searching unit 511 is also used to when memory reaches the first preset water level, according to foundation of the data link table after arriving first Sequentially, the data that data link table does not decontrol are searched;
Hard disk control device further include:
Reading unit 523, for reading target mapping relations from log area and being directed toward when memory reaches the second preset water level Data;
Data write unit 513 is mapped, data field is written in the data for being directed toward target mapping relations;
Unit 514 is deleted, for the delete target mapping relations on memory.
Optionally,
Hard disk control device further include:
Transaction number allocation unit 515, for being the corresponding data link table of affairs according to incremental according to the write sequence of affairs Rule distribution transaction number;
Searching unit 511 is also used to since the smallest data link table of Current transaction number, ascending according to transaction number The data that sequential search data link table does not decontrol.
Optionally,
Hard disk control device further include:
Recovery unit 516, for the step of under the conditions of default recycling, execution journal area data are moved;
As shown in fig. 6, in the step of execution journal area data are moved, recovery unit 516, comprising:
Searching module 517 is recycled, for searching mapping relations;
Judgment module 518 is recycled, the information for recording according to mapping relations judges first day corresponding with mapping relations Whether the data in will area have migrated;
Determining module 519 is recycled, if the data on the first log area have not migrated, according to mapping relations record Information determines the space utilization rate on the first log area;
Execution module 520 is recycled, for when the space utilization rate of the first log area is less than default utilization rate threshold values, by the The Data Migration of one log area updates mapping relations corresponding with the data moved to the second log area, wherein second day Will area is idle log area or the used log area when recycling log area;
Recycling module 521, for when current log area total space water level reaches pre-set space threshold values, then stopping executing Otherwise the step of log area data are moved continues to execute the step of log area data are moved.
Optionally,
Hard disk control device further include:
Transaction number allocation unit 515, for being the corresponding data link table of affairs according to incremental according to the write sequence of affairs Rule distribution transaction number;
Recycle searching module 517, be also used to since the smallest data link table of Current transaction number, according to transaction number by it is small to Big sequential search mapping relations corresponding with data link table;
Optionally, recycling condition is preset to include timer expiry, complete the reclaimer operation of internal storage data, log area Zong Shui Position reaches at least one of preset water level threshold values.
Optionally,
Cache unit 506, it is data cached on memory if being also used to data is hot spot data;Alternatively, being read from log area Access is cached according to caching device.
Optionally, hot spot data includes data and/or hot spot data of the size of data less than preset data threshold values including first Data.
Optionally,
Log manager 504 is also used to distribute log area space in sequence in log area for data, data sequence is chased after Add write-in log area space.
Optionally,
Hard disk control device further include:
Data field conversion unit 522, for inciting somebody to action when the space utilization rate of data field is greater than preset data area and utilizes threshold values The log area of current idle is converted into data field;
Log area conversion unit 524, for inciting somebody to action when the space utilization rate of log area is greater than default log area and utilizes threshold values Log area is converted by the data field that idle log area is converted to.
Optionally,
Hard disk further includes superblock, and each log area is assigned identification information, and superblock is used for after log area is modified Record the identification information for the log area modified.
Optionally, log area and data field are arranged alternately on hard disk.
Optionally,
Hard disk further includes block group, and block group includes the log area and data field of preset quantity, the log area and data field of chunking Continuous setting,
Data management system 503, comprising:
Free area determining module 525, for determining idle target data area according to the management information of block group;
Distribution module 508, for distributing data field space in target data area for data;
Hard disk control device further include:
Metadata generation module 507, for generating target metadata according to data and data field space;
Writing unit 501 is also used to caching device write-in target metadata;
Cache manager judges that metadata is after hot spot data, and log manager 504 is also used to determine target data area Affiliated object block group;Determine the available log area of object block group;Available log area is written into target metadata.
Optionally,
Log manager 504 is also used to distribute log area space on log area for mapping relations;Mapping relations are written The log area space that mapping relations are assigned to.
In conclusion the hard disk includes data field and log area, writing unit on the hard disk control device for including hard disk After 501 are written data to caching device, cache manager 502 judges whether the data are hot spot datas, if the data are not heat Point data, then data management system 503 matches data field space in data separation for the data, writes the data into data field space; If the data are hot spot datas, log manager 504 is the data in log area distribution log area space, is write the data into Log area space.In this way, the data for being written into hard disk are divided into hot spot data and non-thermal point data, hot spot data is to be stored in firmly Hard disk can be made to generate the data of preset quantity fragment after on disk after the modification of preset times and release, hot spot data is easy to cause Hard disk generates fragment, and hot spot number is stored on log area, is managed with log mode, even if the data on log area are frequent Modification generates hard disk fragment, also facilitates and carries out the management such as recycling to these fragments, and non-thermal point data is stored in data field, non- The release of hot spot data does not easily lead to hard disk and generates fragment, and data field, which may not need, distributes excess resource for hard disk debris management, To be managed, be can be improved in different ways by the way that different types of data are stored in different regions on hard disk Debris management efficiency on hard disk, efficient management of the log area to hard disk fragment, can reduce the generation of hard disk fragment.
Fig. 7 be another embodiment of the present invention provides a kind of hard disk control device hardware structural diagram, the hard disk control Device processed includes processor CPU701, caching device 703 and hard disk 702 and hard disk controller 705 and bus 704.Hard disk 702 include data field and log area, and caching device in the embodiment having for example can be memory.
The step as performed by hard disk control device can be based on the hard disk control device shown in Fig. 7 in above-described embodiment Structure.
The processor 701 executes program, so that the method that hard disk control device executes above-mentioned hard disk data management method, is lifted The various optional designs of example are specific as follows.
The processor 701 executes program, so that hard disk control device has following function: data are written to caching device; Judge whether data are hot spot datas, wherein hot spot data be after being stored on hard disk after the modification of preset times and release energy Hard disk is set to generate the data of preset quantity fragment;It is empty with data field in data separation for data if data are not hot spot datas Between, write data into data field space;If data are hot spot datas, log area space is distributed in log area for data, will be counted According to write-in log area space.
A kind of optional design, caching device are memory, which executes program, so that hard disk control device has Following function: after distributing log area space in log area for data, the mapping relations of data and log area space are established.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: establishing more The mapping relations in the log area space that a target data and multiple target datas are assigned to, wherein target data belongs to hot spot number According to;Multiple write operation groups of multiple target datas are combined into an affairs;All target datas write-in log area of affairs is empty Between, when the write operation of one of target data of affairs executes failure, the write operation of other target datas execution of affairs Failure.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: in memory It is upper to cache the data for belonging to hot spot data.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: by affairs All target datas write-in log area space before, establish data link table according to multiple target datas, wherein data link table use In management objectives data, the target data of data link table management and the target data of affairs are identical;According to data link table to target Data are managed;Under default release conditions, according to foundation sequence of the data link table after arriving first, searches data link table and do not solve Except the data of management;The data that target data chained list does not decontrol are discharged on memory, and retain target data on memory The corresponding target mapping relations of chained list;Wherein, target data is managed according to data link table, comprising: establish the second data After chained list, when the second target data of the second data link table management is by the first mesh of the first data link table management pre-established When mark data modification obtains, the management to first object data is released on the first data link table;With the first data link table pair The information of first object data is deleted in the first mapping relations answered
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: working as memory When reaching the first preset water level, according to foundation sequence of the data link table after arriving first, the number that data link table does not decontrol is searched According to;
After discharging the data that target data chained list does not decontrol on memory, reach the second preset water level in memory When, the data that target mapping relations are directed toward are read from log area;
Data field is written in the data that target mapping relations are directed toward;
The delete target mapping relations on memory.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
It is the corresponding data link table of affairs according to incremental rule distribution transaction number according to the write sequence of affairs;
Since the smallest data link table of Current transaction number, not according to the ascending sequential search data link table of transaction number The data to decontrol.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: default Under the conditions of recycling, execution journal area data move the step of;
The step of execution journal area data are moved, comprising:
Search mapping relations;
Judge whether the data on the first log area corresponding with mapping relations migrate according to the information of mapping relations record It is complete;
If the data on the first log area have not migrated, according to the information that mapping relations record, the first log area is determined On space utilization rate;
When the space utilization rate of the first log area is less than default utilization rate threshold values, extremely by the Data Migration of the first log area Second log area, and update corresponding with the data moved mapping relations, wherein the second log area be the free time log area or The used log area when recycling log area;
When total space water level reaches pre-set space threshold values when current log area, then stop the resettlement of execution journal area data Otherwise step continues to execute the step of log area data are moved.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
It is the corresponding data link table of affairs according to incremental rule distribution transaction number according to the write sequence of affairs;From current thing A business number the smallest data link table starts, according to the ascending sequential search of transaction number mapping relations corresponding with data link table;
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Default recycling condition includes timer expiry, to the reclaimer operation of internal storage data is completed, the total water level in log area reaches At least one of preset water level threshold values.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
It is data cached on memory if data are hot spot datas after judging whether data are hot spot data;Alternatively,
To before caching device write-in data, data are read to caching device caching from log area.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Hot spot data include size of data be less than preset data threshold values data and/or hot spot data include metadata.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Log area space is distributed in sequence in log area for data, by the additional write-in log area space of data sequence.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
When the space utilization rate of data field, which is greater than preset data area, utilizes threshold values, convert the log area of current idle to Data field;
When the space utilization rate of log area, which is greater than default log area, utilizes threshold values, by what is be converted to by idle log area Data field is converted into log area.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Hard disk further includes superblock, and each log area is assigned identification information, and superblock is used for after log area is modified Record the identification information for the log area modified.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Log area and data field are arranged alternately on hard disk.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: hard disk is also Including block group, block group includes the log area and data field of preset quantity, and the log area and data field of chunking are continuously arranged, according to block The management information of group determines idle target data area;Data field space is distributed in target data area for data;
It writes data into after the space of data field, generates target metadata according to data and data field space;To buffer Target metadata is written in part;Metadata is judged to determine object block group belonging to target data area after hot spot data;Determine mesh Mark the available log area of block group;Available log area is written into target metadata.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: for mapping Relationship distributes log area space on log area;The log area space that mapping relations write-in mapping relations are assigned to.
In conclusion the hard disk includes data field and log area, the processor on the hard disk control device for including hard disk After 701 are written data to caching device, which judges whether the data are hot spot datas, if the data are not hot spots Data, then the processor 701 matches data field space in data separation for the data, writes the data into data field space;If should Data are hot spot datas, then the processor 701 is the data in log area distribution log area space, write the data into log area Space.In this way, the data for being written into hard disk are divided into hot spot data and non-thermal point data, hot spot data is after being stored on hard disk Hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times, hot spot data is easy to that hard disk is caused to produce Raw fragment, hot spot number is stored on log area, is managed with log mode, even if the data on log area frequently modify production Stiff disk fragment also facilitates and carries out the management such as recycling to these fragments, and non-thermal point data is stored in data field, non-thermal points According to release do not easily lead to hard disk generate fragment, data field may not need for hard disk debris management distribute excess resource, thus, lead to It crosses and different types of data are stored in different regions on hard disk are managed in different ways, can be improved on hard disk Debris management efficiency, efficient management of the log area to hard disk fragment, can reduce the generation of hard disk fragment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although referring to before Stating embodiment, invention is explained in detail, those skilled in the art should understand that: it still can be to preceding Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these It modifies or replaces, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.

Claims (24)

1. a kind of hard disk data management method, which is characterized in that the method is applied to the hard disk control device including hard disk, institute Stating hard disk includes data field and log area, which comprises
Data are written to caching device;
Judge whether the data are hot spot datas, wherein the hot spot data is after being stored on the hard disk in preset times Modification and release after the hard disk can be made to generate the data of preset quantity fragment;
If the data are not hot spot datas, match data field space in the data separation for the data, by the data The data field space is written;
If the data are hot spot datas, log area space is distributed in the log area for the data, the data are write Enter the log area space;
The method also includes:
When the space utilization rate of the data field, which is greater than preset data area, utilizes threshold values, convert the log area of current idle to Data field;
When the space utilization rate of the log area, which is greater than default log area, utilizes threshold values, by what is be converted to by idle log area Data field is converted into log area.
2. the method according to claim 1, wherein the caching device be memory,
Described is the data after the distribution log area space of the log area, the method also includes:
Establish the mapping relations of the data and the log area space.
3. according to the method described in claim 2, it is characterized in that,
The mapping relations for establishing the data and the log area space, comprising:
The mapping relations in the log area space that multiple target datas and the multiple target data are assigned to are established, wherein the mesh Mark data belong to hot spot data;
It is described that the log area space is written into the data, comprising:
Multiple write operation groups of the multiple target data are combined into an affairs;
Log area space is written into all target datas of the affairs, when one of target data of the affairs writes behaviour When making to execute failure, the write operation that other target datas of the affairs execute fails.
4. according to the method described in claim 3, it is characterized in that, the method also includes:
Caching belongs to the data of the hot spot data on the memory.
5. according to the method described in claim 4, it is characterized in that,
All target datas by the affairs are written before the space of log area, the method also includes:
Data link table is established according to the multiple target data, wherein the data link table is for managing the target data, institute The target data for stating data link table management is identical as the target data of the affairs;
The target data is managed according to the data link table;
Under default release conditions, according to foundation sequence of the data link table after arriving first, searches the data link table and do not solve Except the data of management;
The data that target data chained list does not decontrol are discharged on the memory, and retain the number of targets on the memory According to the corresponding target mapping relations of chained list;
It is wherein, described that the target data is managed according to the data link table, comprising:
After establishing the second data link table, when the second target data of the second data link table management be by pre-establish first When the first object data modification of data link table management obtains, release on first data link table to the first object number According to management;The information of the first object data is deleted in the first mapping relations corresponding with first data link table.
6. according to the method described in claim 5, it is characterized in that,
It is described under default release conditions, according to foundation sequence of the data link table after arriving first, search the data link table The data not decontroled, comprising:
When the memory reaches the first preset water level, sequentially according to foundation of the data link table after arriving first, described in lookup The data that data link table does not decontrol;
After the data that release target data chained list does not decontrol on the memory, the method also includes:
When the memory reaches the second preset water level, the data that the target mapping relations are directed toward are read from the log area;
The data field is written in the data that the target mapping relations are directed toward;
The target mapping relations are deleted on the memory.
7. according to the method described in claim 5, it is characterized in that,
The method also includes:
Under the conditions of default recycling, the step of execution journal area data are moved;
The step of execution journal area data are moved, comprising:
Search the mapping relations;
Judge whether the data on the first log area corresponding with the mapping relations migrate according to the information of mapping relations record It is complete;
If the data on first log area have not migrated, according to the information that the mapping relations record, determine first Space utilization rate in will area;
When the space utilization rate of first log area is less than default utilization rate threshold values, the data of first log area are moved The second log area is moved to, and updates mapping relations corresponding with the data moved, wherein second day will area is sky Not busy log area or the used log area when recycling log area;
When total space water level reaches pre-set space threshold values when current log area, then stop the step of execution journal area data resettlement Suddenly, the step of otherwise continuing to execute the resettlement of log area data.
8. the method according to the description of claim 7 is characterized in that
The method also includes:
It is the corresponding data link table of the affairs according to incremental rule distribution transaction number according to the write sequence of the affairs;
It is described to search the mapping relations in the step of execution journal area data are moved, comprising:
Since the smallest data link table of Current transaction number, according to the ascending sequential search of the transaction number and the data The corresponding mapping relations of chained list.
9. according to the method described in claim 4, it is characterized in that,
The caching on the memory belongs to the data of the hot spot data, comprising:
It is described judge whether the data are hot spot data after, if the data are hot spot datas, protected on the memory Stay the data;Alternatively,
Before the device write-in data to caching, data are read from the log area and are cached to the caching device.
10. method according to any one of claims 1 to 9, which is characterized in that the hot spot data includes that size of data is small In the data of preset data threshold values and/or the hot spot data include metadata.
11. method according to any one of claims 1 to 9, which is characterized in that
It is described to distribute log area space in the log area for the data, the log area space, packet is written into the data It includes:
Log area space is distributed in sequence in the log area for the data, by the data sequence additional write-in day Will area space.
12. according to the described in any item methods of claim 2 to 9, which is characterized in that
The method also includes:
Log area space is distributed on log area for the mapping relations;
The log area space that the mapping relations are assigned to is written into the mapping relations.
13. a kind of hard disk control device, which is characterized in that the hard disk control device includes:
Writing unit, for data to be written to buffer;
Cache manager, for judging whether the data are hot spot datas, wherein the hot spot data is described hard to be stored in After on disk, the hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times;
Data management system, if not being hot spot data for the data, the data separation for the data in hard disk matches data The data field space is written in the data by area space;
Log manager distributes day in the log area of the hard disk for the data if being hot spot data for the data The log area space is written in the data by will area space;
The hard disk control device further include:
Data field conversion unit, for that will work as when the space utilization rate of the data field is greater than preset data area and utilizes threshold values The log area of preceding free time is converted into data field;
Log area conversion unit will be by for when the space utilization rate of the log area is greater than default log area and utilizes threshold values The data field that idle log area is converted to is converted into log area.
14. hard disk control device according to claim 13, which is characterized in that the caching device is memory, described hard Disk control unit further include:
Mapping relations establish unit, for establishing the mapping relations of the data and the log area space.
15. hard disk control device according to claim 14, which is characterized in that
The mapping relations establish unit, are also used to establish the log that multiple target datas and the multiple target data are assigned to The mapping relations in area space, wherein the target data belongs to hot spot data;
The log manager is also used to multiple write operation groups of the multiple target data being combined into an affairs;It will be described Log area space is written in all target datas of affairs, when the write operation of one of target data of the affairs executes failure When, the write operation that other target datas of the affairs execute fails.
16. hard disk control device according to claim 15, which is characterized in that the hard disk control device further include:
Cache unit, for caching the data for belonging to the hot spot data on the memory.
17. hard disk control device according to claim 16, which is characterized in that
The hard disk control device further include:
Chained list establishes unit, for establishing data link table according to the multiple target data, wherein the data link table is for managing The target data is managed, the target data of the data link table management is identical as the target data of the affairs;
Chained list administrative unit, for being managed according to the data link table to the target data;
Searching unit is used under default release conditions, sequentially according to foundation of the data link table after arriving first, described in lookup The data that data link table does not decontrol;
Memory management unit, the data not decontroled for discharging target data chained list on the memory, and described interior It deposits and retains the corresponding target mapping relations of the target data chained list;
Wherein, the chained list administrative unit, is also used to:
After establishing the second data link table, when the second target data of the second data link table management be by pre-establish first When the first object data modification of data link table management obtains, release on first data link table to the first object number According to management;The information of the first object data is deleted in the first mapping relations corresponding with first data link table.
18. hard disk control device according to claim 17, which is characterized in that
The searching unit is also used to when the memory reaches the first preset water level, according to the data link table after arriving first Foundation sequence, search the data that the data link table does not decontrol;
The hard disk control device further include:
Reading unit, for reading the target mapping from the log area and closing when the memory reaches the second preset water level Mean to data;
Data write unit is mapped, the data field is written in the data for being directed toward the target mapping relations;
Unit is deleted, for deleting the target mapping relations on the memory.
19. hard disk control device according to claim 17, which is characterized in that
The hard disk control device further include:
Recovery unit, for the step of under the conditions of default recycling, execution journal area data are moved;
In the step of execution journal area data are moved, the recovery unit, comprising:
Searching module is recycled, for searching the mapping relations;
Judgment module is recycled, the information for recording according to mapping relations judges the first log area corresponding with the mapping relations On data whether migrated;
Determining module is recycled, if the data on first log area have not migrated, is recorded according to the mapping relations Information, determine the space utilization rate on the first log area;
Execution module is recycled, it, will be described for when the space utilization rate of first log area is less than default utilization rate threshold values The Data Migration of first log area updates mapping relations corresponding with the data moved to the second log area, wherein Second log area is idle log area or the used log area when recycling log area;
Recycling module, for when current log area total space water level reaches pre-set space threshold values, then stopping execution journal area Otherwise the step of data are moved continues to execute the step of log area data are moved.
20. hard disk control device according to claim 19, which is characterized in that
The hard disk control device further include:
Transaction number allocation unit, for being the corresponding data link table of the affairs according to incremental according to the write sequence of the affairs Rule distribution transaction number;
It is described to search the mapping relations in the step of execution journal area data are moved, comprising:
Since the smallest data link table of Current transaction number, according to the ascending sequential search of the transaction number and the data The corresponding mapping relations of chained list.
21. hard disk control device according to claim 16, which is characterized in that
The cache unit retains the data if being also used to the data is hot spot data on the memory;Alternatively, Data are read from the log area to cache to the caching device.
22. 3 to 21 described in any item hard disk control devices according to claim 1, which is characterized in that the hot spot data includes Size of data is less than the data of preset data threshold values and/or the hot spot data includes metadata.
23. 3 to 21 described in any item hard disk control devices according to claim 1, which is characterized in that
The log manager is also used to distribute log area space in sequence in the log area for the data, will be described Data sequence is additional to be written the log area space.
24. 5 to 21 described in any item hard disk control devices according to claim 1, which is characterized in that
The log manager is also used to distribute log area space on log area for the mapping relations;The mapping is closed The log area space that the mapping relations are assigned to is written in system.
CN201610912077.5A 2016-10-19 2016-10-19 Hard disk data management method and hard disk control device Active CN106502587B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610912077.5A CN106502587B (en) 2016-10-19 2016-10-19 Hard disk data management method and hard disk control device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610912077.5A CN106502587B (en) 2016-10-19 2016-10-19 Hard disk data management method and hard disk control device

Publications (2)

Publication Number Publication Date
CN106502587A CN106502587A (en) 2017-03-15
CN106502587B true CN106502587B (en) 2019-10-25

Family

ID=58294298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610912077.5A Active CN106502587B (en) 2016-10-19 2016-10-19 Hard disk data management method and hard disk control device

Country Status (1)

Country Link
CN (1) CN106502587B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107885455B (en) * 2016-09-30 2020-02-07 郑州云海信息技术有限公司 Dynamic adjustment method for disk log area
CN107197191B (en) * 2017-05-27 2021-05-11 深圳市景阳科技股份有限公司 Writing method and device for network hard disk video
CN107688442B (en) * 2017-09-04 2020-11-20 苏州浪潮智能科技有限公司 Virtual block management method for solid state disk
CN107506156B (en) * 2017-09-28 2020-05-12 焦点科技股份有限公司 Io optimization method of block device
CN108920095B (en) * 2018-06-06 2021-06-29 深圳市脉山龙信息技术股份有限公司 Data storage optimization method and device based on CRUSH
EP3789883A4 (en) * 2018-06-30 2021-05-12 Huawei Technologies Co., Ltd. Storage fragment managing method and terminal
KR20200035592A (en) * 2018-09-27 2020-04-06 삼성전자주식회사 Method of operating storage device, storage device performing the same and storage system including the same
CN111125033B (en) * 2018-10-31 2024-04-09 深信服科技股份有限公司 Space recycling method and system based on full flash memory array
CN109558457B (en) * 2018-12-11 2022-04-22 浪潮(北京)电子信息产业有限公司 Data writing method, device, equipment and storage medium
CN111694703B (en) * 2019-03-13 2023-05-02 阿里云计算有限公司 Cache region management method and device and computer equipment
CN113010616A (en) * 2021-04-26 2021-06-22 广州小鹏汽车科技有限公司 Data processing method and data processing system
CN116069261A (en) * 2023-03-03 2023-05-05 苏州浪潮智能科技有限公司 Data processing method, system, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514260A (en) * 2013-08-13 2014-01-15 中国科学技术大学苏州研究院 Internal storage log file system and achieving method thereof
CN103544045A (en) * 2013-10-16 2014-01-29 南京大学镇江高新技术研究院 HDFS-based virtual machine image storage system and construction method thereof
CN105224237A (en) * 2014-05-26 2016-01-06 华为技术有限公司 A kind of date storage method and device
CN105956090A (en) * 2016-04-27 2016-09-21 中国科学技术大学 I/O self-adaption-based file system log mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7272687B2 (en) * 2005-02-01 2007-09-18 Lsi Corporation Cache redundancy for LSI raid controllers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514260A (en) * 2013-08-13 2014-01-15 中国科学技术大学苏州研究院 Internal storage log file system and achieving method thereof
CN103544045A (en) * 2013-10-16 2014-01-29 南京大学镇江高新技术研究院 HDFS-based virtual machine image storage system and construction method thereof
CN105224237A (en) * 2014-05-26 2016-01-06 华为技术有限公司 A kind of date storage method and device
CN105956090A (en) * 2016-04-27 2016-09-21 中国科学技术大学 I/O self-adaption-based file system log mode

Also Published As

Publication number Publication date
CN106502587A (en) 2017-03-15

Similar Documents

Publication Publication Date Title
CN106502587B (en) Hard disk data management method and hard disk control device
CN106547703B (en) A kind of FTL optimization method based on block group structure
CN104461393B (en) Mixed mapping method of flash memory
CN104298610B (en) Data storage system and its management method
CN104035729B (en) Block device thin-provisioning method for log mapping
CN102789427B (en) Data memory device and its method of operating
CN106527969B (en) A kind of Nand Flash memorizer reading/writing method in a balanced way of life-span
CN104346357B (en) The file access method and system of a kind of built-in terminal
CN103777905B (en) Software-defined fusion storage method for solid-state disc
CN107066393A (en) The method for improving map information density in address mapping table
CN103019958A (en) Method for managing data in solid state memory through data attribute
CN108121503A (en) A kind of NandFlash address of cache and block management algorithm
CN106201916B (en) A kind of nonvolatile cache method towards SSD
US20140258596A1 (en) Memory controller and memory system
CN109582593B (en) FTL address mapping reading and writing method based on calculation
CN109164975A (en) A kind of method and solid state hard disk writing data into solid state hard disk
CN111158604B (en) Internet of things time sequence data storage and retrieval method for flash memory particle array
CN102646069A (en) Method for prolonging service life of solid-state disk
CN101354681A (en) Memory system, abrasion equilibrium method and apparatus of non-volatile memory
CN104598386B (en) By following the trail of and reusing solid-state drive block using two level map index
CN102163175A (en) Hybrid address mapping method based on locality analysis
CN107015763A (en) Mix SSD management methods and device in storage system
CN109947363A (en) A kind of data cache method of distributed memory system
CN110188108A (en) Date storage method, device, system, computer equipment and storage medium
CN109671458A (en) The method of management flash memory module and relevant flash controller

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant