CN106502587B - Hard disk data management method and hard disk control device - Google Patents
Hard disk data management method and hard disk control device Download PDFInfo
- Publication number
- CN106502587B CN106502587B CN201610912077.5A CN201610912077A CN106502587B CN 106502587 B CN106502587 B CN 106502587B CN 201610912077 A CN201610912077 A CN 201610912077A CN 106502587 B CN106502587 B CN 106502587B
- Authority
- CN
- China
- Prior art keywords
- data
- log area
- hard disk
- space
- mapping relations
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0674—Disk device
Abstract
The embodiment of the invention discloses kind of hard disk data management method and hard disk control devices, for efficiently managing the fragment on hard disk.The embodiment of the present invention is applied to the hard disk control device including hard disk, which includes data field and log area, and method includes: to caching device write-in data;Judge whether data are hot spot datas, wherein hot spot data is the data that hard disk can be made to generate preset quantity fragment after being stored on hard disk after the modification of preset times and release;If data are not hot spot datas, match data field space in data separation for data, write data into data field space;If data are hot spot datas, log area space is distributed in log area for data, writes data into log area space.It is managed in different ways by the way that different types of data are stored in different regions on hard disk, the debris management efficiency on hard disk can be improved, efficient management of the log area to hard disk fragment can reduce the generation of hard disk fragment.
Description
Technical field
The present invention relates to data processing field more particularly to a kind of hard disk data management methods and hard disk control device.
Background technique
For common mechanical hard disk, because it relies on mechanical rotation hard disk and moving head locating read-write position,
Hard disk sequence read-write is optimal read-write model.If hard drive space fragmentation, when writing data, can not be assigned to
Continuous space causes magnetic head shake serious, and the main time consumption of data transmission is on positioning magnetic track and sector, to leave for
The time for transmitting data is seldom.Because the data of file are more discrete, then read these files when, efficiency also compared with
It is low.
Therefore, most hard disks file system all avoids generating a large amount of fragment space as possible, but fragmentation is still
It not can avoid.
Such as, it can use the advantage of hard disk sequential write using COW mechanism.When to modify write a block number according to when, be not
The data of early version are directly covered, but read the data of early version, after modification is good, a new position is write, number will be write
According to data be all aggregating, sequentially write on hard disk, discharge the data of early version.Because the change in location of data, needs
The pointer being directed toward in the upper layer index block of data is modified, such recurrence to top.It can thus discharge a large amount of
Data cause to generate a large amount of fragment on hard disk.
Summary of the invention
The embodiment of the invention provides a kind of hard disk data management method and hard disk control devices, for efficiently managing hard disk
On fragment.
First aspect present invention provides a kind of hard disk data management method, and this method is applied to the hard disk controlling including hard disk
Device, hard disk include data field and log area, this method comprises:
Data are written to caching device in hard disk control device, which for example can be memory, flash card, solid-state
The memory device different from hard disk such as hard disk, then, hard disk control device judge whether the data are hot spot datas, wherein hot spot
Data are the data that hard disk can be made to generate preset quantity fragment after being stored on hard disk after the modification of preset times and release.It is logical
It crosses on caching device and the data of write-in is judged, determine the type of the data, it is different to be executed to different data
Processing mode.
It is empty with data field in data separation for data if the data are not hot spot datas when data are written to hard disk
Between, write data into data field space;If the data are hot spot datas, log area space is distributed in log area for data, it will
Log area space is written in data.
The hard disk data management method of first aspect present invention, the data for being written into hard disk are divided into hot spot data and non-thermal
Point data, hot spot data are easy to that hard disk is caused to generate fragment, hot spot number are stored on log area, carries out pipe with log mode
Reason, it is also convenient recycle etc. management to these fragments even if the data on log area, which frequently modifys, generates hard disk fragment, and incite somebody to action
Non-thermal point data is stored in data field, and the release of non-thermal point data does not easily lead to hard disk and generates fragment, data field may not need for
Hard disk debris management distribute excess resource, thus, by hard disk by different types of data be stored in different regions with
Different modes are managed, and the debris management efficiency on hard disk can be improved, be effectively managed, evade to the fragment on hard disk
Or it reduces hard disk fragment and generates.
With reference to first aspect, in the first possible implementation, caching device is memory, is the data in log area
It distributes after the space of log area, the first possible implementation further include: the mapping for establishing the data and log area space is closed
System.I.e. hard disk control device is to establish a mapping relations on caching device after the data distribute log area space in log area,
For the corresponding relationship of the data and its log area space being assigned to, data are recorded in the storage of log area by the mapping relations
Situation is managed so as to use the mapping relations to eliminate operation to the data that the data of log area are eased up in memory device.Its
In, in the first possible implementation, caching device is memory, but caches device and can also be other situations.
The possible implementation of with reference to first aspect the first, establish in the second possible implementation data and
The mapping relations in log area space, comprising: establish the log area space that multiple target datas and multiple target datas are assigned to
Mapping relations, wherein target data belongs to hot spot data;
Write data into log area space, comprising: multiple write operation groups of multiple target datas are combined into an affairs;It will
Log area space is written in all target datas of affairs.And when the write operation of one of target data of affairs executes failure
When, the write operation that other target datas of affairs execute fails.Multiple target datas refer at least two target datas, correspondingly,
Multiple write operations refer at least two write operations.In this way, the concept of database field affairs is introduced, as unit of multiple hot spot datas
To operating, mapping relations such as are established with multiple hot spot datas for belonging to same affairs, and with all hot spot datas of affairs
Write operation execute the write operation to log area together.The efficiency of data processing can be improved in this way.
The possible implementation of second with reference to first aspect, the third possible implementation further include: in memory
It is upper to cache the data for belonging to hot spot data.Hot spot data is buffered on memory, for example, log area is written in hot spot data
When, these data are also retained on memory to the hot spot data alternatively, before data are written to memory from log area reading thereon,
And be buffered on memory, in this way, can directly modify on memory to data, data exist when the subsequent write-in data to content
It is migrated in memory, reduces the generation of fragment on hard disk, and can arrange according to data of the migration situation to log area.
The third possible implementation with reference to first aspect, in the fourth possible implementation by the institute of affairs
There is target data to be written before the space of log area, the 4th kind of possible implementation further include:
Data link table is established according to multiple target datas, wherein data link table is used for management objectives data, data link table pipe
The target data of reason and the target data of affairs are identical;Then, target data is managed according to data link table, and, according to
Data link table is managed target data, comprising: after establishing the second data link table, when the second mesh of the second data link table management
Marking data is when being obtained by the first object data modification of the first data link table management pre-established, on the first data link table
Release the management to first object data;First object data are deleted in the first mapping relations corresponding with the first data link table
Information.In this way, the migration of data between different affairs can be managed by data link table on memory.
The mode being managed according to data link table to the data on memory can be with are as follows: under default release conditions, according to
Foundation sequence of the data link table after arriving first, searches the data that data link table does not decontrol;Target data is discharged on memory
The data that chained list does not decontrol, and retain the corresponding target mapping relations of target data chained list on memory.By discharging mesh
Data on mark data link table can expand the capacity of memory management data.System passes through the inquiry of target mapping relations, Ji Kecong
Corresponding data are read on log area.
The 4th kind of possible implementation with reference to first aspect, in a sixth possible implementation in default release
Under the conditions of, according to foundation sequence of the data link table after arriving first, search the data that data link table does not decontrol, comprising: when interior
It deposits when reaching the first preset water level, according to foundation sequence of the data link table after arriving first, searches what data link table did not decontroled
Data;In addition, the internal storage data that this method further includes second stage is eliminated, i.e., target data chained list is discharged on memory and is not released
After the data of management, this method further include: when memory reaches the second preset water level, read target mapping relations from log area
The data of direction;Data field is written in the data that target mapping relations are directed toward;The delete target mapping relations on memory.Pass through two
The internal storage data eliminative mechanism in stage can expand memory to the management capacity of data, and eliminate in the internal storage data of second stage
In, the data that target mapping relations are directed toward at this time are sluggish data, and a possibility that being modified is lower, can by these data from
Log area moves to data field preservation, this will not excessively increase the fragment of data field.
The 4th kind of possible implementation with reference to first aspect is also wrapped in the 6th kind of possible implementation this method
It includes: being the corresponding data link table of affairs according to incremental rule distribution transaction number according to the write sequence of affairs.By for data-link
Table distributes transaction number, can be managed according to transaction number to data link table, improve the efficiency of management.Such as, from Current transaction
Number the smallest data link table starts, according to the data that the ascending sequential search data link table of transaction number does not decontrol, this
The sequence of the foundation according to data link table after arriving first can be realized in sample, searches the data that data link table does not decontrol.
The 4th kind of possible implementation with reference to first aspect, this method is also wrapped in the 7th kind of possible implementation
It includes: under the conditions of default recycling, the step of execution journal area data are moved.Such as the step of execution journal area data resettlement, packet
It includes:
Search mapping relations;
Judge whether the data on the first log area corresponding with mapping relations migrate according to the information of mapping relations record
It is complete;
If the data on the first log area have not migrated, according to the information that mapping relations record, the first log area is determined
On space utilization rate;
When the space utilization rate of the first log area is less than default utilization rate threshold values, extremely by the Data Migration of the first log area
Second log area, and update corresponding with the data moved mapping relations, wherein the second log area be the free time log area or
The used log area when recycling log area;
When total space water level reaches pre-set space threshold values when current log area, then stop the resettlement of execution journal area data
Otherwise step continues to execute the step of log area data are moved.
The 7th kind of possible implementation with reference to first aspect, this method is also wrapped in the 8th kind of possible implementation
Include: according to the write sequence of affairs be the corresponding data link table of affairs according to it is incremental rule distribute transaction number, according to transaction number come
Data link table is managed, the efficiency of management can be improved.For example, since the smallest data link table of Current transaction number, according to thing
Business number ascending sequential search mapping relations corresponding with data link table, can be realized the lookup to mapping relations.
The 7th kind of possible implementation with reference to first aspect presets recycling article in the 9th kind of possible implementation
Part includes timer expiry, to the reclaimer operation of internal storage data is completed, the total water level in log area reaches in preset water level threshold values extremely
It is one few.
The possible implementation of with reference to first aspect the first, in the tenth kind of possible implementation, on memory
Caching belongs to the data of hot spot data, and there are many modes, such as after judging whether data are hot spot data, if data are hot spots
Data then retain the data on memory;Alternatively, reading data to caching from log area before data are written to caching device
Device caching.
With reference to first aspect or second to the tenth kind any possible implementation of first aspect, the tenth one kind can
Can implementation in hot spot data include that size of data is less than the data of preset data threshold values and/or hot spot data includes first number
According to.The preset data threshold values for example can be 128KB or other space sizes, specific numerical value and can be adjusted according to type of service
Whole, if the size of data of data is less than the preset data threshold values, it is a large amount of broken that the frequent release of the data may be such that hard disk generates
Piece.And metadata includes the management data to data, such as the indirect block and conservation object of preservation data address manage structure
Meta data block.Metadata may also lead to hard disk and generate a large amount of fragments.These hot spot datas to be filtered out, to be stored in log
Area.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 12nd kind
Log area space is distributed in log area for data in the implementation of energy, writes data into log area space, comprising: is existed for data
Log area space is distributed in log area in sequence, by the additional write-in log area space of data sequence.Data can be realized in this way to exist
The sequence of log area is read and write, thus when moving the data of log area, the not no expense of metadata.It is smaller to arrange expense, has
Effect ensure that the stability of system performance.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 13rd kind
This method in the implementation of energy further include: when the space utilization rate of data field, which is greater than preset data area, utilizes threshold values, will work as
The log area of preceding free time is converted into data field;It, will be by when the space utilization rate of log area, which is greater than default log area, utilizes threshold values
The data field that idle log area is converted to is converted into log area.In this way, log area and data field are mutually converted with adaption system
The variation of capacity.Can the specific usage scenario of flexible adaptation, improve the service efficiency of hard disk.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 14th kind
Hard disk further includes superblock in the implementation of energy, and each log area is assigned identification information, and superblock is used in log area quilt
The identification information for the log area modified is recorded after modification.Further log area is managed by superblock, if such as
After system cut-off or collapse restore, hard disk control device can search in time the day modified according to the information that the superblock records
Will area.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 15th kind
Log area and data field are arranged alternately on hard disk in the implementation of energy.In this way, may make the data of log area and data field
It is arranged closer.
It with reference to first aspect or second to the tenth kind any possible implementation of first aspect, can at the 16th kind
Can implementation in hard disk further include block group, block group includes the log area and data field of preset quantity, the log area of chunking and
Data field is continuously arranged, and the use of the data field and log area in adjustment chunking can be cooperated by chunking, for example, according to block group
It is data in target data area distribution data field space after management information determines idle target data area;To which data be write
After entering data field space, method further include: generate target metadata according to data and data field space;To caching device write-in
Target metadata, hard disk control device judge metadata to determine object block group belonging to target data area after hot spot data;
Determine the available log area of object block group;Available log area is written into target metadata.It in this way can be by metadata in hard disk
On the position for being located proximate to the corresponding data of metadata and being stored on hard disk, the convenient read-write to data.
Second with reference to first aspect is to the tenth kind of any possible implementation, the 17th kind of possible realization side
This method in formula further include: distribute log area space on log area for mapping relations, then, mapping is written into mapping relations and is closed
It is the log area space being assigned to.Mapping relations are also stored on log area, so that mapping relations can on hard disk
It is saved by ground.
Second aspect of the present invention provides a kind of hard disk control device, which includes hard disk, which includes
Data field and log area, the hard disk control device have the function of hard disk control device in the above method.The function can pass through
Hardware realization, it is also possible to which corresponding software realization is executed by hardware.The hardware or software include one or more and above-mentioned function
It can corresponding module.
In a kind of possible implementation, which includes:
Writing unit, for data to be written to caching device;
Cache manager, for judging whether data are hot spot datas, wherein hot spot data be stored on hard disk after
Hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times;
Data management system is matched data field space in data separation for data, will be counted if not being hot spot data for data
According to write-in data field space;
Log manager distributes log area space in log area for data, by data if being hot spot data for data
Log area space is written.
In alternatively possible implementation, which includes:
Processor;
The processor executes following movement: data are written to caching device;
The processor executes following movement: judging whether data are hot spot datas, wherein hot spot data is to be stored in hard disk
Hard disk can be made to generate the data of preset quantity fragment after upper after the modification and release of preset times;
The processor executes following movement: empty with data field in data separation for data if data are not hot spot datas
Between, write data into data field space;
The processor executes following movement: if data are hot spot datas, log area space is distributed in log area for data,
Write data into log area space.
The third aspect, the embodiment of the present application provide a kind of computer storage medium, which is stored with journey
Sequence code, the program code are used to indicate the method for executing above-mentioned first aspect.
As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that
On the hard disk control device for including hard disk, which includes data field and log area, and number is written to caching device
According to rear, hard disk control device judges whether the data are hot spot datas, is the data in number if the data are not hot spot datas
Data field space is distributed according to area, writes the data into data field space;It is the data in log if the data are hot spot datas
Area distributes log area space, writes the data into log area space.In this way, the data for being written into hard disk be divided into hot spot data and
Non-thermal point data, hot spot data are that hard disk can be made to generate present count after being stored on hard disk after the modification of preset times and release
Measure the data of fragment, hot spot data is easy to cause hard disk to generate fragment, hot spot number is stored on log area, with log mode into
Row management also facilitates even if the data on log area, which are frequently modified, generates hard disk fragment and carries out the management such as recycling to these fragments,
And non-thermal point data is stored in data field, the release of non-thermal point data does not easily lead to hard disk and generates fragment, and data field can nothing
Excess resource need to be distributed for hard disk debris management, thus, by the way that different types of data to be stored in different areas on hard disk
Domain is managed in different ways, can be improved the debris management efficiency on hard disk, efficient management of the log area to hard disk fragment,
The generation of hard disk fragment can be reduced.
Detailed description of the invention
Fig. 1 is the logical view of an object on log area provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of hard disk data management method shown in one embodiment of the invention;
Fig. 3 is the schematic diagram that data involved in embodiment illustrated in fig. 2 migrate in memory;
Fig. 4 is the schematic diagram that data involved in embodiment illustrated in fig. 2 cache in memory;
Fig. 5 be another embodiment of the present invention provides a kind of hard disk control device structural schematic diagram;
Fig. 6 is the structural schematic diagram of the recovery unit of hard disk control device shown in fig. 5;
Fig. 7 be another embodiment of the present invention provides a kind of hard disk control device hardware structural diagram.
Specific embodiment
The embodiment of the invention provides a kind of hard disk data management method and hard disk control devices, on efficiently management hard disk
Fragment.
One, implementation environment involved in the hard disk data management method of the embodiment of the present invention
A kind of hard disk data management system of the embodiment of the present invention, the hard disk data management system include hard disk, memory, are somebody's turn to do
Memory can be used as caching device, which is divided into data field Date zone and log area Journal zone, wherein the day
Will area is managed data thereon with log mode.
Before data are written to the hard disk in hard disk data management system, the number is written to the memory first as caching device
According to if hard disk data management system judges that the data are hot spot data, which is after being stored on hard disk default
Hard disk can be made to generate the data of preset quantity fragment after the modification and release of number, these hot spot datas are on the data field of hard disk
It will lead to hard disk after modification and generate a large amount of fragments.So after data are written to memory, if the data are not hot spot datas, for
Then the data write the data into the data field space with data field space in data separation.If the data are hot spot numbers
According to, then be data log area distribute log area space, write data into log area space.
Wherein, data field space is the memory space on data field, can be the segment space on a data field, can also
To be whole spaces on a data field.Log area space is the memory space on data field, be can be on a log area
Segment space, be also possible to whole spaces on a log area.
In this way, the data for being written into hard disk are divided into hot spot data and non-thermal point data, hot spot data is to be stored in hard disk
Hard disk can be made to generate the data of preset quantity fragment after upper after the modification and release of preset times, hot spot data is easy to cause hard
Disk generates fragment, and hot spot number is stored on log area, is managed with log mode, even if the data on log area are frequently repaired
Change products stiff disk fragment, also facilitate and these fragments are carried out the management such as to recycle, and non-thermal point data is stored in data field, it is non-thermal
The release of point data does not easily lead to hard disk and generates fragment, and data field, which may not need, distributes excess resource for hard disk debris management, from
And be managed in different ways by the way that different types of data are stored in different regions on hard disk, it can be improved hard
Debris management efficiency on disk, is effectively managed the fragment on hard disk, has also been reached by the management of log area and has been evaded firmly
The effect that disk fragment generates.
The setting of the log area and data field of hard disk can have various ways, be described in detail as follows, using as its
One of implementation.
It is the two kinds of region in data field and log area by hard disk partition, to the space size sheet of data field and log area
Inventive embodiments are not especially limited, such as can be 256M.The data field and log area, which can be, to be arranged alternately, such as one institute of table
Show, table one is an a kind of example of hard drive space layout, and hard disk is divided into superblock, data field and log area.Optionally, exist
0.1% ratio is as fixed fixation log area, evenly spaced to be distributed on hard disk, this seed type in the set of log area
Log area can only use as log area, and others log area can be converted into data when hard drive space deficiency
Area.
Table one
Data field and log area have a variety of set-up modes on hard disk, and above-mentioned data field and log area are arranged alternately only
It is one such mode, the present invention is not especially limited this, such as can also be that data field is continuously disposed in the one of hard disk
Region, log area is continuously disposed in another region of hard disk or multiple data fields are continuously set as data district's groups, multiple logs
Area is continuously set as log district's groups, and then data district's groups and log district's groups are arranged alternately, etc..
Data field can save non-thermal point data, such as will be greater than the data of 128KB and write direct data field.Wherein, exist
Data field after writing data into data field, will generate the metadata of management data, after the writable caching device of these metadata,
It is again it in log area allocation space, to be stored.
Log area can save hot spot data, such as by the data and meta-data preservation that are less than 128KB into log area,
In some embodiments, hot spot data can be to be stored on log area in additional mode, can also be right in the embodiment having
An identification information ID is distributed in log area in sequence.In log area, data are managed with log mode.
In the embodiment having, in log area, WriteMode is added in sequence and carries out io processing, when a data block needs
Log area is written, the tail portion allocation space being written from the log area last time, when log area can not accommodate a data block
It waits, reselects a free time maximum log area and carry out additional write.
The layout of log area is as shown in Table 2, and wherein Journal ctrl is identification information ID, and Map is mapping relations,
Record is data, which is, for example, to be less than the data and metadata of 128KB.
Table two
As shown in Table 1, in the embodiment having, it is additionally provided with superblock on the hard disk of hard disk control device, is written in data
Behind log area, log area is modified, and superblock will will record the flag information ID for the log area modified.For example, a batch to
When the write operation group of log area is combined into an affairs, the data of an affairs are saved in after hard disk, can modify this affairs
The identification information ID of log area be recorded in the corresponding bitmap of superblock.
For the space size of superblock, the embodiment of the present invention is not especially limited, and can be carried out according to equipment concrete condition
Adjustment, for example, the disk of a 4T, log area has 4T/256M/2=8192, and a 1024B is needed to record day in superblock
The overall service condition in will area.
The layout of superblock can be as shown in Table 3.Wherein Super Blkctrk is for recording overall management information, example
As hard disk has used capacity, total capacity, total idle capacity, log area number, data field number.Journalbitmap is used
In record processed transaction number.The capacity of Super blkctrk and Journalbitmap can be 4K respectively
Table three
Super blkctrk | Journalbitmap |
It, can be by multiple log areas in the embodiment having in order to make the data for being assigned to data field and log area close
It is combined into a chunking with data district's groups, the log area and data field in chunking are continuously arranged.Each piece of group has a space management
Object manages the service condition of the hard drive space in this block group by the way of bitmap file bitmap.For example, can will connect
16 log areas of continuous setting and data district's groups are combined into a block group.
Table four and table five show the relationship of chunking, data field and log area three.Table is fourth is that hard as unit of chunking
The signal of the layout of disk, table is fifth is that the layout to the chunking 1 in table four is illustrated.
Table four
Table five
As shown in Table 2, mapping relations Map is also stored on log area, which can be for record log
The corresponding relationship in the log area space that data and the data in area are assigned to, as shown in Figure 1, patrolling it illustrates hard disk object
View is collected, mapping relations are illustrated according to the figure.
As shown in Figure 1, it illustrates an objects on log area.One object can be divided into many levels, most bottom
Grade level is 0 layer by layer, the data block of corresponding objects.It is indirect block, level 1 on level0.Top layer is Object Management group
Block where structure, level 2.In the case that there are many data block, the address that an indirect block can not save so more data blocks refers to
Needle, needs multiple indirect blocks at this time, and the number of plies of object also increases.The block of same layer according to numbering from left to right, such as most bottom
The blkid number consecutively of the data block of layer is 0,1,2 and 3.
In affairs when modifying data block, the relationship note in the log area space for needing for data and the data to be assigned to
It records into a mapping relations.For example, an affairs create the object in Fig. 1, then needing to record such as in mapping relations
The information of table six.In table six, the information type of the data of each column record is followed successively by objsetid in the mapping relations,
Objid, levelid, blkid, journalid, offset, size.
Wherein, objsetid refers to that object set ID, objid refer to that object ID, levelid refer to the number of plies where data block, blkid
Refer to data block in the place number of plies, serial number from left to right, journalid refers to that the id of the log area of data block write-in, offset refer to
The opposite offset of log area is written in data block, and size refers to the size of data block write-in.
It is appreciated that the information type of mapping relations record can be including above-mentioned all information types, also can wrap
The some types of above- mentioned information type are included, can also include more other information types, the embodiment of the present invention does not make this
It is specific to limit.
Table six
It is appreciated that in the embodiment having, which can be replaced memory by other devices, such as Nvdimm,
Flash card, SSD (solid state hard disk, Solid State Drives) etc..It can be with it is appreciated that in the embodiment having, on hard disk
Do not include log area, and hot spot data is stored on caching device, the embodiment of the present invention is not specifically limited in this embodiment.
It is appreciated that the hard disk control device of the embodiment of the present invention can be used in the equipment such as computer, server, this
Inventive embodiments are not specifically limited in this embodiment.
Fig. 2 is a kind of flow chart of hard disk data management method shown according to an exemplary embodiment.This method application
In on hard disk control device, which includes hard disk, which includes data field and log area.In conjunction with foregoing description
First part, i.e. implementation environment involved in the hard disk data management method of the embodiment of the present invention held with hard disk control device
For the angle of row method provided in an embodiment of the present invention, referring to fig. 2, method flow provided in an embodiment of the present invention includes:
Step 201: data are written to memory;
Before data are written to hard disk in equipment, hard disk control device first writes the data into caching device, to be written
Management before hard disk, such as the data are managed using the cache manager of hard disk control device.
The caching device can be the caching device such as memory or flash memory, and the embodiment of the present invention is not made this specifically
It limits.
In embodiments of the present invention, it is illustrated using caching device as memory.The data of the write-in memory include hard disk control
The metadata generated when data is write in all external datas of device processed and the data field to hard disk.
Wherein, it is write to memory write-in data including modification and two ways is write in creation, the number being written to memory is write in modification
According to modify to the data on memory, the data that new data cover is modified, creation is write is written new number to memory
According to not caching the initial data of the new data on memory.
Step 202: judge whether the data are hot spot datas, if the data are not hot spot datas, then follow the steps 203,
If the data are hot spot datas, 204 are thened follow the steps.
After data are written to memory, hard disk control device judges whether the data are hot spot datas, such as passes through hard disk control
The cache manager module of device processed is judged.Hard disk control device judges whether the data of hard disk to be written are hot spot datas
Afterwards, different processing modes is executed according to judging result.
Wherein, hot spot data is that after being stored on hard disk after the modification of preset times and release hard disk can generate to preset
The data of quantity fragment.For example, in the present embodiment, which, which can be, refers to that size of data is less than preset data threshold values
Data and/or the hot spot data may also mean that metadata.
The data that size of data is less than preset data threshold values are small data, which frequently modifies and release is easy to produce
Hard disk fragment, and the data that size of data is greater than certain predetermined data threshold values will not generate greatly frequently modifying on hard disk
The hard disk fragment of amount.Wherein, the setting of the preset data threshold values is related with business model, such as can be set to 64KB, 128KB
Deng.
And metadata is the data block for recording object management architecture and recording data block address.When hard disc data area to be modified
On data block when, because data field generally uses COW mechanism, that is, when to modify write a block number according to when, be not direct
The data of early version are covered, but read the data of early version, after modification is good, a new position is write, discharges early version
Data.Because of the change in location of data, needs to modify on the pointer in the upper layer index block for being directed toward data, that is, want
Metadata is modified, modified new data allocations are to new space, and the old metadata needs modified discharge, so
Recurrence is to top.Thus a large amount of data can be discharged, the position of the data discharged on hard disk because of the modification of metadata
With regard to generating fragment, to accelerate the process of hard disk fragmentation.
To which size of data is less than the data of preset data threshold values to the embodiment of the present invention and/or metadata is classified as hot spot
Data, these data are easy to cause hard disk to generate fragment, need to manage it accordingly.
Step 203: matching data field space in data separation for data, write the data into data field space.
Be judged as be not hot spot data data because it is not easy to make hard disk to generate fragment, so as to save it in hard disk
On data field.Wherein, data field is the region on hard disk for storing data, which can be used COW mechanism to thereon
Data are managed.Data field can be the region on hard disk with pre-set space size, which for example can be with
It is 256M.Description of the above-mentioned implementation environment part to data field can refer to the description of the data field.
For example, cache manager judges that the data are not hot spot datas, then firmly when the size of a data is greater than 128KB
The dina base administration device module of disk control unit can distribute data field space for the data on data field, then write the data
Enter the data field space being assigned to.
Wherein, metadata can be generated after writing data into data field.The metadata is used to record the address of the data, thus
The convenient management to the data.For example, when tissue multi-block data generally by the way of multiple index, that is, in data
One index block of a upper Layer assignment of block, content are to record the address of data block, can be by multiple data by the index block
Block is spliced into a continuous object in logic.This index block is exactly a seed type of metadata.Write data into data field
After generating metadata, which can be written memory, execute above-mentioned step 201.Cache manager can determine whether out this yuan of number
According to for hot spot data, to be saved on log area and memory.
It is appreciated that the hard disk of hard disk control device further includes block group in the embodiment having, block group includes preset quantity
Log area and data field, the log area and data field of block group be continuously arranged, in the hard disk control device for including block group, for number
It is to select suitable block group, such as space uses less block according to the specific executive mode for matching data field space in data separation
The more block group of group or idle data area, then distributes the data field space of data field for the data in the block group of the selection.
It after allocation space, needs to record the service condition in space, that is, needs to search and distribute in the space management bitmap of block group
After idle data block, the management structure of modified block group.It is subsequent can be according to the space service condition that block group records to belonging to this
The metadata is written in the log area of block group, if the log area that the block group is not idle, needs that neighbouring log area is selected to be written
The metadata.So that the data that the metadata is directed toward with the metadata are located proximate on hard disk.
Wherein, which can refer to corresponding description of the above-mentioned implementation environment part to block group.
It is appreciated that the data on write-in data field can also include interior other than being judged as the data of non-thermal point data
It deposits because of the data that space is inadequate and eliminates, for example, being directed toward mapping relations in subsequent memory recycling second stage
Data move to data field from log area.These data being migrated can also be matched by data space management device in data separation
Behind the space of data field, data field space is written.
Step 204: distributing log area space in log area for data.
It is additionally provided with log area on hard disk, the data on log area are managed by log mode on log area.
Log area sky is distributed on log area when the data of step 201 are judged as hot spot data, and for the data
Between, such as log manager is delivered the data to, which is that the hot spot data distributes log area on log area
Space, so that the hot spot data stores on log area.
It, can be according to write-in log area in order to be more easily managed to the data on log area in the embodiment of the present invention
Sequence be to belong to the data of the hot spot data allocation space on log area.It certainly, in other embodiments, can not be heat
Point data order-assigned log area space, the present invention is not especially limit this,
And the data for belonging to hot spot data are stored on log area, when the hot spot data on memory is lost because of power down
When, can by, so that equipment executes operation, also, memory and log area are used cooperatively in the reading data to memory on log area,
Hot spot data is stored in log area, may make that the data volume of the manageable hot spot data of memory is expanded.
The log area can be the region on hard disk with certain space size, such as can be 256M.It can on hard disk
With with multiple log areas and data field, which can have identification information according to the setting order-assigned on hard disk.It closes
In log area, specific set-up mode can refer to corresponding description of the above-mentioned implementation environment part to log area.
The set-up mode of log area and data interval on hard disk can have various ways, such as log area and data
Area is disposed alternately on hard disk, as shown in above-mentioned table one.Certain data field and log area can also be arranged in another manner,
The present invention is not especially limit this, specifically refers to what data field and log area was arranged in above-mentioned implementation environment part
Corresponding description.
Step 205: establishing the mapping relations in the log area space that multiple target datas and multiple target data are assigned to.
Wherein target data belongs to hot spot data.
After multiple data are written to memory, by whether be hot spot data judgement after, may be obtained on memory multiple
Belong to the data of hot spot data.Hard disk control device determines multiple target datas for belonging to hot spot data on memory,
To be managed together to target data is multiple, treatment effeciency is improved.Target data is all assigned log area sky on log area
Between, mapping relations are established in the log area space that hard disk control device is assigned to according to these target datas and target data.Wherein
These target datas belong to hot spot data, i.e. target data includes that metadata and/or size of data are less than preset data threshold values
Data.
All data for writing log area require the mapping relations of record data and log area space, pass through mapping relations
Data on log area are recorded, with according to the mapping relations to the hot spot data on memory or the data on log area into
Row management, for example, equipment reads the corresponding number being stored on log area according to the index for the mapping relations being buffered on memory
According to, or according to mapping relations record data information log area is recycled, with to log area carry out debris management.
The information type of mapping relations record may include objsetid, objid, levelid, blkid,
The information such as journalid, offset, size,
More contents about the mapping relations can refer to the corresponding description of above-mentioned implementation environment part.
Step 206: mapping relations are cached on memory.
After mapping relations establish, it is saved on memory, for subsequent operation preparation.
In the embodiment having, which can also be saved in log area, when needing to be saved on memory, then
The mapping relations are read, from log area to be buffered on memory.It certainly, can be by the mapping relations in the embodiment having
It is stored on memory and log area simultaneously.
Step 207: multiple write operation groups of multiple target data are combined into an affairs.
After determining multiple target datas, write operation is executed to multiple target data log area, hard disk is written
Multiple write operation groups of this multiple target data are combined into an affairs by control device, are executed as unit of affairs and are write hard disk behaviour
Make.Wherein affairs are the addresses to the combination of multiple write operation, are not the execution of write operation.And multiple target datas refer at least
Two target datas, correspondingly, multiple write operations refer at least two write operations.
In order to improve write efficiency and guarantee that write-in reliability is generally not only to hold when data are written to log area in equipment
Write operation of row, but in the multiple write operations for once executing multiple data into the operation of log area write-in data.These
Belong to multiple target datas with a batch of write operation, during hard disk is written or is write as function entirely or write mistake entirely
It loses.Multiple write operation groups with a batch of multiple target datas are combined into an affairs.
It is appreciated that the embodiment of the present invention is not especially limited the execution sequence of step 207 and step 205.It is i.e. multiple
The write operation group of target data is combined into an affairs, can establish mapping relations according to these target datas, thus an affairs and one
Mapping relations are corresponding.
In the embodiment having, the method for the embodiment of the present invention further includes establishing data-link according to multiple target data
Table forms a data link table, the number according to a series of metadata in an affairs and less than the data of preset threshold
Management objectives data are used for according to chained list.When forming mapping relations, can be existed according to the data on the data link table with these data
Mapping relations are established in the space distributed on log area.
Step 208: log area space is written into all target datas of affairs.
Log area space is written in all target datas of affairs by hard disk control device, and multiple target set of data are combined into one
After affairs, when the write operation of one of target data of affairs executes failure, what other target datas of affairs executed is write
Operation failure.The write operation of only each target data is carried out success, and the write operation of the affairs could succeed.
After the completion of mapping relations are established, it is that the mapping relations distribute log area space by log area manager, belongs to heat
The data of point data are also assigned with log area space, so as to which the mapping relations and target data are all written what the two was assigned to
Log area space, to be saved on log area to mapping relations and target data.
After mapping relations are stored in log area, system can be read the mapping relations on memory, so that memory can
To reacquire the mapping relations, this is particularly useful to reworking after memory power down, certainly, can in the embodiment having
The mapping relations not to be stored on log area, so that this also can be real without being mapping relations in log area allocation space
The effect of hard disk fragment on data field is now reduced, the present invention is not especially limit this.
In the embodiment of the present invention, the sequence in the target data write-in log area space of mapping relations and affairs is not made specifically
It limits.
In the embodiment having, it is data in log area distribution log area space, writes the data into log area space
Concrete mode is to distribute log area space in sequence in log area for the data, by the additional write-in log of data sequence
Area space.Order-assigned space and the additional write-in data of sequence are distributed i.e. on the memory space of log area according to sequencing empty
Between or write-in data.
By distributing log area space in sequence in log area for the data, by the additional write-in log of data sequence
The mode in area space can sequentially read and write data when being managed to the data of log area, improve the effect of data management
Rate, and determine according to sequentially the data on log area, and mapping relations record has the data for being stored in log area and these numbers
According to log area space corresponding relationship, using mapping relations can replace metadata effect, be determined without using metadata
Data on log area, from without additional metadata management expense.And when data are write in a manner of sequentially adding in log area,
The information of memory space of the mapping relations record data on log area can be facilitated.
In an embodiment of the present invention, the method for the embodiment of the present invention further include: if data are hot spot datas, in memory
Log area space is written in all target datas of affairs by upper caching data, and in the target data of affairs is buffered in
It deposits, that is, after judging whether data are hot spot data, if the data are hot spot datas, the data is retained on memory, this
Sample, during subsequent operation, if data are written to memory, if the write operation is to the target data being buffered on memory
Modification write, then can modify directly on memory to the data, with execute it is above-mentioned according to data link table to target data
The step of being managed.To retain the data on memory, so that the data migrate on memory, because of hot spot data
It is easy to cause hard disk to generate fragment, these data buffer storages is not stored in data field on memory, can avoid on data field
Because the migration of the data generates fragment.
In the embodiment having, if can not also execute data is hot spot data, the step of the data is cached on memory
Suddenly, but before step 201, data are read from log area and are cached to memory, i.e., before data being written to caching device,
Data are read to caching device caching from log area.It is also able to achieve and modifies directly on memory to the data in this way, this is complete
At above-mentioned the step of being managed according to data link table to target data.To which the data be retained on memory, with
It migrates the data on memory, avoids on data field because the migration of the data generates fragment.
In the embodiment having, hard disk control device further includes superblock, when each log area is assigned identification information,
The superblock can be used for recording the identification information for the log area modified after log area is modified.To, after memory power down,
The identification information for the log area that can be recorded according to superblock reads the data on corresponding log area, then, delays on memory
The data on the log area read are deposited, to continue to execute follow-up data operation.
It is all when affairs for example, after log area space is written in all target datas and mapping relations of affairs
It, will be in the bitmap of the identification information recording of the log area of modification to superblock when write operation is fully completed.
It is appreciated that hard disk further includes block group in the embodiment having, which includes the log area sum number of preset quantity
According to area, the log area and data field of chunking are continuously arranged, at this point, as described above, for data data separation match data field space,
It specifically may is that and determine idle target data area according to the management information of block group;Data are distributed in target data area for data
Area space.
To, it writes data into after the space of data field, target metadata can be generated according to data and data field space, it should
Target metadata is used to record the address of the data on the data field, the data are managed and be inquired to convenient.
After generating metadata, target metadata is written to memory;Metadata is judged to determine number of targets after hot spot data
According to object block group belonging to area;It determines the available log area of object block group, that is, searches available log area in the object block group, so
The available log area is written into target metadata afterwards.It is nearest near target data area or object block group if searched without if
Log area can be used.
Wherein, to log area write-in data be using affairs as unit when, be affairs multiple target datas distribution log area
Above-mentioned method can be used in space, and the data allocations of target data are obtained to the data being directed toward close to the metadata.
In this way, after the data that metadata and the metadata are directed toward are stored on hard disk according to the above method, the metadata
The storage location of data be directed toward close to the metadata of storage location improve hard disk to reduce the moving distance of magnetic head
Read-write efficiency.
It is the side with chained list to the multiple target datas for belonging to same affairs on memory in the embodiment that the present invention has
Formula is managed.As described above, the method for the embodiment of the present invention is after determining multiple target datas, it can be according to multiple target
Data establish data link table, and data link table is used for management objectives data.I.e. the data of each affairs and metadata are submitted to firmly
After disk, the data of corresponding log area are got up with chained list method management,
Because the write operation group of these target datas is combined into an affairs, so that an affairs correspond to a data-link
Table, the target data of data link table management and the target data of affairs are identical.And a mapping is had according to the foundation of these target datas
Relationship, so that the corresponding mapping relations of a data link table, the mapping relations have recorded the data of the data link table management in day
Storage condition in will area.
Target data can be managed according to data link table, specific management method is as follows:
According to above-mentioned execution step, determined on memory after belonging to multiple first object data of hot spot data, root
The first data link table is established according to these first object data, these first object data belong to the first affairs, i.e. these first mesh
Data are marked when log area is written, are that log area is written according to same write-in batch, as long as there are a first object data to write
Enter failure, then failure is written in other data of the first affairs.These first object data, which are also established, one first mapping relations.It should
First mapping relations are buffered on memory.For example, first object data A1, first object on the first data link table managing internal memory
Data B1, first object data C1, first object data D1.Corresponding first mapping relations record have first object data A1,
The log area that first object data B1, first object data C1, first object data D1 and these data are assigned in log area
The relationship in space.In subsequent process, hard disk control device establishes the second data link table according to multiple second target datas, multiple
Second target data belongs to hot spot data, while belonging to the second affairs, has second to reflect according to the foundation of multiple second target data
Relationship is penetrated, which can for example manage the second target data A2, the second target data E1, the second target data F1, second
Target data D2.
When the second target data of the second data link table management is by the first of the first data link table management pre-established
When target data is modified to obtain, the management to first object data is released on the first data link table;With the first data link table
The information of first object data is deleted in corresponding first mapping relations.In this way, being achieved that target data on data link table
With the migration in mapping relations.For example, the second target data A2 when the second data link table management is by the first data link table
When first object data A1 modifies to obtain, because the first object data A1 on memory is modified to the second target data A2, the
One target data A1 no longer needs, so as to release management of first data link table to first object data A1, correspondingly, first
The information about first object data A1 that record is shut in mapping can also delete.Because of the first object of the first mapping relations record
The information of data A1 is deleted, log area recycle when, according to first mapping relations, can judge with this first
There is fragment generation in the corresponding log area target data A1, and when migration merges the data on log area, because closing from the first mapping
The information read less than first object data A1 is fastened, so as to not move the first object data A1 on log area to new
Log area, i.e., the first object data A1 on log area are releasable.
Fig. 3 is the schematic diagram that data migrate in memory, wherein each affairs include multiple target datas, are in Fig. 3
Label mark has been carried out to partial data block therein.As shown in figure 3, subsequent affairs are because of io's because affairs constantly generate
Locality has modified the data managed in the first affairs.Such as second affairs have modified first object data A1 and obtain the second mesh
Data A2 is marked, the 5th affairs have modified first object data B1 and obtain the 5th target data B2, and the 4th affairs have modified the first mesh
Mark data C1 obtains the 4th target data C2, and the second affairs have modified first object data D1 and obtain the second target data D2, the
Three affairs have modified the second target data D3 and obtain third target data D3.In this way according to the first affairs are corresponding and the first data
Chained list is it is found that all data of the first affairs all move to subsequent affairs, and corresponding, log area can all discharge first
The target data of affairs.As shown in table 7, the target data of the first affairs is stored on log area 1, in all of the first affairs
After data all move to subsequent affairs, 1 data above block of log area all moves to subsequent affairs, and write-in is other
In log area, i.e., the data of the corresponding new version of data of log area 1, this when, in log are preserved in other log areas
Log area 1 can release when area recycles, as empty log area.The migration of other target datas is similar, goes to
When five affairs, actual caching situation is as shown in Figure 4 in memory.
Table 7
In order to be more easily managed according to data link table to target data, in the embodiment having, the present invention is implemented
It further include distributing affairs according to incremental rule for the corresponding data link table of affairs according to the write sequence of affairs in the method for example
Number, the write sequence of affairs refers to the sequencing of write-in log area between different affairs, the corresponding data-link of the affairs being first written
The transaction number that table is assigned to is smaller, and the transaction number that the corresponding data link table of affairs of log area is and then written increases a unit,
Data link table each in this way can have corresponding mark, these marks also have incremental rule.Thus can be square according to transaction number
That just determines the data link table on memory establishes sequencing.
For example, initially setting up the data link table of the first affairs, which is first written log area, to be first thing
The data link table of business distributes transaction number 1, the data link table of the second affairs is then established according to multiple second target datas, this second
Log area will be written after the first affairs in the data of affairs, to distribute thing for corresponding second data link table of second affairs
Business number 2 similarly distributes transaction number 3 for third data link table, so analogizes.
The above-mentioned partial content i.e. to be managed according to data link table to data.Belong to the data write-in day of hot spot data
Will area, after caching the data for belonging to hot spot data on memory or reading data from log area, by the number of the reading
According to being buffered on memory, so as to be managed according to data link table to data on memory, hot spot data is allowed to move on memory
It moves, reduces migration of the data on hard disk, to reduce the fragment that the data generate hard disk.
By data buffer storage on memory, it can also avoid reading log when reading data, directly from memory read data
Data in area.
And by being managed according to data link table to data on memory, to be established and be changed to mapping relations
Become, according to mapping relations can data to log area and fragment arrange, to avoid or reduce the fragment of hard disk, moreover it is possible to
Improve the efficiency of fragment processing.In order to make full use of the space on memory, the hot spot data being buffered on memory can be carried out
Release, recycles the space of memory, so that the other more data of memory cache.So the method for the embodiment of the present invention, pre-
If under release conditions, can be recycled to memory headroom.For example, triggering caching is eliminated when reaching the recycling water level of Installed System Memory.
An illustrative recovery method is illustrated below, which is divided into two stages.
First stage
When memory reaches the first preset water level, since the smallest data link table of Current transaction number, according to transaction number by
The small data not decontroled to big sequential search data link table.Then, to the target data chained list found, on memory
The data that release target data chained list does not decontrol, and retain the corresponding target mapping of target data chained list on memory and close
System.
Because being managed by way of data link table data, the data of the data link table management first established may be because
It is eliminated on memory by the modification of the data of rear write-in, these data eliminated have been completed solution from data link table
It removes.In this regard, can refer to the description of the above-mentioned management method to data link table.Internally deposit into capable recycling, that is, search data link table and do not have
Superseded data discharge these data that do not eliminate also on memory.These are stored back to belonging to the data that the time receiving is released inside
Data link table can be described as target data chained list.And the corresponding mapping relations of target data chained list remain on memory.First
The Memory recycle in stage just stops when memory headroom reaches preset stopping water level.
By above-mentioned step, the data less than or equal to preset data threshold values write log with the metadata for writing data generation
Area, while caching in memory.To the hot spot data on these log areas and memory, as above, using the successive sequence pipe of affairs
Reason is got up, and in this way when system, which runs a period of time caching, reaches the first preset water level, triggers backstage cache garbage collection thread
Memory headroom will be recycled according to the sequence of transaction number from small to large.For example, since the smallest data link table of Current transaction number, root
According to the data that the ascending sequential search data link table of transaction number does not decontrol, realize from the data link table established at first
Start the data on releasing memory.
The remaining data link table of more early foundation not is more sluggish data in the data that memory is eliminated, and equipment reads it
A possibility that taking modification is smaller, so as to which these data are discharged from memory, in this way on the data reading performance using redundancy of equipment influence compared with
It is small.And because target mapping relations are stored on memory, when equipment will read data, if not read from memory, in
The mapping relations for depositing reservation are inquired, can be according to target if determining the data to be inquired according to target mapping relations
Mapping relations read corresponding data from log area.In this way, memory can manage more hot spot datas.
It is appreciated that memory reaches the first preset water level, one kind of release conditions is only preset, data link table is searched and does not solve
Except the data of management can also trigger under the default release conditions of others, for example, setting timer then etc., the present invention is real
It applies example and this is not especially limited.
Second stage
After memory carries out the recycling of first stage, when memory reaches the second preset water level, target is read from log area and is reflected
Penetrate the data of relationship direction;Data field is written in the data that target mapping relations are directed toward;The delete target mapping relations on memory.
In order to further make full use of the space of memory, second can be carried out to memory headroom in second stage and recycled.Its
In, triggering memory executes the water level of second stage recycling when the second preset water level.Second preset water level can be with the first default water
Position is identical, can also be not identical as the first preset water level, and the embodiment of the present invention is not specifically limited in this embodiment, to the first preset water level
Also be not specifically limited with the specific value embodiment of the present invention of the second preset water level, for example, can according to actual amount of memory and
Type of service is flexibly set.
After the data that log area reads that target mapping relations are directed toward, number can be written in the data that target mapping relations are directed toward
According to area.When memory reaches the second preset water level, memory has run the regular hour at this time, caches mapping relations in memory
It is more and more, when finally also reaching caching water level, needs to read the data of these mapping relations direction, be written to data field
In.
It is currently more early to build because performing the management method of above-mentioned data link table when memory reaches the second preset water level
A possibility that data of vertical data link table not being released from are modified is smaller, because, if the data of the management on data link table
The data modification being written below, then the data move to the corresponding data link table of affairs below from the data link table.To,
The remaining data of the data link table of more early foundation can determine that for sluggish data, these sluggish data can because what is modified
Energy property is smaller, so that they are because the hard disk fragment that modification generates is also less, number can be written in the data that target mapping relations are directed toward
According to area.For example, reading these data from log area according to target mapping relations, then divided on data field by dina base administration device
Data field space is prepared, then on the data field that the write-in of these data is assigned to.
After the execution of above-mentioned steps, when equipment will read the data on hard disk, such as have to having been written into log
The data in area access, and hard disk control device can be searched first on memory, if hit in memory, can directly return
Return the data.After the first stage of Memory recycle, part hot spot data is deleted from memory, and remains corresponding mapping
Relationship, thus if do not find the data to be accessed in memory, but can be found in the mapping relations of memory cache,
These data to be accessed can be then directly read in the corresponding log area of the mapping relations.If in mapping relations,
It searches less than the data to be searched, then searches and read data into data field.
It is appreciated that being carried out in the embodiment for writing data in the additional mode write in sequence to log area, when continuous more
The data link tables of a affairs is eliminated finish after, such as after executing the recycling of above-mentioned memory second stage, reflected with deleted
The corresponding log area of relationship is penetrated also to release completely, can be re-used as an empty log area come using.Specific release day
Will area and the method for recycling log area will describe below.
In the embodiment having, after executing the above method, the embodiment of the invention also includes the operations recycled to log area.From
And the fragment on log area is reduced, make full use of the space of log area.
The recycling of log area below can carry out log area in a manner of adding write in sequence based on above-mentioned method
The embodiment for writing data is illustrated the recycling of log area.That is step 208 is, by the additional write-in of target data sequence of affairs
Log area space, i.e. data on log area are stored according to the sequencing of affairs,
The specific recovery method of log area, such as may include following step:
A1: under the conditions of default recycling, the step of execution journal area data are moved;
Wherein, recycling condition is preset to include timer expiry, complete the reclaimer operation of internal storage data, the total water level in log area
Reach at least one of preset water level threshold values.When referring to above-mentioned Memory recycle to the reclaimer operation completion of internal storage data,
Each stage executes completion, and all triggering log area data move step, i.e. triggering log recycling thread execution journal area recycling stream
Journey.
A2: when total space water level reaches pre-set space threshold values when current log area, then stop execution journal area data and remove
Otherwise the step of moving continues to execute the step of log area data are moved.
Wherein, the step of execution journal area data are moved, comprising:
B1: since the smallest data link table of Current transaction number, according to the ascending sequential search of transaction number and data
The corresponding mapping relations of chained list.
Performing in above-mentioned method according to the write sequence of affairs is the corresponding data link table of affairs according to incremental rule
After the step of distributing transaction number, hard disk control device is corresponding with data link table according to the sequential search of transaction number from small to large
Mapping relations, because mapping relations have recorded the relationship in the space that data and the data are assigned on log area, so as to analyze
Distribution situation of the data block on log area in mapping relations may know that also how many data are not moved on corresponding log area
It moves, the data not migrated are located on corresponding log area.
And because affairs are just switched to next log area later when promoting, using a log area is finished, because
The data of this continuous affairs can be write in continuous log area.According to the sequence analysis mapping relations of transaction number from small to large, i.e.,
The data storage condition on continuous log area can be got.
B2: whether the data on the first log area corresponding with the mapping relations are judged according to the information of mapping relations record
Migration finishes;
When being managed according to data link table to the data on memory, to the data migrated on memory, also to modify
Corresponding mapping relations, if the data on the message reflection log area of data link table record are finished with migrating on memory, example
After such as the corresponding data in the log area all have the second stage of new version or Memory recycle in other log areas,
Data on corresponding log area are moved to data field, at this point, judging the data of the corresponding log area of the mapping relations
Through having migrated.
B3: if the Data Migration of the first log area finishes, first log area is recycled;
It is finished if the data of the first log area have been moved, recycles first log area.Day is being recorded by superblock
In the embodiment of the service condition in will area, the information corresponding with the first log area in superblock can be understood at this time.
B4: if the data on the first log area have not migrated, according to the information that mapping relations record, the first log is determined
Space utilization rate in area;
Because mapping relations have recorded the relationship in the space that data and the data are assigned on log area, reflected so as to basis
The information analysis for penetrating relation record goes out space utilization rate on the first log area.
B5: when the space utilization rate of the first log area is less than default utilization rate threshold values, the data of the first log area are moved
The second log area is moved to, and updates mapping relations corresponding with the data moved.
Wherein the second log area is idle log area or the used log area when recycling log area, such as preceding next day
Will area used log area when recycling.This is preset can be set using threshold values according to particular condition in use, for example, in log area
Using it is more when, but the utilization rate of each log area is lower when, which can reduce this, default utilizes threshold value.Such as this is default
It can be set as 50% etc. using threshold values, the present invention is not especially limit this.Update reflect corresponding with the data moved
Penetrate relationship, can be after data move on log area, the data that timely update in mapping relations relevant to the data and
The corresponding relationship in new log area space.
By above-mentioned log area recovery method, the recycling of log area can be realized, reduced the fragment on log area, fill
Divide the space that log area is utilized.And because log area is that sequence is read and write, with the sequence of the transaction number of data link table from small to large
It is corresponding, thus corresponding mapping relations can be inquired according to the transaction number size of data link table, and data are moved, at this time without member
Data come the position of the data in record log area, to reduce the expense of metadata.
The method of the embodiment of the present invention, on the hard disk control device for including hard disk, which includes data field and log
Area, to after caching device write-in data, hard disk control device judges whether the data are hot spot datas, if the data are not hot spots
Data then match data field space in data separation for the data, write the data into data field space;If the data are hot spot numbers
According to, then be the data log area distribute log area space, write the data into log area space.In this way, being written into hard disk
Data be divided into hot spot data and non-thermal point data, hot spot data is after being stored on hard disk in the modification of preset times and release
Hard disk can be made to generate the data of preset quantity fragment afterwards, hot spot data is easy to that hard disk is caused to generate fragment, hot spot number is stored in
It on log area, is managed with log mode, even if the data on log area, which are frequently modified, generates hard disk fragment, is also facilitated to this
A little fragments carry out the management such as recycling, and non-thermal point data is stored in data field, and the release of non-thermal point data does not easily lead to hard disk
Generate fragment, data field may not need for hard disk debris management distribute excess resource, thus, by hard disk by different type
Data be stored in different regions and be managed in different ways, the debris management efficiency on hard disk, log area can be improved
Efficient management to hard disk fragment, can reduce the generation of hard disk fragment.
Data in log area because hot spot data locality, be migrated away, it is practical arrange log area when
It waits, the data of reading are less, so that the data management efficiency of log area is higher.And hot spot data is stored in log area, because of heat
Point data is easy to produce fragment, so that fragment concentrates on log area, further improves the efficiency to defragmentation.
By sequence is additional write in a manner of data are written to log area when, in log area, only needs sequence reads data and suitable
Data are written in sequence, and resettlement data do not generate the expense of metadata, and it is high-efficient that log area carries out defragmentation.By arranging log
The data in area can effectively avoid or reduce hard disk fragment.
In the embodiment having, data can not also be written to log area in a manner of sequentially adding, know log area
When the distribution condition of space, bitmap file bitmap can be used and correspond to journal zone to manage.
For example, the corresponding fixed block size of each bit, e.g. 4K, then the journal zone of 256M needs
8K is managed, and is write journal zone every time and is required to modify this bitmap.
It is appreciated that in an embodiment of the present invention, data field and log area can be classification storage, i.e. data field and day
Hot spot data is moved to the accumulation layer of lower-level when log area is recycled in different levels by will area.
It is appreciated that mapping relations can be searched in other ways in the embodiment having, such as random challenge mapping is closed
System, then analyzes the space storage condition of corresponding log area according to the mapping relations found, and then carries out log area data
Resettlement, can not have to the target data of affairs log area space is written in such a way that sequence is additional at this time, to specific write-in
Mode is without limitation.But such mode can may not react whole space storage conditions of log area because of mapping relations, and
Cause the recovering effect of log area not ideal enough.
It is appreciated that in the embodiment for including multiple log areas and multiple data fields of the invention, in order to more fully
Using data field and log area, to make full use of hard drive space, the method for the embodiment of the present invention further includes data field and log
The step of converting in area, for example, when the space utilization rate of data field is greater than preset data area and utilizes threshold values, by the day of current idle
Will area is converted into data field;It, will be by idle log when the space utilization rate of log area, which is greater than default log area, utilizes threshold values
The data field that area is converted to is converted into log area.
For example, presetting when initialization on hard disk, the hard drive space of half is log area, and others are several
According to area.After system runs a period of time, the utilization rate of data field is higher, looks into according to the identification information incremental order of log area
Idle log area is looked for, data field is translated into.It is converted into after data field, in the embodiment for including block group, uses block
The space management object of group is managed.Data field is set by the state of log area, and the management knot of block group is recorded
In structure, disk preservation is write to this.When the data on data field are deleted, hard drive space is released, and one is converted to by log area
The space of data field all release, the state of the data field is switched to log area by this when, and is recorded in block group
Management structure in, write disk preservation.
Fig. 5 is a kind of structural schematic diagram of hard disk control device shown according to an exemplary embodiment, the hard disk controlling
Device includes hard disk, and hard disk includes data field and log area, and the hard disk control device is for executing the corresponding embodiment of above-mentioned Fig. 2
The function that middle hard disk control device executes.Referring to Fig. 5, which includes:
Writing unit 501, for data to be written to caching device;
Cache manager 502, for judging whether data are hot spot datas, wherein hot spot data is after being stored on hard disk
Hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times;
Data management system 503 matches data field space in data separation for data if not being hot spot data for data,
Write data into data field space;
Log manager 504 distributes log area space in log area for data if being hot spot data for data, will
Log area space is written in data.
Optionally, caching device is memory, hard disk control device further include:
Mapping relations establish unit 505, for establishing the mapping relations of data and log area space in memory;
Optionally, hard disk control device further include:
Cache unit 506, for caching the data for belonging to hot spot data on memory.
Optionally,
Mapping relations establish unit 505, are also used to establish the log that multiple target datas and multiple target datas are assigned to
The mapping relations in area space, wherein target data belongs to hot spot data;
Log manager 504 is also used to multiple write operation groups of multiple target datas being combined into an affairs for affairs
Log area space is written in all target datas, when the write operation of one of target data of affairs executes failure, affairs
The write operation failure that other target datas execute.
Optionally,
Hard disk control device further include:
Chained list establishes unit 509, for establishing data link table according to multiple target datas, wherein data link table is for managing
Target data is managed, the target data of data link table management and the target data of affairs are identical;
Chained list administrative unit 510, for being managed according to data link table to target data;
Searching unit 511, for according to foundation sequence of the data link table after arriving first, searching under default release conditions
The data that data link table does not decontrol;
Memory management unit 512, the data not decontroled for discharging target data chained list on memory, and in memory
The corresponding target mapping relations of upper reservation target data chained list;
Wherein, chained list administrative unit 510, is also used to:
After establishing the second data link table, when the second target data of the second data link table management be by pre-establish first
When the first object data modification of data link table management obtains, the pipe to first object data is released on the first data link table
Reason;The information of first object data is deleted in the first mapping relations corresponding with the first data link table
Optionally,
Searching unit 511 is also used to when memory reaches the first preset water level, according to foundation of the data link table after arriving first
Sequentially, the data that data link table does not decontrol are searched;
Hard disk control device further include:
Reading unit 523, for reading target mapping relations from log area and being directed toward when memory reaches the second preset water level
Data;
Data write unit 513 is mapped, data field is written in the data for being directed toward target mapping relations;
Unit 514 is deleted, for the delete target mapping relations on memory.
Optionally,
Hard disk control device further include:
Transaction number allocation unit 515, for being the corresponding data link table of affairs according to incremental according to the write sequence of affairs
Rule distribution transaction number;
Searching unit 511 is also used to since the smallest data link table of Current transaction number, ascending according to transaction number
The data that sequential search data link table does not decontrol.
Optionally,
Hard disk control device further include:
Recovery unit 516, for the step of under the conditions of default recycling, execution journal area data are moved;
As shown in fig. 6, in the step of execution journal area data are moved, recovery unit 516, comprising:
Searching module 517 is recycled, for searching mapping relations;
Judgment module 518 is recycled, the information for recording according to mapping relations judges first day corresponding with mapping relations
Whether the data in will area have migrated;
Determining module 519 is recycled, if the data on the first log area have not migrated, according to mapping relations record
Information determines the space utilization rate on the first log area;
Execution module 520 is recycled, for when the space utilization rate of the first log area is less than default utilization rate threshold values, by the
The Data Migration of one log area updates mapping relations corresponding with the data moved to the second log area, wherein second day
Will area is idle log area or the used log area when recycling log area;
Recycling module 521, for when current log area total space water level reaches pre-set space threshold values, then stopping executing
Otherwise the step of log area data are moved continues to execute the step of log area data are moved.
Optionally,
Hard disk control device further include:
Transaction number allocation unit 515, for being the corresponding data link table of affairs according to incremental according to the write sequence of affairs
Rule distribution transaction number;
Recycle searching module 517, be also used to since the smallest data link table of Current transaction number, according to transaction number by it is small to
Big sequential search mapping relations corresponding with data link table;
Optionally, recycling condition is preset to include timer expiry, complete the reclaimer operation of internal storage data, log area Zong Shui
Position reaches at least one of preset water level threshold values.
Optionally,
Cache unit 506, it is data cached on memory if being also used to data is hot spot data;Alternatively, being read from log area
Access is cached according to caching device.
Optionally, hot spot data includes data and/or hot spot data of the size of data less than preset data threshold values including first
Data.
Optionally,
Log manager 504 is also used to distribute log area space in sequence in log area for data, data sequence is chased after
Add write-in log area space.
Optionally,
Hard disk control device further include:
Data field conversion unit 522, for inciting somebody to action when the space utilization rate of data field is greater than preset data area and utilizes threshold values
The log area of current idle is converted into data field;
Log area conversion unit 524, for inciting somebody to action when the space utilization rate of log area is greater than default log area and utilizes threshold values
Log area is converted by the data field that idle log area is converted to.
Optionally,
Hard disk further includes superblock, and each log area is assigned identification information, and superblock is used for after log area is modified
Record the identification information for the log area modified.
Optionally, log area and data field are arranged alternately on hard disk.
Optionally,
Hard disk further includes block group, and block group includes the log area and data field of preset quantity, the log area and data field of chunking
Continuous setting,
Data management system 503, comprising:
Free area determining module 525, for determining idle target data area according to the management information of block group;
Distribution module 508, for distributing data field space in target data area for data;
Hard disk control device further include:
Metadata generation module 507, for generating target metadata according to data and data field space;
Writing unit 501 is also used to caching device write-in target metadata;
Cache manager judges that metadata is after hot spot data, and log manager 504 is also used to determine target data area
Affiliated object block group;Determine the available log area of object block group;Available log area is written into target metadata.
Optionally,
Log manager 504 is also used to distribute log area space on log area for mapping relations;Mapping relations are written
The log area space that mapping relations are assigned to.
In conclusion the hard disk includes data field and log area, writing unit on the hard disk control device for including hard disk
After 501 are written data to caching device, cache manager 502 judges whether the data are hot spot datas, if the data are not heat
Point data, then data management system 503 matches data field space in data separation for the data, writes the data into data field space;
If the data are hot spot datas, log manager 504 is the data in log area distribution log area space, is write the data into
Log area space.In this way, the data for being written into hard disk are divided into hot spot data and non-thermal point data, hot spot data is to be stored in firmly
Hard disk can be made to generate the data of preset quantity fragment after on disk after the modification of preset times and release, hot spot data is easy to cause
Hard disk generates fragment, and hot spot number is stored on log area, is managed with log mode, even if the data on log area are frequent
Modification generates hard disk fragment, also facilitates and carries out the management such as recycling to these fragments, and non-thermal point data is stored in data field, non-
The release of hot spot data does not easily lead to hard disk and generates fragment, and data field, which may not need, distributes excess resource for hard disk debris management,
To be managed, be can be improved in different ways by the way that different types of data are stored in different regions on hard disk
Debris management efficiency on hard disk, efficient management of the log area to hard disk fragment, can reduce the generation of hard disk fragment.
Fig. 7 be another embodiment of the present invention provides a kind of hard disk control device hardware structural diagram, the hard disk control
Device processed includes processor CPU701, caching device 703 and hard disk 702 and hard disk controller 705 and bus 704.Hard disk
702 include data field and log area, and caching device in the embodiment having for example can be memory.
The step as performed by hard disk control device can be based on the hard disk control device shown in Fig. 7 in above-described embodiment
Structure.
The processor 701 executes program, so that the method that hard disk control device executes above-mentioned hard disk data management method, is lifted
The various optional designs of example are specific as follows.
The processor 701 executes program, so that hard disk control device has following function: data are written to caching device;
Judge whether data are hot spot datas, wherein hot spot data be after being stored on hard disk after the modification of preset times and release energy
Hard disk is set to generate the data of preset quantity fragment;It is empty with data field in data separation for data if data are not hot spot datas
Between, write data into data field space;If data are hot spot datas, log area space is distributed in log area for data, will be counted
According to write-in log area space.
A kind of optional design, caching device are memory, which executes program, so that hard disk control device has
Following function: after distributing log area space in log area for data, the mapping relations of data and log area space are established.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: establishing more
The mapping relations in the log area space that a target data and multiple target datas are assigned to, wherein target data belongs to hot spot number
According to;Multiple write operation groups of multiple target datas are combined into an affairs;All target datas write-in log area of affairs is empty
Between, when the write operation of one of target data of affairs executes failure, the write operation of other target datas execution of affairs
Failure.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: in memory
It is upper to cache the data for belonging to hot spot data.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: by affairs
All target datas write-in log area space before, establish data link table according to multiple target datas, wherein data link table use
In management objectives data, the target data of data link table management and the target data of affairs are identical;According to data link table to target
Data are managed;Under default release conditions, according to foundation sequence of the data link table after arriving first, searches data link table and do not solve
Except the data of management;The data that target data chained list does not decontrol are discharged on memory, and retain target data on memory
The corresponding target mapping relations of chained list;Wherein, target data is managed according to data link table, comprising: establish the second data
After chained list, when the second target data of the second data link table management is by the first mesh of the first data link table management pre-established
When mark data modification obtains, the management to first object data is released on the first data link table;With the first data link table pair
The information of first object data is deleted in the first mapping relations answered
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: working as memory
When reaching the first preset water level, according to foundation sequence of the data link table after arriving first, the number that data link table does not decontrol is searched
According to;
After discharging the data that target data chained list does not decontrol on memory, reach the second preset water level in memory
When, the data that target mapping relations are directed toward are read from log area;
Data field is written in the data that target mapping relations are directed toward;
The delete target mapping relations on memory.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
It is the corresponding data link table of affairs according to incremental rule distribution transaction number according to the write sequence of affairs;
Since the smallest data link table of Current transaction number, not according to the ascending sequential search data link table of transaction number
The data to decontrol.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: default
Under the conditions of recycling, execution journal area data move the step of;
The step of execution journal area data are moved, comprising:
Search mapping relations;
Judge whether the data on the first log area corresponding with mapping relations migrate according to the information of mapping relations record
It is complete;
If the data on the first log area have not migrated, according to the information that mapping relations record, the first log area is determined
On space utilization rate;
When the space utilization rate of the first log area is less than default utilization rate threshold values, extremely by the Data Migration of the first log area
Second log area, and update corresponding with the data moved mapping relations, wherein the second log area be the free time log area or
The used log area when recycling log area;
When total space water level reaches pre-set space threshold values when current log area, then stop the resettlement of execution journal area data
Otherwise step continues to execute the step of log area data are moved.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
It is the corresponding data link table of affairs according to incremental rule distribution transaction number according to the write sequence of affairs;From current thing
A business number the smallest data link table starts, according to the ascending sequential search of transaction number mapping relations corresponding with data link table;
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Default recycling condition includes timer expiry, to the reclaimer operation of internal storage data is completed, the total water level in log area reaches
At least one of preset water level threshold values.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
It is data cached on memory if data are hot spot datas after judging whether data are hot spot data;Alternatively,
To before caching device write-in data, data are read to caching device caching from log area.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Hot spot data include size of data be less than preset data threshold values data and/or hot spot data include metadata.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Log area space is distributed in sequence in log area for data, by the additional write-in log area space of data sequence.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
When the space utilization rate of data field, which is greater than preset data area, utilizes threshold values, convert the log area of current idle to
Data field;
When the space utilization rate of log area, which is greater than default log area, utilizes threshold values, by what is be converted to by idle log area
Data field is converted into log area.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Hard disk further includes superblock, and each log area is assigned identification information, and superblock is used for after log area is modified
Record the identification information for the log area modified.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function:
Log area and data field are arranged alternately on hard disk.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: hard disk is also
Including block group, block group includes the log area and data field of preset quantity, and the log area and data field of chunking are continuously arranged, according to block
The management information of group determines idle target data area;Data field space is distributed in target data area for data;
It writes data into after the space of data field, generates target metadata according to data and data field space;To buffer
Target metadata is written in part;Metadata is judged to determine object block group belonging to target data area after hot spot data;Determine mesh
Mark the available log area of block group;Available log area is written into target metadata.
A kind of optional design, the processor 701 execute program, so that hard disk control device has following function: for mapping
Relationship distributes log area space on log area;The log area space that mapping relations write-in mapping relations are assigned to.
In conclusion the hard disk includes data field and log area, the processor on the hard disk control device for including hard disk
After 701 are written data to caching device, which judges whether the data are hot spot datas, if the data are not hot spots
Data, then the processor 701 matches data field space in data separation for the data, writes the data into data field space;If should
Data are hot spot datas, then the processor 701 is the data in log area distribution log area space, write the data into log area
Space.In this way, the data for being written into hard disk are divided into hot spot data and non-thermal point data, hot spot data is after being stored on hard disk
Hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times, hot spot data is easy to that hard disk is caused to produce
Raw fragment, hot spot number is stored on log area, is managed with log mode, even if the data on log area frequently modify production
Stiff disk fragment also facilitates and carries out the management such as recycling to these fragments, and non-thermal point data is stored in data field, non-thermal points
According to release do not easily lead to hard disk generate fragment, data field may not need for hard disk debris management distribute excess resource, thus, lead to
It crosses and different types of data are stored in different regions on hard disk are managed in different ways, can be improved on hard disk
Debris management efficiency, efficient management of the log area to hard disk fragment, can reduce the generation of hard disk fragment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit
It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention
Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey
The medium of sequence code.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although referring to before
Stating embodiment, invention is explained in detail, those skilled in the art should understand that: it still can be to preceding
Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these
It modifies or replaces, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.
Claims (24)
1. a kind of hard disk data management method, which is characterized in that the method is applied to the hard disk control device including hard disk, institute
Stating hard disk includes data field and log area, which comprises
Data are written to caching device;
Judge whether the data are hot spot datas, wherein the hot spot data is after being stored on the hard disk in preset times
Modification and release after the hard disk can be made to generate the data of preset quantity fragment;
If the data are not hot spot datas, match data field space in the data separation for the data, by the data
The data field space is written;
If the data are hot spot datas, log area space is distributed in the log area for the data, the data are write
Enter the log area space;
The method also includes:
When the space utilization rate of the data field, which is greater than preset data area, utilizes threshold values, convert the log area of current idle to
Data field;
When the space utilization rate of the log area, which is greater than default log area, utilizes threshold values, by what is be converted to by idle log area
Data field is converted into log area.
2. the method according to claim 1, wherein the caching device be memory,
Described is the data after the distribution log area space of the log area, the method also includes:
Establish the mapping relations of the data and the log area space.
3. according to the method described in claim 2, it is characterized in that,
The mapping relations for establishing the data and the log area space, comprising:
The mapping relations in the log area space that multiple target datas and the multiple target data are assigned to are established, wherein the mesh
Mark data belong to hot spot data;
It is described that the log area space is written into the data, comprising:
Multiple write operation groups of the multiple target data are combined into an affairs;
Log area space is written into all target datas of the affairs, when one of target data of the affairs writes behaviour
When making to execute failure, the write operation that other target datas of the affairs execute fails.
4. according to the method described in claim 3, it is characterized in that, the method also includes:
Caching belongs to the data of the hot spot data on the memory.
5. according to the method described in claim 4, it is characterized in that,
All target datas by the affairs are written before the space of log area, the method also includes:
Data link table is established according to the multiple target data, wherein the data link table is for managing the target data, institute
The target data for stating data link table management is identical as the target data of the affairs;
The target data is managed according to the data link table;
Under default release conditions, according to foundation sequence of the data link table after arriving first, searches the data link table and do not solve
Except the data of management;
The data that target data chained list does not decontrol are discharged on the memory, and retain the number of targets on the memory
According to the corresponding target mapping relations of chained list;
It is wherein, described that the target data is managed according to the data link table, comprising:
After establishing the second data link table, when the second target data of the second data link table management be by pre-establish first
When the first object data modification of data link table management obtains, release on first data link table to the first object number
According to management;The information of the first object data is deleted in the first mapping relations corresponding with first data link table.
6. according to the method described in claim 5, it is characterized in that,
It is described under default release conditions, according to foundation sequence of the data link table after arriving first, search the data link table
The data not decontroled, comprising:
When the memory reaches the first preset water level, sequentially according to foundation of the data link table after arriving first, described in lookup
The data that data link table does not decontrol;
After the data that release target data chained list does not decontrol on the memory, the method also includes:
When the memory reaches the second preset water level, the data that the target mapping relations are directed toward are read from the log area;
The data field is written in the data that the target mapping relations are directed toward;
The target mapping relations are deleted on the memory.
7. according to the method described in claim 5, it is characterized in that,
The method also includes:
Under the conditions of default recycling, the step of execution journal area data are moved;
The step of execution journal area data are moved, comprising:
Search the mapping relations;
Judge whether the data on the first log area corresponding with the mapping relations migrate according to the information of mapping relations record
It is complete;
If the data on first log area have not migrated, according to the information that the mapping relations record, determine first
Space utilization rate in will area;
When the space utilization rate of first log area is less than default utilization rate threshold values, the data of first log area are moved
The second log area is moved to, and updates mapping relations corresponding with the data moved, wherein second day will area is sky
Not busy log area or the used log area when recycling log area;
When total space water level reaches pre-set space threshold values when current log area, then stop the step of execution journal area data resettlement
Suddenly, the step of otherwise continuing to execute the resettlement of log area data.
8. the method according to the description of claim 7 is characterized in that
The method also includes:
It is the corresponding data link table of the affairs according to incremental rule distribution transaction number according to the write sequence of the affairs;
It is described to search the mapping relations in the step of execution journal area data are moved, comprising:
Since the smallest data link table of Current transaction number, according to the ascending sequential search of the transaction number and the data
The corresponding mapping relations of chained list.
9. according to the method described in claim 4, it is characterized in that,
The caching on the memory belongs to the data of the hot spot data, comprising:
It is described judge whether the data are hot spot data after, if the data are hot spot datas, protected on the memory
Stay the data;Alternatively,
Before the device write-in data to caching, data are read from the log area and are cached to the caching device.
10. method according to any one of claims 1 to 9, which is characterized in that the hot spot data includes that size of data is small
In the data of preset data threshold values and/or the hot spot data include metadata.
11. method according to any one of claims 1 to 9, which is characterized in that
It is described to distribute log area space in the log area for the data, the log area space, packet is written into the data
It includes:
Log area space is distributed in sequence in the log area for the data, by the data sequence additional write-in day
Will area space.
12. according to the described in any item methods of claim 2 to 9, which is characterized in that
The method also includes:
Log area space is distributed on log area for the mapping relations;
The log area space that the mapping relations are assigned to is written into the mapping relations.
13. a kind of hard disk control device, which is characterized in that the hard disk control device includes:
Writing unit, for data to be written to buffer;
Cache manager, for judging whether the data are hot spot datas, wherein the hot spot data is described hard to be stored in
After on disk, the hard disk can be made to generate the data of preset quantity fragment after the modification and release of preset times;
Data management system, if not being hot spot data for the data, the data separation for the data in hard disk matches data
The data field space is written in the data by area space;
Log manager distributes day in the log area of the hard disk for the data if being hot spot data for the data
The log area space is written in the data by will area space;
The hard disk control device further include:
Data field conversion unit, for that will work as when the space utilization rate of the data field is greater than preset data area and utilizes threshold values
The log area of preceding free time is converted into data field;
Log area conversion unit will be by for when the space utilization rate of the log area is greater than default log area and utilizes threshold values
The data field that idle log area is converted to is converted into log area.
14. hard disk control device according to claim 13, which is characterized in that the caching device is memory, described hard
Disk control unit further include:
Mapping relations establish unit, for establishing the mapping relations of the data and the log area space.
15. hard disk control device according to claim 14, which is characterized in that
The mapping relations establish unit, are also used to establish the log that multiple target datas and the multiple target data are assigned to
The mapping relations in area space, wherein the target data belongs to hot spot data;
The log manager is also used to multiple write operation groups of the multiple target data being combined into an affairs;It will be described
Log area space is written in all target datas of affairs, when the write operation of one of target data of the affairs executes failure
When, the write operation that other target datas of the affairs execute fails.
16. hard disk control device according to claim 15, which is characterized in that the hard disk control device further include:
Cache unit, for caching the data for belonging to the hot spot data on the memory.
17. hard disk control device according to claim 16, which is characterized in that
The hard disk control device further include:
Chained list establishes unit, for establishing data link table according to the multiple target data, wherein the data link table is for managing
The target data is managed, the target data of the data link table management is identical as the target data of the affairs;
Chained list administrative unit, for being managed according to the data link table to the target data;
Searching unit is used under default release conditions, sequentially according to foundation of the data link table after arriving first, described in lookup
The data that data link table does not decontrol;
Memory management unit, the data not decontroled for discharging target data chained list on the memory, and described interior
It deposits and retains the corresponding target mapping relations of the target data chained list;
Wherein, the chained list administrative unit, is also used to:
After establishing the second data link table, when the second target data of the second data link table management be by pre-establish first
When the first object data modification of data link table management obtains, release on first data link table to the first object number
According to management;The information of the first object data is deleted in the first mapping relations corresponding with first data link table.
18. hard disk control device according to claim 17, which is characterized in that
The searching unit is also used to when the memory reaches the first preset water level, according to the data link table after arriving first
Foundation sequence, search the data that the data link table does not decontrol;
The hard disk control device further include:
Reading unit, for reading the target mapping from the log area and closing when the memory reaches the second preset water level
Mean to data;
Data write unit is mapped, the data field is written in the data for being directed toward the target mapping relations;
Unit is deleted, for deleting the target mapping relations on the memory.
19. hard disk control device according to claim 17, which is characterized in that
The hard disk control device further include:
Recovery unit, for the step of under the conditions of default recycling, execution journal area data are moved;
In the step of execution journal area data are moved, the recovery unit, comprising:
Searching module is recycled, for searching the mapping relations;
Judgment module is recycled, the information for recording according to mapping relations judges the first log area corresponding with the mapping relations
On data whether migrated;
Determining module is recycled, if the data on first log area have not migrated, is recorded according to the mapping relations
Information, determine the space utilization rate on the first log area;
Execution module is recycled, it, will be described for when the space utilization rate of first log area is less than default utilization rate threshold values
The Data Migration of first log area updates mapping relations corresponding with the data moved to the second log area, wherein
Second log area is idle log area or the used log area when recycling log area;
Recycling module, for when current log area total space water level reaches pre-set space threshold values, then stopping execution journal area
Otherwise the step of data are moved continues to execute the step of log area data are moved.
20. hard disk control device according to claim 19, which is characterized in that
The hard disk control device further include:
Transaction number allocation unit, for being the corresponding data link table of the affairs according to incremental according to the write sequence of the affairs
Rule distribution transaction number;
It is described to search the mapping relations in the step of execution journal area data are moved, comprising:
Since the smallest data link table of Current transaction number, according to the ascending sequential search of the transaction number and the data
The corresponding mapping relations of chained list.
21. hard disk control device according to claim 16, which is characterized in that
The cache unit retains the data if being also used to the data is hot spot data on the memory;Alternatively,
Data are read from the log area to cache to the caching device.
22. 3 to 21 described in any item hard disk control devices according to claim 1, which is characterized in that the hot spot data includes
Size of data is less than the data of preset data threshold values and/or the hot spot data includes metadata.
23. 3 to 21 described in any item hard disk control devices according to claim 1, which is characterized in that
The log manager is also used to distribute log area space in sequence in the log area for the data, will be described
Data sequence is additional to be written the log area space.
24. 5 to 21 described in any item hard disk control devices according to claim 1, which is characterized in that
The log manager is also used to distribute log area space on log area for the mapping relations;The mapping is closed
The log area space that the mapping relations are assigned to is written in system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610912077.5A CN106502587B (en) | 2016-10-19 | 2016-10-19 | Hard disk data management method and hard disk control device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610912077.5A CN106502587B (en) | 2016-10-19 | 2016-10-19 | Hard disk data management method and hard disk control device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106502587A CN106502587A (en) | 2017-03-15 |
CN106502587B true CN106502587B (en) | 2019-10-25 |
Family
ID=58294298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610912077.5A Active CN106502587B (en) | 2016-10-19 | 2016-10-19 | Hard disk data management method and hard disk control device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106502587B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107885455B (en) * | 2016-09-30 | 2020-02-07 | 郑州云海信息技术有限公司 | Dynamic adjustment method for disk log area |
CN107197191B (en) * | 2017-05-27 | 2021-05-11 | 深圳市景阳科技股份有限公司 | Writing method and device for network hard disk video |
CN107688442B (en) * | 2017-09-04 | 2020-11-20 | 苏州浪潮智能科技有限公司 | Virtual block management method for solid state disk |
CN107506156B (en) * | 2017-09-28 | 2020-05-12 | 焦点科技股份有限公司 | Io optimization method of block device |
CN108920095B (en) * | 2018-06-06 | 2021-06-29 | 深圳市脉山龙信息技术股份有限公司 | Data storage optimization method and device based on CRUSH |
EP3789883A4 (en) * | 2018-06-30 | 2021-05-12 | Huawei Technologies Co., Ltd. | Storage fragment managing method and terminal |
KR20200035592A (en) * | 2018-09-27 | 2020-04-06 | 삼성전자주식회사 | Method of operating storage device, storage device performing the same and storage system including the same |
CN111125033B (en) * | 2018-10-31 | 2024-04-09 | 深信服科技股份有限公司 | Space recycling method and system based on full flash memory array |
CN109558457B (en) * | 2018-12-11 | 2022-04-22 | 浪潮(北京)电子信息产业有限公司 | Data writing method, device, equipment and storage medium |
CN111694703B (en) * | 2019-03-13 | 2023-05-02 | 阿里云计算有限公司 | Cache region management method and device and computer equipment |
CN113010616A (en) * | 2021-04-26 | 2021-06-22 | 广州小鹏汽车科技有限公司 | Data processing method and data processing system |
CN116069261A (en) * | 2023-03-03 | 2023-05-05 | 苏州浪潮智能科技有限公司 | Data processing method, system, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103514260A (en) * | 2013-08-13 | 2014-01-15 | 中国科学技术大学苏州研究院 | Internal storage log file system and achieving method thereof |
CN103544045A (en) * | 2013-10-16 | 2014-01-29 | 南京大学镇江高新技术研究院 | HDFS-based virtual machine image storage system and construction method thereof |
CN105224237A (en) * | 2014-05-26 | 2016-01-06 | 华为技术有限公司 | A kind of date storage method and device |
CN105956090A (en) * | 2016-04-27 | 2016-09-21 | 中国科学技术大学 | I/O self-adaption-based file system log mode |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7272687B2 (en) * | 2005-02-01 | 2007-09-18 | Lsi Corporation | Cache redundancy for LSI raid controllers |
-
2016
- 2016-10-19 CN CN201610912077.5A patent/CN106502587B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103514260A (en) * | 2013-08-13 | 2014-01-15 | 中国科学技术大学苏州研究院 | Internal storage log file system and achieving method thereof |
CN103544045A (en) * | 2013-10-16 | 2014-01-29 | 南京大学镇江高新技术研究院 | HDFS-based virtual machine image storage system and construction method thereof |
CN105224237A (en) * | 2014-05-26 | 2016-01-06 | 华为技术有限公司 | A kind of date storage method and device |
CN105956090A (en) * | 2016-04-27 | 2016-09-21 | 中国科学技术大学 | I/O self-adaption-based file system log mode |
Also Published As
Publication number | Publication date |
---|---|
CN106502587A (en) | 2017-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106502587B (en) | Hard disk data management method and hard disk control device | |
CN106547703B (en) | A kind of FTL optimization method based on block group structure | |
CN104461393B (en) | Mixed mapping method of flash memory | |
CN104298610B (en) | Data storage system and its management method | |
CN104035729B (en) | Block device thin-provisioning method for log mapping | |
CN102789427B (en) | Data memory device and its method of operating | |
CN106527969B (en) | A kind of Nand Flash memorizer reading/writing method in a balanced way of life-span | |
CN104346357B (en) | The file access method and system of a kind of built-in terminal | |
CN103777905B (en) | Software-defined fusion storage method for solid-state disc | |
CN107066393A (en) | The method for improving map information density in address mapping table | |
CN103019958A (en) | Method for managing data in solid state memory through data attribute | |
CN108121503A (en) | A kind of NandFlash address of cache and block management algorithm | |
CN106201916B (en) | A kind of nonvolatile cache method towards SSD | |
US20140258596A1 (en) | Memory controller and memory system | |
CN109582593B (en) | FTL address mapping reading and writing method based on calculation | |
CN109164975A (en) | A kind of method and solid state hard disk writing data into solid state hard disk | |
CN111158604B (en) | Internet of things time sequence data storage and retrieval method for flash memory particle array | |
CN102646069A (en) | Method for prolonging service life of solid-state disk | |
CN101354681A (en) | Memory system, abrasion equilibrium method and apparatus of non-volatile memory | |
CN104598386B (en) | By following the trail of and reusing solid-state drive block using two level map index | |
CN102163175A (en) | Hybrid address mapping method based on locality analysis | |
CN107015763A (en) | Mix SSD management methods and device in storage system | |
CN109947363A (en) | A kind of data cache method of distributed memory system | |
CN110188108A (en) | Date storage method, device, system, computer equipment and storage medium | |
CN109671458A (en) | The method of management flash memory module and relevant flash controller |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |