CN107122124B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN107122124B
CN107122124B CN201610103929.6A CN201610103929A CN107122124B CN 107122124 B CN107122124 B CN 107122124B CN 201610103929 A CN201610103929 A CN 201610103929A CN 107122124 B CN107122124 B CN 107122124B
Authority
CN
China
Prior art keywords
data
page data
page
type
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610103929.6A
Other languages
Chinese (zh)
Other versions
CN107122124A (en
Inventor
杨洪章
罗圣美
王志坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201610103929.6A priority Critical patent/CN107122124B/en
Priority to PCT/CN2017/074290 priority patent/WO2017143972A1/en
Publication of CN107122124A publication Critical patent/CN107122124A/en
Application granted granted Critical
Publication of CN107122124B publication Critical patent/CN107122124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a data processing method and device. Wherein, the method comprises the following steps: acquiring a recovery request, wherein the recovery request is used for requesting data recovery of page data in the solid state disk; responding to a recovery request to obtain first type page data from the cached effective page data, wherein the first type page data is used for indicating that the page data stored in the solid state disk is to be replaced from the cache; and migrating the first type of page data from the cache to a preset migration position in the solid state disk, wherein the preset migration position is a storage position of effective page data after data recovery is executed. By the method and the device, the problem of low data processing efficiency caused by secondary data relocation in the related technology is solved, and the effect of improving the data processing efficiency is achieved.

Description

Data processing method and device
Technical Field
The present invention relates to the field of communications, and in particular, to a data processing method and apparatus.
Background
Solid State Drives (SSD) is a new generation of hard disk that is generated by integrating advanced semiconductor technology into large-capacity mobile storage. Because the inside does not have such mechanical structure of similar magnetic head, do not need to move the magnetic head and position data, so solid state hard drives start faster, and because there is no seek time, the storage of solid state hard drives and read-write speed are superior to mechanical hard drives too. In addition to high performance, the advantages of solid state disks over conventional disks include: high reliability, strong shock resistance, low power consumption, low noise and the like. As such, solid state drives are beginning to grow in popularity in the personal application area as well as enterprise-level applications.
However, solid state drives also have some disadvantages, such as pre-write erasure, limited number of erasures, garbage collection. Wherein, 1) erasing before writing means: the solid state disk has three operations of reading, writing and erasing, and before writing operation, erasing must be carried out first, namely, the overwriting operation cannot be directly carried out. For example, when the written data needs to be modified, the old data needs to be marked as invalid before the new data is written into the free space. The characteristic of erasing before writing greatly reduces the writing performance of the solid state disk. 2) The limited number of erasures is: the solid state disk is generally erased ten thousand to one million times, and once the solid state disk is erased, the solid state disk cannot be used continuously, data stored in a damaged unit needs to be migrated to another unit, and when the number of the damaged units exceeds a certain number, the whole solid state disk cannot be used. 3) Garbage Collection (GC for short) refers to: when there is no free space to perform write operation, it is necessary to release a part of the free space, that is, all valid pages are relocated to one or several data blocks, and the blocks not containing valid data pages are erased to release the free space.
Currently, for garbage collection of a solid state disk, a method commonly used in the prior art is to use a classical Greedy Algorithm (Greedy Algorithm), that is, select data blocks containing the most failed pages for garbage collection, and preferentially collect all failed pages in the data blocks. That is to say, when the free space in the solid state disk is insufficient, the effective page in the data recovery block of the solid state disk is moved, and the invalid page in the data recovery block is erased, so as to realize garbage recovery of the solid state disk. However, in the existing garbage recycling process, the solid state disk does not subdivide the effective page data page, that is, after the data is moved, the cold and dirty page data in the effective page data is replaced from the cache, and further, the cold and dirty page data needs to be moved again, that is, the data just moved to a new location is marked as invalid, and the new data in the cache is written into the updated location of the solid state disk. Therefore, a large amount of meaningless secondary relocation is carried out on the effective pages in the using process of the solid state disk, so that the overhead of the solid state disk is greatly increased, and the processing efficiency of data in the solid state disk is influenced.
In view of the problems set forth above, no effective solution has been proposed.
Disclosure of Invention
The invention provides a data processing method and a data processing device, which are used for at least solving the problem of low data processing efficiency caused by secondary data relocation in the related technology.
According to an aspect of the present invention, there is provided a data processing method including: acquiring a recovery request, wherein the recovery request is used for requesting data recovery of page data in a solid state disk; responding to the recovery request to acquire first type page data from the cached effective page data, wherein the first type page data is used for indicating that the page data stored in the solid state disk is to be replaced from the cache; and migrating the first type of page data from the cache to a preset migration position in the solid state disk, wherein the preset migration position is a storage position of the effective page data after the data recovery is executed.
Optionally, the obtaining the first type of page data from the cached valid page data in response to the recycle request includes: acquiring the access frequency and the modification identification of the effective page data in the cache; acquiring a page type of the effective page data according to the access frequency and the modification identifier, wherein the page type of the effective page data includes the first type of page data and a second type of page data, and the second type of page data is used for indicating that the page data stored in the solid state disk is not replaced from the cache; and separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
Optionally, the second type of page data includes first page data and second page data, where obtaining the page type of the valid page data according to the access frequency and the modification identifier includes: the page data that is identified as being unmodified by the modification is used as the first page data, the page data that is identified as being modified by the modification and has the access frequency greater than or equal to a first predetermined threshold value is used as the second page data, and the page data that is identified as being modified by the modification and has the access frequency less than the first predetermined threshold value is used as the first type of page data.
Optionally, before the transferring the page data of the first type from the cache to a predetermined transfer position in the solid state disk, the method further includes: determining a data recovery block of the solid state disk at least according to the first type of page data, wherein the page type of the page data in each data block in the solid state disk includes: unwritten page data, invalid page data and the valid page data, wherein each data block comprises the data recovery block; and performing the data recovery on the data recovery block.
Optionally, determining the data recovery block of the solid state disk according to at least the first type of page data includes: acquiring the data recovery rate of each data block according to the page data of the first type in the cache and the page data in each data block in the solid state disk; and determining the data recovery block according to the data recovery rate.
Optionally, obtaining a data recovery rate of each data block according to the page data of the first type in the cache and the page data in each data block in the solid state disk includes: repeatedly executing the following steps until all the data blocks in the solid state disk are traversed: acquiring a block identifier of a current data block; acquiring the first type page data and the failure page data identified by the block identifier; obtaining the data recovery rate of the current data block by:
Figure BDA0000929661180000021
wherein r represents the data recovery rate of the current data block, a represents the number of pages of the stale page data of the current data block in the solid state disk, B represents the number of pages of the first type of page data in the cache, P represents the page size, and B represents the block size.
Optionally, the performing the data recovery on the data recovery block includes: relocating the effective page data in the data recovery block to the predetermined relocation position, and marking the effective page data as the invalid page data; and erasing the invalid page data in the data recovery block.
Optionally, before the transferring the page data of the first type from the cache to a predetermined transfer position in the solid state disk, the method further includes: and determining the preset transfer position according to the size of the unwritten page data in other data blocks except the data recovery block in the solid state disk.
According to another aspect of the present invention, there is provided a data processing apparatus comprising: the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a recovery request, and the recovery request is used for requesting data recovery of page data in a solid state disk; a second obtaining unit, configured to obtain, in response to the recycle request, a first type of page data from valid page data in a cache, where the first type of page data is used to indicate that page data stored in the solid state disk is to be replaced from the cache; and a relocation unit, configured to relocate the first type of page data from the cache to a predetermined relocation location in the solid state disk, where the predetermined relocation location is a storage location of the valid page data after the data recovery is performed.
Optionally, the second obtaining unit includes: a first obtaining module, configured to obtain an access frequency and a modification identifier for the valid page data in the cache; a second obtaining module, configured to obtain a page type of the valid page data according to the access frequency and the modification identifier, where the page type of the valid page data includes the first type of page data and a second type of page data, and the second type of page data is used to indicate that the page data stored in the solid state disk is not replaced from the cache; and the separation module is used for separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
Optionally, the second type of page data includes first page data and second page data, where the second obtaining module obtains the page type of the valid page data by: the page data that is identified as being unmodified by the modification is used as the first page data, the page data that is identified as being modified by the modification and has the access frequency greater than or equal to a first predetermined threshold value is used as the second page data, and the page data that is identified as being modified by the modification and has the access frequency less than the first predetermined threshold value is used as the first type of page data.
Optionally, the method further comprises: a first determining unit, configured to determine a data recovery block of the solid state disk according to at least the first type of page data before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, where a page type of page data in each data block in the solid state disk includes: unwritten page data, invalid page data and the valid page data, wherein each data block comprises the data recovery block; and a recovery unit for performing the data recovery on the data recovery block.
Optionally, the first determination unit includes: a third obtaining module, configured to obtain a data recovery rate of each data block according to the page data of the first type in the cache and the page data in each data block in the solid state disk; and the determining module is used for determining the data recovery block according to the data recovery rate.
Optionally, the third obtaining module includes: the processing submodule is used for repeatedly executing the following steps until all the data blocks in the solid state disk are traversed: acquiring a block identifier of a current data block; acquiring the first type page data and the failure page data identified by the block identifier; obtaining the data recovery rate of the current data block by:
Figure BDA0000929661180000041
wherein r represents the data recovery rate of the current data block, and a represents the invalid page of the current data block in the solid state diskThe number of pages of data, B represents the number of pages of the first type of page data in the cache, P represents the page size, and B represents the block size.
Optionally, the recovery unit includes: a relocation module, configured to relocate the valid page data in the data recovery block to the predetermined relocation location, and mark the valid page data as the invalid page data; and the erasing module is used for erasing the invalid page data in the data recovery block.
Optionally, the method further comprises: and a second determining unit, configured to determine the predetermined relocation position according to a size of the unwritten page data in the data blocks other than the data recovery block in the solid state disk before migrating the first type of page data from the cache to the predetermined relocation position in the solid state disk.
According to the invention, when a recovery request for data recovery of the page data in the solid state disk is acquired, the first type of page data to be replaced and stored in the cache to the solid state disk in the effective page data is directly moved to the preset moving position in the solid state disk at one time, and the first type of page data does not need to be replaced into the solid state disk first and then moved again, so that the problem of low data processing efficiency caused by secondary moving of data in the related technology is solved, the effect of improving the data processing efficiency is realized, in addition, the data moving times and the additional overhead caused by the data moving times in the data recovery and cache replacing processes of the solid state disk are reduced, and the performance of the solid state disk is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow diagram of an alternative data processing method according to an embodiment of the invention;
FIG. 2 is a block diagram of an alternative data block according to an embodiment of the present invention;
FIG. 3 is a flow diagram of another alternative data processing method according to an embodiment of the invention; and
fig. 4 is a schematic diagram of an alternative data processing apparatus according to an embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Example 1
In this embodiment, a data processing method is provided, and fig. 1 is a flowchart of an alternative data processing method provided according to an embodiment of the present invention, as shown in fig. 1, the flowchart includes the following steps:
step S102, a recovery request is obtained, wherein the recovery request is used for requesting data recovery of page data in the solid state disk;
step S104, responding to the recovery request to acquire first type page data from the cached effective page data, wherein the first type page data is used for indicating that the page data stored in the solid state disk is to be replaced from the cache;
step S106, the first type page data is moved from the cache to a preset moving position in the solid state disk, wherein the preset moving position is a storage position of effective page data after the data recovery is executed.
Optionally, in this embodiment, the data processing method may be applied to, but not limited to, a garbage data recovery process of a solid state disk. That is to say, in this embodiment, when data recovery is performed on page data that is garbage in the solid state disk, when a recovery request for performing data recovery on the page data in the solid state disk is acquired, the first type of page data to be replaced and stored in the cache in the solid state disk in the valid page data may be directly moved to a predetermined moving position in the solid state disk at one time, without replacing the first type of page data in the solid state disk first and then moving the first type of page data again, so that a problem of low data processing efficiency due to secondary moving of data in the related art is overcome, and while data processing efficiency is improved, overhead of the solid state disk due to data moving is greatly reduced.
Optionally, in this embodiment, the solid state disk includes: a Flash Translation Layer (FTL), configured to map a logical address to a physical address through a mapping table; the method is used for marking the page type in the solid state disk; the method comprises the steps of detecting free space, and triggering data recovery when insufficiency occurs, for example, triggering a recovery request for requesting data recovery of page data in the solid state disk when the number of erased blocks in the solid state disk accounts for less than 20% of the total number of data blocks.
Optionally, in this embodiment, the page types of the effective page data include a first type of page data and a second type of page data, where the first type of page data is page data to be replaced and stored in the solid state disk from the cache, and the second type of page data is page data not replaced and stored in the solid state disk from the cache.
Optionally, in this embodiment, the obtaining, in response to the recycle request, the first type of page data from the cached valid page data includes: acquiring access frequency and modification identification of effective page data in a cache; acquiring the page type of the effective page data according to the access frequency and the modification identifier, wherein the page type of the effective page data comprises a first type of page data and a second type of page data, and the second type of page data is used for indicating that the page data stored in the solid state disk is not replaced from the cache; and separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
For example, a Cache Layer (Cache Layer) for temporarily caching valid page data will queue the pages in the Cache according to a Least Recently Used (Least Recently Used) algorithm (LRU queue). Wherein the LRU queue may be, but is not limited to, divided into HOT (HOT) pages and Cold (COOL) pages according to a predetermined threshold, for example, if the predetermined threshold is 10, then the tail 10% of the LRU queue is marked as Cold (COOL) pages and the first 90% of the pages are marked as HOT (HOT) pages. Wherein the LRU queues may be, but are not limited to, sorted according to access frequency.
Further, the page data modified in the caching layer is marked as DIRTY (DIRTY) pages, and the page data not modified is marked as CLEAN (CLEAN) pages. DIRTY pages in cold pages are called cold DIRTY (COOL DIRTY) pages (identified by CD) and DIRTY pages in HOT pages are called HOT DIRTY (HOT DIRTY) pages (identified by HD).
It should be noted that there are two copies of the page data in the cache, one is a copy in the solid state disk, and the other is a copy in the cache. If the page is a clean page, the contents of the two copies are completely the same; if the page is dirty, the copy in the cache is the latest page data, and the copy in the solid state disk is the old page data. That is to say, the dirty page in the solid state disk and the dirty page in the cache are different copies of the same page, the cold dirty page in the cache stores new data, and the solid state disk stores old data of the cold dirty page. In this embodiment, by flexibly combining cache replacement and data recovery, when a cold dirty page in a cache is replaced, the cold dirty page in the solid state disk, where old data is stored, may be marked as invalid page data, and new data corresponding to the cold dirty page in the cache is directly written into an updated position (e.g., a predetermined relocation position after data recovery is performed) in the solid state disk. Therefore, the steps that cold and dirty pages are moved to the solid state disk through cache replacement and then secondary moving is executed in the data recovery process are avoided, the problem that data processing efficiency is low due to secondary moving of data in the related technology is solved, data processing efficiency is improved, and meanwhile, expenditure of the solid state disk due to data moving is greatly reduced.
For example, as shown in fig. 2, each data block in the solid state disk may include, but is not limited to, the following 5 types of page data: unwritten page data (which may be represented by unwritten pages), stale page data (which may be represented by stale pages), valid page data (which may be represented by valid pages), wherein valid page data includes: clean page data (which may be represented by a clean page), hot dirty page data (which may be represented by a hot dirty page), and cold dirty page data (which may be represented by a cold dirty page). The unwritten page data is free space in the data block, is erased or is not allocated, and can be directly written. Clean page data means that the page has been written with data while the page data is not modified in the cache; the hot dirty page data refers to page data which is modified in the cache but is not replaced from the cache temporarily due to frequent access; the cold dirty page data means that the page data is modified in the cache and is not frequently accessed, namely, the page data is replaced by a cache replacement algorithm; the invalid page data means that the page data is modified, the new data is written to other positions, and the old data is the invalid data. In addition, in this embodiment, the cache is used for storing valid page data, which may include, but is not limited to, the following 3 types of page data: clean page data, hot dirty page data, and cold dirty page data.
That is, in this embodiment, the page data of the second type may include, but is not limited to: clean page data, hot dirty page data, the first type of page data may include, but is not limited to, cold dirty page data. It should be noted that, because the clean page data in the valid page data is consistent with the content stored in the solid state disk in the cache, in this embodiment, the clean page data and the hot page data may be regarded as the second type page data that is not stored in the solid state disk from the cache in a replacement manner.
Optionally, in this embodiment, before migrating the first type of page data from the cache to a predetermined migration position in the solid state disk, the method further includes:
s1, determining a data recovery block of the solid state disk at least according to the first type of page data;
and S2, performing data recovery on the data recovery block.
Optionally, in this embodiment, the performing data recovery on the data recovery block may include: the effective page data in the data recovery block is moved to a preset moving position, and the effective page data is marked as invalid page data; and erasing the invalid page data in the data recovery block.
Optionally, in this embodiment, the determining the data recovery block of the solid state disk according to at least the first type of page data may be, but is not limited to, obtaining data recovery rates of the data blocks in the solid state disk according to at least the first type of page data, and determining the data recovery block (e.g., determining a block identifier of the data recovery block) by comparing the obtained data recovery rates.
Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of:
1) taking the data block with the highest data recovery rate as a data recovery block;
2) if the data recovery rates of a plurality of data blocks are the same and are the highest values, comparing the number of the first type of page data in the data blocks, and taking the data block with the larger value as a data recovery block;
3) if the data recovery rates of a plurality of data blocks are the same and are the highest values, and the number of the first type of page data in the data blocks is the same, comparing the number of the second type of page data in the data blocks, and taking the data block with the smaller value as a data recovery block;
4) and if the data recovery rates of a plurality of data blocks are the same and the data blocks are the highest values, the number of the first type of page data in the data blocks is the same, and the number of the second type of page data in the data blocks is also the same, taking the data block with the largest block identifier as the data recovery block.
Optionally, in this embodiment, the obtaining of the data recovery rate of each data block in the solid state disk according to at least the first type of page data may include, but is not limited to: and determining the data recovery rate according to the first type of page data and the invalid page data in each data block in the solid state disk.
Optionally, in this embodiment, before migrating the first type of page data from the cache to a predetermined migration position in the solid state disk, the method further includes: and determining a preset relocation position according to the size of unwritten page data in other data blocks except the data recovery block in the solid state disk. Therefore, the effective page data in the solid state disk can be completely relocated to the area corresponding to the unwritten page data in other data blocks.
Specifically, as shown in fig. 3, the method includes:
s302, a system triggers a recovery request;
s304, acquiring the page types of all the page data in the cache, and acquiring the first type of page data from the page types;
s306, calculating the data recovery rate of each data block in the solid state disk;
s308, determining a data recovery block;
s310, selecting a preset relocation position;
s312, directly copying the second type page data in the data recovery block to a preset relocation position of the relocation position;
s314, marking the page data corresponding to the first type page data in the data recovery block as an invalid page, and copying the latest first type page data in the cache to a preset moving position;
s316, the data recovery block is erased.
According to the embodiment provided by the application, when a recovery request for data recovery of page data in the solid state disk is acquired, the first type of page data to be replaced and stored in the cache in the solid state disk is directly moved to the preset moving position in the solid state disk at one time in the effective page data, and the first type of page data does not need to be replaced into the solid state disk first and then moved again, so that the problem of low data processing efficiency caused by secondary moving of data in the related technology is solved, the effect of improving the data processing efficiency is further realized, in addition, the data moving times and extra expenses caused by the data moving times in the data recovery and cache replacement processes of the solid state disk are reduced, and the performance of the solid state disk is improved.
As an optional scheme, the obtaining the first type of page data from the cached valid page data in response to the recycle request includes:
s1, acquiring the access frequency and the modification identification of the effective page data in the cache;
s2, acquiring the page type of the effective page data according to the access frequency and the modification identifier, wherein the page type of the effective page data comprises a first type of page data and a second type of page data, and the second type of page data is used for indicating that the page data stored in the solid state disk is not replaced from the cache;
and S3, separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
Optionally, in this embodiment, the second type of page data includes first page data and second page data, where obtaining the page type of the valid page data according to the access frequency and the modification identifier includes: the page data which is not modified is identified as the first page data, the page data which is modified is identified as the second page data, the access frequency of the page data is greater than or equal to a first preset threshold value, and the page data which is modified is identified as the first page data, and the access frequency of the page data is less than the first preset threshold value.
That is, in this embodiment, the FTL in the solid state disk will detect the free space of the solid state disk in real time, and after detecting that the free space is insufficient and triggering the recycle request, the system will start to traverse the LRU queue in the cache, and mark the page type to the page data in the LRU queue.
For example, assuming that the predetermined threshold for the demarcation is 10%, the caching layer marks 10% of the pages at the end of the line of the LRU queue as cold pages and the remaining pages as hot pages. Further, the cold page and the hot page are traversed respectively, a dirty page in the cold page is marked as a cold dirty page (CD), a dirty page in the hot page is marked as a hot dirty page (HD), and the marked page type is notified to the FTL in the solid state disk.
That is to say, after the page type of the page data marked in the cache is obtained, the page data of the first type (i.e. the cold-dirty page) can be obtained separately, so that the page data of the first type can be moved to the storage location of the effective page data after the data recovery is performed, thereby avoiding performing cache replacement and two times of movement in the data recovery process, and achieving the effect of reducing overhead.
As an optional scheme, before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, the method further includes:
s1, determining a data recovery block of the solid state disk according to at least the first type of page data, wherein the page type of the page data in each data block in the solid state disk includes: unwritten page data, invalid page data and valid page data, wherein each data block comprises a data recovery block;
and S2, performing data recovery on the data recovery block.
Optionally, in this embodiment, the determining, according to at least the first type of page data, the data recovery block of the solid state disk includes:
s12, acquiring the data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state disk;
and S14, determining a data recovery block according to the data recovery rate.
Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of:
1) taking the data block with the highest data recovery rate as a data recovery block;
2) if the data recovery rates of a plurality of data blocks are the same and are the highest values, comparing the number of the first type of page data in the data blocks, and taking the data block with the larger value as a data recovery block;
3) if the data recovery rates of a plurality of data blocks are the same and are the highest values, and the number of the first type of page data in the data blocks is the same, comparing the number of the second type of page data in the data blocks, and taking the data block with the smaller value as a data recovery block;
4) and if the data recovery rates of a plurality of data blocks are the same and the data blocks are the highest values, the number of the first type of page data in the data blocks is the same, and the number of the second type of page data in the data blocks is also the same, taking the data block with the largest block identifier as the data recovery block.
It should be noted that, in this embodiment, the mapping table of the cache layer serving as the cache not only stores the page type of the page data mark, but also correspondingly stores preset location information (such as a block identifier of a data block) of the page data after the page data is stored in the solid state disk in a replacement manner.
For example, assuming that the page data of the first type is a cold dirty page, the FTL may count the number of cold dirty pages marked as the page data of the first type acquired by the cache and the number of invalid page data in each data block of the solid state disk, acquire the number of invalid pages and the number of cold dirty pages in each data block by using the data block as a unit, and calculate the data recovery rate of each data block by using the number of invalid pages and the number of cold dirty pages.
Through the embodiment that this application provided, come the accurate data recovery piece that is used for carrying out data recovery in the location solid state hard drive through data recovery rate to the realization is to the accurate efficient data recovery of solid state hard drive, and then guarantees the treatment effeciency of data in the solid state hard drive.
As an optional scheme, acquiring a data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state disk includes:
s1, repeatedly executing the following steps until all data blocks in the solid state disk are traversed:
s12, acquiring the block identification of the current data block;
s14, acquiring the first type page data and the failure page data identified by the block identifier;
and S16, acquiring the data recovery rate of the current data block by the following method:
Figure BDA0000929661180000101
wherein r represents the data recovery rate of the current data block, a represents the page number of the invalid page data of the current data block in the solid state disk, B represents the page number of the first type page data in the cache, P represents the page size, and B represents the block size.
It should be noted that, in this embodiment, the unit of the read/write operation in the solid state disk is a page, where the page size is usually 2KB, and the access delay is usually 15us to 200 us. The unit of an erase operation is a block, where the block size is typically 128KB, and an overhead of around 2ms is required to erase a block.
Through the embodiment provided by the application, the data recovery rate of each data block in the solid state disk is calculated in sequence through the mode, so that the accuracy of the determined data recovery block is ensured, and accurate and efficient data recovery of the solid state disk is realized.
As an optional scheme, performing data reclamation on the data reclamation block includes:
s1, the effective page data in the data recovery block is moved to a preset moving position, and the effective page data is marked as invalid page data;
and S2, erasing the invalid page data in the data recovery block.
For example, if the first type of page data is a cold dirty page as an example, and the second type of page data is a clean page and a hot dirty page as an example, in the process of executing data recovery, the clean page and the hot dirty page in the data recovery block may be directly copied to a predetermined relocation location, and location information corresponding to the page data in the FTL is modified. And marks the corresponding valid page data in the data reclamation block as stale page data (which may be represented by a stale page).
Further, the cold-dirty page in the data recovery block is also marked as invalid page data, and the latest data of the cold-dirty page in the cache is copied to a preset relocation position. And then deleting the latest data of the cold and dirty page in the cache, and modifying the position information corresponding to the page data in the FTL.
And then erasing the page data in the data recovery block, and marking the data recovery block as erased so as to realize the purposes of recovering the data of the solid state disk and releasing the free space.
Through the embodiment provided by the application, the data recovery of the solid state disk is realized through the above manner, so that the first type page data and the second type page data in the effective page data can be moved to the preset moving position at one time, the second moving of the first type page data is avoided, and the effect of reducing the overhead of the solid state disk is realized.
As an optional scheme, before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, the method further includes:
and S1, determining a preset transfer position according to the size of the unwritten page data in the data blocks except the data recovery block in the solid state disk.
For example, the size of effective page data in the data recovery block is counted, the FTL updated in real time is checked, an unwritten data page satisfying the size of the effective page data in other data blocks of the solid state disk is inquired, and the searched data block is used as a predetermined relocation position of the effective page data of the data recovery block.
According to the embodiment provided by the application, the preset moving position is determined according to the size of the unwritten page data in other data blocks except the data recovery block in the solid state disk, so that all effective page data in the data recovery block can be moved.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
In this embodiment, a data processing apparatus is further provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, and details are not repeated for what has been described. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a schematic diagram of an alternative data processing apparatus according to an embodiment of the present invention, as shown in fig. 4, the apparatus includes:
1) a first obtaining unit 402, configured to obtain a recovery request, where the recovery request is used to request data recovery of page data in a solid state disk;
2) a second obtaining unit 404, configured to obtain, in response to the recycle request, a first type of page data from the cached valid page data, where the first type of page data is used to indicate that the page data stored in the solid state disk is to be replaced from the cache;
3) the relocation unit 406 is configured to relocate the first type of page data from the cache to a predetermined relocation location in the solid state disk, where the predetermined relocation location is a storage location of valid page data after the data recovery is performed.
Optionally, in this embodiment, the data processing apparatus may be, but is not limited to, applied to a garbage data recovery process of a solid state disk. That is to say, in this embodiment, when data recovery is performed on page data that is garbage in the solid state disk, when a recovery request for performing data recovery on the page data in the solid state disk is acquired, the first type of page data to be replaced and stored in the cache in the solid state disk in the valid page data may be directly moved to a predetermined moving position in the solid state disk at one time, without replacing the first type of page data in the solid state disk first and then moving the first type of page data again, so that a problem of low data processing efficiency due to secondary moving of data in the related art is overcome, and while data processing efficiency is improved, overhead of the solid state disk due to data moving is greatly reduced.
Optionally, in this embodiment, the solid state disk includes: a Flash Translation Layer (FTL), configured to map a logical address to a physical address through a mapping table; the method is used for marking the page type in the solid state disk; the method comprises the steps of detecting free space, and triggering data recovery when insufficiency occurs, for example, triggering a recovery request for requesting data recovery of page data in the solid state disk when the number of erased blocks in the solid state disk accounts for less than 20% of the total number of data blocks.
Optionally, in this embodiment, the page types of the effective page data include a first type of page data and a second type of page data, where the first type of page data is page data to be replaced and stored in the solid state disk from the cache, and the second type of page data is page data not replaced and stored in the solid state disk from the cache.
Optionally, in this embodiment, the second obtaining unit 404 includes: (1) the first acquisition module is used for acquiring the access frequency and the modification identification of the effective page data in the cache; (2) the second obtaining module is used for obtaining the page type of the effective page data according to the access frequency and the modification identifier, wherein the page type of the effective page data comprises the first type of page data and the second type of page data, and the second type of page data is used for indicating that the page data stored in the solid state disk is not replaced from the cache; (3) and the separation module is used for separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
For example, a Cache Layer (Cache Layer) for temporarily caching valid page data will queue the pages in the Cache according to a Least Recently Used (Least Recently Used) algorithm (LRU queue). Wherein the LRU queue may be, but is not limited to, divided into HOT (HOT) pages and Cold (COOL) pages according to a predetermined threshold, for example, if the predetermined threshold is 10, then the tail 10% of the LRU queue is marked as Cold (COOL) pages and the first 90% of the pages are marked as HOT (HOT) pages. Wherein the LRU queues may be, but are not limited to, sorted according to access frequency.
Further, the page data modified in the caching layer is marked as DIRTY (DIRTY) pages, and the page data not modified is marked as CLEAN (CLEAN) pages. DIRTY pages in cold pages are called cold DIRTY (COOL DIRTY) pages (identified by CD) and DIRTY pages in HOT pages are called HOT DIRTY (HOT DIRTY) pages (identified by HD).
It should be noted that there are two copies of the page data in the cache, one is a copy in the solid state disk, and the other is a copy in the cache. If the page is a clean page, the contents of the two copies are completely the same; if the page is dirty, the copy in the cache is the latest page data, and the copy in the solid state disk is the old page data. That is to say, the dirty page in the solid state disk and the dirty page in the cache are different copies of the same page, the cold dirty page in the cache stores new data, and the solid state disk stores old data of the cold dirty page. In this embodiment, by flexibly combining cache replacement and data recovery, when a cold dirty page in a cache is replaced, the cold dirty page in the solid state disk, where old data is stored, may be marked as invalid page data, and new data corresponding to the cold dirty page in the cache is directly written into an updated position (e.g., a predetermined relocation position after data recovery is performed) in the solid state disk. Therefore, the steps that cold and dirty pages are moved to the solid state disk through cache replacement and then secondary moving is executed in the data recovery process are avoided, the problem that data processing efficiency is low due to secondary moving of data in the related technology is solved, data processing efficiency is improved, and meanwhile, expenditure of the solid state disk due to data moving is greatly reduced.
For example, as shown in fig. 2, each data block in the solid state disk may include, but is not limited to, the following 5 types of page data: unwritten page data (which may be represented by unwritten pages), stale page data (which may be represented by stale pages), valid page data (which may be represented by valid pages), wherein valid page data includes: clean page data (which may be represented by a clean page), hot dirty page data (which may be represented by a hot dirty page), and cold dirty page data (which may be represented by a cold dirty page). The unwritten page data is free space in the data block, is erased or is not allocated, and can be directly written. Clean page data means that the page has been written with data while the page data is not modified in the cache; the hot dirty page data refers to page data which is modified in the cache but is not replaced from the cache temporarily due to frequent access; the cold dirty page data means that the page data is modified in the cache and is not frequently accessed, namely, the page data is replaced by a cache replacement algorithm; the invalid page data means that the page data is modified, the new data is written to other positions, and the old data is the invalid data. In addition, in this embodiment, the cache is used for storing valid page data, which may include, but is not limited to, the following 3 types of page data: clean page data, hot dirty page data, and cold dirty page data.
That is, in this embodiment, the page data of the second type may include, but is not limited to: clean page data, hot dirty page data, the first type of page data may include, but is not limited to, cold dirty page data. It should be noted that, because the clean page data in the valid page data is consistent with the content stored in the solid state disk in the cache, in this embodiment, the clean page data and the hot page data may be regarded as the second type page data that is not stored in the solid state disk from the cache in a replacement manner.
Optionally, in this embodiment, the apparatus further includes: (1) the first determining unit is configured to determine a data recovery block of the solid state disk at least according to the first type of page data before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, where the page type of the page data in each data block in the solid state disk includes: unwritten page data, invalid page data and valid page data, wherein each data block comprises a data recovery block; (2) and the recovery unit is used for recovering the data of the data recovery block.
Optionally, in this embodiment, the foregoing recovery unit performs data recovery on the data recovery block by: the effective page data in the data recovery block is moved to a preset moving position, and the effective page data is marked as invalid page data; and erasing the invalid page data in the data recovery block.
Optionally, in this embodiment, the first determining unit determines the data recycling block of the solid state disk according to at least the first type of page data by: and acquiring the data recovery rate of each data block in the solid state disk at least according to the first type of page data, and determining the data recovery block (such as determining the block identifier of the data recovery block) by comparing the acquired data recovery rates.
Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of:
1) taking the data block with the highest data recovery rate as a data recovery block;
2) if the data recovery rates of a plurality of data blocks are the same and are the highest values, comparing the number of the first type of page data in the data blocks, and taking the data block with the larger value as a data recovery block;
3) if the data recovery rates of a plurality of data blocks are the same and are the highest values, and the number of the first type of page data in the data blocks is the same, comparing the number of the second type of page data in the data blocks, and taking the data block with the smaller value as a data recovery block;
4) and if the data recovery rates of a plurality of data blocks are the same and the data blocks are the highest values, the number of the first type of page data in the data blocks is the same, and the number of the second type of page data in the data blocks is also the same, taking the data block with the largest block identifier as the data recovery block.
Optionally, in this embodiment, the obtaining of the data recovery rate of each data block in the solid state disk according to at least the first type of page data may include, but is not limited to: and determining the data recovery rate according to the first type of page data and the invalid page data in each data block in the solid state disk.
Optionally, in this embodiment, before migrating the first type of page data from the cache to a predetermined migration position in the solid state disk, the method further includes: and determining a preset relocation position according to the size of unwritten page data in other data blocks except the data recovery block in the solid state disk. Therefore, the effective page data in the solid state disk can be completely relocated to the area corresponding to the unwritten page data in other data blocks.
Specifically, the following example is described, and as shown in fig. 3, the data processing apparatus may implement data recovery on a solid state disk by the following steps:
s302, a system triggers a recovery request;
s304, acquiring the page types of all the page data in the cache, and acquiring the first type of page data from the page types;
s306, calculating the data recovery rate of each data block in the solid state disk;
s308, determining a data recovery block;
s310, selecting a preset relocation position;
s312, directly copying the second type page data in the data recovery block to a preset relocation position of the relocation position;
s314, marking the page data corresponding to the first type page data in the data recovery block as an invalid page, and copying the latest first type page data in the cache to a preset moving position;
s316, the data recovery block is erased.
According to the embodiment provided by the application, when a recovery request for data recovery of page data in the solid state disk is acquired, the first type of page data to be replaced and stored in the cache in the solid state disk is directly moved to the preset moving position in the solid state disk at one time in the effective page data, and the first type of page data does not need to be replaced into the solid state disk first and then moved again, so that the problem of low data processing efficiency caused by secondary moving of data in the related technology is solved, the effect of improving the data processing efficiency is further realized, in addition, the data moving times and extra expenses caused by the data moving times in the data recovery and cache replacement processes of the solid state disk are reduced, and the performance of the solid state disk is improved.
As an optional solution, the second obtaining unit includes:
1) the first acquisition module is used for acquiring the access frequency and the modification identification of the effective page data in the cache;
2) the second obtaining module is used for obtaining the page type of the effective page data according to the access frequency and the modification identifier, wherein the page type of the effective page data comprises the first type of page data and the second type of page data, and the second type of page data is used for indicating that the page data stored in the solid state disk is not replaced from the cache;
3) and the separation module is used for separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
Optionally, in this embodiment, the page data of the second type may include: the first page data and the second page data, wherein the second obtaining module obtains the page type of the effective page data by the following method: the page data which is not modified is identified as the first page data, the page data which is modified is identified as the second page data, the access frequency of the page data is greater than or equal to a first preset threshold value, and the page data which is modified is identified as the first page data, and the access frequency of the page data is less than the first preset threshold value.
That is, in this embodiment, the FTL in the solid state disk will detect the free space of the solid state disk in real time, and after detecting that the free space is insufficient and triggering the recycle request, the system will start to traverse the LRU queue in the cache, and mark the page type to the page data in the LRU queue.
For example, assuming that the predetermined threshold for the demarcation is 10%, the caching layer marks 10% of the pages at the end of the line of the LRU queue as cold pages and the remaining pages as hot pages. Further, the cold page and the hot page are traversed respectively, a dirty page in the cold page is marked as a cold dirty page (CD), a dirty page in the hot page is marked as a hot dirty page (HD), and the marked page type is notified to the FTL in the solid state disk.
That is to say, after the page type of the page data marked in the cache is obtained, the page data of the first type (i.e. the cold-dirty page) can be obtained separately, so that the page data of the first type can be moved to the storage location of the effective page data after the data recovery is performed, thereby avoiding performing cache replacement and two times of movement in the data recovery process, and achieving the effect of reducing overhead.
As an optional scheme, the method further comprises the following steps:
1) the first determining unit is configured to determine a data recovery block of the solid state disk at least according to the first type of page data before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, where the page type of the page data in each data block in the solid state disk includes: unwritten page data, invalid page data and valid page data, wherein each data block comprises a data recovery block;
2) and the recovery unit is used for recovering the data of the data recovery block.
Optionally, in this embodiment, the first determining unit includes:
(1) the third acquisition module is used for acquiring the data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state disk;
(2) and the determining module is used for determining the data recovery block according to the data recovery rate.
Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of:
1) taking the data block with the highest data recovery rate as a data recovery block;
2) if the data recovery rates of a plurality of data blocks are the same and are the highest values, comparing the number of the first type of page data in the data blocks, and taking the data block with the larger value as a data recovery block;
3) if the data recovery rates of a plurality of data blocks are the same and are the highest values, and the number of the first type of page data in the data blocks is the same, comparing the number of the second type of page data in the data blocks, and taking the data block with the smaller value as a data recovery block;
4) and if the data recovery rates of a plurality of data blocks are the same and the data blocks are the highest values, the number of the first type of page data in the data blocks is the same, and the number of the second type of page data in the data blocks is also the same, taking the data block with the largest block identifier as the data recovery block.
It should be noted that, in this embodiment, the mapping table of the cache layer serving as the cache not only stores the page type of the page data mark, but also correspondingly stores preset location information (such as a block identifier of a data block) of the page data after the page data is stored in the solid state disk in a replacement manner.
For example, assuming that the page data of the first type is a cold dirty page, the FTL may count the number of cold dirty pages marked as the page data of the first type acquired by the cache and the number of invalid page data in each data block of the solid state disk, acquire the number of invalid pages and the number of cold dirty pages in each data block by using the data block as a unit, and calculate the data recovery rate of each data block by using the number of invalid pages and the number of cold dirty pages.
Through the embodiment that this application provided, come the accurate data recovery piece that is used for carrying out data recovery in the location solid state hard drive through data recovery rate to the realization is to the accurate efficient data recovery of solid state hard drive, and then guarantees the treatment effeciency of data in the solid state hard drive.
As an optional scheme, the third obtaining module includes:
1) the processing submodule is used for repeatedly executing the following steps until all data blocks in the solid state disk are traversed:
s1, acquiring the block identification of the current data block;
s2, acquiring the first type page data and the failure page data identified by the block identifier;
and S3, acquiring the data recovery rate of the current data block by the following method:
Figure BDA0000929661180000171
wherein r represents the data recovery rate of the current data block, a represents the page number of the invalid page data of the current data block in the solid state disk, B represents the page number of the first type page data in the cache, P represents the page size, and B represents the block size.
It should be noted that, in this embodiment, the unit of the read/write operation in the solid state disk is a page, where the page size is usually 2KB, and the access delay is usually 15us to 200 us. The unit of an erase operation is a block, where the block size is typically 128KB, and an overhead of around 2ms is required to erase a block.
Through the embodiment provided by the application, the data recovery rate of each data block in the solid state disk is calculated in sequence through the mode, so that the accuracy of the determined data recovery block is ensured, and accurate and efficient data recovery of the solid state disk is realized.
As an alternative, the recovery unit comprises:
1) the moving module is used for moving the effective page data in the data recovery block to a preset moving position and marking the effective page data as invalid page data;
2) and the erasing module is used for erasing the invalid page data in the data recovery block.
For example, if the first type of page data is a cold dirty page as an example, and the second type of page data is a clean page and a hot dirty page as an example, in the process of executing data recovery, the clean page and the hot dirty page in the data recovery block may be directly copied to a predetermined relocation location, and location information corresponding to the page data in the FTL is modified. And marks the corresponding valid page data in the data reclamation block as stale page data (which may be represented by a stale page).
Further, the cold-dirty page in the data recovery block is also marked as invalid page data, and the latest data of the cold-dirty page in the cache is copied to a preset relocation position. And then deleting the latest data of the cold and dirty page in the cache, and modifying the position information corresponding to the page data in the FTL.
And then erasing the page data in the data recovery block, and marking the data recovery block as erased so as to realize the purposes of recovering the data of the solid state disk and releasing the free space.
Through the embodiment provided by the application, the data recovery of the solid state disk is realized through the above manner, so that the first type page data and the second type page data in the effective page data can be moved to the preset moving position at one time, the second moving of the first type page data is avoided, and the effect of reducing the overhead of the solid state disk is realized.
As an optional scheme, the method further comprises the following steps:
1) and the second determining unit is used for determining the preset moving position according to the size of the unwritten page data in other data blocks except the data recovery block in the solid state disk before moving the first type of page data from the cache to the preset moving position in the solid state disk.
For example, the size of effective page data in the data recovery block is counted, the FTL updated in real time is checked, an unwritten data page satisfying the size of the effective page data in other data blocks of the solid state disk is inquired, and the searched data block is used as a predetermined relocation position of the effective page data of the data recovery block.
According to the embodiment provided by the application, the preset moving position is determined according to the size of the unwritten page data in other data blocks except the data recovery block in the solid state disk, so that all effective page data in the data recovery block can be moved.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in a plurality of processors.
Example 3
The embodiment of the invention also provides a storage medium. Alternatively, in the present embodiment, the storage medium may be configured to store program codes for performing the following steps:
s1, acquiring a recovery request, wherein the recovery request is used for requesting data recovery of page data in the solid state disk;
s2, responding to the recovery request, and acquiring first type page data from the cached effective page data, wherein the first type page data is used for indicating that the page data stored in the solid state disk is to be replaced from the cache;
and S3, transferring the first type page data from the cache to a preset transfer position in the solid state disk, wherein the preset transfer position is a storage position of valid page data after executing data recovery.
Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. A data processing method, comprising:
acquiring a recovery request, wherein the recovery request is used for requesting data recovery of page data in a solid state disk;
responding to the recovery request to obtain first type page data from the cached effective page data, wherein the first type page data is used for indicating that the page data stored in the solid state disk is to be replaced from the cache;
and migrating the first type of page data from the cache to a preset migration position in the solid state disk, wherein the preset migration position is a storage position of the effective page data after the data recovery is executed.
2. The method of claim 1, wherein obtaining the first type of page data from the cached valid page data in response to the eviction request comprises:
acquiring the access frequency and the modification identification of the effective page data in the cache;
acquiring the page type of the effective page data according to the access frequency and the modification identifier, wherein the page type of the effective page data comprises the first type of page data and a second type of page data, and the second type of page data is used for indicating that the page data stored in the solid state disk is not replaced from the cache;
and separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
3. The method of claim 2, wherein the second type of page data comprises first page data and second page data, and wherein obtaining the page type of the valid page data according to the access frequency and the modification identifier comprises:
the page data which is not modified is identified as the first page data, the page data which is modified is identified as the second page data, the access frequency of the page data is greater than or equal to a first preset threshold value, and the page data which is modified is identified as the first type of page data.
4. The method according to claim 2, before migrating the first type of page data from the cache to a predetermined migration location in the solid state disk, further comprising:
determining a data recovery block of the solid state disk at least according to the first type of page data, wherein the page type of the page data in each data block in the solid state disk comprises: unwritten page data, invalid page data and the valid page data, wherein each data block comprises the data recovery block;
and performing the data recovery on the data recovery block.
5. The method of claim 4, wherein determining the data reclamation block for the solid state disk based at least on the first type of page data comprises:
acquiring the data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state disk;
and determining the data recovery block according to the data recovery rate.
6. The method of claim 5, wherein obtaining the data recovery rate of each data block according to the page data of the first type in the cache and the page data in each data block in the solid state disk comprises:
repeatedly executing the following steps until all the data blocks in the solid state disk are traversed:
acquiring a block identifier of a current data block;
acquiring the first type of page data and the failure page data identified by the block identifier;
obtaining the data recovery rate of the current data block by:
Figure FDA0000929661170000021
wherein r represents the data recovery rate of the current data block, a represents the number of pages of the invalid page data of the current data block in the solid state disk, B represents the number of pages of the first type of page data in the cache, P represents the page size, and B represents the block size.
7. The method of claim 4, wherein performing the data reclamation on the data reclamation block comprises:
the effective page data in the data recovery block is relocated to the preset relocation position, and the effective page data is marked as the failure page data;
and erasing the invalid page data in the data recovery block.
8. The method according to claim 7, before migrating the first type of page data from the cache to a predetermined migration location in the solid state disk, further comprising:
and determining the preset relocation position according to the size of the unwritten page data in other data blocks except the data recovery block in the solid state disk.
9. A data processing apparatus, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a recovery request, and the recovery request is used for requesting data recovery of page data in a solid state disk;
a second obtaining unit, configured to obtain, in response to the recycle request, a first type of page data from valid page data in a cache, where the first type of page data is used to indicate that page data stored in the solid state disk is to be replaced from the cache;
and the moving unit is used for moving the first type of page data from the cache to a preset moving position in the solid state disk, wherein the preset moving position is a storage position of the effective page data after the data recovery is executed.
10. The apparatus of claim 9, wherein the second obtaining unit comprises:
the first acquisition module is used for acquiring the access frequency and the modification identifier of the effective page data in the cache;
a second obtaining module, configured to obtain a page type of the valid page data according to the access frequency and the modification identifier, where the page type of the valid page data includes the first type of page data and a second type of page data, and the second type of page data is used to indicate that the page data stored in the solid state disk is not replaced from the cache;
and the separation module is used for separating the effective page data according to the page type of the effective page data to obtain the first type of page data.
11. The apparatus of claim 10, wherein the second type of page data comprises first page data and second page data, and wherein the second obtaining module obtains the page type of the valid page data by:
the page data which is not modified is identified as the first page data, the page data which is modified is identified as the second page data, the access frequency of the page data is greater than or equal to a first preset threshold value, and the page data which is modified is identified as the first type of page data.
12. The apparatus of claim 10, further comprising:
a first determining unit, configured to determine a data recovery block of the solid state disk according to at least the first type of page data before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, where a page type of page data in each data block in the solid state disk includes: unwritten page data, invalid page data and the valid page data, wherein each data block comprises the data recovery block;
and the recovery unit is used for recovering the data of the data recovery block.
13. The apparatus of claim 12, wherein the first determining unit comprises:
a third obtaining module, configured to obtain a data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state disk;
and the determining module is used for determining the data recovery block according to the data recovery rate.
14. The apparatus of claim 13, wherein the third obtaining module comprises:
the processing submodule is used for repeatedly executing the following steps until all the data blocks in the solid state disk are traversed:
acquiring a block identifier of a current data block;
acquiring the first type of page data and the failure page data identified by the block identifier;
obtaining the data recovery rate of the current data block by:
Figure FDA0000929661170000041
wherein r represents the data recovery rate of the current data block, a represents the number of pages of the invalid page data of the current data block in the solid state disk, B represents the number of pages of the first type of page data in the cache, P represents the page size, and B represents the block size.
15. The apparatus of claim 12, wherein the recovery unit comprises:
the moving module is used for moving the effective page data in the data recovery block to the preset moving position and marking the effective page data as the invalid page data;
and the erasing module is used for erasing the invalid page data in the data recovery block.
16. The apparatus of claim 15, further comprising:
a second determining unit, configured to determine, before the first type of page data is migrated from the cache to a predetermined migration position in the solid state disk, the predetermined migration position according to a size of the unwritten page data in another data block in the solid state disk except the data recovery block.
CN201610103929.6A 2016-02-25 2016-02-25 Data processing method and device Active CN107122124B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610103929.6A CN107122124B (en) 2016-02-25 2016-02-25 Data processing method and device
PCT/CN2017/074290 WO2017143972A1 (en) 2016-02-25 2017-02-21 Data processing method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610103929.6A CN107122124B (en) 2016-02-25 2016-02-25 Data processing method and device

Publications (2)

Publication Number Publication Date
CN107122124A CN107122124A (en) 2017-09-01
CN107122124B true CN107122124B (en) 2021-06-15

Family

ID=59684803

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610103929.6A Active CN107122124B (en) 2016-02-25 2016-02-25 Data processing method and device

Country Status (2)

Country Link
CN (1) CN107122124B (en)
WO (1) WO2017143972A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710541B (en) * 2018-12-06 2023-06-09 天津津航计算技术研究所 Optimization method for Greedy garbage collection of NAND Flash main control chip
CN109739776B (en) * 2018-12-06 2023-06-30 天津津航计算技术研究所 Greedy garbage collection system for NAND Flash main control chip
CN113805805B (en) * 2021-05-06 2023-10-13 北京奥星贝斯科技有限公司 Method and device for eliminating cache memory block and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102508788A (en) * 2011-09-28 2012-06-20 成都市华为赛门铁克科技有限公司 SSD (solid state drive) and SSD garbage collection method and device
CN102567218A (en) * 2010-12-17 2012-07-11 微软公司 Garbage collection and hotspots relief for a data deduplication chunk store
CN102841850A (en) * 2012-06-19 2012-12-26 记忆科技(深圳)有限公司 Method and system for reducing solid state disk write amplification
CN103136121A (en) * 2013-03-25 2013-06-05 中国人民解放军国防科学技术大学 Cache management method for solid-state disc
CN103455435A (en) * 2013-08-29 2013-12-18 华为技术有限公司 Data writing method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8219776B2 (en) * 2009-09-23 2012-07-10 Lsi Corporation Logical-to-physical address translation for solid state disks
US8285918B2 (en) * 2009-12-11 2012-10-09 Nimble Storage, Inc. Flash memory cache for data storage device
CN102279809A (en) * 2011-08-10 2011-12-14 郏惠忠 Method for redirecting write in and garbage recycling in solid hard disk
CN104424103B (en) * 2013-08-21 2018-05-29 光宝科技股份有限公司 Solid state storage device medium-speed cached management method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567218A (en) * 2010-12-17 2012-07-11 微软公司 Garbage collection and hotspots relief for a data deduplication chunk store
CN102508788A (en) * 2011-09-28 2012-06-20 成都市华为赛门铁克科技有限公司 SSD (solid state drive) and SSD garbage collection method and device
CN102841850A (en) * 2012-06-19 2012-12-26 记忆科技(深圳)有限公司 Method and system for reducing solid state disk write amplification
CN103136121A (en) * 2013-03-25 2013-06-05 中国人民解放军国防科学技术大学 Cache management method for solid-state disc
CN103455435A (en) * 2013-08-29 2013-12-18 华为技术有限公司 Data writing method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Zombie Chasing: Efficient Flash Management Considering Dirty Data in the Buffer Cache;Youngjae Lee 等;《IEEE TRANSACTIONS ON COMPUTERS》;20150228;第64卷(第2期);第569-581页 *
固态硬盘的闪存转换层映射策略研究;侯奎;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130315(第3期);第I137-57页 *

Also Published As

Publication number Publication date
CN107122124A (en) 2017-09-01
WO2017143972A1 (en) 2017-08-31

Similar Documents

Publication Publication Date Title
US10838859B2 (en) Recency based victim block selection for garbage collection in a solid state device (SSD)
US9747202B1 (en) Storage module and method for identifying hot and cold data
US8417878B2 (en) Selection of units for garbage collection in flash memory
US10810127B2 (en) Solid-state hard disk and data access method for use with solid-state hard disk
US8838875B2 (en) Systems, methods and computer program products for operating a data processing system in which a file delete command is sent to an external storage device for invalidating data thereon
US8930612B2 (en) Background deduplication of data sets in a memory
US9940040B2 (en) Systems, solid-state mass storage devices, and methods for host-assisted garbage collection
CN109656486B (en) Configuration method of solid state disk, data storage method, solid state disk and storage controller
US20170139825A1 (en) Method of improving garbage collection efficiency of flash-oriented file systems using a journaling approach
CN107391774B (en) The rubbish recovering method of log file system based on data de-duplication
US20080177937A1 (en) Storage apparatus, computer system, and method for managing storage apparatus
US20160004474A1 (en) Data Erasing Method and Apparatus Applied to Flash Memory
US20110208898A1 (en) Storage device, computing system, and data management method
DK3059679T3 (en) CONTROL UNIT, FLASH MEMORY UNIT, PROCEDURE FOR IDENTIFICATION OF DATA BLOCK STABILITY, AND PROCEDURE FOR STORING DATA ON THE FLASH MEMORY UNIT
CN111880723B (en) Data storage device and data processing method
CN107632942A (en) A kind of method that solid state hard disc realizes LBA rank TRIM orders
CN110674056B (en) Garbage recovery method and device
US11645006B2 (en) Read performance of memory devices
EP3346387B1 (en) Storage system and system garbage collection method
US20100318726A1 (en) Memory system and memory system managing method
CN107122124B (en) Data processing method and device
CN104424110A (en) Active recovery of solid state drive
CN112513823A (en) Logical to physical table fragments
US20140258591A1 (en) Data storage and retrieval in a hybrid drive
CN107229580B (en) Sequential flow detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant