CN113672166A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113672166A
CN113672166A CN202110772517.2A CN202110772517A CN113672166A CN 113672166 A CN113672166 A CN 113672166A CN 202110772517 A CN202110772517 A CN 202110772517A CN 113672166 A CN113672166 A CN 113672166A
Authority
CN
China
Prior art keywords
data
solid state
cache region
memory cache
storage unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110772517.2A
Other languages
Chinese (zh)
Inventor
吴本卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruijie Networks Co Ltd
Original Assignee
Ruijie Networks Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruijie Networks Co Ltd filed Critical Ruijie Networks Co Ltd
Priority to CN202110772517.2A priority Critical patent/CN113672166A/en
Publication of CN113672166A publication Critical patent/CN113672166A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • G06F3/0649Lifecycle management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the application provides a data processing method and device, electronic equipment and a storage medium. Wherein, the method comprises the following steps: determining a storage unit to be recovered in the solid state disk; detecting whether the heat information of the data object is stored in a first memory cache region or not aiming at any data object in the storage unit to be recovered; if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region; if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object; and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk. The technical scheme provided by the embodiment of the application can ensure that the solid state disk has higher cache hit rate.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data processing method and device, electronic equipment and a storage medium.
Background
At present, a Solid State Drive (SSD) is generally used for data caching, and the Solid State Drive is a hard disk manufactured by using a Solid State electronic memory chip array, and has the advantages of high read-write speed, low power consumption, large working temperature range, and the like. In order to manage the solid state disk, the solid state disk is often divided into a plurality of storage units, and each storage unit is used for storing data. When the available storage space of the solid state disk is not large, the recovery operation of the storage unit is performed, that is, the storage data in the storage unit is emptied, and the emptied storage unit can be used for storing the data again.
In the prior art, when the storage unit is recycled, all data on the storage unit is often emptied. However, this method is easy to cause the hot data that is frequently accessed or recently accessed to be deleted from the solid state disk, and if there is a subsequent access request for the hot data, the hot data is deleted from the solid state disk, so that the hot data cannot be obtained from the solid state disk, and further the cache hit rate of the solid state disk is low.
Disclosure of Invention
The embodiment of the application provides a data processing method and device, electronic equipment and a storage medium, and aims to solve the problem that the cache hit rate of a solid state disk is low in the prior art.
In a first aspect, an embodiment of the present application provides a data processing method, including:
determining a storage unit to be recovered in the solid state disk;
detecting whether the heat information of the data object is stored in a first memory cache region or not aiming at any data object in the storage unit to be recovered;
if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region;
if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object;
and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
the determining module is used for determining a storage unit to be recovered in the solid state disk;
the processing module is used for detecting whether the heat information of the data object is stored in a first memory cache region or not aiming at any data object in the storage unit to be recovered; if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region; if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object; and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
In a third aspect, an embodiment of the present application provides an electronic device, including a processing component and a storage component;
the storage component stores one or more computer instructions; the one or more computer instructions are used for being called and executed by the processing component to realize the data processing method.
In a fourth aspect, an embodiment of the present application provides a computer storage medium storing a computer program, where the computer program is executed by a computer to implement the data processing method.
In the embodiment of the application, when any data object in a storage unit to be recovered of the solid state disk is recovered, the heat information stored in the first memory cache region is used for identifying whether the data of the data object needs to be migrated or cleared, and then data migration operation is performed in a targeted manner, so that the heat data in the solid state disk is not cleared and can be finally reserved in the solid state disk, and the solid state disk is ensured to have higher cache hit rate. In addition, data clearing operation is performed in a targeted manner, so that data needing clearing in the solid state disk is cleared, and the utilization rate of the solid state disk is improved.
These and other aspects of the present application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart illustrating one embodiment of a data processing method provided herein;
FIG. 2 shows a schematic structural diagram of a solid state disk in an actual application of the present application;
FIG. 3 is a schematic diagram illustrating a virtual buffer queue according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating another embodiment of a data processing method provided herein;
FIG. 5 is a schematic block diagram illustrating an embodiment of a data processing apparatus provided herein;
fig. 6 shows a schematic structural diagram of an embodiment of an electronic device provided by the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In some of the flows described in the specification and claims of this application and in the above-described figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, the number of operations, e.g., 101, 102, etc., merely being used to distinguish between various operations, and the number itself does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 shows a flowchart of an embodiment of a data processing method provided in the present application. Referring to fig. 1, the data processing method may include the steps of:
101: and determining a storage unit to be recovered in the solid state disk.
In this embodiment of the present application, when initializing the solid state disk, the solid state disk may be divided into a plurality of idle storage units. One free storage unit may be a block (block), and one free storage unit may also be a chunk, and of course, the specific form of the free storage unit is not limited in this embodiment. The chunk can be considered as a large block, and the unit of chunk data is MB (megabit).
At present, most of solid state disks adopt nand flash as flash memory particles. The inside of the solid state disk is composed of a plurality of flash, each flash is further subdivided into blocks, the blocks are organization units inside the solid state disk, the blocks are further subdivided into pages, and the pages are the minimum read-write units of the solid state disk. When the solid state disk is used as a cache, it is common to divide the logical address space of the solid state disk into larger blocks for management, that is, to manage according to chunk.
In addition, the write amplification factor of the solid state disk directly affects the service life of the solid state disk. The write amplification factor of the solid state disk is the data volume written by the flash memory/the data volume written by the host. According to the principle of nand flash, the data volume written in by the flash memory is limited, the write amplification factor of the solid state disk can be reduced only by reducing the data volume written in by the host as much as possible, and the service life is prolonged.
Taking the solid state disk shown in fig. 2 as an example, the solid state disk includes 4 chunks, and each chunk can store a plurality of data therein. For example, the data stored in chunk1 includes object1, object2, object3, and object 4. Of course, fig. 2 only illustrates that the solid state disk includes 4 chunks, and does not indicate that the solid state disk includes only 4 chunks, nor does it indicate that the storage unit of the solid state disk is only a chunk.
In the embodiment of the application, any non-idle storage unit in the solid state disk is selected as a storage unit to be recovered with probability. The non-idle storage unit can be understood as a storage unit which stores data, and an idle storage unit is opposite to the non-idle storage unit, and the data is not stored in the idle storage unit.
In the embodiment of the application, the operation of recovering the non-idle storage unit in the solid state disk is executed, so that the non-idle storage unit can become an idle storage unit again, the solid state disk can continue to cache data, and the utilization rate of the solid state disk is improved.
In practical application, the operation of recovering the non-idle storage units in the solid state disk may be triggered and executed periodically, and certainly, the operation of recovering the non-idle storage units in the solid state disk may also be triggered and executed when a trigger condition set according to an actual service requirement is met, which is not limited in this embodiment.
102: and detecting whether the heat information of the data object is stored in the first memory cache region or not aiming at any data object in the storage unit to be recovered.
103: and if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to the second memory cache region.
104: and if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object.
In practical application, access characteristics of each service data of any service scenario are different, so that some service data are hot data of the service scenario, and some service data are cold data of the service scenario. It is understood that the hot data is accessed more hot than the cold data. For example, the data volume of service data of a certain service scenario is 100G, about 10G of hot data exists in 100G service data, and the remaining service data in 100G service data is cold data. In the embodiment of the present application, in order to ensure that the solid state disk has a higher cache hit rate, any memory cache region may be applied in the memory in advance as the first memory cache region, where the first memory cache region is mainly used to store the heat information of the thermal data of any service scenario. The popularity information is used to indicate access characteristics of the data object, such as the number of the latest accesses or time length information of the latest access time from the current time.
It should be understood that, for any data object of any service scenario, if the number of the latest accesses of the data object is greater or the latest access time is shorter than the current time, the probability that the data object is hot data is greater, and accordingly, the hot information of the data object is stored into the first memory cache region with a greater probability.
It should be understood that if the heat information of the data object is stored in the first memory buffer, it indicates that the data of the data object is heat data, whereas if the heat information of the data object is not stored in the first memory buffer, it indicates that the data of the data object is not heat data. In the embodiment of the application, when an operation of recovering a non-idle storage unit in a solid state disk is performed, for any data object of a storage unit to be recovered, whether the data object has heat information of the data object in a first memory cache region is detected to determine whether the data of the data object is data to be migrated or data to be removed, so that data migration operation can be performed in a targeted manner, so that hot data in the solid state disk is not removed and can be finally retained in the solid state disk, the solid state disk is guaranteed to have a high cache hit rate, and data removal operation is performed in a targeted manner, so that data to be removed in the solid state disk is removed, and the utilization rate of the solid state disk is improved.
In the embodiment of the present application, a cache region is applied in advance in a memory as a second memory cache region. The second memory cache region is different from the first memory cache region, and the second memory cache region is mainly used for storing data read from the mechanical hard disk or data migrated from the solid state hard disk.
It can be understood that the data to be migrated in the storage unit to be recovered is migrated from the storage unit to be recovered to the second memory cache region, and the subsequent data in the second memory cache region can be migrated to the solid state disk again.
It can be understood that the data to be cleared in the storage unit to be reclaimed is directly cleared from the solid state disk.
It can be understood that after the unit to be reclaimed performs the data migration operation and the data clearing operation, the unit to be reclaimed is changed from a non-free storage unit to a free storage unit.
It should be noted that, the execution order of step 103 and step 104 is not limited in this embodiment of the application.
105: and when the second memory cache region meets the preset trigger condition, storing the data in the second memory cache region to the idle storage unit of the solid state disk.
In this embodiment, the data read from the mechanical hard disk may be written into the second memory cache region, and the data in the solid state disk may also be migrated into the second memory cache region. Therefore, more and more data is stored in the second memory buffer area as time goes by. And when the second memory cache region meets the preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
It can be understood that the data migrated from the storage unit to be recovered in the solid state disk to the second memory cache region only temporarily disappear in the solid state disk, and subsequently, when the second memory cache region meets the preset trigger condition, the data migrated from the solid state disk can also be migrated back to the solid state disk, so that the hot data can be finally retained in the solid state disk, and the solid state disk is guaranteed to have a higher cache hit rate.
It should be noted that the preset trigger condition is set according to specific service requirements, for example, the preset trigger condition is one or more of the following conditions: the current storage capacity of the second memory cache region is smaller than the preset residual storage capacity, or the current storage capacity of the second memory cache region is smaller than the storage capacity required by the data to be written. And the preset residual storage capacity is set according to the actual service requirement.
The requirement means that the second memory cache region is used for aggregating and writing in the data to be cached, compared with the method that the data to be cached is directly written in the solid state disk, by means of the second memory cache region, more data to be cached can be written in the solid state disk at one time, the write amplification factor of the solid state disk is reduced, and the service life of the solid state disk is prolonged.
Along with the solid-state disk shown in fig. 2, when recovery is performed on chunk1, object2 is a data object to be migrated, object2 is migrated to the second memory cache region first, and then is migrated from the second memory cache region back to a new chunk4 in the solid-state disk, and other objects 1, object3, and object4 are data objects that need to be cleared. By analogy, when recovery is performed on chunk2, object8 is a data object to be migrated, object8 is migrated to the second memory buffer first, and then is migrated from the second memory buffer back to a new chunk4 in the solid state disk, and other objects 5 and object7 are data objects that need to be cleared. When recovery is performed on chunk3, object12 is a data object to be migrated, object12 is migrated to the second memory buffer first, and then is migrated from the second memory buffer back to a new chunk4 in the solid state disk, and other objects 9, object10, and object11 are data objects that need to be cleared. Finally, after the recycling process is completed, chunk1, chunk2, chunk3 in the solid state disk are converted into new free storage units again, and data objects such as object2, object8, object12 and the like are reserved in the chunk 4.
According to the data processing method provided by the embodiment of the application, when any data object in the storage unit to be recovered of the solid state disk is recovered, the heat information stored in the first memory cache region is used for identifying whether the data of the data object needs to be migrated or cleared, and then data migration operation is performed in a targeted manner, so that the heat data in the solid state disk is not cleared and can be finally reserved in the solid state disk, and the solid state disk is ensured to have higher cache hit rate. In addition, data clearing operation is performed in a targeted manner, so that data needing clearing in the solid state disk is cleared, and the utilization rate of the solid state disk is improved.
In practical application, in any service scenario, some service data may be stored in the second memory cache region, and some service data may be stored in the solid state disk. Therefore, in some embodiments, in order to accurately determine the thermal data of any service scenario, the access characteristics of the data objects in the second memory cache region and the solid state disk may be counted; according to the statistical result, determining the data object with the access characteristic meeting the preset access condition and the heat information thereof; and storing the heat information of the data objects meeting the preset access condition in a first memory buffer area in a queue mode.
It is noted that the predetermined access condition is associated with an algorithm that identifies thermal data. For example, if the algorithm for identifying hot data is an LRU (Least Recently Used) algorithm, the preset access condition is that the latest access time is the shortest from the current time. That is, the service data with the shortest access time from the current time is hot data, and the service data with the longest access time from the current time is cold data to be eliminated. For another example, if the algorithm for identifying hot data is an LFU (Least recently Used) algorithm, the predetermined access condition is that the number of most recent accesses is the largest, that is, the number of most recent accesses is hot data, and the number of most recent accesses is cold data that needs to be eliminated.
It should be noted that the algorithm used for identifying the hot data of any service scenario is not limited to the LRU algorithm and the LFU algorithm illustrated above, and may be any other cache elimination algorithm. Of course, more details about the cache eviction algorithm are described in the related art.
In some embodiments, before storing the heat information of the data objects satisfying the preset access condition in the first memory buffer area in a queue, the method may further include: and setting the queue length of the queue according to the preset cache hit rate and/or the data volume of the data object meeting the preset access condition.
For ease of understanding, the queue in the first memory buffer region that stores the heat information is referred to as a virtual buffer queue. The queue length of the virtual buffer queue determines the amount of data that the virtual buffer queue can store. It can be understood that the longer the queue length of the virtual cache queue is, the more the heat information stored in the virtual cache queue is, the higher the cache hit rate is; the shorter the queue length of the virtual cache queue is, the less the heat information is stored in the virtual cache queue, and the lower the cache hit rate is.
In some embodiments, the dequeue time of data in the head of the queue is later than the dequeue time of data in the tail of the queue; correspondingly, after storing the heat information of the data objects meeting the preset access condition in the first memory buffer area in a queue form, the method may further include: and if the data object meeting the preset access condition is detected to be accessed currently, adjusting the position of the heat information of the data object meeting the preset access condition in the queue to the head of the queue.
It should be noted that if the dequeue time of data at the head of the queue is later than the dequeue time of data at the tail of the queue, the queue can be regarded as a queue with first-in and last-out. The first-in-last-out queue is characterized by enqueuing and dequeuing from the end of the queue.
It can be understood that, the closer to the head of the virtual buffer queue, the more recent access times of the data object corresponding to the heat data are greater, or the recent access time of the data object corresponding to the heat data is shorter than the current time, and of course, the later the heat data is eliminated from the virtual buffer queue, that is, the later the heat data is retained in the virtual buffer queue for a longer time. On the contrary, the closer to the tail of the virtual cache queue, the less the latest access times of the data object corresponding to the hot data are, or the latest access time of the data object corresponding to the hot data is longer than the current time, and of course, the earlier the hot data is eliminated from the virtual cache queue, that is, the shorter the retention time of the hot data in the virtual cache queue is.
Fig. 3 is a schematic diagram illustrating a structure of a virtual buffer queue in an actual application. The most recent access times of the data object objecty close to the head of the queue in the virtual buffer queue shown in fig. 3 are greater than the most recent access times of the data object objectx close to the tail of the queue, or the most recent access times of the data object objecty close to the head of the queue in the virtual buffer queue shown in fig. 3 are shorter than the most recent access times of the data object objectx close to the tail of the queue. The queue length of the virtual cache queue shown in fig. 3 may be set according to a specific service scenario, and the heat information of the data object meeting the preset access condition is reserved in the virtual cache queue, so that when the to-be-recovered storage unit is recovered, whether the data of the data object of the to-be-recovered storage unit is migrated is determined by judging whether the heat information of the data object of the to-be-recovered storage unit exists in the virtual cache queue, so as to reduce the write data amount of the solid state disk.
In practical applications, the heat information entered into the virtual buffer queue will be slowly eliminated from the virtual buffer queue over time. Of course, over time, the hot information eliminated from the virtual cache queue becomes hot again due to the corresponding data object being revisited, and then the hot information returns to the virtual cache queue. Therefore, the heat information in the virtual cache queue needs to be adjusted in time based on the access characteristics of the data object, so that the virtual cache queue can be effectively managed, and the heat information of the current latest heat data of any service scene is ensured to be stored in the virtual cache queue.
It should be noted that the elimination of the heat information of the data object from the virtual cache queue does not mean that the data of the data object is eliminated from the solid state disk SSD. When the hot degree information of the data object changes in the position of the virtual cache queue, the access characteristics of the data object are changed.
In practical applications, the storage capacity of the virtual cache queue may be smaller than that of the solid state disk SSD. For example, the storage capacity of the virtual cache queue is the storage capacity of the solid state disk SSD. Wherein, the value range of ratio is (0, 1), generally 0.5.
In practical applications, for a to-be-recovered storage unit, there may be more data objects satisfying a preset access condition in the to-be-recovered storage unit, that is, there are more hot data in the to-be-recovered storage unit. If the storage unit to be recovered is recovered, more data needs to be migrated to the second memory cache area, and certainly, more data is subsequently migrated from the second memory cache area to the solid state disk. That is to say, when the storage unit to be recovered is recovered, if more additional data are written due to data migration, the amount of data written into the solid state disk is greatly increased, and the write amplification factor of the solid state disk is increased, so that the service life of the solid state disk is shortened, and the performance of the solid state disk is reduced.
Therefore, in some embodiments, in order to effectively control the increase of the write data amount of the SSD caused by recycling the to-be-recycled storage unit and optimize the service life of the solid state disk, before migrating the data of the data object to the second memory cache region, the method may further include: judging whether the data volume of the data to be migrated is smaller than a preset first quantity threshold value or not; if yes, executing the step of migrating the data of the data object to the second memory cache region. Specifically, the first number threshold is set according to actual traffic demands. It can be understood that, if the data volume of the data to be migrated is smaller than the preset first number threshold, it is described that recovering the storage unit to be recovered does not cause a situation of "greatly increasing the write data volume of the solid state disk", and at this time, the step of migrating the data of the data object to the second memory cache region is performed. On the contrary, if the data volume of the data to be migrated is greater than or equal to the preset first number threshold, it indicates that recovering the storage unit to be recovered is likely to cause a situation of "greatly increasing the write data volume of the solid state disk", and at this time, the operation of converting the currently determined storage unit to be recovered into an idle storage unit is not executed any more.
Of course, when the currently determined storage unit to be recovered cannot be recovered, the storage unit to be recovered in the solid state disk may also be re-determined until the data volume of the data to be migrated in the storage unit to be recovered is re-determined to be smaller than the preset first quantity threshold.
It should be noted that there may be one or more data objects in the storage unit to be reclaimed. The data volume of the data to be migrated can be understood as the sum of the data volumes of all the data objects in the storage unit to be reclaimed.
It should be noted that, for any data object in the to-be-recovered storage unit, if it is detected that the heat information of the data object has been stored in the first memory cache region, at this time, a "migratable" flag may be marked on the data object; on the contrary, if it is detected that the hotness information of the data object is not stored in the first memory cache region, at this time, the data object may not be marked with a "migratable" flag, or the data object may be marked with a "non-migratable" flag, or the data object may be marked with a "cleanable" flag. Therefore, by counting the 'migratable' marks, which data of the storage unit to be recovered are data to be migrated, which data are data to be cleared, and the data volume of the data to be migrated can be identified.
It can be understood that, for a non-free memory unit, if there is hot data in the to-be-non-free memory unit, it is determined whether the non-free memory unit is recycled based on the data amount of the hot data. When the data amount of the hot data is not large, an operation of reclaiming the non-free memory cells is performed. When the data volume of the hot data is large, a new non-idle storage unit is reselected to execute the recovery operation, so that the recovery operation of the non-idle storage unit containing a large amount of hot data is skipped, and the service life of the solid state disk is further ensured.
In some embodiments, in order to timely and effectively control the solid state disk to provide enough free storage units and reduce waste of the cache space of the solid state disk, an implementation manner of "determining storage units to be recovered in the solid state disk" is as follows: detecting the number of idle storage units in the solid state disk; and if the number of the idle storage units is smaller than a preset second number threshold value, selecting a storage unit to be recovered from at least one non-idle storage unit in the solid state disk. Wherein the second data threshold is set according to actual conditions.
It can be understood that when the number of the free storage units in the solid state disk is less than the preset second number threshold, the free storage units in the solid state disk may be considered to be insufficient. Insufficient free storage units are difficult to meet the storage requirement of data stored from the outside of the solid state disk to the solid state disk.
It can be understood that when the number of the free storage units in the solid state disk is greater than or equal to the preset second number threshold, the free storage units in the solid state disk can be considered to be sufficient, and the storage requirement that data is stored from the outside of the solid state disk to the solid state disk is easily met.
In some embodiments, in order to accurately know which non-free storage units exist in the solid state disk, a first list may be maintained, the identifier of the non-free storage unit is recorded in the first list, and then the non-free storage unit is determined by querying the first list.
In some embodiments, in order to accurately know which free storage units exist in the solid state disk, a second list may be maintained, where the second list records the identifiers of the free storage units, and then the free storage units are determined by querying the second list.
In some embodiments, the first list and the second list may each appear as a queue. For example, the first list is a first queue and the second list is a second queue. The first queue and the second queue may be first-in-first-out queues, and the first-in-first-out queues are characterized in that: enqueuing at the tail of the queue and dequeuing at the head of the queue.
In an optional embodiment, the "determining the storage unit to be recycled in the solid state disk" may be: and selecting a storage unit to be recycled from at least one non-idle storage unit of the solid state disk according to the identification of the non-idle storage unit recorded in the first list.
In some embodiments, in order to better maintain the first list and the second list, after the clearing the data of the data object, the method may further include: the identity of the storage unit to be reclaimed is deleted from the first list and added to a second list that records the identity of free storage units.
In some embodiments, to better maintain the first list, before selecting a storage unit to be recycled from at least one non-free storage unit of the solid state disk according to the identification of the non-free storage unit recorded in the first list, the method further includes:
initializing a first list, wherein the record in the initialized first list is null; after detecting that the free storage units in the solid state disk are stored with data, determining the free storage units of the stored data as non-free storage units, and adding the identification of the non-free storage units to the first list.
In some embodiments, in order to better maintain the first list and the second list, after the data in the second memory cache region is stored in the free storage unit of the solid state disk, the method may further include: the identification of free memory locations of stored data is deleted from the second list recording the identification of free memory locations, and the identification of free memory locations of stored data is added to the first list.
The foregoing embodiment describes a data processing method for a recovery stage of a solid state disk, and the embodiment shown in fig. 4 describes a data processing method for a stage of caching data from a mechanical hard disk to the solid state disk. The data processing method in the embodiment shown in fig. 4 may be executed before the recovery operation for the solid state disk, or may be executed after the recovery operation for the solid state disk, which is not limited in this embodiment.
Fig. 4 shows a flowchart of another embodiment of the data processing method provided in the present application. On the basis of the above embodiment, referring to fig. 4, the data processing method may further include the steps of:
401. and acquiring the data of the data object to be cached from the mechanical hard disk.
402. And storing the data of the data object to be cached to the second memory cache region, and storing the heat information of the data object to be cached to the first memory cache region.
Mechanical hard disk (HDD) has the advantages of low price and large capacity, but is affected by Mechanical principles, and its random I/O delay is from several milliseconds to tens of milliseconds, which seriously affects user experience and performance. The random access performance of the solid state disk is greatly improved compared with that of a mechanical hard disk, but the solid state disk is expensive.
In the embodiment of the present application, a solid state disk may be used as a system cache (cache memory), and a mechanical hard disk may be used as a back-end storage device, so that a balance between performance and cost is achieved.
In practical application, data stored in the mechanical hard disk can be cached in the solid state disk in advance, so that subsequent application software can read required data from the solid state disk. Of course, when the application software does not read the required data from the solid state disk, the data stored in the mechanical hard disk may be cached in the solid state disk so as to allow the application software to read the required data from the solid state disk.
For ease of understanding and distinction, data that is obtained from the mechanical hard disk and eventually needs to be cached to the solid state hard disk is referred to as data of the data object to be cached.
After the data of the data object to be cached is acquired from the mechanical hard disk, the data of the data object to be cached is not directly stored in the solid state disk, but the data of the data object to be cached is stored in a second memory cache region in the memory; and when the data volume in the second memory cache region is accumulated to a certain number, writing the data volume into an idle storage unit of the solid state disk in a batch sequence mode to reduce the write amplification factor of the solid state disk.
It should be noted that the cache structure according to the embodiment of the present application may be regarded as a two-level cache structure, that is, the second memory cache region is used as a first-level cache, and the solid state disk is used as a second-level cache. When the application software needs to read data, firstly reading the data from a first-level cache, namely a second memory cache region; if the required data is not hit in the first-level cache, reading the data from a second-level cache, namely the solid state disk; if the required data is not hit in the second-level cache, reading the data from the mechanical hard disk at the rear end and storing the data into the first-level cache; and when the storage capacity of the first-level cache is full, writing the data in the first-level cache into the second-level cache.
In addition, if the LRU algorithm identifies the hot data of any service scenario, when the data of the data object to be cached is stored in the second memory cache region, because the data object to be cached is accessed, at this time, the data of the data object to be cached is identified as the hot data, and therefore, the hot information of the data object to be cached needs to be written into the first memory cache region for storage.
The heat information in the first memory cache region is used for determining data migrated to the second memory cache region and data to be cleared in the storage unit to be recovered of the solid state disk. For details of the first memory buffer, it is described in the above embodiments, and no further description is provided herein.
It should be noted that, if the hot data of any service scenario is identified as another cache elimination algorithm except for the LRU algorithm, the hot information of the data object to be cached may not be stored in the first memory cache area when the data of the data object to be cached is stored in the second memory cache area.
403. And when the second memory cache region meets the preset trigger condition, storing the data in the second memory cache region to a first idle storage unit of the solid state disk.
In practical application, when some application software needs to read a piece of data, if the data is not hit in the solid state cache, the data needs to be read from the mechanical hard disk and stored in a second memory cache region of the memory, and when the data amount in the second memory cache region is accumulated to a certain amount, the data is written into an idle storage unit of the solid state disk in a batch sequence manner to reduce the write amplification factor of the solid state disk. When the available space of the solid state disk is less and less, the recovery operation of the storage unit is needed.
According to the data processing method provided by the embodiment of the application, when the data of the data object to be cached read from the mechanical hard disk is stored in the second memory cache region, the heat information of the data object to be cached can be stored in the first memory cache region. And when the second memory cache region meets the preset triggering condition, migrating the data of the second memory cache region to the solid state disk. Therefore, the write amplification factor of the solid state disk can be reduced, and the service life of the solid state disk is prolonged. In addition, the heat information in the first memory cache region can be used for determining data migrated to the second memory cache region and data to be cleared in the storage unit to be recovered of the solid state disk, so that when the unit to be recovered is subsequently recovered, the heat information can be ensured to be finally reserved in the solid state disk, and the solid state disk is ensured to have higher cache hit rate.
In practical application, some service data may be stored in the second memory cache region and some service data may be stored in the solid state disk for any service scenario. Therefore, in some embodiments, in order to accurately determine the thermal data of any service scenario, the access characteristics of the data objects in the second memory cache region and the solid state disk may be counted; according to the statistical result, determining the data object with the access characteristic meeting the preset access condition and the heat information thereof; and storing the heat information of the data objects meeting the preset access condition in a first memory buffer area in a queue mode.
It is noted that the predetermined access condition is associated with an algorithm that identifies thermal data. For example, if the algorithm for identifying hot data is an LRU (Least Recently Used) algorithm, the preset access condition is that the latest access time is the shortest from the current time. That is, the service data with the shortest access time from the current time is hot data, and the service data with the longest access time from the current time is cold data to be eliminated. For another example, if the algorithm for identifying hot data is an LFU (Least recently Used) algorithm, the predetermined access condition is that the number of most recent accesses is the largest, that is, the number of most recent accesses is hot data, and the number of most recent accesses is cold data that needs to be eliminated.
It should be noted that the algorithm used for identifying the hot data of any service scenario is not limited to the LRU algorithm and the LFU algorithm illustrated above, and may be any other cache elimination algorithm. Of course, more details about the cache eviction algorithm are described in the related art.
In some embodiments, before storing the heat information of the data objects satisfying the preset access condition in the first memory buffer area in a queue, the method may further include: and setting the queue length of the queue according to the preset cache hit rate and/or the data volume of the data object meeting the preset access condition.
For ease of understanding, the queue in the first memory buffer region that stores the heat information is referred to as a virtual buffer queue. The queue length of the virtual buffer queue determines the amount of data that the virtual buffer queue can store. It can be understood that the longer the queue length of the virtual cache queue is, the more the heat information stored in the virtual cache queue is, the higher the cache hit rate is; the shorter the queue length of the virtual cache queue is, the less the heat information is stored in the virtual cache queue, and the lower the cache hit rate is.
In some embodiments, the dequeue time of data in the head of the queue is later than the dequeue time of data in the tail of the queue; correspondingly, after storing the heat information of the data objects meeting the preset access condition in the first memory buffer area in a queue form, the method may further include: and if the data object meeting the preset access condition is detected to be accessed currently, adjusting the position of the heat information of the data object meeting the preset access condition in the queue to the head of the queue.
It can be understood that, the closer to the head of the virtual buffer queue, the more recent access times of the data object corresponding to the heat data are greater, or the recent access time of the data object corresponding to the heat data is shorter than the current time, and of course, the later the heat data is eliminated from the virtual buffer queue, that is, the later the heat data is retained in the virtual buffer queue for a longer time. On the contrary, the closer to the tail of the virtual cache queue, the less the latest access times of the data object corresponding to the hot data are, or the latest access time of the data object corresponding to the hot data is longer than the current time, and of course, the earlier the hot data is eliminated from the virtual cache queue, that is, the shorter the retention time of the hot data in the virtual cache queue is.
For the related description of the virtual buffer queue, reference is made to the foregoing description, and details are not repeated here.
In practical applications, as time goes by, more and more data are stored in the second memory buffer, and the available storage capacity of the second memory buffer is less and less. Therefore, in order to ensure that the second memory buffer has enough available storage capacity to store the data to be cached, the implementation manner of "storing the data of the data object to be cached in the second memory buffer" may be: detecting whether the current storage capacity of the second memory cache region meets the storage capacity required by the data object to be cached; if so, storing the data of the data object to be cached in a second memory cache region; and if not, storing the data in the second memory cache region into a second idle storage unit in the solid state disk, emptying the data in the second memory cache region, and storing the data of the data object to be cached into the emptied second memory cache region.
It can be understood that the current storage capacity of the second memory cache region satisfies the storage capacity required by the data object to be cached, that is, the storage capacity of the second memory cache region is still remained. The current storage capacity of the second memory cache region does not meet the storage capacity required by the data object to be cached, that is, the storage capacity of the second memory cache region is full, at this time, the data of the second memory cache region can be integrally written into a second idle storage unit in the solid state disk, the data in the second memory cache region is emptied, and the emptied second memory cache region can provide sufficient storage capacity.
In some embodiments, in order to improve the utilization rate of the solid state disk, before the data of the data object to be cached is stored in the second memory cache region, the method may further include: and determining a second memory cache area with the total storage capacity equal to that of the free storage units in the memory.
It can be understood that, when the total storage capacity provided by the second memory cache region is equal to the total storage capacity of a single free storage unit, the utilization rate of the solid state disk is the maximum. Of course, the total storage capacity provided by the second memory cache region may also be greater than or less than the total storage capacity of a single free storage unit, and the embodiment of the present application does not limit the total storage capacity of the second memory cache region.
In some embodiments, in order to accurately know which free storage units exist in the solid state disk, a second list may be maintained, where the second list records the identifiers of the free storage units, and then the free storage units are determined by querying the second list.
In some embodiments, the "storing the data in the second memory cache region to the first free storage unit of the solid state disk" specifically includes: selecting a first free storage unit from the solid state disk according to a second list recording the identification of the free storage unit; and storing the data in the second memory cache region to the first idle storage unit.
In some embodiments, before selecting the first free storage unit from the solid state disk according to the second list recording the identification of the free storage unit, the method may further include: initializing a second list, wherein the records in the initialized second list are null; initializing a solid state disk, wherein the initialized solid state disk comprises at least one idle storage unit; an identification of at least one free storage unit is added to the second list.
In some embodiments, in order to better maintain the second list and improve the utilization rate of the solid state disk, after the first free storage unit is selected from the solid state disk according to the second list recording the identifier of the free storage unit, the method may further include: removing the identity of the first free storage unit from the second list; determining the number of idle storage units in the solid state disk according to the number of the identifiers of the idle storage units currently recorded in the second list; if the number of the idle storage units is smaller than a preset second number threshold, selecting a storage unit to be recovered from at least one non-idle storage unit in the solid state disk; and carrying out recovery processing on the storage unit to be recovered according to the heat information in the first memory cache region.
The content of performing the recycling process on the to-be-recycled storage unit based on the heat information in the first memory cache region is described in detail in the foregoing description, and is not described herein again.
Fig. 5 is a schematic structural diagram illustrating an embodiment of a data processing apparatus provided in the present application. Referring to fig. 5, the data processing apparatus may include:
a determining module 501, configured to determine a storage unit to be recovered in a solid state disk;
a processing module 502, configured to detect, for any data object in the to-be-recovered storage unit, whether the heat information of the data object has been stored in a first memory cache region; if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region; if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object; and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
In some embodiments, the processing module 502 is further configured to count access characteristics of the data objects in the second memory cache region and the solid state disk;
according to the statistical result, determining the data object with the access characteristic meeting the preset access condition and the heat information thereof;
and storing the heat information of the data objects meeting the preset access condition in a queue form in the first memory buffer area.
In some embodiments, the dequeue time of data in the head of the queue is later than the dequeue time of data in the tail of the queue;
accordingly, in some embodiments, after storing the hot data in the first memory buffer in the form of a queue, the processing module 502 is further configured to:
and if the data object meeting the preset access condition is detected to be accessed currently, adjusting the position of the heat information of the data object meeting the preset access condition in the queue to the head of the queue.
In some embodiments, before storing the heat information of the data objects satisfying the preset access condition in the first memory buffer area in a queue, the processing module 502 is further configured to:
and setting the queue length of the queue according to a preset cache hit rate and/or the data volume of the data object meeting a preset access condition.
In some embodiments, before migrating the data of the data object to the second memory cache region, the processing module 502 is further configured to:
judging whether the data volume of the data to be migrated is smaller than a preset first quantity threshold value or not;
and if so, executing the step of migrating the data of the data object to a second memory cache region.
In some embodiments, the processing module 502 determines that the storage unit to be recovered in the solid state disk is specifically:
detecting the number of idle storage units in the solid state disk;
and if the number of the idle storage units is smaller than a preset second number threshold value, selecting a storage unit to be recovered from at least one non-idle storage unit in the solid state disk.
In some embodiments, the processing module 502 determines that the storage unit to be recovered in the solid state disk is specifically:
and selecting a storage unit to be recycled from at least one non-idle storage unit of the solid state disk according to the identification of the non-idle storage unit recorded in the first list.
In some embodiments, after clearing the data of the data object, the processing module 502 is further configured to:
the identity of the storage unit to be reclaimed is deleted from the first list and added to a second list that records the identity of free storage units.
In some embodiments, before selecting the storage unit to be recycled from the at least one non-free storage unit of the solid state disk according to the identification of the non-free storage unit recorded in the first list, the processing module 502 is further configured to:
initializing a first list, wherein the record in the initialized first list is null;
after detecting that the free storage units in the solid state disk are stored with data, determining the free storage units of the stored data as non-free storage units, and adding the identification of the non-free storage units to the first list.
In some embodiments, after the data in the second memory cache region is stored in the free storage unit of the solid state disk, the processing module 502 is further configured to:
the identification of free memory locations of stored data is deleted from the second list recording the identification of free memory locations, and the identification of free memory locations of stored data is added to the first list.
The data processing apparatus in fig. 5 may execute the data processing method in the embodiment shown in fig. 1, and the implementation principle and the technical effect are not described again. The specific manner in which each module and unit of the data processing apparatus in the above embodiments perform operations has been described in detail in the embodiments related to the method, and will not be described in detail herein.
In some embodiments, the data processing apparatus of fig. 5 may also perform the data processing method of the embodiment shown in fig. 4. Specifically, the method comprises the following steps:
an obtaining module 501, configured to obtain data of a data object to be cached from a mechanical hard disk;
a processing module 502, configured to store data of a data object to be cached in a second memory cache region, and store heat information of the data object to be cached in a first memory cache region;
the processing module 502 is further configured to store the data in the second memory cache region to the first idle storage unit of the solid state disk when the second memory cache region meets a preset trigger condition;
the heat information in the first memory cache region is used for determining data migrated to the second memory cache region and data to be cleared in the storage unit to be recovered of the solid state disk.
In some embodiments, the processing module 502 is further configured to:
counting the access characteristics of the data objects in the second memory cache region and the solid state disk;
determining data objects meeting the access characteristics and meeting preset access conditions and heat information thereof according to the statistical result;
and storing the heat information of the data objects meeting the preset access condition in a queue form in the first memory buffer area.
In some embodiments, the step of the processing module 502 storing the data of the data object to be cached in the second memory cache region specifically includes:
detecting whether the current storage capacity of the second memory cache region meets the storage capacity required by the data object to be cached;
if so, storing the data of the data object to be cached in a second memory cache region;
and if not, storing the data in the second memory cache region into a second idle storage unit in the solid state disk, emptying the data in the second memory cache region, and storing the data of the data object to be cached into the emptied second memory cache region.
In some embodiments, before storing the data of the data object to be cached in the second memory buffer, the processing module 502 is further configured to:
and determining a second memory cache area with the total storage capacity equal to that of the free storage units in the memory.
In some embodiments, the step of the processing module 502 storing the data in the second memory cache region to the first free storage unit of the solid state disk specifically includes:
selecting a first free storage unit from the solid state disk according to a second list recording the identification of the free storage unit;
and storing the data in the second memory cache region to the first idle storage unit.
In some embodiments, before selecting the first free storage unit from the solid state disk according to the second list recording the identification of the free storage unit, the processing module 502 is further configured to:
initializing a second list, wherein the records in the initialized second list are null;
initializing a solid state disk, wherein the initialized solid state disk comprises at least one idle storage unit;
an identification of at least one free storage unit is added to the second list.
In some embodiments, after selecting the first free storage unit from the solid state disk according to the second list recording the identification of the free storage unit, the processing module 502 is further configured to:
removing the identity of the first free storage unit from the second list;
determining the number of idle storage units in the solid state disk according to the number of the identifiers of the idle storage units currently recorded in the second list;
if the number of the idle storage units is smaller than a preset second number threshold, selecting a storage unit to be recovered from at least one non-idle storage unit in the solid state disk;
and carrying out recovery processing on the storage unit to be recovered according to the heat information in the first memory cache region.
In one possible design, the data processing apparatus of the embodiment shown in fig. 5 may be implemented as an electronic device, which may include a storage component 601 and a processing component 602, as shown in fig. 6;
the storage component stores one or more computer instructions, wherein the one or more computer instructions are invoked for execution by the processing component.
The processing component 602 is configured to:
determining a storage unit to be recovered in the solid state disk; detecting whether the heat information of the data object is stored in a first memory cache region or not aiming at any data object in the storage unit to be recovered; if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region;
if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object; and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
Among other things, the processing component 602 may include one or more processors to execute computer instructions to perform all or some of the steps of the methods described above. Of course, the processing elements may also be implemented as one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components configured to perform the above-described methods.
The storage component 601 is configured to store various types of data to support operations at the terminal. The memory components may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The electronic device may further comprise a display component 603, and the display component 603 may be an Electroluminescent (EL) element, a liquid crystal display or a micro-display having a similar structure, or a retina-directable display or a similar laser scanning type display.
Of course, the electronic device may of course also comprise other components, such as input/output interfaces, communication components, etc.
The input/output interface provides an interface between the processing components and peripheral interface modules, which may be output devices, input devices, etc.
The communication component is configured to facilitate wired or wireless communication between the electronic device and other devices, and the like.
The electronic device may be a physical device or an elastic computing host provided by a cloud computing platform, and the electronic device may be a cloud server, and the processing component, the storage component, and the like may be basic server resources rented or purchased from the cloud computing platform.
The embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a computer, the data processing method of the embodiment shown in fig. 1 and/or fig. 4 may be implemented.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (15)

1. A data processing method, comprising:
determining a storage unit to be recovered in the solid state disk;
detecting whether the heat information of the data object is stored in a first memory cache region or not aiming at any data object in the storage unit to be recovered;
if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region;
if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object;
and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
2. The method of claim 1, further comprising:
counting the access characteristics of the data objects in the second memory cache region and the solid state disk;
according to the statistical result, determining the data object with the access characteristic meeting the preset access condition and the heat information thereof;
and storing the heat information of the data objects meeting the preset access condition in a queue form in the first memory buffer area.
3. The method of claim 2, wherein the dequeue time of data in the head of the queue is later than the dequeue time of data in the tail of the queue;
correspondingly, after storing the heat information of the data objects meeting the preset access condition in the first memory buffer area in a queue form, the method further includes:
and if the data object meeting the preset access condition is detected to be accessed currently, adjusting the position of the heat information of the data object meeting the preset access condition in the queue to the head of the queue.
4. The method according to claim 2, wherein before storing the heat information of the data objects satisfying the preset access condition in the first memory buffer area in a queue form, the method further comprises:
and setting the queue length of the queue according to a preset cache hit rate and/or the data volume of the data object meeting a preset access condition.
5. The method of claim 1, wherein prior to migrating the data of the data object to the second memory cache, the method further comprises:
judging whether the data volume of the data to be migrated is smaller than a preset first quantity threshold value or not;
and if so, executing the step of migrating the data of the data object to a second memory cache region.
6. The method of claim 1, wherein determining the storage units to be recycled in the solid state disk comprises:
and selecting a storage unit to be recovered from at least one non-idle storage unit of the solid state disk according to the identification of the non-idle storage unit recorded in the first list.
7. The method of claim 6, wherein after purging the data of the data object, the method further comprises:
and deleting the identifier of the storage unit to be recovered from the first list, and adding the identifier of the storage unit to be recovered to a second list recording the identifier of a free storage unit.
8. The method of claim 6, wherein before selecting the storage unit to be reclaimed from the at least one non-free storage unit of the solid state disk according to the identification of the non-free storage unit recorded in the first list, the method further comprises:
initializing a first list, wherein the record in the initialized first list is null;
after detecting that the free storage units in the solid state disk are stored with data, determining the free storage units with the stored data as non-free storage units, and adding the identification of the non-free storage units to the first list.
9. The method according to claim 1, wherein after the data in the second memory buffer area is stored in the free storage unit of the solid state disk, the method further comprises:
deleting the identity of the free memory unit of stored data from a second list recording identities of free memory units, and adding the identity of the free memory unit of stored data to the first list.
10. The method of claim 1, further comprising:
acquiring data of a data object to be cached from a mechanical hard disk;
storing the data of the data object to be cached to a second memory cache region, and storing the heat information of the data object to be cached to a first memory cache region;
when the second memory cache region meets a preset trigger condition, storing data in the second memory cache region to a first idle storage unit of the solid state disk;
the heat information in the first memory cache region is used for determining data migrated to the second memory cache region and data to be cleared in the storage unit to be recovered of the solid state disk.
11. The method of claim 10, wherein storing the data of the data object to be cached in a second memory buffer comprises:
detecting whether the current storage capacity of the second memory cache region meets the storage capacity required by the data object to be cached;
if so, storing the data of the data object to be cached in the second memory cache region;
if not, storing the data in the second memory cache region into a second idle storage unit in the solid state disk, emptying the data in the second memory cache region, and storing the data of the data object to be cached into the emptied second memory cache region.
12. The method according to claim 10, wherein before storing the data of the data object to be cached in the second memory buffer, the method further comprises:
and determining the second memory cache area with the total storage capacity equal to that of the free storage units in the memory.
13. A data processing apparatus, comprising:
the determining module is used for determining a storage unit to be recovered in the solid state disk;
the processing module is used for detecting whether the heat information of the data object is stored in a first memory cache region or not aiming at any data object in the storage unit to be recovered; if the heat information of the data object is stored in the first memory cache region, migrating the data of the data object to a second memory cache region; if the heat information of the data object is not stored in the first memory cache region, clearing the data of the data object; and when the second memory cache region meets a preset trigger condition, storing the data in the second memory cache region to an idle storage unit of the solid state disk.
14. An electronic device comprising a processing component and a storage component;
the storage component stores one or more computer instructions; the one or more computer instructions for execution by the processing component to perform the data processing method of any of claims 1 to 12.
15. A computer storage medium storing a computer program which, when executed by a computer, implements a data processing method according to any one of claims 1 to 12.
CN202110772517.2A 2021-07-08 2021-07-08 Data processing method and device, electronic equipment and storage medium Pending CN113672166A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110772517.2A CN113672166A (en) 2021-07-08 2021-07-08 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110772517.2A CN113672166A (en) 2021-07-08 2021-07-08 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113672166A true CN113672166A (en) 2021-11-19

Family

ID=78538722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110772517.2A Pending CN113672166A (en) 2021-07-08 2021-07-08 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113672166A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328065A (en) * 2022-09-16 2022-11-11 中国核动力研究设计院 Method for automatically migrating control unit functions applied to industrial control system
CN115543221A (en) * 2022-11-29 2022-12-30 苏州浪潮智能科技有限公司 Data migration method and device for solid state disk, electronic equipment and storage medium
CN116301644A (en) * 2023-03-24 2023-06-23 四川水利职业技术学院 Data storage method, system, terminal and medium based on multi-hard disk coordination

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080155183A1 (en) * 2006-12-18 2008-06-26 Zhiqing Zhuang Method of managing a large array of non-volatile memories
CN106227598A (en) * 2016-07-20 2016-12-14 浪潮电子信息产业股份有限公司 A kind of recovery method of cache resources
CN108776614A (en) * 2018-05-03 2018-11-09 华为技术有限公司 The recovery method and device of memory block
CN111159066A (en) * 2020-01-07 2020-05-15 杭州电子科技大学 Dynamically-adjusted cache data management and elimination method
CN112463057A (en) * 2020-11-28 2021-03-09 济南华芯算古信息科技有限公司 Intelligent garbage recycling method and device compatible with NVMe solid state disk
CN112905129A (en) * 2021-05-06 2021-06-04 蚂蚁金服(杭州)网络技术有限公司 Method and device for eliminating cache memory block and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080155183A1 (en) * 2006-12-18 2008-06-26 Zhiqing Zhuang Method of managing a large array of non-volatile memories
CN106227598A (en) * 2016-07-20 2016-12-14 浪潮电子信息产业股份有限公司 A kind of recovery method of cache resources
CN108776614A (en) * 2018-05-03 2018-11-09 华为技术有限公司 The recovery method and device of memory block
CN111159066A (en) * 2020-01-07 2020-05-15 杭州电子科技大学 Dynamically-adjusted cache data management and elimination method
CN112463057A (en) * 2020-11-28 2021-03-09 济南华芯算古信息科技有限公司 Intelligent garbage recycling method and device compatible with NVMe solid state disk
CN112905129A (en) * 2021-05-06 2021-06-04 蚂蚁金服(杭州)网络技术有限公司 Method and device for eliminating cache memory block and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328065A (en) * 2022-09-16 2022-11-11 中国核动力研究设计院 Method for automatically migrating control unit functions applied to industrial control system
CN115543221A (en) * 2022-11-29 2022-12-30 苏州浪潮智能科技有限公司 Data migration method and device for solid state disk, electronic equipment and storage medium
CN116301644A (en) * 2023-03-24 2023-06-23 四川水利职业技术学院 Data storage method, system, terminal and medium based on multi-hard disk coordination
CN116301644B (en) * 2023-03-24 2023-10-13 四川水利职业技术学院 Data storage method, system, terminal and medium based on multi-hard disk coordination

Similar Documents

Publication Publication Date Title
US9767140B2 (en) Deduplicating storage with enhanced frequent-block detection
Eisenman et al. Flashield: a hybrid key-value cache that controls flash write amplification
US10380035B2 (en) Using an access increment number to control a duration during which tracks remain in cache
EP3229142B1 (en) Read cache management method and device based on solid state drive
CN113672166A (en) Data processing method and device, electronic equipment and storage medium
CN108268219B (en) Method and device for processing IO (input/output) request
US8595451B2 (en) Managing a storage cache utilizing externally assigned cache priority tags
US9342458B2 (en) Cache allocation in a computerized system
CN106547476B (en) Method and apparatus for data storage system
KR20120090965A (en) Apparatus, system, and method for caching data on a solid-state strorage device
EP3252609A1 (en) Cache data determination method and device
CN105095116A (en) Cache replacing method, cache controller and processor
CN109086141B (en) Memory management method and device and computer readable storage medium
US11620219B2 (en) Storage drive dependent track removal in a cache for storage
CN108845957B (en) Replacement and write-back self-adaptive buffer area management method
US11169968B2 (en) Region-integrated data deduplication implementing a multi-lifetime duplicate finder
CN109144431B (en) Data block caching method, device, equipment and storage medium
KR101105127B1 (en) Buffer cache managing method using ssdsolid state disk extension buffer and apparatus for using ssdsolid state disk as extension buffer
CN111459402B (en) Magnetic disk controllable buffer writing method, controller, hybrid IO scheduling method and scheduler
US11301395B2 (en) Method and apparatus for characterizing workload sequentiality for cache policy optimization
CN112379841A (en) Data processing method and device and electronic equipment
CN111290974A (en) Cache elimination method for storage device and storage device
EP4307129A1 (en) Method for writing data into solid-state hard disk
CN111290975A (en) Method for processing read command and pre-read command by using unified cache and storage device thereof
US20140359228A1 (en) Cache allocation in a computerized system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination