WO2023174394A1 - 一种数据处理方法和装置 - Google Patents

一种数据处理方法和装置 Download PDF

Info

Publication number
WO2023174394A1
WO2023174394A1 PCT/CN2023/082053 CN2023082053W WO2023174394A1 WO 2023174394 A1 WO2023174394 A1 WO 2023174394A1 CN 2023082053 W CN2023082053 W CN 2023082053W WO 2023174394 A1 WO2023174394 A1 WO 2023174394A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage block
data
valid data
reads
unit
Prior art date
Application number
PCT/CN2023/082053
Other languages
English (en)
French (fr)
Inventor
邓京涛
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2023174394A1 publication Critical patent/WO2023174394A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Definitions

  • This application relates to the field of data storage technology, and in particular, to a data processing method and device.
  • SSD Solid state drives
  • the flash memory chip in the solid-state drive contains multiple blocks, and each block contains multiple pages.
  • each block contains multiple pages.
  • garbage collection it is necessary to collect the valid garbage in the pages containing invalid data and valid data into new blocks, thereby freeing up the space occupied by the invalid data for the use of valid data. This process is called garbage collection.
  • the SSD main controller After the SSD main controller reads data from the page once, it will add one to the number of reads of the current block A and write it into the memory (double data rate, DDR), and can obtain the page's data rate.
  • a random number generator When the number of data reads reaches the preset value, a random number generator is used to generate a random value within a certain range. When this value is equal to the preset value, garbage collection is triggered and the currently read data is moved to in the new block B, and mark the data location read in block A as invalid.
  • the above method has the following disadvantages: before the data in the block reaches the preset reading value, all may be valid data (assuming the amount of valid data is n), because the preset probability is a fixed probability (assuming it is p ), then when the number of reads reaches the preset value, the probability of the current block being moved for the first time is p(n/n*p).
  • the valid data in the current block becomes n-1, then The probability of the next move happening is (n-1)/n*p, and the probability of the next move happening again is (n-2)/n*p.
  • this application provides a data processing method and device, which effectively smoothes the disadvantages of poor garbage collection balance during garbage collection due to read interference, and can effectively improve the user's reading and writing experience. This solves the problem of large fluctuations in reading and writing performance in the existing technology.
  • a data processing method includes the following steps: obtaining the optimal performance value, the maximum number of reads, and the number of unit data of a storage block; and determining the number of storage blocks based on the optimal performance value, the maximum number of reads, and the number of unit data.
  • Read threshold obtain the number of valid data in the storage block and the maximum value that can store valid data; determine the execution probability value of garbage collection based on the optimal performance value, the number of valid data, and the maximum value that can store valid data.
  • determining the first read threshold of the storage block specifically includes: obtaining the maximum number of reads of the storage block and the number of unit data of the storage block; according to the optimal performance value of the storage block, based on the following formula , determine the first read threshold of the storage block:
  • determining the execution probability value for garbage collection specifically includes: obtaining the number of valid data in the storage block and the maximum value of valid data that can be stored in the storage block; according to the optimal performance value of the storage block, Based on the following formula, determine the execution probability value for garbage collection:
  • performing garbage collection based on the execution probability value specifically includes: obtaining the execution probability value that determines to perform garbage collection; if the execution probability value is not less than 1, And when the difference between the number of reads and the first read threshold is an integer multiple of the execution probability value, garbage collection of one data unit is performed; if the execution probability value is greater than 0 and less than 1, and when the number of reads is an integral multiple of the execution probability value, When the difference between the thresholds is equal to the execution probability value, garbage collection of one data unit is performed.
  • the method further includes: determining a second reading threshold of the storage block based on the optimal performance value; obtaining the number of reads of valid data in the storage block, and determining whether the number of reads is greater than the second reading threshold. Read threshold; if the number of reads exceeds the second read threshold, continue to determine whether there is valid data in the storage block.
  • if the number of reads exceeds the second read threshold, continuing to determine whether there is valid data in the storage block specifically includes: if there is valid data in the storage block, then processing the valid data in the storage block. Garbage recycling.
  • the garbage collection process includes: writing valid data in the storage block into the second storage block, erasing corresponding invalid data in the storage block, and updating the address mapping table corresponding to the valid data.
  • a data processing device includes: an acquisition unit, used to obtain the optimal performance value, the maximum number of reads and the number of unit data of a storage block; the acquisition unit is also used to obtain the number of valid data in the storage block and the number of data that can be stored The maximum value of valid data; the first calculation unit, the first calculation unit is communicatively connected to the acquisition unit, and the first calculation unit determines the first read threshold of the storage block based on the optimal performance value, the maximum number of reads and the number of unit data ; The first computing unit is also used to determine the execution probability value for garbage collection based on the number of valid data and the maximum value that can store valid data; the first judgment unit is communicatively connected to the computing unit, and the first judgment unit Based on obtaining the number of reads of valid data in the storage block, it is used to determine whether the number of reads is greater than the first read threshold. If the number of reads is greater than the read threshold, garbage collection is performed based on the execution probability value.
  • the device further includes: a second calculation unit, the second calculation unit is communicatively connected to the acquisition unit, and the second calculation unit is used to determine the second read threshold of the storage block based on the optimal performance value;
  • the second judgment unit is communicatively connected to the second calculation unit.
  • the second judgment unit is used to determine whether the number of reads is greater than the second read threshold based on obtaining the number of reads of valid data in the storage block. If If the number of reads exceeds the second reading threshold, continue to determine whether there is valid data in the storage block; the third judgment unit is communicatively connected to the second judgment unit, and the third judgment unit is used to continue to judge whether there is valid data in the storage block.
  • Valid data exists. If there is valid data in the storage block, the valid data in the storage block will be garbage collected.
  • the device further includes: a garbage collection unit, used to write the valid data in the storage block into the second storage block, erase the corresponding invalid data in the storage block, and update the address corresponding to the valid data. Mapping table.
  • a data processing method and device of the present application includes: obtaining the optimal performance value of the storage block; based on the optimal performance value, determining the first read threshold of the storage block and the execution probability value of garbage collection; obtaining the read
  • the number of reads of valid data in the storage block is determined to determine whether the number of reads is greater than the first read threshold. If the number of reads is greater than the first read threshold, garbage collection is performed based on the execution probability value; based on the data processing method of this application It can effectively smooth the disadvantages of poor garbage collection balance during garbage collection due to read interference, effectively improve the user's reading and writing experience, and solve the problem of large fluctuations in reading and writing performance existing in the existing technology.
  • Figure 1 is a method flow chart of this application
  • Figure 2 is a schematic diagram of the relationship between the first reading threshold and the second reading threshold of the present application
  • Figure 3 is a structural diagram of the device of this application.
  • Read interference When there is valid data in the flash memory, when data operations are performed on the page multiple times, it may interfere with data operations on other pages in the same block, causing errors in reading data.
  • Garbage collection Collect valid garbage in pages containing invalid data and valid data into new blocks, thereby freeing up the space occupied by invalid data for use by valid data. This process is called garbage collection.
  • Figure 1 is a flow chart of a method in some embodiments.
  • the data processing method in some embodiments includes the following steps: Step S1, obtain the optimal performance value of the storage block; Step S2, based on the optimal performance value, determine the first reading threshold of the storage block and the execution probability of garbage collection value; Step S3, obtain the number of reads of valid data in the storage block, and determine whether the number of reads is greater than the first read threshold. If the number of reads is greater than the first read threshold, perform garbage collection based on the execution probability value. .
  • the method further includes: step S0, accepting a valid data reading request sent by the target user, and using the valid data reading request to determine the target valid data to be read, based on the target valid data to be read and The address mapping table corresponding to the target valid data determines the storage block where the target valid data is located.
  • the method also includes: step S4, determining the second read threshold of the storage block based on the optimal performance value; step S5, obtaining the number of reads of valid data in the storage block, and determining the number of reads Whether it is greater than the second read threshold, if the number of reads exceeds the second read threshold, continue to determine whether there is valid data in the storage block; step S6, if there is valid data in the storage block, the valid data in the storage block is Garbage recycling.
  • FIG. 2 it is a schematic diagram of the relationship between the first read threshold and the second read threshold, where B represents the first read threshold and A represents the second read threshold.
  • determining the first read threshold of the storage block specifically includes: obtaining the maximum number of reads of the storage block and the number of unit data of the storage block; determining based on the optimal performance value of the storage block based on the following formula First read of memory block Take the threshold:
  • the optimal performance value usually refers to: During the actual test process, the maximum value that does not affect the data reading performance of the storage block is defined as the optimal performance value. That is, during the reading and writing process of the storage block, affected by the internal data movement, the reading performance of the storage block will be affected. Affected, when the number of data moves reaches the fourth threshold, the read performance of the storage block will plummet. Therefore, in actual scenarios, the above fourth threshold is defined as the optimal performance value. It should be understood that the optimal performance values in this application are all numbers greater than 0 and less than 1.
  • the optimal performance value is greater than 1, the reciprocal of the optimal performance value needs to be taken. Based on the data reading characteristics of storage blocks, for storage blocks that store data, if the storage block is read more times, the stability of its internally stored data will be worse.
  • the maximum storage block of this application The number of reads is: the maximum number of times that the data in the storage block can be read under the premise of ensuring that the data in the storage block has high accuracy and high stability; the number of unit data in the storage block represents an area The maximum number of unit data that can be stored in a block; the number of valid data in a storage block represents the number of valid data stored in a block; the maximum value of valid data that can be stored in this application represents the number of valid data that can be stored in a block The maximum value of valid data.
  • determining the execution probability value for garbage collection specifically includes: obtaining the number of valid data in the storage block and the maximum value of valid data that can be stored in the storage block; according to the optimal performance value of the storage block, based on the following Formula to determine the execution probability value of garbage collection:
  • the maximum value of valid data that can be stored in the storage block is the capacity of the storage block, which is the rated value of the storage block that can accommodate valid data; the number of valid data in the storage block is the current storage block under the current circumstances. The number of existing valid data stored in the memory.
  • performing garbage collection processing based on the execution probability value specifically includes: obtaining the execution probability value that determines to perform garbage collection; if the execution probability value is not less than 1, and when When the difference between the number of reads and the first read threshold is an integer multiple of the execution probability value, garbage collection of one data unit is performed; that is, when the number of reads exceeds the first read threshold, every time the number of reads increases When the number of times is equal to the execution probability value, garbage collection of one data unit is performed. If the execution probability value is greater than 0 and less than 1, and when the difference between the number of reads and the first read threshold is equal to the execution probability value, garbage collection of one data unit is performed.
  • the garbage collection process includes: writing valid data in the storage block into the second storage block, erasing corresponding invalid data in the storage block, and updating the address mapping table corresponding to the valid data.
  • erasing the corresponding invalid data in the corresponding storage block specifically includes: when the valid data in the storage block is written into the second storage block, it means that the corresponding valid data is changed into invalid data and remains in the storage block. , and then erase the corresponding invalid data in the storage block so that the storage block can be used as a new storage unit to store valid data.
  • the address mapping table corresponding to the valid data is updated, that is, the address of the storage block originally corresponding to the valid data is updated to the address of the second storage block. Need to understand Yes, the process of writing the valid data in the storage block to the second storage block is the process of data movement; the invalid data in this application is the invalidated valid data.
  • the data processing method in some embodiments includes the following steps: obtaining the optimal performance value of the storage block; based on the optimal performance value, determining the first read threshold of the storage block and the execution probability value of garbage collection; obtaining the read The number of reads of valid data in the storage block is determined to determine whether the number of reads is greater than the first read threshold. If the number of reads is greater than the first read threshold, garbage collection is performed based on the execution probability value.
  • determining the first read threshold of the storage block specifically includes: obtaining the maximum number of reads of the storage block and the number of unit data of the storage block; determining based on the optimal performance value of the storage block based on the following formula First read threshold for storage blocks:
  • determining the execution probability value for garbage collection specifically includes: obtaining the number of valid data in the storage block and the maximum value of valid data that can be stored in the storage block; according to the optimal performance value of the storage block, based on the following Formula to determine the execution probability value of garbage collection:
  • performing garbage collection processing based on the execution probability value specifically includes: obtaining the execution probability value that determines to perform garbage collection; if the execution probability value is not less than 1, and when When the difference between the number of reads and the first read threshold is an integer multiple of the execution probability value, garbage collection of one data unit is performed; if the execution probability value is greater than 0 and less than 1, and when the number of reads is equal to the first read threshold When the difference is equal to the execution probability value, garbage collection of one data unit is performed.
  • the method further includes: determining a second read threshold of the storage block based on the optimal performance value; obtaining the number of reads of valid data in the storage block, and determining whether the number of reads is greater than the second read threshold. threshold; if the number of reads exceeds the second read threshold, continue to determine whether there is valid data in the storage block.
  • the second reading threshold in this application is equivalent to a critical value. Specifically, regarding the above critical value, that is, the second reading threshold
  • the specific size of the threshold can be determined by those skilled in the art according to the needs of actual application scenarios.
  • the garbage collection process includes: writing valid data in the storage block into the second storage block, erasing corresponding invalid data in the storage block, and updating the address mapping table corresponding to the valid data.
  • steps in the flowchart of FIG. 1 are shown in sequence as indicated by arrows, these steps are not necessarily executed in the order indicated by arrows. Unless explicitly stated in this article, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 1 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. The execution of these sub-steps or stages The sequence is not necessarily sequential, but may be performed in turn or alternately with other steps or sub-steps of other steps or at least part of the stages.
  • Figure 3 is a device structure diagram in some embodiments of the present application.
  • the data processing device in some embodiments of the present application includes: an acquisition unit, used to acquire the optimal performance value of the storage block; a first computing unit, the first computing unit is communicatively connected to the acquisition unit, and the first computing unit is based on the optimal performance value, used to determine the first reading threshold of the storage block and the execution probability value of garbage collection; the first judgment unit is communicatively connected to the computing unit, and the first judgment unit reads valid data in the storage block based on obtaining The number of reads is used to determine whether the number of reads is greater than the first read threshold. If the number of reads is greater than the read threshold, garbage collection is performed based on the execution probability value.
  • the device further includes: a second computing unit, the second computing unit is communicatively connected to the acquisition unit, and the second computing unit is used to determine the second read threshold of the storage block based on the optimal performance value;
  • the second judgment unit is communicatively connected to the second calculation unit. The second judgment unit is based on obtaining the number of reads of valid data in the storage block and is used to judge whether the number of reads is greater than the second read threshold. If the read If the number of times exceeds the second reading threshold, continue to judge whether there is valid data in the storage block; the third judgment unit is communicatively connected to the second judgment unit, and the third judgment unit is used to continue to judge whether there is valid data in the storage block. Data, if there is valid data in the storage block, the valid data in the storage block will be garbage collected.
  • the device further includes: a garbage collection unit.
  • the garbage collection unit is communicatively connected to the first judgment unit and the third judgment unit.
  • the garbage collection unit is used to write valid data in the storage block into the second storage block. , erase the corresponding invalid data in the storage block, and update the address mapping table corresponding to the valid data.
  • Each module in the above data processing device can be implemented in whole or in part by software, hardware and combinations thereof.
  • Each of the above modules may be embedded in or independent of the processor of the computer device in the form of hardware, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • Some embodiments of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium stores a program.
  • the processor is caused to execute the steps of the data processing method in the above-mentioned Embodiment 1 and Embodiment 2.
  • embodiments in the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the embodiments of the present application may take the form of a complete hardware embodiment, a complete software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code. .
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System (AREA)

Abstract

本申请涉及一种数据处理方法和装置,方法包括以下步骤:获取存储块的最优性能值;基于所述最优性能值,确定存储块的第一读取阈值及进行垃圾回收的执行概率值;获取读取所述存储块内有效数据的读取次数,判断所述读取次数是否大于所述第一读取阈值,若所述读取次数大于所述第一读取阈值,则基于所述执行概率值进行垃圾回收处理;基于本申请所述的数据处理方法可以有效地平滑处理由于读干扰导致的在进行垃圾回收处理时垃圾回收均衡性差的弊端,可以有效地提升用户的读写体验,解决读写性能波动大的问题。

Description

一种数据处理方法和装置
相关申请的交叉引用
本申请要求于2022年03月18日提交中国专利局,申请号为202210268230.0,申请名称为“一种数据处理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据存储技术领域,尤其是指一种数据处理方法和装置。
背景技术
固态硬盘(solid state drives,SSD),指用固态电子存储芯片阵列制成的硬盘,由控制单元和存储单元组成。固态硬盘中的闪存芯片(flash memory)中包含多个区块(block),每一个区块中包含多个页(page)。在多次对页进行数据操作时,可能对同一个区块内其他页的数据操作造成干扰,造成读取数据出错。于是需要将含有无效数据和有效数据的页中的有效垃圾回收到新的区块中,从而将无效数据所占空间腾出以供有效数据使用,这个过程称为垃圾回收。
现有技术中,SSD主控制器从页中读取一次数据后,会将当前区块A的读次数加一,并写入到内存中(double data rate,DDR),并且可以获取该页的数据读次数,当其达到预设值时,利用一种随机数生成器,生成一定范围内的随机值,当这个值等于预设值时,触发垃圾回收,将当前读到的数据,搬移到新的区块B中,并标记区块A中被读到数据位置为无效。经过分析研究发现,上述方法存在以下弊端:当区块中的数据达到读预设值之前,可能全部都是有效数据(假设有效数据量为n),因预设概率为固定概率(假设为p),那么当读次数达到预设值之后,当前区块首次发生搬移的概率为p(n/n*p),当发生一次搬移后,当前区块中的有效数据变为n-1,那么下一次发生搬移的概率为(n-1)/n*p,再下一次发生的概率为(n-2)/n*p。以此类推,可见随着搬移次数的增加,当前区块发生搬移的概率是逐渐减小的,也就是说,在实际场景中,当前区块发生垃圾回收的几率呈线性下降,发生垃圾回收的几率最大值是发生在刚达到预设值时。
在实际的应用场景中,当发生用户读写SSD内部数据,及SSD垃圾回收(垃圾回收)都是会消耗一定的CPU计算能力和缓存资源,但是CPU的计算能力和缓存资源都是有限的,为了能够达到用户的良好体验,垃圾回收产生时,垃圾回收的数量不能太多,也不能有明显的波动。但是现有技术涉及的垃圾回收处理方案与用户期望是相违背的。
发明内容
为了解决上述技术问题,本申请提供了一种数据处理方法和装置,有效地平滑处理由于读干扰导致的在进行垃圾回收处理时垃圾回收均衡性差的弊端,可以有效地提升用户的读写体验,解决现有技术中存在读写性能波动大的问题。
为实现上述目的,本申请提出第一技术方案:
一种数据处理方法,包括以下步骤:获取存储块的最优性能值、最大读取次数及单位数据个数;基于最优性能值、最大读取次数及单位数据个数,确定存储块的第一读取阈值;获取存储块内的有效数据个数及可存放有效数据的最大值;基于最优性能值、有效数据个数及可存放有效数据的最大值,确定进行垃圾回收的执行概率值;获取读取存储块内有效数据的读取次数,判断读取次数是否大于第一读取阈值,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理。
在本申请的一些实施例中,确定存储块的第一读取阈值具体包括:获取存储块的最大读取次数及存储块的单位数据个数;根据存储块的最优性能值,基于下式,确定存储块的第一读取阈值:
在本申请的一些实施例中,确定进行垃圾回收的执行概率值具体包括:获取存储块内的有效数据个数及存储块内可存放有效数据的最大值;根据存储块的最优性能值,基于下式,确定进行垃圾回收的执行概率值:
在本申请的一些实施例中,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理具体包括:获取确定进行垃圾回收的执行概率值;若执行概率值不小于1,且当读取次数与第一读取阈值的差值为执行概率值的整数倍时,进行一个数据单位的垃圾回收;若执行概率值大于0、小于1,且当读取次数与第一读取阈值的差值等于执行概率值时,进行一个数据单位的垃圾回收。
在本申请的一些实施例中,方法还包括:基于最优性能值,确定存储块的第二读取阈值;获取读取存储块内有效数据的读取次数,判断读取次数是否大于第二读取阈值;若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据。
在本申请的一些实施例中,若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据具体包括:若存储块内存在有效数据,则将存储块内的有效数据进行垃圾回收处理。
在本申请的一些实施例中,垃圾回收处理包括:将存储块内的有效数据写入第二存储块,擦除存储块内对应的无效数据,并更新有效数据对应的地址映射表。
为实现上述目的,本申请还提出第二技术方案:
一种数据处理装置,装置包括:获取单元,用于获取存储块的最优性能值、最大读取次数及单位数据个数;获取单元还用于获取存储块内的有效数据个数及可存放有效数据的最大值;第一计算单元,第一计算单元与获取单元通信连接,第一计算单元基于最优性能值、最大读取次数及单位数据个数,确定存储块的第一读取阈值;第一计算单元还用于基于有效数据个数及可存放有效数据的最大值,确定进行垃圾回收的执行概率值;第一判断单元,第一判断单元与计算单元通信连接,第一判断单元基于获取读取存储块内有效数据的读取次数,用于判断读取次数是否大于第一读取阈值,若读取次数大于读取阈值,则基于执行概率值进行垃圾回收处理。
在本申请的一些实施例中,装置还包括:第二计算单元,第二计算单元与获取单元通信连接,第二计算单元基于最优性能值,用于确定存储块的第二读取阈值;第二判断单元,第二判断单元与第二计算单元通信连接,第二判断单元基于获取读取存储块内有效数据的读取次数,用于判断读取次数是否大于第二读取阈值,若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据;第三判断单元,第三判断单元与第二判断单元通信连接,第三判断单元用于继续判断存储块内是否存在有效数据,若存储块内存在有效数据,则将存储块内的有效数据进行垃圾回收处理。
在本申请的一些实施例中,装置还包括:垃圾回收单元,用于将存储块内的有效数据写入第二存储块,擦除存储块内对应的无效数据,并更新有效数据对应的地址映射表。
本申请的上述技术方案相比现有技术具有以下优点:
本申请的一种数据处理方法和装置,方法包括:获取存储块的最优性能值;基于最优性能值,确定存储块的第一读取阈值及进行垃圾回收的执行概率值;获取读取存储块内有效数据的读取次数,判断读取次数是否大于第一读取阈值,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理;基于本申请的数据处理方法可以有效地平滑处理由于读干扰导致的在进行垃圾回收处理时垃圾回收均衡性差的弊端,可以有效地提升用户的读写体验,解决现有技术中存在的读写性能波动大的问题。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域 普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是本申请的方法流程图;
图2是本申请的第一读取阈值与第二读取阈值之间的关系示意图;
图3是本申请的装置结构图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
下面对本申请实施例设计的一些属于进行介绍:
读干扰:当闪存中有有效数据,在多次对页进行数据操作时,可能对同一个区块内其他页的数据操作造成干扰,造成读取数据出错。
垃圾回收:将含有无效数据和有效数据的页中的有效垃圾回收到新的区块中,从而将无效数据所占空间腾出以供有效数据使用,这个过程称为垃圾回收。
参照图1所示,图1为一些实施例中的方法流程图。
一些实施例中的数据处理方法,包括以下步骤:步骤S1、获取存储块的最优性能值;步骤S2、基于最优性能值,确定存储块的第一读取阈值及进行垃圾回收的执行概率值;步骤S3、获取读取存储块内有效数据的读取次数,判断读取次数是否大于第一读取阈值,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理。
在其中一些实施方式中,方法还包括:步骤S0、接受目标用户发送的有效数据读取请求,并利用有效数据读取请求确定待读取的目标有效数据,基于待读取的目标有效数据及目标有效数据对应的地址映射表确定目标有效数据所在的存储块。
在其中一些实施方式中,方法还包括:步骤S4、基于最优性能值,确定存储块的第二读取阈值;步骤S5、获取读取存储块内有效数据的读取次数,判断读取次数是否大于第二读取阈值,若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据;步骤S6、若存储块内存在有效数据,则将存储块内的有效数据进行垃圾回收处理。如图2所示,即为第一读取阈值与第二读取阈值之间的关系示意图,其中,B表示第一读取阈值,A表示第二读取阈值。
在其中一些实施方式中,确定存储块的第一读取阈值具体包括:获取存储块的最大读取次数及存储块的单位数据个数;根据存储块的最优性能值,基于下式,确定存储块的第一读 取阈值:
;其中,需要理解的是,本申请对于最优性能值的具体取值的大小不做限定,本领域的技术人员根据存储块的实际性能进行确定,一般地,最优性能值通常是指:在实际的测试过程中,将不影响存储块的数据读取性能的最大值定义为最优性能值,即存储块在读写过程中,受到内部数据搬移的影响,存储块的读取性能会受到影响,当数据搬移的次数达到第四阈值后,存储块的读取性能会直线下降,因此,在实际场景中,将上述第四阈值定义为最优性能值。需要理解的是,本申请的最优性能值均为大于0小于1的数,在实际的应用过程中,若最优性能值大于1,则需取最优性能值的倒数。基于存储块的数据读取特性,对于存储数据的存储块而言,如果存储块被读取的次数越多,其内部存储的数据的稳定性会越差,因此,本申请的存储块的最大读取次数即为:在保证存储块内的数据具有高准确性和高稳定性的前提下,存储块内的数据可被读取的次数的最大值;存储块的单位数据个数表示一个区块内可以存放的单位数据的最大个数;存储块内的有效数据个数表示一个区块内存储的有效数据的个数;本申请的可存放有效数据的最大值表示一个区块内可以存储的有效数据的最大值。
在其中一些实施方式中,确定进行垃圾回收的执行概率值具体包括:获取存储块内的有效数据个数及存储块内可存放有效数据的最大值;根据存储块的最优性能值,基于下式,确定进行垃圾回收的执行概率值:
;其中,存储块内可存放有效数据的最大值即为存储块的容量,即为存储块可以容纳有效数据的额定值;存储块内的有效数据个数即为在当前情况下,当前存储块内存储的现有的有效数据的个数。
在其中一些实施方式中,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理具体包括:获取确定进行垃圾回收的执行概率值;若执行概率值不小于1,且当读取次数与第一读取阈值的差值为执行概率值的整数倍时,进行一个数据单位的垃圾回收;即表示当读取次数超出第一读取阈值之后,每当读取次数增加的次数与执行概率值相等的时候,就进行一个数据单位的垃圾回收。若执行概率值大于0、小于1,且当读取次数与第一读取阈值的差值等于执行概率值时,进行一个数据单位的垃圾回收。
在其中一些实施方式中,垃圾回收处理包括:将存储块内的有效数据写入第二存储块,擦除存储块内对应的无效数据,并更新有效数据对应的地址映射表。需要理解的是,擦除所属存储块内对应的无效数据具体包括:当将存储块中的有效数据写入第二存储块之后,即表示将对应的有效数据变为了无效数据留存在了存储块中,再将存储块内对应的无效数据擦除以实现可以将该存储块作为新的存储单元进行有效数据的存储。其中,更新有效数据对应的地址映射表,即将有效数据原来对应的存储块的地址更新为第二存储块的地址。需要理解的 是,将存储块内的有效数据写入第二存储块的过程即为数据搬移的过程;本申请的无效数据即为被无效了的有效数据。
在一些实施例中的数据处理方法,包括以下步骤:获取存储块的最优性能值;基于最优性能值,确定存储块的第一读取阈值及进行垃圾回收的执行概率值;获取读取存储块内有效数据的读取次数,判断读取次数是否大于第一读取阈值,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理。
在其中一些实施方式中,确定存储块的第一读取阈值具体包括:获取存储块的最大读取次数及存储块的单位数据个数;根据存储块的最优性能值,基于下式,确定存储块的第一读取阈值:
在其中一些实施方式中,确定进行垃圾回收的执行概率值具体包括:获取存储块内的有效数据个数及存储块内可存放有效数据的最大值;根据存储块的最优性能值,基于下式,确定进行垃圾回收的执行概率值:
在其中一些实施方式中,若读取次数大于第一读取阈值,则基于执行概率值进行垃圾回收处理具体包括:获取确定进行垃圾回收的执行概率值;若执行概率值不小于1,且当读取次数与第一读取阈值的差值为执行概率值的整数倍时,进行一个数据单位的垃圾回收;若执行概率值大于0、小于1,且当读取次数与第一读取阈值的差值等于执行概率值时,进行一个数据单位的垃圾回收。
在其中一些实施方式中,方法还包括:基于最优性能值,确定存储块的第二读取阈值;获取读取存储块内有效数据的读取次数,判断读取次数是否大于第二读取阈值;若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据。需要理解的是,本申请的第一读取阈值与第二读取阈值并非是同一概念,本申请的第二读取阈值相当于一个临界值,具体地,关于上述临界值,即第二读取阈值的具体大小可以根据实际应用场景的需要,由本领域技术人员进行确定。
在其中一些实施方式中,若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据具体包括:若存储块内存在有效数据,则将存储块内的有效数据进行垃圾回收处理。
在其中一些实施方式中,垃圾回收处理包括:将存储块内的有效数据写入第二存储块,擦除存储块内对应的无效数据,并更新有效数据对应的地址映射表。
应该理解的是,虽然图1的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
参照图3所示,图3为本申请一些实施例中的装置结构图。
本申请一些实施例中的数据处理装置,包括:获取单元,用于获取存储块的最优性能值;第一计算单元,第一计算单元与获取单元通信连接,第一计算单元基于最优性能值,用于确定存储块的第一读取阈值及进行垃圾回收的执行概率值;第一判断单元,第一判断单元与计算单元通信连接,第一判断单元基于获取读取存储块内有效数据的读取次数,用于判断读取次数是否大于第一读取阈值,若读取次数大于读取阈值,则基于执行概率值进行垃圾回收处理。
在其中一些实施方式中,装置还包括:第二计算单元,第二计算单元与获取单元通信连接,第二计算单元基于最优性能值,用于确定存储块的第二读取阈值;第二判断单元,第二判断单元与第二计算单元通信连接,第二判断单元基于获取读取存储块内有效数据的读取次数,用于判断读取次数是否大于第二读取阈值,若读取次数超出第二读取阈值,则继续判断存储块内是否存在有效数据;第三判断单元,第三判断单元与第二判断单元通信连接,第三判断单元用于继续判断存储块内是否存在有效数据,若存储块内存在有效数据,则将存储块内的有效数据进行垃圾回收处理。
在其中一些实施方式中,装置还包括:垃圾回收单元,垃圾回收单元与第一判断单元及第三判断单元分别通信连接,垃圾回收单元用于将存储块内的有效数据写入第二存储块,擦除存储块内对应的无效数据,并更新有效数据对应的地址映射表。
关于数据处理装置的具体限定可以参见上文中对于数据处理方法的限定,在此不再赘述。上述数据处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
本申请一些实施例提供一种计算机可读存储介质,计算机可读存储介质存储有程序,当 程序被处理器执行时,使得处理器执行上述实施例一和实施例二中的数据处理方法的步骤。
本领域内的技术人员应明白,本申请实施例中的实施例可提供为方法、系统、或计算机程序产品。因此,本申请实施例中可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例中可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请实施例中是参照根据本申请实施例中实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
注意,上述仅为本申请的较佳实施例及所运用技术原理。本领域技术人员会理解,本申请不限于这里的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其它等效实施例,而本申请的范围由所附的权利要求范围决定。

Claims (20)

  1. 一种数据处理方法,其特征在于,所述方法包括以下步骤:
    获取存储块的最优性能值、最大读取次数及单位数据个数;基于所述最优性能值、所述最大读取次数及所述单位数据个数,确定所述存储块的第一读取阈值;
    获取所述存储块内的有效数据个数及可存放有效数据的最大值;基于所述最优性能值、所述有效数据个数及所述可存放有效数据的最大值,确定进行垃圾回收的执行概率值;
    获取读取所述存储块内有效数据的读取次数,判断所述读取次数是否大于所述第一读取阈值,若所述读取次数大于所述第一读取阈值,则基于所述执行概率值进行垃圾回收处理。
  2. 根据权利要求1所述的数据处理方法,其特征在于,基于下式,确定所述存储块的第一读取阈值:
  3. 根据权利要求2所述的方法,其特征在于,所述最优性能值为在实际的测试过程中,不影响存储块的数据读取性能的最大值。
  4. 根据权利要求3所述的方法,其特征在于,所述最优性能值为数据搬移的次数对应的第四阈值。
  5. 根据权利要求2~4任一项所述的方法,其特征在于,所述最优性能值大于0小于1。
  6. 根据权利要求2所述的方法,其特征在于,所述可存放有效数据的最大值表示一个区块内可以存储的有效数据的最大值。
  7. 根据权利要求2所述的方法,其特征在于,所述存储块的最大读取次数为在所述存储块内的数据具有高准确性和高稳定性的情况下,所述存储块内的数据可被读取的次数的最大值。
  8. 根据权利要求2所述的方法,其特征在于,所述存储块的单位数据个数为一个区块内可以存放的单位数据的最大个数。
  9. 根据权利要求1所述的数据处理方法,其特征在于,基于下式,确定进行垃圾回收的执行概率值:
  10. 根据权利要求9所述的方法,其特征在于,所述存储块内可存放有效数据的最大值为存储块的容量。
  11. 根据权利要求9所述的方法,其特征在于,所述有效数据个数为在当前情况下,当前存储块内存储的现有的有效数据的个数。
  12. 根据权利要求1所述的数据处理方法,其特征在于,若所述读取次数大于所述第一读取阈值,则基于所述执行概率值进行垃圾回收处理具体包括:
    获取进行垃圾回收的执行概率值;
    若所述执行概率值不小于1,且当所述读取次数与所述第一读取阈值的差值为所述执行概率值的整数倍时,进行一个数据单位的垃圾回收;
    若所述执行概率值大于0、小于1,且当所述读取次数与所述第一读取阈值的差值等于所述执行概率值时,进行一个数据单位的垃圾回收。
  13. 根据权利要求1或12所述的数据处理方法,其特征在于,所述方法还包括:
    基于所述最优性能值,确定所述存储块的第二读取阈值;
    获取读取所述存储块内有效数据的读取次数,判断所述读取次数是否大于所述第二读取阈值;
    若所述读取次数超出所述第二读取阈值,则继续判断所述存储块内是否存在有效数据。
  14. 根据权利要求13所述的数据处理方法,其特征在于,若所述读取次数超出所述第二读取阈值,则继续判断所述存储块内是否存在有效数据具体包括:
    若所述存储块内存在有效数据,则将所述存储块内的有效数据进行垃圾回收处理。
  15. 根据权利要求14所述的方法,其特征在于,所述垃圾回收指将含有所述无效数据和所述有效数据的页中的所述有效垃圾回收到新的区块中,从而将所述无效数据所占空间腾出以供有效数据使用。
  16. 根据权利要求1或14所述的数据处理方法,其特征在于,所述垃圾回收处理包括:
    将所述存储块内的有效数据写入第二存储块,擦除所述存储块内对应的无效数据,并更新所述有效数据对应的地址映射表。
  17. 根据权利要求16所述的数据处理方法,其特征在于,所述擦除所属存储块内对应的无效数据包括:
    当将所述存储块中的有效数据写入所述第二存储块之后,所述有效数据在所述存储块中变为无效数据,将所述存储块内对应的无效数据擦除以实现可以将所述存储块作为新的存储单元进行有效数据的存储。
  18. 根据权利要求16所述的数据处理方法,其特征在于,所述更新所述有效数据对应的地址映射表包括:
    将所述有效数据原来对应的所述存储块的地址更新为所述第二存储块的地址。
  19. 一种数据处理装置,其特征在于,所述装置包括:
    获取单元,用于获取存储块的最优性能值、最大读取次数及单位数据个数;所述获取单元还用于获取所述存储块内的有效数据个数及可存放有效数据的最大值;
    第一计算单元,所述第一计算单元与所述获取单元通信连接,所述第一计算单元基于所述最优性能值、所述最大读取次数及所述单位数据个数,确定所述存储块的第一读取阈值;所述第一计算单元还用于基于所述有效数据个数及所述可存放有效数据的最大值,确定进行垃圾回收的执行概率值;
    第一判断单元,所述第一判断单元与所述计算单元通信连接,所述第一判断单元基于获取读取所述存储块内有效数据的读取次数,用于判断所述读取次数是否大于所述第一读取阈值,若所述读取次数大于所述读取阈值,则基于所述执行概率值进行垃圾回收处理。
  20. 根据权利要求19所述的数据处理装置,其特征在于,所述装置还包括:
    第二计算单元,所述第二计算单元与所述获取单元通信连接,所述第二计算单元基于所述最优性能值,用于确定所述存储块的第二读取阈值;
    第二判断单元,所述第二判断单元与所述第二计算单元通信连接,所述第二判断单元基于获取读取所述存储块内有效数据的读取次数,用于判断所述读取次数是否大于所述第二读取阈值,若所述读取次数超出所述第二读取阈值,则继续判断所述存储块内是否存在有效数据;
    第三判断单元,所述第三判断单元与所述第二判断单元通信连接,所述第三判断单元用于继续判断所述存储块内是否存在有效数据,若所述存储块内存在有效数据,则将所述存储块内的有效数据进行垃圾回收处理。
PCT/CN2023/082053 2022-03-18 2023-03-17 一种数据处理方法和装置 WO2023174394A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210268230.0A CN114356248B (zh) 2022-03-18 2022-03-18 一种数据处理方法和装置
CN202210268230.0 2022-03-18

Publications (1)

Publication Number Publication Date
WO2023174394A1 true WO2023174394A1 (zh) 2023-09-21

Family

ID=81094808

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/082053 WO2023174394A1 (zh) 2022-03-18 2023-03-17 一种数据处理方法和装置

Country Status (2)

Country Link
CN (1) CN114356248B (zh)
WO (1) WO2023174394A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114356248B (zh) * 2022-03-18 2022-06-14 苏州浪潮智能科技有限公司 一种数据处理方法和装置
CN114996173B (zh) * 2022-08-04 2022-11-18 合肥康芯威存储技术有限公司 一种管理存储设备写操作的方法和装置
CN115269451B (zh) * 2022-09-28 2023-05-12 珠海妙存科技有限公司 闪存垃圾回收方法、装置及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160179386A1 (en) * 2014-12-17 2016-06-23 Violin Memory, Inc. Adaptive garbage collection
CN109976671A (zh) * 2019-03-19 2019-07-05 苏州浪潮智能科技有限公司 一种读干扰处理方法、装置、设备及可读存储介质
CN112256198A (zh) * 2020-10-21 2021-01-22 成都佰维存储科技有限公司 Ssd数据读取方法、装置、可读存储介质及电子设备
CN114356248A (zh) * 2022-03-18 2022-04-15 苏州浪潮智能科技有限公司 一种数据处理方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160179386A1 (en) * 2014-12-17 2016-06-23 Violin Memory, Inc. Adaptive garbage collection
CN109976671A (zh) * 2019-03-19 2019-07-05 苏州浪潮智能科技有限公司 一种读干扰处理方法、装置、设备及可读存储介质
CN112256198A (zh) * 2020-10-21 2021-01-22 成都佰维存储科技有限公司 Ssd数据读取方法、装置、可读存储介质及电子设备
CN114356248A (zh) * 2022-03-18 2022-04-15 苏州浪潮智能科技有限公司 一种数据处理方法和装置

Also Published As

Publication number Publication date
CN114356248B (zh) 2022-06-14
CN114356248A (zh) 2022-04-15

Similar Documents

Publication Publication Date Title
WO2023174394A1 (zh) 一种数据处理方法和装置
TWI515561B (zh) 使用快閃記憶體之頁結構的資料樹儲存方法、系統以及電腦產品
US9483396B2 (en) Control apparatus, storage device, and storage control method
TWI541819B (zh) 用來進行錯誤更正之方法、記憶裝置、與控制器
US10860494B2 (en) Flushing pages from solid-state storage device
TWI662418B (zh) Information processing device and memory access method
WO2022134741A1 (zh) 重读命令处理方法、闪存控制器及固态硬盘
EP3338193B1 (en) Convertible leaf memory mapping
TWI608350B (zh) 儲存裝置及其控制單元、可用於儲存裝置的資料搬移方 法
KR20140094278A (ko) 반도체 장치 및 이의 동작 방법
US8972629B2 (en) Low-contention update buffer queuing for large systems
US20100125697A1 (en) Computing device having storage, apparatus and method of managing storage, and file system recorded recording medium
CN115756312A (zh) 数据访问系统、数据访问方法和存储介质
US10025706B2 (en) Control device, storage device, and storage control method
CN108762670B (zh) 一种ssd固件中数据块的管理方法、系统及装置
KR101123335B1 (ko) 해시 인덱스 구성 방법과 그 장치, 및 상기 장치를 구비하는 데이터 저장 장치, 및 상기 방법을 구현하는 프로그램이 기록된 기록매체
US10083117B2 (en) Filtering write request sequences
KR101363422B1 (ko) 비휘발성 메모리 시스템
JPWO2015087651A1 (ja) メモリの使用可能期間を延ばすための装置、プログラム、記録媒体および方法
US9652172B2 (en) Data storage device performing merging process on groups of memory blocks and operation method thereof
US9146858B2 (en) Control device, storage device, and storage control method
US20150074492A1 (en) Memory system and memory controller
TWI662554B (zh) 儲存裝置及其控制單元、可用於儲存裝置的資料搬移方法
JP2019045974A (ja) 情報処理装置、情報処理プログラム及び情報処理方法
JP2014186600A (ja) 記憶装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23769896

Country of ref document: EP

Kind code of ref document: A1