WO2022110196A1 - 一种数据处理方法、装置及系统 - Google Patents

一种数据处理方法、装置及系统 Download PDF

Info

Publication number
WO2022110196A1
WO2022110196A1 PCT/CN2020/132897 CN2020132897W WO2022110196A1 WO 2022110196 A1 WO2022110196 A1 WO 2022110196A1 CN 2020132897 W CN2020132897 W CN 2020132897W WO 2022110196 A1 WO2022110196 A1 WO 2022110196A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage area
storage
grouping
data processing
period
Prior art date
Application number
PCT/CN2020/132897
Other languages
English (en)
French (fr)
Inventor
许璐
胡海燕
金加靖
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202080103535.5A priority Critical patent/CN115989485A/zh
Priority to PCT/CN2020/132897 priority patent/WO2022110196A1/zh
Publication of WO2022110196A1 publication Critical patent/WO2022110196A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Definitions

  • the present application relates to the field of computer technology, and in particular, to a data processing method, device and system.
  • SSD solid state drive
  • NAND Flash flash memory
  • the present application provides a data processing method, device and system, by grouping multiple storage areas in a storage array according to historical feature information of the multiple storage areas, so as to optimize the data arrangement in the SSD and reduce the number of SSDs. Write amplification, thus guaranteeing system performance and SSD life.
  • an embodiment of the present application provides a data processing method, and the method may be implemented by a controller, and the controller may be, for example, a controller in a solid-state storage device SSD.
  • the method may include: the controller acquires characteristic information of multiple storage areas in the storage array according to the grouping period, and the characteristic information of each storage area represents the life cycle of the storage area; within the grouping period, according to the The characteristic information performs grouping processing on the multiple storage areas to obtain multiple storage area groups, each storage area group includes at least one storage area, and different storage area groups correspond to different life cycle intervals; according to the multiple storage area groups A group update mapping relationship table, wherein the mapping relationship table is used to record the mapping relationship among the group identifiers, logical addresses, and physical addresses of the multiple storage area groups.
  • the controller can group a plurality of storage areas and update the mapping table according to the historical feature information of each storage area, so that the controller can process the data according to the mapping table and the received data processing request in real time.
  • the included logical addresses are used to determine the corresponding target physical addresses and complete the corresponding data processing operations, so as to optimize the data arrangement in the storage array, reduce the write amplification of the storage array, and thus ensure the system performance and the life of the storage array.
  • the data processing request is a write request
  • the data in the corresponding area can be written into the corresponding physical area in the storage array of the SSD, thereby completing the intelligent arrangement of data in the SSD.
  • the grouping period is: a time period; or, a period of access times.
  • each storage area can be periodically obtained, so that multiple storage areas can be periodically grouped to combine the actual usage of each storage area, so as to optimize the data arrangement in the SSD.
  • the use and wear of each storage area is balanced to ensure the life of the SSD and the system performance of the computer system where the SSD is located as much as possible.
  • the grouping period may be configured in any suitable metering manner, so that the data processing solution can be flexibly applied to different scenarios, and details are not described herein again.
  • the multiple storage areas are grouped according to the characteristic information to obtain multiple storage area groups, including: according to the characteristic information and the set grouping conditions , performing iterative grouping processing on the multiple storage areas until the multiple storage area groups that do not meet the grouping conditions are obtained; the grouping conditions include: the traffic proportion of one storage area group is greater than a preset proportion threshold , the traffic ratio is the ratio of the read/write frequency of all storage areas included in the storage area group to the read/write frequency of the multiple storage areas.
  • the multiple storage areas in the storage array when multiple storage areas in the storage array are grouped and divided, the multiple storage areas can be grouped according to the characteristic information obtained by periodic statistics and the iterative clustering method.
  • more groups can be obtained adaptively, so that data with different data characteristics can be divided into corresponding categories and stored in corresponding physical areas, so as to optimize the data arrangement of data in SSD cloth to avoid problems such as write amplification caused by data mixing.
  • the controller acquires the characteristic information of the plurality of storage areas in the storage array according to the grouping period, including: in the grouping period, according to the received data processing request for each storage area The command information of the area, and the characteristic information of the plurality of storage areas is acquired.
  • the controller can adaptively decide the characteristic parameters and the corresponding grouping algorithm according to the received command information, so that in different application scenarios, the characteristic parameters and statistical characteristic information can be flexibly determined, so as to realize the adaptive realization A grouping of multiple storage areas within a storage array.
  • the characteristic parameters determined by the controller may include but are not limited to the type, group, and value period of the characteristic, etc., which will not be repeated here.
  • the data processing request is an I/O request;
  • the characteristic information includes I/O characteristic information.
  • a corresponding data processing solution can be completed according to I/O requests from other devices or functional modules of the storage system or computer system.
  • the controller is a controller in a solid state storage device SSD, and the storage array includes NAND flash memory particles. It can be understood that, in this embodiment of the present application, the storage array may include, but not limited to, NAND flash memory particles, which will not be repeated here.
  • an embodiment of the present application provides a data processing apparatus, including an interface, a processor, and a cache; the interface is used to provide a channel connecting the processor and a storage array; the processor is used to Periodically acquire characteristic information of multiple storage areas in the storage array, and within the grouping period, perform grouping processing on the multiple storage areas according to the characteristic information to obtain multiple storage area groups, and
  • the multiple storage area groups update the mapping relationship table, wherein the characteristic information of each storage area represents the life cycle of the storage area, each storage area group includes at least one storage area, and different storage area groups correspond to different life cycles.
  • the period interval, the mapping relationship table is used to record the mapping relationship among the group identifiers, logical addresses and physical addresses of the plurality of storage area groups; the cache is used to store the mapping relationship table.
  • the grouping period is: a time period; or, a period of access times.
  • the processor is specifically configured to: perform iterative grouping processing on the plurality of storage areas according to the feature information and the set grouping conditions, until the grouping conditions that do not meet the grouping conditions are obtained.
  • a plurality of storage area groups; the grouping conditions include: the flow ratio of a storage area group is greater than a preset ratio threshold, and the flow ratio is the sum of the read and write frequencies of all storage areas included in the storage area group and the The ratio of the sum of read and write frequencies of the plurality of storage areas.
  • the processor is specifically configured to: within the grouping period, acquire the characteristics of the multiple storage areas according to the command information for each storage area included in the received data processing request information.
  • the data processing request is an I/O request;
  • the characteristic information includes I/O characteristic information.
  • an embodiment of the present application further provides storage data, where the storage data may include a storage array and the data processing apparatus according to any one of the foregoing second aspects, wherein the data processing apparatus is configured to Periodically acquire characteristic information of a plurality of storage areas in the storage array.
  • an embodiment of the present application further provides a computer system, where the computer system may include a host and the storage system described in the third aspect, wherein the host is configured to send a data processing request to the storage system, The storage system is configured to execute the stored instructions, and the storage system implements the data processing method according to any one of the first aspects above by executing the instructions.
  • an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is run on a computer, the computer is made to execute the above-mentioned first aspect provided method.
  • an embodiment of the present application further provides a computer program product, which, when the computer program product runs on a computer, causes the computer to execute the method provided in the first aspect.
  • FIG. 2 is a schematic structural diagram of a storage system to which an embodiment of the present application is applicable;
  • FIG. 3 is a schematic diagram of a logical structure of a controller according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a data processing method according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a data processing method according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a computer system according to an embodiment of the present application.
  • the temperature information of the data is downloaded to the SSD layer, so that the Data at different temperatures is written to different storage areas of the SSD.
  • the linux layer defines 4 write hint levels: short, medium, long, extreme.
  • solution 1 cannot be flexibly applied to complex scenarios supporting multiple applications or multiple systems.
  • parameters such as temperature level, temperature range, etc. are usually pre-configured in the SSD, and the SSD control unit adjusts and judges the corresponding temperature in real time according to the I/O information, so as to write the data to be written into the corresponding storage of the SSD area.
  • the embodiments of the present application provide a data processing solution, which can be flexibly applied to various application scenarios, and can guarantee the performance of the computer system and the lifespan of the SSD.
  • the method and the device are based on the same technical concept. Since the method and the device solve the problem in similar principles, the implementation of the device and the method can be referred to each other, and the repetition will not be repeated.
  • the controller may acquire feature information of multiple storage areas in the storage sequence according to the grouping period, perform grouping processing on the multiple storage areas according to the feature information, and obtain multiple storage area groups, and then:
  • the mapping relationship table is updated according to the plurality of storage area groups, wherein the characteristic information of each storage area represents the life cycle of the storage area, each storage area group includes at least one storage area, and different storage area groups correspond to different storage area groups.
  • the life cycle interval the mapping relationship table is used to record the mapping relationship among the group identifiers, logical addresses, and physical addresses of the multiple storage area groups.
  • the controller can search for the grouping identifier of the target storage area group corresponding to the target logical address contained in the data processing request according to the mapping relationship table and the target physical address, and then process the data processing request according to the group identifier and the target physical address.
  • the controller can record the characteristic information of each storage area in real time according to the received data processing request, and can periodically count the historical characteristic information of each storage area to group multiple storage areas to obtain multiple storage areas In order to realize the intelligent arrangement of data in the storage array according to the periodically updated grouping information of multiple storage area groups.
  • This solution can be flexibly applied to any scenario applicable to SSD, etc., and can reduce the write amplification of SSD, thereby improving system performance and SSD life.
  • a computer system consists of a hardware (sub) system and a software (sub) system.
  • the hardware (sub) system includes the organic combination of various physical components composed of electrical, magnetic, optical, mechanical and other principles, and is the entity on which the system works;
  • the software (sub) system includes various programs and files for Command the whole system to work according to the specified requirements.
  • the computer system in the embodiments of the present application may be a computer system in a terminal device, which is a device that provides business services to users and has a voice or data connectivity function.
  • the terminal device may also be referred to as a terminal device, and may also be referred to as user equipment (UE), a mobile station (mobile station, MS), a mobile terminal (mobile terminal, MT), etc.
  • UE user equipment
  • MS mobile station
  • MT mobile terminal
  • the terminal device may also be a chip.
  • a terminal device is taken as an example for specific description.
  • the terminal device may be a handheld device with a wireless connection function, a vehicle-mounted device, or the like.
  • some examples of terminal devices are: mobile phone (mobile phone), tablet computer, notebook computer, PDA, mobile internet device (MID), smart point of sale (POS), wearable device, Virtual reality (VR) equipment, augmented reality (AR) equipment, wireless terminals in industrial control, wireless terminals in self driving, remote medical surgery wireless terminals in smart grids, wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, Class smart meters (smart water meter, smart electricity meter, smart gas meter), etc.
  • the computer system in this embodiment of the present application may be a server, which is a device that provides a data connection service. Since the server can respond to the service request of the terminal device and process it, generally speaking, the server should have the ability to undertake and guarantee the service.
  • the server may be a server located in a data network (DN), such as a common server, a server in a cloud platform; or a multi-access edge computing (multi-access edge) located in the core network computing, MEC) server, etc.
  • DN data network
  • MEC multi-access edge computing
  • OS Operating system
  • Kernel It is the core of an operating system. It is the first layer of software expansion based on hardware. It provides the most basic functions of the operating system and is the basis for the work of the operating system. It is responsible for managing the processes, memory, Drivers, files and network systems determine the performance and stability of the system.
  • (5) at least one, refers to one or more.
  • FIG. 2 is a schematic structural diagram of a storage system to which an embodiment of the present application is applied.
  • the storage system 200 may include a storage array 210 and a controller 220 .
  • the storage array 210 may implement the function of data storage
  • the controller 220 may implement the function of control.
  • the storage array 210 may include multiple storage areas, and the multiple storage areas may be divided into multiple groups by the controller 220 periodically, that is, multiple storage area groups, and each storage area group includes at least A storage area, different storage area groups correspond to different life cycle intervals.
  • the controller 220 may update the mapping relationship table according to the plurality of storage area groups, and the mapping relationship table may be used to record the group identifier, logical address of the multiple storage area groups, and the physical address of the storage area. Mapping relations.
  • the controller 220 can look up the group identifier and target storage area group corresponding to the target logical address included in the data processing request according to the mapping table. physical address, and further, according to the group identifier and the target physical address, process the data processing request, for example, perform operations such as reading, writing or erasing data on the target physical address, and obtain the data processing request process result.
  • the controller 220 when the controller 220 performs grouping processing on multiple storage areas in the storage array 210, the controller 220 can adaptively decide the characteristic parameters used to group the storage areas according to application scenarios or business requirements, and further can According to the corresponding characteristic parameters and the historical characteristic information of each storage area obtained by statistics, multiple storage areas are grouped, and the mapping table is updated according to the multiple storage area groups obtained after the grouping, so as to refresh the logical address, group identification and The mapping relationship between the three physical addresses. Therefore, during the operation of the storage system, the data processing request received in real time can be mapped to the corresponding target physical address based on the mapping relationship table.
  • the storage system can be implemented in multiple ways, for example, it can be implemented as a solid-state storage SSD system, or it can be implemented as a storage system with similar characteristics to the SSD system, such as phase change storage (phase change storage) memory, PCM) system, which can also be implemented as at least one storage device in a computer system, which is not limited in this application.
  • phase change storage phase change storage
  • PCM phase change storage memory
  • the medium of the storage array in the storage system may be SSD particles, or may be any other storage medium in subsequent development, which is not limited in this application.
  • the controller 220 may include the following logic modules: a statistics module 221 , a processing module 222 and a query module 223 .
  • the statistics module 221 and the query module 223 can be used to implement foreground operations
  • the processing module 222 can be used to implement background operations.
  • the statistics module 221 can be used to realize the function of feature collection. After the controller 220 receives the data processing request, it can perform information recording and statistics according to the command information contained in the data processing request, so as to obtain corresponding feature information, and The collected characteristic information may be provided to the processing module 222 .
  • the processing module 222 can be used to realize the related algorithm function of data processing, and it can periodically perform calculation according to the obtained characteristic information of multiple storage areas, so as to group multiple storage areas to obtain multiple storage area groups, each Each storage area group includes at least one storage area, and different storage area groups correspond to different life cycle intervals. Moreover, the processing module 222 may generate or update the mapping relationship table according to the obtained multiple storage area groups. Wherein, the mapping relationship table is used to record the mapping relationship among the group identifiers, logical addresses, and physical addresses of the plurality of storage area groups.
  • the query module 223 can be used to realize the query function of the data processing request.
  • the query module 223 can search for the corresponding logical address contained in the data processing request according to the above-mentioned mapping table in real time.
  • the controller 220 may process the data processing request according to the group identifier and the target physical address, and obtain a data processing result.
  • the statistics module 221 can perform information recording and statistics according to any information contained in the data processing request, and convert it into corresponding feature information.
  • the statistics module 221 can also record and count some specified information contained in the data processing request according to the instructions of the processing module 222, and convert it into corresponding feature information.
  • data processing requests may include different processing types such as reading, writing, and erasing
  • the statistics module 221 may also perform different processing operations for different information contained in different types of data processing requests.
  • the processing algorithm is used to obtain feature information for grouping multiple storage areas, which is not limited in this application.
  • the characteristic information obtained by the statistics module 221 may include I/O characteristic information, such as read frequency, write frequency, read/write ratio, order and randomness, number of worker threads, queues Depth, data record size, etc.
  • I/O characteristic information such as read frequency, write frequency, read/write ratio, order and randomness, number of worker threads, queues Depth, data record size, etc.
  • the I/O information that can be obtained from the I/O request can include processing type information: write; correspondingly, the I/O feature information can include: The cumulative increase to the number of writes.
  • the I/O request is a read/write request for a certain logical address
  • the available I/O information may include information on the processing type of the logical address: read/write
  • the I/O characteristic information may include a cumulative increase in the number of reads/writes to the logical address and a read/write ratio characteristic of the logical address.
  • the number of accesses for example, the sum of the number of reads, the number of writes, etc.
  • the statistics module 221 can use corresponding algorithms to convert the recorded information to obtain corresponding features. information, which will not be repeated here.
  • the statistics module 221 may, according to a preset grouping period, count the characteristic information of the plurality of storage areas in each grouping period.
  • the trigger processing module 222 executes a correlation algorithm for grouping multiple storage areas to obtain an updated grouping result.
  • the grouping period may be the time period T.
  • the statistics module 221 can start a timer to perform time accumulation while counting the characteristic information of each storage area according to the time period, and can provide the characteristic information accumulated and collected in the previous period T1 to the processing at the end of the first period T1.
  • the module 222 performs one grouping process, and when the time reaches the end time of the second period T2 after T1, the characteristic information accumulated and collected in T2 is provided to the processing module 222 for the next grouping process.
  • the processing module 222 periodically performs calculation according to the obtained characteristic information, so as to group the multiple storage areas to obtain multiple storage area groups.
  • multiple storage areas can be grouped according to the actual usage of each storage area. While reducing the probability of write amplification, the use and wear of each storage area can be balanced, and the storage array can be protected as much as possible. lifespan.
  • the above example of triggering the processing module 222 to perform a grouping algorithm based on the time period as the grouping period is only an example of an optional implementation manner of the present application, rather than any limitation.
  • the grouping period may also be a grouping period determined by other measurement parameters, which is not limited in this application.
  • the number of visits can be used as the grouping period, and the increase in the number of visits can be used as a relative measurement, and the statistics module 221 can start a timer to accumulate the number of visits while counting the feature information.
  • the processing module 222 can be triggered to execute the related algorithm for grouping, so as to obtain the updated grouping result, so as to update the mapping relationship table according to the updated multiple storage area groups. This will not be repeated here.
  • the processing module 222 When the processing module 222 is triggered to perform the grouping algorithm, the processing module 222 can perform calculation according to the characteristic information obtained from the statistics module 221 in the grouping period, and obtain the updated grouping result and the updated mapping relationship table. Subsequently, after receiving the data processing request, the controller 220 can determine the target storage area group and the corresponding target physical address according to the updated mapping relationship table and the logical address contained in the data processing request in real time, Further, operations such as reading, writing or erasing data are performed on the target physical address according to the data processing request, and the processing result of the data processing request is obtained.
  • the processing module 222 when the processing module 222 groups multiple storage areas in the storage array, it can adaptively decide the characteristic parameters and corresponding grouping algorithms to be used for grouping. Therefore, when the above-mentioned data processing scheme is applied to the SSD system and to different user scenarios, the controller in the SSD can realize the recognition function and address translation function of the data processing request from the upper layer, which can be flexibly implemented. It can be applied to any scenario where SSD is applicable, and it will not bring large algorithm overhead.
  • the target physical address is helpful to realize the intelligent arrangement of the data to be processed in the storage array of the SSD, so as to reduce the probability of the write amplification problem of the SSD, so as to ensure the life of the storage array of the SSD and the performance of the computer system where the SSD is located.
  • the data processing method may be implemented by the controller shown in FIG. 2 or FIG. 3 , and the controller may be a controller in the SSD.
  • the data processing method may include:
  • S401 The controller receives a data processing request.
  • the data processing request may include command information for the storage area.
  • the command information may include, for example, the processing type of the data to be processed (such as read, write, or erase, etc.), the logical address of the data to be processed in the storage array, and the length of the data to be processed (or called the data record size). ), etc., which are not limited in this application.
  • the storage area may be obtained by dividing the entire storage space of the storage array according to any suitable granularity, for example, it may be a block (block), or it may be based on a logical address (logical block address, LBA), etc., which are not limited in this application.
  • LBA logical address
  • the command information included in a data processing request may include command information for one LBA, or may include command information for multiple LBAs, which is not limited in this application.
  • the data processing method may include the following branch processing flow:
  • the controller may look up the group identifier and the target physical address of the target storage area group corresponding to the target logical address included in the data processing request according to the mapping relationship table.
  • the controller may process the data processing request according to the group identifier and the target physical address.
  • the data processing method may further include the following branch processing flow:
  • the controller may record the command information for the storage area contained in the data processing request to obtain characteristic information of the corresponding storage area.
  • the controller may acquire characteristic information of multiple storage areas in the storage array according to the grouping period. At the end of the grouping period, the controller may perform grouping processing on the plurality of storage areas according to the characteristic information obtained by statistics in the grouping period to obtain a plurality of storage area groups, wherein each storage area group contains At least one storage area, and different storage area groups correspond to different life cycle intervals. Furthermore, based on the plurality of storage area groups, an updated mapping relationship table can be obtained. Wherein, the mapping relationship table is used to record the mapping relationship among the group identifiers, logical addresses, and physical addresses of the plurality of storage area groups. Wherein, the mapping relationship table used in S402 is an updated mapping relationship table obtained after regrouping multiple storage areas after each grouping period.
  • S401-S404 may be real-time, and S405 may be performed periodically based on the grouping period.
  • the grouping period may be measured based on the following information: time, or the number of visits.
  • the above-mentioned grouping period may be set according to application scenarios or business requirements, which is not limited in this application.
  • the grouping calculation process is triggered when the statistics of the historical feature information of multiple storage areas reach the grouping period, so that the grouping of the storage areas is periodically realized according to the historical usage of the storage areas, so that according to the grouping of the storage areas, the A storage area that is more suitable for the logical address in the data processing request, which is conducive to the intelligent arrangement of data in the storage array of the SSD, so as to reduce the write amplification of the SSD, thereby ensuring the life of the SSD and improving the computer system where the SSD is located. performance.
  • S405 may specifically include the following steps: performing iterative grouping processing on the plurality of storage areas according to the feature information and the set grouping conditions, until The plurality of storage area groups that do not satisfy the grouping condition are obtained.
  • the grouping condition includes: the traffic proportion of one storage area group is greater than a preset proportion threshold, and the traffic proportion is the sum of the read and write frequencies of all the storage areas included in the storage area group and the multiple The ratio of the sum of the read and write frequencies of the storage area.
  • the following is an example to illustrate the life cycle of a storage area represented by a heat characteristic.
  • the historical feature information periodically counted in the grouping period may respectively include the value of the heat feature of each storage area (referred to as the heat feature value for short).
  • the grouping calculation process when the grouping calculation process is periodically started, the following process can be used to divide multiple storage areas into multiple storage area groups:
  • S501 The controller traverses the heat characteristic values obtained by statistics in the grouping period, and obtains the maximum value and the minimum value of the heat characteristic.
  • S503 Determine whether the respective traffic proportion of each cluster exceeds a preset proportion threshold.
  • the plurality of storage areas can be divided into a cold data storage area group and a hot data storage area group.
  • the corresponding heat characteristic value intervals are (0, 3000], (3001, 6000] respectively.
  • multiple storage areas can also be divided into extremely cold data storage area groups, cold data storage area groups, and hot data storage area groups.
  • the corresponding heat feature value intervals are (0,500], (501,2000], (2001,400], (4001,6000]. Therefore, when applied to different scenarios, More groups can be obtained adaptively, so as to optimize the data arrangement of data in the SSD and avoid the problem of write amplification caused by data mixing.
  • the number of storage areas included in different storage area groups may be the same or different, and the different life cycle intervals corresponding to different storage area groups may be evenly divided, or For non-uniform division, this application does not limit it.
  • the heat characteristic is used as an example to illustrate the grouping of the plurality of storage areas, but is not limited in any way.
  • the controller may also determine specified characteristic parameters according to application scenarios, business requirements, etc., and the specified characteristic parameters may be, for example, any parameters used to represent the life cycle of the storage area.
  • the corresponding historical feature information of each storage area can be periodically counted based on the specified feature parameters, and based on the statistical historical feature information, multiple storage areas can be grouped, and the records recorded in the mapping relationship table can be updated.
  • the controller may determine a target storage area group and a corresponding storage area group among the plurality of storage area groups according to the logical address of the data to be processed contained in the data processing request target physical address, and then perform data processing operations according to the data processing request and the target physical address.
  • the controller can record the characteristic information of each storage area in real time according to the received data processing request, and can periodically count the historical characteristic information of each storage area and group multiple storage areas, so that the storage areas can be grouped according to the period
  • the updated grouping information realizes the intelligent arrangement of data in the storage array.
  • an embodiment of the present application also provides a data processing apparatus.
  • the structure of the apparatus 600 is shown in FIG. 6 , including an interface 601 , a processor 602 and a cache 603 .
  • the apparatus 600 can be applied to the controller in the storage system shown in FIG. 2 , and can implement the above embodiments and the data processing methods provided by the embodiments.
  • the functions of each unit/module in the apparatus 600 will be introduced below.
  • the interface 601 is configured to provide a channel connecting the processor and the storage array; the processor 602 is configured to acquire characteristic information of multiple storage areas in the storage array according to the grouping period, And within the grouping period, the multiple storage areas are grouped according to the feature information to obtain multiple storage area groups, and the mapping table is updated according to the multiple storage area groups, wherein each storage area
  • the feature information of the area represents the life cycle of the storage area, each storage area group contains at least one storage area, and different storage area groups correspond to different life cycle intervals, and the mapping relationship table is used to record the multiple storage areas.
  • the mapping relationship between the group identifier, the logical address, and the physical address of the storage area group; the cache 603 is used to store the mapping relationship table.
  • the grouping period is: a time period; or, a period of access times.
  • the processor 602 is specifically configured to: perform iterative grouping processing on the plurality of storage areas according to the feature information and the set grouping conditions, until all the storage areas that do not meet the grouping conditions are obtained.
  • the plurality of storage area groups; the grouping conditions include: the flow ratio of a storage area group is greater than a preset ratio threshold, and the flow ratio is the sum of the read and write frequencies of all storage areas included in the storage area group The ratio to the sum of the read and write frequencies of the plurality of storage areas.
  • the processor 602 is specifically configured to: within the grouping period, obtain the data of the multiple storage areas according to the command information for each storage area included in the received data processing request. characteristic information.
  • the data processing request is an I/O request
  • the characteristic information includes I/O characteristic information
  • the controller is a controller in a solid state storage device SSD, and the storage array includes NAND flash memory particles.
  • the embodiments of the present application also provide a computer system
  • the computer system may include a host and a storage system as shown in FIG. function of the data processing device shown.
  • the computer system 700 may include a memory 701 , a processor 702 , and a transceiver 703 .
  • the memory 701, the processor 702 and the transceiver 703 are connected to each other.
  • the memory 701, the processor 702, and the transceiver 703 are connected to each other through a bus 704, the memory 701 is used to store program codes, and the processor 702 can obtain the program codes from the memory 701 according to the and perform the corresponding processing.
  • the bus 704 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
  • the embodiments of the present application further provide a computer program, which, when the computer program runs on a computer, causes the computer to execute the data processing method provided by the above embodiments.
  • the embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a computer, the computer executes the data provided by the above embodiments.
  • the storage medium may be any available medium that the computer can access, such as SSD memory, PCM memory, and the like.
  • a controller can acquire feature information of multiple storage areas in a storage array, and perform the Multiple storage areas are grouped to obtain multiple storage area groups; the mapping table is updated according to the multiple storage area groups, wherein the characteristic information of each storage area represents the life cycle of the storage area; each storage area The area group includes at least one storage area, and different storage area groups correspond to different life cycle intervals; the mapping relationship table is used to record the group identifiers, logical addresses, and physical addresses of the plurality of storage area groups. mapping relationship.
  • the controller can look up the grouping identifier and the target physical address of the target storage area group corresponding to the target logical address included in the data processing request according to the mapping table, and then according to The group identifier and the target physical address are used to process the data processing request.
  • the intelligent arrangement of data in the SSD is realized according to the periodically updated grouping information.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

一种数据处理方法、装置及系统,涉及计算机技术领域。该方法中,控制器(220)根据分组周期获取存储阵列(210)中的多个存储区域的特征信息,并对多个存储区域进行分组处理,得到多个存储区域组;依据所述多个存储区域组更新映射关系表,该映射关系表中用于记录多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。由此,通过周期性地更新多个存储区域组,优化存储阵列内的数据排布方式,以降低例如SSD的写放大,从而保障SSD的寿命和提升SSD所在的计算机系统的性能。

Description

一种数据处理方法、装置及系统 技术领域
本申请涉及计算机技术领域,特别涉及一种数据处理方法、装置及系统。
背景技术
随着存储技术的不断发展,以快闪记忆体(NAND Flash)为存储介质的固态硬盘(solid state drive,SSD)逐渐成为主流存储形态。通常,SSD会包含多个NAND Flash颗粒,由于NAND Flash的介质特性,SSD往往采用异步更新下写策略,使得传统SSD中数据排布仅由数据写入次序决定。
然而,在实际业务场景中,由于对不同的数据的访问频次不同,导致不同的数据具有不同的“冷热”程度,也即不同的生命周期。若按照数据写入次序进行数据排布往往会导致不同生命周期的数据的混写,此时,基于NAND Flash的需要先擦除才能再写入、和读写以页为单位、擦除以块(多个页组成)为单位的工作特性,将会带来垃圾回收(garbage collection,GC)的额外数据拷贝,即额外写放大(write amplification,WA),这会导致SSD的寿命缩短,并降低SSD所在的计算机系统的读写性能。
因此,如何针对SSD内的数据排布方式进行优化,以降低SSD的写放大,从而保障系统性能和SSD的寿命,仍为亟需解决的问题之一。
发明内容
本申请提供一种数据处理方法、装置及系统,通过根据存储阵列中的多个存储区域的历史特征信息来对该多个存储区域进行分组,以便优化SSD内的数据排布方式,以降低SSD的写放大,从而保障系统性能和SSD的寿命。
第一方面,本申请实施例提供了一种数据处理方法,该方法可以可由控制器实现,该控制器例如可以为固态存储设备SSD中的控制器。该方法可以包括:控制器根据分组周期获取存储阵列中的多个存储区域的特征信息,每个存储区域的特征信息表征了所述存储区域的生命周期;在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间;依据所述多个存储区域组更新映射关系表,其中,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。
通过该方案,控制器可以通过根据各个存储区域的历史特征信息来对多个存储区域进行分组并更新映射关系表,以便于控制器可以实时地根据该映射关系表以及所接收到数据处理请求中包含的逻辑地址,来确定相应的目标物理地址并完成相应的数据处理操作,从而优化存储阵列内的数据排布方式,以降低存储阵列的写放大,从而保障系统性能和存储阵列的寿命。其中,以SSD为例,当该数据处理请求为写请求时,则可以将相应区域的数据写入SSD的存储阵列内相应的物理区域,从而完成数据在SSD内的智能化排布。
在一个可能的设计中,所述分组周期为:时间周期;或者,访问次数周期。
通过该方案,可以通过周期性地获取各个存储区域的历史特征信息,以便周期性地对多个存储区域进行分组,以结合各存储区域的实际使用情况进行分组,以便在优化SSD内 的数据排布方式的同时,均衡各个存储区域的使用磨损,以尽可能地保障SSD的寿命以及该SSD所在的计算机系统的系统性能。
可以理解的是,本申请实施例中可以通过任何合适的计量方式配置分组周期,以便可以将该数据处理方案灵活地应用于不同的场景中,在此不再赘述。
在一个可能的设计中,在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,包括:根据所述特征信息以及设定的分组条件,对所述多个存储区域进行迭代分组处理,直至得到不满足所述分组条件的所述多个存储区域组;所述分组条件包括:一个存储区域组的流量占比大于预设的比例阈值,所述流量占比为所述存储区域组包含的所有存储区域的读写频率与所述多个存储区域的读写频率的比值。
通过该方案,在对存储阵列中的多个存储区域进行分组划分时,可以根据周期性统计获得的特征信息以及迭代聚簇的方式,以实现对多个存储区域的分组。当应用于不同的场景中时,可以自适应地获得更多的分组,以便将具有不同数据特征的数据划分为相应的类别并存储在相应的物理区域内,从而优化数据在SSD内的数据排布,避免由于数据混写带来的写放大等问题。
在一个可能的设计中,控制器根据分组周期获取存储阵列中的所述多个存储区域的特征信息,包括:在所述分组周期内,根据接收到的数据处理请求中包含的针对每个存储区域的命令信息,获取所述多个存储区域的特征信息。
通过该方案,控制器可以根据接收到的命令信息,自适应地决策特征参数以及相应的分组算法,以便在不同的应用场景中,可以灵活地确定特征参数以及统计特征信息,从而自适应地实现对存储阵列内的多个存储区域的分组。可以理解的是,本申请实施例中,控制器所确定的特征参数可以包括但不限于特征的类型、组别、取值期间等,在此不再赘述。
在一个可能的设计中,所述数据处理请求为I/O请求;所述特征信息包括I/O特征信息。
通过该方案,例如当将该控制器应用于存储系统或者计算机系统中时,可以根据来自于存储系统或者计算机系统的其它器件或功能模块的I/O请求,来完成相应的数据处理方案。
在一个可能的设计中,所述控制器为固态存储设备SSD中的控制器,所述存储阵列包括NAND闪存颗粒。可以理解的是,本申请实施例中,存储阵列可以包括但不限于NAND闪存颗粒,在此不再赘述。
第二方面,本申请实施例提供了一种数据处理装置,包括接口、处理器和缓存;所述接口用于提供连接所述处理器和存储阵列的通道;所述处理器,用于根据分组周期获取所述存储阵列中的多个存储区域的特征信息,并在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,以及依据所述多个存储区域组更新映射关系表,其中,每个存储区域的特征信息表征了所述存储区域的生命周期,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系;所述缓存,用于存放所述映射关系表。
在一个可能的设计中,所述分组周期为:时间周期;或者,访问次数周期。
在一个可能的设计中,所述处理器具体用于:根据所述特征信息以及设定的分组条件,对所述多个存储区域进行迭代分组处理,直至得到不满足所述分组条件的所述多个存储区 域组;所述分组条件包括:一个存储区域组的流量占比大于预设的比例阈值,所述流量占比为所述存储区域组包含的所有存储区域的读写频率之和与所述多个存储区域的读写频率之和的比值。
在一个可能的设计中,所述处理器具体用于:在所述分组周期内,根据接收到的数据处理请求中包含的针对每个存储区域的命令信息,获取所述多个存储区域的特征信息。
在一个可能的设计中,所述数据处理请求为I/O请求;所述特征信息包括I/O特征信息。
第三方面,本申请实施例还提供了一种存储数据,该存储数据可以包括存储阵列以及上述第二方面中任一项所述的数据处理装置,其中,所述数据处理装置用于根据分组周期获取所述存储阵列中的多个存储区域的特征信息。
第四方面,本申请实施例还提供了一种计算机系统,该计算机系统可以包括主机以及上述第三方面所述的存储系统,其中,所述主机用于向所述存储系统发送数据处理请求,所述存储系统被配置为执行存储的指令,所述存储系统通过执行指令来实现上述第一方面中任一项所述的数据处理方法。
第五方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当所述计算机程序在计算机上运行时,使得计算机执行上述第一方面提供的方法。
第六方面,本申请实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行上述第一方面提供的方法。
附图说明
图1为一种解决方案;
图2为本申请实施例适用的存储系统的结构示意图;
图3为本申请实施例的控制器的逻辑结构示意图;
图4为本申请实施例的数据处理方法的流程示意图;
图5为本申请实施例的数据处理方法的流程示意图;
图6为本申请实施例的数据处理装置的结构示意图;
图7为本申请实施例的计算机系统的结构示意图。
具体实施方式
为了对背景技术中提及的SSD的写放大的问题进行优化,业内提出了两种解决方案:
方案一:
如图1所示,在该方案中,通过上层的应用程序(application,APP)和文件系统(file system)对数据“温度”进行识别后,将数据的温度信息下传至SSD层,从而将不同的温度的数据写入SSD的不同存储区域。其中,在linux系统中,linux层定义了4个write hint等级:short、medium、long、extreme,APP和文件系统在将不同的数据识别对应该4个等级后,将识别结果下传至NVMe驱动(driver)层,由NVMe driver将上层识别结果映射到相应等级对应的存储区域,从而向SSD层传递数据对应的温度信息。
在此方案中,由于不同应用或文件系统对数据识别的理解能力不同,映射到同一等级 的数据或者元数据也可能存在较大差异,若想要获得较为准确的识别结果,需要设计者掌握较好的理解与开发技术,对设计者的要求较高。同时,受限于存储区域等级的设定,当APP或者文件系统的识别结果涉及更多数据分类的情况下,无法为每一类数据单独分配相应等级,随着SSD的日益改进,当SSD支持的存储区域等级数量超过之前设定的数量时,当前的等级映射方式也使得无法充分利用SSD的资源。
由此,方案一无法灵活地应用于支持多应用或者多种系统的复杂场景中。
方案二:
在该方案中,通常在SSD盘内预先配置参数,例如温度等级、温度范围等,SSD控制单元根据I/O信息实时调整及判断其对应的温度,以将待写数据写入SSD的对应存储区域。
在此方案中,由于需要实时地调整数据温度,会给SSD带来极大的算法开销。并且,该方案依赖于固化配置的参数,往往限制了温度判断的结果,在参数配置较差的情况下,也会降低温度判断结果的准确性。并且,参数固化的做法使得方案二也不适用于多样化的业务场景中。
由此,上述方案一和方案二,虽然可以在一定程度上降低由于传统SSD的数据写入方式而带来的额外写放大的问题,但是由于方案一和方案二自身的局限性,导致这两种解决方案均无法灵活地应用于不同的场景中。
有鉴于此,本申请实施例提供了一种数据处理方案,该方案可以灵活地应用于多种应用场景中,并可以保障计算机系统性能和SSD的寿命。其中,该方案中,方法和装置是基于同一技术构思的,由于方法及装置解决问题的原理相似,因此装置与方法的实施可以相互参见,重复之处不再赘述。
本申请实施例中,控制器可以根据分组周期获取存储序列中的多个存储区域的特征信息,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,然后,依据所述多个存储区域组更新映射关系表,其中,每个存储区域的特征信息表征了所述存储区域的生命周期,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。进而,控制器在接收到(例如来自于主机的)实时的数据处理请求后,可以根据所述映射关系表查找所述数据处理请求中包含的目标逻辑地址所对应的目标存储区域组的分组标识和目标物理地址,然后根据所述分组标识和所述目标物理地址,对所述数据处理请求进行处理。
通过该方案,控制器可以根据接收到的数据处理请求实时地记录各个存储区域的特征信息,并可以周期性地统计各个存储区域的历史特征信息来对多个存储区域进行分组得到多个存储区域组,以便可以根据周期性更新的多个存储区域组的分组信息实现数据在存储阵列内的智能化排布。该方案可以灵活地应用于SSD等所适用的任何场景中,并且可以降低SSD的写放大,从而提升系统性能和SSD的寿命。
以下,对本申请中的部分用语进行解释说明,以便于本领域技术人员理解。
(1)、计算机系统,由硬件(子)系统和软件(子)系统组成。其中,硬件(子)系统包括由电、磁、光、机械等原理构成的各种物理部件的有机组合,是系统赖以工作的实体;软件(子)系统包括各种程序和文件,用于指挥全系统按指定的要求进行工作。随着计算机技术的发展,现代计算机系统小到微型计算机和个人计算机,大到巨型计算机及其网络,形态、特性多种多样,已广泛用于科学计算、事务处理和过程控制,日益深入社会 各个领域,对社会的进步产生深刻影响。
在一种实现方式中,本申请实施例中的计算机系统,可以为终端装置内的计算机系统,是一种向用户提供业务服务、具有语音或数据连通功能的装置。终端装置又可以称为终端设备,还可以称为用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等,终端装置也可以为一种芯片。在本申请后续实施例和描述中,以终端设备为例进行具体描述。
例如,终端设备可以为具有无线连接功能的手持式设备、车载设备等。目前,一些终端设备的举例为:手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(mobile internet device,MID)、智能销售终端(point of sale,POS)、可穿戴设备,虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端、各类智能仪表(智能水表、智能电表、智能燃气表)等。
在另一种实现方式中,本申请实施例中的计算机系统,可以是服务器,是提供数据连通服务的设备。由于服务器可以响应终端设备的服务请求,并进行处理,因此一般来说服务器应具备承担服务并且保障服务的能力。在本申请中,所述服务器可以为位于数据网络(data network,DN)中的服务器,例如普通服务器,云平台中的服务器;或者为位于核心网内的多接入边缘计算(multi-access edge computing,MEC)服务器等。
(2)操作系统(operating system,OS):是运行在计算机系统上的最基本的系统软件,例如windows系统、Android系统、IOS系统、windows server系统、Netware系统、Unix系统、Linux系统。本领域技术人员可以理解,其它操作系统中,也可以采用类似的算法实现,本申请对此不做限定。
(3)内核(kernel):是一个操作系统的核心,是基于硬件的第一层软件扩充,提供操作系统的最基本的功能,是操作系统工作的基础,它负责管理系统的进程、内存、驱动程序、文件和网络系统,决定着系统的性能和稳定性。
(4)、多个,是指两个或两个以上。
(5)、至少一个,是指一个或多个。
(6)、“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。
下面结合附图及实施例,详细说明本申请的数据处理方案。
图2为本申请实施例适用的存储系统的结构示意图。参阅图2所示,该存储系统200可以包括存储阵列210和控制器220。其中,存储阵列210可以实现数据存储的功能,控制器220可以实现控制的功能。
参阅图2所示,存储阵列210中可以包括多个存储区域,该多个存储区域可以被控制器周期性地220划分为多个分组,即多个存储区域组,每个存储区域组包含至少一个存储 区域,不同存储区域组对应于不同的生命周期区间。控制器220可以依据该多个存储区域组更新映射关系表,该映射关系表中可以用于记录所述多个存储区域组的分组标识、逻辑地址、以及存储区域的物理地址三者之间的映射关系。进而,控制器220在接收到(例如来自主机的)数据处理请求后,则可以根据该映射关系表查找所述数据处理请求中包含的目标逻辑地址所对应的目标存储区域组的分组标识和目标物理地址,进而,根据所述分组标识和所述目标物理地址,对所述数据处理请求进行处理,例如对目标物理地址进行读取、写入或擦除数据等操作,并得到数据处理请求的处理结果。
本申请实施例中,控制器220在对存储阵列210中的多个存储区域进行分组处理时,可以根据应用场景或业务需求等自适应地决策用于对存储区域进行分组的特征参数,进而可以根据相应特征参数,以及统计得到的各个存储区域的历史特征信息,来对多个存储区域进行分组,并依据分组后得到的多个存储区域组更新映射关系表,以刷新逻辑地址、分组标识以及物理地址三者之间的映射关系。由此,在该存储系统的运行过程中,则能够基于该映射关系表,将实时接收到的数据处理请求映射到相应的目标物理地址。这样,结合存储区域的特征信息来分组和映射,当数据处理请求为写请求时,可以避免对不同生命周期的数据的混写,从而可以降低写放大的发生几率,保障存储系统的寿命以及存储系统所在的计算机系统的性能。
需要说明的是,在具体实施时,该存储系统可以有多种实现方式,例如可以实现为固态存储SSD系统,也可以实现为与SSD系统具有相似特性的存储系统,例如相变存储(phase change memory,PCM)系统,还可以实现为计算机系统中的至少一个存储设备,本申请对此不做限定。可以理解的是,随着存储技术的不断发展,存储系统中的存储阵列的介质可以为SSD颗粒,也可以为后续发展时的任何其它存储介质,本申请对此不做限定。
为了便于理解,下面结合各实施例对本申请的数据处理方案进行介绍时,具体以SSD系统为例进行详细说明,下文中将不再逐一区分和赘述。
在一个示例中,如图3所示,基于控制器220的功能实现,该控制器220可以包括以下逻辑模块:统计模块221、处理模块222和查询模块223。其中,统计模块221和查询模块223可以用于实现前台操作,处理模块222用于实现后台操作。
统计模块221可以用于实现特征采集的功能,其可以在控制器220接收到数据处理请求后,根据该数据处理请求中所包含的命令信息进行信息记录和统计,以便获得相应的特征信息,并可以将所采集的特征信息提供给处理模块222。
处理模块222可以用于实现数据处理的相关算法功能,其可以周期性地根据所获得的多个存储区域的特征信息进行计算,以将多个存储区域进行分组,得到多个存储区域组,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间。并且,处理模块222可以依据所得到的多个存储区域组生成或更新映射关系表。其中,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。
查询模块223可以用于实现数据处理请求的查询功能,在接收到实时的数据处理请求时,查询模块223可以实时地根据上述映射关系表,查找与该数据处理请求中所包含的逻辑地址所对应的目标存储区域组的目标标识和目标物理地址。进而,控制器220则可以根据所述分组标识和所述目标物理地址,对数据处理请求进行处理,并得到数据处理结果。
可以理解的是,在具体实施时,统计模块221可以根据数据处理请求中所包含的任何 信息均进行信息记录和统计,并转换为相应的特征信息。或者,统计模块221也可以根据处理模块222的指示,对数据处理请求中包含的某些指定信息进行记录和统计,并转换为相应的特征信息。需要说明的是,由于数据处理请求可以包括读取、写入、擦除等不同的处理类型,因此,统计模块221还可以针对不同类型的数据处理请求中所包含的不同的信息分别执行不同的处理算法,以便获得用于对多个存储区域进行分组的特征信息,本申请对此不做限定。
以数据处理请求为I/O请求为例,统计模块221统计得到的特征信息可以包括I/O特征信息,例如读取频率、写入频率、读写比例、顺序和随机、工作线程数、队列深度、数据记录大小等。当控制器220接收到待处理的I/O请求后,统计模块221可以实时地记录该I/O请求中所包含的I/O信息,并处理得到相应的I/O特征信息。
示例性地,例如,若I/O请求为写请求时,则可以从该I/O请求中获得的I/O信息可以包括处理类型信息:写入;相应地,I/O特征信息可以包括对写入次数的累计增加。又例如,若I/O请求为针对某个逻辑地址的读取/写入请求,则可以获得的I/O信息可以包括对该逻辑地址的处理类型信息:读取/写入,相应地,I/O特征信息可以包括对该逻辑地址的读取/写入次数的累计增加以及对该逻辑地址的读写比例特征。又例如,针对某个逻辑地址,还可以统计其访问次数(例如读取次数、写入次数等之和),作为所需的相应I/O特征信息。
可以理解的是,当数据处理请求为其它实现,或者该数据处理请求中所包含的信息包括其它信息,统计模块221均可以采用相应的算法来对所记录的信息进行转换,以得到相应的特征信息,在此不再赘述。
基于相应的信息采集规则,统计模块221可以根据预设的分组周期,在每个分组周期内统计所述多个存储区域的特征信息。在所获得的特征信息的统计结果达到预定的分组时机时,触发处理模块222执行对多个存储区域进行分组的相关算法,以获得更新后的分组结果。
示例性地,该分组周期可以为时间周期T。统计模块221在根据时间周期统计各个存储区域的特征信息的同时,可以启动定时器进行时间累计,可以在第一个周期T1的结束时刻时,将之前在T1内累积采集的特征信息提供给处理模块222进行一次分组处理,在时间达到T1之后的第二个周期T2的结束时刻时,将T2内累积采集的特征信息提供给处理模块222进行下一次的分组处理。
由此,通过周期性地统计多个存储区域的特征信息,以使得处理模块222周期性地根据所获得的特征信息进行计算,以将多个存储区域进行分组得到多个存储区域组。由此,即可结合对各个存储区域的实际使用情况,来对多个存储区域进行分组,在降低写放大的发生几率的同时,还可以均衡各个存储区域的使用磨损,尽可能地保障存储阵列的寿命。
可以理解的是,上述基于时间周期作为分组周期触发处理模块222进行分组算法的示例仅是对本申请的一种可选的实现方式的举例说明而非任何限定。在具体实施时,在不同的业务场景中,该分组周期也可以是以其它计量参数确定的分组周期,本申请对此不做限定。
示例性地,例如可以以访问次数周期作为分组周期,通过访问次数的增加作为相对计量,统计模块221在统计特征信息的同时,可以启动定时器进行访问次数累计,当访问次数的累计每增加达到设定的访问次数周期(例如100次)时,即可触发处理模块222执行 进行分组的相关算法,以获得更新后的分组结果,以便依据更新后的多个存储区域组更新映射关系表,在此不再赘述。
当触发处理模块222进行分组算法时,处理模块222可以根据在分组周期内从统计模块221所获得的特征信息进行计算,并获得更新后的分组结果以及更新后的所述映射关系表。后续,当接收到数据处理请求后,控制器220则可以通过实时地根据该更新后的映射关系表以及该数据处理请求中所包含的逻辑地址,确定目标存储区域组以及相应的目标物理地址,进而根据数据处理请求对目标物理地址进行读取、写入或擦除数据等操作,并得到数据处理请求的处理结果。
可以理解的是,本申请实施例中,处理模块222在对存储阵列中的多个存储区域进行分组时,可以自适应地决策进行分组所需使用的特征参数以及相应的分组算法。由此,当将上述数据处理方案应用于SSD系统以及应用于不同的用户场景中时,均可以由SSD中的控制器实现对来自于上层的数据处理请求的识别功能以及地址转换功能,可以灵活地应用于SSD所适用的任何场景中,并且不会带来较大的算法开销。
并且,与传统SSD以及前述的方案一和方案二相比,由于根据历史数据处理请求中所包含的命令信息获得多个存储区域的历史特征信息,并周期性地对多个存储区域进行分组,和更新映射关系表,在基于所获得的多个存储区域组和所述映射关系表对后续数据处理请求中包含的逻辑地址进行地址转换时,可以较为准确地获得与该逻辑地址更加适配的目标物理地址,从而有利于实现待处理数据在SSD的存储阵列内的智能化排布,以降低SSD的写放大问题的发生几率,从而保障SSD的存储阵列的寿命以及SSD所在的计算机系统的性能。
为了更好地理解本申请的数据处理方案的具体实现,下面结合图4示出的流程图,对本申请的数据处理方案的方法步骤进行详细说明。其中,该数据处理方法可以由图2或图3所示的控制器实现,该控制器可以SSD中的控制器。
参阅图4,该数据处理方法可以包括:
S401:控制器接收数据处理请求。
该数据处理请求中可以包括针对存储区域的命令信息。该命令信息例如可以包括待处理数据的处理类型(例如读取、或写入、或擦除等)、待处理数据在存储阵列中的逻辑地址、待处理数据的长度(或称为数据记录大小)等,本申请对此不做限定。
可以理解的是,本申请实施例中,存储区域可以是将存储阵列的整个存储空间按照任何合适的粒度划分得到的,例如可以是块(block),也可以是基于逻辑地址(logical block address,LBA)等进行划分,本申请对此均不做限定。以LBA为例,在一个数据处理请求中所包括的命令信息中,可以包括针对一个LBA的命令信息,也可以包括针对多个LBA的命令信息,本申请对此也不做限定。
一方面,参阅图4右侧,该数据处理方法可以包括以下分支处理流程:
S402:控制器可以根据所述映射关系表查找所述数据处理请求中包含的目标逻辑地址所对应的目标存储区域组的分组标识和目标物理地址。
S403:控制器可以根据所述分组标识和所述目标物理地址,对所述数据处理请求进行处理。
另一方面,参阅图4左侧,该数据处理方法还可以包括以下分支处理流程:
S404:控制器可以记录该数据处理请求中包含的针对存储区域的命令信息,以获得相 应存储区域的特征信息。
S405:控制器可以根据分组周期获取存储阵列中的多个存储区域的特征信息。在该分组周期的结束时刻,控制器可以根据在所述分组周期内统计获得的特征信息,对所述多个存储区域进行分组处理,得到多个存储区域组,其中,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间。进而,基于该多个存储区域组,可以获得更新后的映射关系表。其中,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。其中,在S402中所使用的映射关系表,则为每个分组周期后重新对多个存储区域进行分组后获得的更新后的映射关系表。
可以理解的是,在图4所示的流程图中,S401-S404可以是实时的,而S405则可以是基于分组周期而周期性地执行的。示例性地,分组周期可以基于以下信息计量:时间、或者,访问次数。由此,当针对多个存储区域的历史特征信息的统计达到预设的分组周期后,即可触发分组计算流程,从而将多个存储区域分为多个存储区域组。
可以理解的是,在具体实施时,可以根据应用场景或者业务需求等设置上述分组周期,本申请对此不做限定。在多个存储区域的历史特征信息的统计达到该分组周期时触发分组计算流程,从而根据存储区域的历史使用情况周期性地实现对存储区域的分组,由此使得根据对存储区域的分组,获得与数据处理请求中的逻辑地址更加适配的存储区域,从而有利于数据在SSD的存储阵列内的智能化排布,以降低SSD的写放大,从而保障SSD的寿命并提升SSD所在的计算机系统的性能。
在一个示例中,为了更加灵活地应用于多种不同的场景中,S405具体可以包括以下步骤:根据所述特征信息以及设定的分组条件,对所述多个存储区域进行迭代分组处理,直至得到不满足所述分组条件的所述多个存储区域组。其中,所述分组条件包括:一个存储区域组的流量占比大于预设的比例阈值,所述流量占比为所述存储区域组包含的所有存储区域的读写频率之和与所述多个存储区域的读写频率之和的比值。
下面以热度特征表示存储区域的生命周期为例进行示例说明。其中,在分组周期内周期性统计的历史特征信息可以分别包括各个存储区域的热度特征的取值(简称为热度特征值)。参阅图5,当周期性启动分组计算流程后,具体可以通过以下流程将多个存储区域分为多个存储区域组:
S501:控制器遍历在分组周期内统计获得的热度特征值,获取热度特征最大值和热度特征最小值。
S502:分别以热度特征最大值和热度特征最小值为中心点,根据各个热度特征值分别与中心点的距离(即热度特征值的差值),对各个热度特征值进行聚簇划分,并分别记录相应聚簇的统计信息,例如流量、簇心等。
S503:判断每个聚簇各自的流量占比是否超过预设的比例阈值。
若是,则返回S501,进行迭代聚簇处理,以拆分相应聚簇。
若否,则进入S504,根据聚簇结果输出分组结果,并结束本分组周期的分组计算流程。
可以理解的是,上述S501-S503在具体实施时,可以在本分组周期的起始时刻进行聚簇划分时,首先针对本分组周期内的统计的所有特征信息,获取热度特征最大值和热度特征最小值,然后分别以热度特征最大值和热度特征最小值为中心点,将各个热度特征值与其距离较小的中心点划分至同一聚簇。随后,在S503在所获得的任一个聚簇的流量占比超过预设的比例阈值时,针对相应聚簇,重复执行S501-S503,通过在相应聚簇内迭代, 进一步将相应聚簇进行拆分,直至将多个存储区域划分为不满足分组条件的所述多个存储区域组。
示例性地,根据上述迭代分组算法,在根据热度特征值自适应进行分组处理时,根据各个存储区域的实际使用情况,可以将多个存储区域分为冷数据存储区域组和热数据存储区域组,对应的热度特征值区间分别为(0,3000]、(3001,6000]。或者,还可以将多个存储区域分为极冷数据存储区域组、冷数据存储区域组、热数据存储区域组和极热数据存储区域组,对应的热度特征值区间分别为(0,500]、(501,2000]、(2001,400]、(4001,6000]。由此,当应用于不同的场景中时,可以自适应地获得更多的分组,以便优化数据在SSD内的数据排布,避免由于数据混写带来的写放大的问题。
可以理解的是,在本申请的分组算法中,不同存储区域组中包含的存储区域的数量可以相同也可以不同,不同存储区域组所对应的不同的生命周期区间可以是均匀划分的,也可以为非均匀划分的,本申请对此均不作限定。
可以理解的是,在此仅是以热度特征为例来对多个存储区域进行分组的示例说明而非任何限定。在具体实施例中,控制器也可以根据应用场景、业务需求等确定指定特征参数,该指定特征参数例如可以为用于表示存储区域的生命周期的任意参数。进而在后续则可以基于指定特征参数周期性地统计各个存储区域的相应的历史特征信息,并基于所统计的历史特征信息,来对多个存储区域进行分组,并更新映射关系表中所记录的逻辑地址、分组标识以及物理地址三者之间的映射关系。进而,当接收到(例如来自主机的)数据处理请求时,控制器可以根据该数据处理请求中所包含的待处理数据的逻辑地址,在多个存储区域组中确定目标存储区域组以及相应的目标物理地址,然后根据该数据处理请求和目标物理地址进行数据处理操作。
通过该方案,控制器可以根据接收到的数据处理请求实时地记录各个存储区域的特征信息,并可以周期性地统计各个存储区域的历史特征信息并对多个存储区域进行分组,以便可以根据周期性更新的分组信息实现数据在存储阵列内的智能化排布。该方案可以灵活地应用于SSD所适用的任何场景中,并且可以降低SSD的写放大,从而提升系统性能和SSD的寿命。
与传统SSD相比,当将上述数据处理方案应用于计算机系统中时,基于对数据处理请求中所包含的信息的实时统计,以及对存储区域分组的周期性刷新的策略,可以简化操作,降低管理成本,可以最大化利用硬件加速。同时,由于自适应的算法设计,能够快速匹配不同的场景,能够解决当前业界依赖于固化配置的参数的方案的局限性。在真实的业务场景中,例如MySQL数据库,可以极大地提升例如WA、TPMC等性能。
基于相同的技术构思,本申请实施例还提供了一种数据处理装置,该装置600的结构如图6所示,包括接口601、处理器602和缓存603。所述装置600可以应用于图2所示的存储系统中的控制器,并可以实现以上实施例以及实施例提供的数据处理方法。下面对装置600中的各个单元/模块的功能进行介绍。
在一个示例中,所述接口601,用于提供连接所述处理器和存储阵列的通道;所述处理器602,用于根据分组周期获取所述存储阵列中的多个存储区域的特征信息,并在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,以及依据所述多个存储区域组更新映射关系表,其中,每个存储区域的特征信息表征了所 述存储区域的生命周期,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系;所述缓存603,用于存放所述映射关系表。
在一个可能的设计中,所述分组周期为:时间周期;或者,访问次数周期。
在一个可能的设计中,所述处理器602具体用于:根据所述特征信息以及设定的分组条件,对所述多个存储区域进行迭代分组处理,直至得到不满足所述分组条件的所述多个存储区域组;所述分组条件包括:一个存储区域组的流量占比大于预设的比例阈值,所述流量占比为所述存储区域组包含的所有存储区域的读写频率之和与所述多个存储区域的读写频率之和的比值。
在一个可能的设计中,所述处理器602具体用于:在所述分组周期内,根据接收到的数据处理请求中包含的针对每个存储区域的命令信息,获取所述多个存储区域的特征信息。
在一个可能的设计中,所述数据处理请求为I/O请求,所述特征信息包括I/O特征信息。
在一个可能的设计中,所述控制器为固态存储设备SSD中的控制器,所述存储阵列包括NAND闪存颗粒。
基于以上实施例,本申请实施例还提供了一种计算机系统,该计算机系统可以包括主机以及如图2所示的存储系统,可以实现以上实施例以及实施例提供的方法,具有如图6所示的数据处理装置的功能。参阅图7,该计算机系统700可以包括存储器701、处理器702、收发器703。其中,存储器701、处理器702以及收发器703之间相互连接。
可选的,所述存储器701、所述处理器702、收发器703之间通过总线704相互连接,所述存储器701,用于存储程序代码,处理器702则可以根据从存储器701中获取程序代码并执行相应的处理。所述总线704可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
基于以上实施例,本申请实施例还提供了一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行以上实施例提供的数据处理法。
基于以上实施例,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,所述计算机程序被计算机执行时,使得计算机执行以上实施例提供的数据处理方法。其中,存储介质可以是计算机能够存取的任何可用介质,例如SSD存储器、PCM存储器等。
综上所述,本申请实施例提供了一种数据处理方法、装置及系统,在该方案中,控制器可以获取存储阵列中的多个存储区域的特征信息,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组;依据所述多个存储区域组更新映射关系表,其中,每个存储区域的特征信息表征了所述存储区域的生命周期;每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间;所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。进而,控制器在接收到实时的数据处理请求后,可以根据所述映射关系表查找所述数据处理请求中包 含的目标逻辑地址所对应的目标存储区域组的分组标识和目标物理地址,然后根据所述分组标识和所述目标物理地址,对所述数据处理请求进行处理。由此,根据周期性更新的分组信息实现数据在SSD内的智能化排布。该方案可以灵活地应用于SSD所适用的任何场景中,并且可以降低SSD的写放大,从而提升系统性能和SSD的寿命。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的保护范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (16)

  1. 一种数据处理方法,其特征在于,包括:
    控制器根据分组周期获取存储阵列中的多个存储区域的特征信息,每个存储区域的特征信息表征了所述存储区域的生命周期;
    在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间;
    依据所述多个存储区域组更新映射关系表,其中,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系。
  2. 根据权利要求1所述的方法,其特征在于,所述分组周期为:时间周期;或者,访问次数周期。
  3. 根据权利要求1或2所述的方法,其特征在于,在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,包括:
    根据所述特征信息以及设定的分组条件,对所述多个存储区域进行迭代分组处理,直至得到不满足所述分组条件的所述多个存储区域组;
    所述分组条件包括:一个存储区域组的流量占比大于预设的比例阈值,所述流量占比为所述存储区域组包含的所有存储区域的读写频率之和与所述多个存储区域的读写频率之和的比值。
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,控制器根据分组周期获取存储阵列中的所述多个存储区域的特征信息,包括:
    在所述分组周期内,根据接收到的数据处理请求中包含的针对每个存储区域的命令信息,获取所述多个存储区域的特征信息。
  5. 根据权利要求4所述的方法,其特征在于,所述数据处理请求为I/O请求;所述特征信息包括I/O特征信息。
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,所述控制器为固态存储设备SSD中的控制器,所述存储阵列包括NAND闪存颗粒。
  7. 一种数据处理装置,其特征在于,包括接口、处理器和缓存:
    所述接口,用于提供连接所述处理器和存储阵列的通道;
    所述处理器,用于根据分组周期获取所述存储阵列中的多个存储区域的特征信息,并在所述分组周期内,根据所述特征信息对所述多个存储区域进行分组处理,得到多个存储区域组,以及依据所述多个存储区域组更新映射关系表,其中,每个存储区域的特征信息表征了所述存储区域的生命周期,每个存储区域组包含至少一个存储区域,不同存储区域组对应于不同的生命周期区间,所述映射关系表中用于记录所述多个存储区域组的分组标识、逻辑地址、以及物理地址三者之间的映射关系;
    所述缓存,用于存放所述映射关系表。
  8. 根据权利要求7所述的装置,其特征在于,所述分组周期为:时间周期;或者,访问次数周期。
  9. 根据权利要求7或8所述的装置,其特征在于,所述处理器具体用于:
    根据所述特征信息以及设定的分组条件,对所述多个存储区域进行迭代分组处理,直 至得到不满足所述分组条件的所述多个存储区域组;
    所述分组条件包括:一个存储区域组的流量占比大于预设的比例阈值,所述流量占比为所述存储区域组包含的所有存储区域的读写频率之和与所述多个存储区域的读写频率之和的比值。
  10. 根据权利要求7-9中任一项所述的装置,其特征在于,所述处理器具体用于:
    在所述分组周期内,根据接收到的数据处理请求中包含的针对每个存储区域的命令信息,获取所述多个存储区域的特征信息。
  11. 根据权利要求10所述的装置,其特征在于,所述数据处理请求为I/O请求;所述特征信息包括I/O特征信息。
  12. 根据权利要求7-11中任一项所述的装置,其特征在于,所述控制器为固态存储设备SSD中的控制器,所述存储阵列包括NAND闪存颗粒。
  13. 一种存储系统,其特征在于,包括存储阵列以及如所述权利要求7-12中任一项所述的数据处理装置,其中,所述数据处理装置用于根据分组周期获取所述存储阵列中的多个存储区域的特征信息。
  14. 一种计算机系统,其特征在于,包括主机以及如权利要求13所述的存储系统,其中,所述主机用于向所述存储系统发送数据处理请求,所述存储系统被配置为执行存储的指令,所述存储系统通过执行指令来实现如权利要求1-6中任一项所述的数据处理方法。
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,当所述计算机程序在计算机上运行时,使得计算机执行权利要求1-6中任一项所述的方法。
  16. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行权利要求1-6中任一项所述的方法。
PCT/CN2020/132897 2020-11-30 2020-11-30 一种数据处理方法、装置及系统 WO2022110196A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080103535.5A CN115989485A (zh) 2020-11-30 2020-11-30 一种数据处理方法、装置及系统
PCT/CN2020/132897 WO2022110196A1 (zh) 2020-11-30 2020-11-30 一种数据处理方法、装置及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/132897 WO2022110196A1 (zh) 2020-11-30 2020-11-30 一种数据处理方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2022110196A1 true WO2022110196A1 (zh) 2022-06-02

Family

ID=81753916

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132897 WO2022110196A1 (zh) 2020-11-30 2020-11-30 一种数据处理方法、装置及系统

Country Status (2)

Country Link
CN (1) CN115989485A (zh)
WO (1) WO2022110196A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118426712A (zh) * 2024-07-05 2024-08-02 深圳市天创伟业科技有限公司 闪存卡数据存储方法、装置、设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118051948B (zh) * 2024-04-16 2024-07-19 深圳迅策科技股份有限公司 基于大数据平台的动态安全存储方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148702A1 (en) * 2014-11-20 2016-05-26 HGST Netherlands B.V. Calibrating optimal read levels
US20170139591A1 (en) * 2015-11-13 2017-05-18 Samsung Electronics Co., Ltd. Multimode storage device
CN109445681A (zh) * 2018-08-27 2019-03-08 华为技术有限公司 数据的存储方法、装置和存储系统
CN110554999A (zh) * 2018-05-31 2019-12-10 华为技术有限公司 基于日志式文件系统和闪存设备的冷热属性识别和分离方法、装置以及相关产品
CN111159058A (zh) * 2019-12-27 2020-05-15 深圳大普微电子科技有限公司 一种磨损均衡方法、装置及非易失性的存储设备
CN111782134A (zh) * 2019-06-14 2020-10-16 北京京东尚科信息技术有限公司 数据处理方法、装置、系统和计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148702A1 (en) * 2014-11-20 2016-05-26 HGST Netherlands B.V. Calibrating optimal read levels
US20170139591A1 (en) * 2015-11-13 2017-05-18 Samsung Electronics Co., Ltd. Multimode storage device
CN110554999A (zh) * 2018-05-31 2019-12-10 华为技术有限公司 基于日志式文件系统和闪存设备的冷热属性识别和分离方法、装置以及相关产品
CN109445681A (zh) * 2018-08-27 2019-03-08 华为技术有限公司 数据的存储方法、装置和存储系统
CN111782134A (zh) * 2019-06-14 2020-10-16 北京京东尚科信息技术有限公司 数据处理方法、装置、系统和计算机可读存储介质
CN111159058A (zh) * 2019-12-27 2020-05-15 深圳大普微电子科技有限公司 一种磨损均衡方法、装置及非易失性的存储设备

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118426712A (zh) * 2024-07-05 2024-08-02 深圳市天创伟业科技有限公司 闪存卡数据存储方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN115989485A (zh) 2023-04-18

Similar Documents

Publication Publication Date Title
US11960726B2 (en) Method and apparatus for SSD storage access
US9021189B2 (en) System and method for performing efficient processing of data stored in a storage node
US9092321B2 (en) System and method for performing efficient searches and queries in a storage node
CN107111452B (zh) 应用于计算机系统的数据迁移方法和装置、计算机系统
JP2017021805A (ja) 不揮発性メモリ装置内でデータ属性基盤データ配置を利用可能にするインターフェイス提供方法及びコンピュータ装置
WO2017148242A1 (zh) 一种访问叠瓦式磁记录smr硬盘的方法及服务器
JP2013509658A (ja) 将来の使用推量に基づく記憶メモリの割り当て
WO2022110196A1 (zh) 一种数据处理方法、装置及系统
WO2017107015A1 (zh) 存储空间的分配方法及存储设备
CN114371813A (zh) 写入流优先级的识别和分类
WO2018068714A1 (zh) 重删处理方法及存储设备
US11138104B2 (en) Selection of mass storage device streams for garbage collection based on logical saturation
WO2021035555A1 (zh) 一种固态硬盘的数据存储方法、装置及固态硬盘ssd
CN112099939A (zh) 用于工作负载类型操作度量计算的系统、方法和存储介质
CN115756312A (zh) 数据访问系统、数据访问方法和存储介质
US10228885B2 (en) Deallocating portions of data storage based on notifications of invalid data
CN113867644A (zh) 磁盘阵列优化方法、装置、计算机设备及存储介质
CN109582649A (zh) 一种元数据存储方法、装置、设备及可读存储介质
CN112148226A (zh) 一种数据存储方法及相关装置
US9606909B1 (en) Deallocating portions of provisioned data storage based on defined bit patterns indicative of invalid data
CN109521970B (zh) 一种数据处理方法及相关设备
CN109947667B (zh) 数据访问预测方法和装置
WO2021227789A1 (zh) 存储空间的分配方法、装置、终端及计算机可读存储介质
CN116339643B (zh) 一种磁盘阵列的格式化方法、装置、设备和介质
CN115079936A (zh) 一种数据写入方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20963084

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20963084

Country of ref document: EP

Kind code of ref document: A1