EP4044039A1 - Datenverarbeitungsverfahren und -vorrichtung sowie speichermedium - Google Patents

Datenverarbeitungsverfahren und -vorrichtung sowie speichermedium Download PDF

Info

Publication number
EP4044039A1
EP4044039A1 EP20885263.2A EP20885263A EP4044039A1 EP 4044039 A1 EP4044039 A1 EP 4044039A1 EP 20885263 A EP20885263 A EP 20885263A EP 4044039 A1 EP4044039 A1 EP 4044039A1
Authority
EP
European Patent Office
Prior art keywords
hard disk
logical space
data
address
storage device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP20885263.2A
Other languages
English (en)
French (fr)
Other versions
EP4044039A4 (de
Inventor
Yang Liu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of EP4044039A1 publication Critical patent/EP4044039A1/de
Publication of EP4044039A4 publication Critical patent/EP4044039A4/de
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0615Address space extension
    • G06F12/0623Address space extension for memory modules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7201Logical to physical mapping or translation of blocks or pages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7205Cleaning, compaction, garbage collection, erase control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7208Multiple device management, e.g. distributing data over multiple flash devices

Definitions

  • This application relates to the field of data storage technologies, and in particular, to a data access method and apparatus and a storage medium.
  • streaming computing engines are becoming the mainstream.
  • data continuously arrives during the computing, which requires high real-time processing. Therefore, sequential read/write operations on the data are more efficient.
  • a current storage device is usually used to process data in an organization manner such as a block, a file, or an object, and cannot process streaming data.
  • the streaming data refers to a data sequence that arrives continuously over time. Therefore, before the data is processed by using the storage device, the organization manner of the data needs to be converted.
  • most hard disks currently use a logical block address (logical block address, LBA) access manner.
  • LBA logical block address
  • an organization manner of the data in the storage device needs to be converted into an organization manner of the data in the hard disk. That is, in a current data access process, a logical address accessed by a client needs to be translated into a logical address for accessing the storage device, and the logical address for accessing the storage device needs to be translated into a logical address for accessing the hard disk. This leads to two times of translation, and overheads are relatively high.
  • This application provides a data access method and apparatus and a storage medium, to reduce overheads and improve data read/write efficiency.
  • the technical solutions are as follows:
  • a data access method where the method is applied to a storage device, the storage device includes a plurality of hard disks, and the method includes: The storage device receives a first data write request from a client, where the first data write request carries target data to be written and an address of a service logical space corresponding to the target data; determines a target hard disk in the plurality of hard disks and an address of a hard disk logical space corresponding to the service logical space based on the address of the service logical space; and writes, in an append-only write manner, the target data into a physical space corresponding to the hard disk logical space and in the target hard disk based on the determined address of the hard disk logical space.
  • the client requests the storage device to create the service logical space when needing to store data.
  • the address of the service logical space may include an identifier and an offset of the service logical space.
  • the identifier of the service logical space is used to uniquely identify the service logical space, and the offset is used to indicate length information of an area, in which data is currently written, in a physical space allocated to the service logical space.
  • the address of the hard disk logical space may also include an identifier and an offset of the hard disk logical space.
  • the offset included in the address of the hard disk logical space may be the offset included in the address of the service logical space.
  • Append-only write may be understood as writing data in an append-only manner, that is, the append-only write manner means that the written data is organized based on a writing time sequence.
  • the process subsequently does not perform a write operation on an area in which the data is located, and only performs a read operation.
  • the offset is used to indicate the length information of the area, in which the data is currently written, in the physical space allocated to the service logical space, and the address of the hard disk logical space includes the identifier and the offset of the hard disk logical space. Therefore, the storage device may determine a start address in the physical space corresponding to the hard disk logical space and in the target hard disk based on the address of the hard disk logical space, and then start to write the target data from the start address.
  • the client accesses the storage device by using the address of the service logical space.
  • the storage device determines the address of the hard disk logical space by using the address of the service logical space, and further accesses the target hard disk based on the address of the hard disk logical space.
  • the service logical space is specific to the client
  • the hard disk logical space is specific to the target hard disk.
  • Both the service logical space and the hard disk logical space are logical spaces.
  • different limitations are imposed. Therefore, it can be learned that a manner in which the client accesses the storage device and a manner in which the storage device accesses the target hard disk are the same. Therefore, when the storage device writes, in the append-only write manner, the target data into the physical space corresponding to the hard disk logical space and in the target hard disk, address translation in the hard disk is not performed. This reduces overheads and increases a data write rate.
  • the service logical space is specific to the client, and the hard disk logical space is specific to the hard disk.
  • the service logical space that the client requests the storage device to create is to facilitate subsequent data access.
  • the storage device may include a plurality of hard disks. Therefore, to ensure reliability of subsequent data access, after creating the service logical space and the hard disk logical space, the storage device may obtain the identifier of the service logical space and the identifier of the hard disk logical space, and store a correspondence between the identifier of the service logical space and the identifier of the hard disk logical space. That is, a correspondence between the service logical space and the hard disk logical space is created.
  • the hard disk logical space created in the target hard disk corresponds to one or more erase blocks, and data stored in the one or more erase blocks has a same cold/hot degree or a same life cycle.
  • a size of the physical space allocated by the target hard disk to the created hard disk logical space is an integer multiple of a size of one erase block.
  • write amplification in the garbage collection process can be reduced by using the foregoing method, or by using another method.
  • a size of the hard disk logical space in the target hard disk is adjusted so that the size of the hard disk logical space is equal to an integer multiple of a size of one erase block.
  • a data access apparatus in a second aspect, has a function of implementing behavior of the data access method according to the first aspect.
  • the data access apparatus includes one or more modules, and the one or more modules are configured to implement the data access method provided in the first aspect.
  • a storage device in a third aspect, includes a processor and a memory, and the memory is configured to store a program for performing the data access method provided in the first aspect, and store data related to implementation of the data access method provided in the first aspect.
  • the processor is configured to execute the program stored in the memory.
  • An operating apparatus of the storage device may further include a communication bus, where the communication bus is configured to establish a connection between the processor and the memory.
  • a computer-readable storage medium stores instructions. When the instructions run on a computer, the computer is enabled to perform the data access method according to the first aspect.
  • a computer program product including instructions is provided.
  • the computer program product runs on a computer, the computer is enabled to perform the data access method according to the first aspect.
  • the storage device may determine the target hard disk and the address of the hard disk logical space corresponding to the service logical space based on the address of the service logical space carried in the first data write request sent by the client, and further write the target data into the target hard disk based on the address of the hard disk logical space. That is, the client accesses the storage device by using the address of the service logical space. Inside the storage device, the address of the service logical space is translated into the address of the hard disk logical space. Data is written into the hard disk based on the address of the hard disk logical space. Therefore, it can be learned that only one time of address translation is required, thereby reducing overheads and improving data read/write efficiency.
  • FIG. 1 is an architectural diagram of a storage system shown according to a data access method provided in an embodiment of this application. As shown in FIG. 1 , the system includes a client 01 and a storage device 02, and a communication connection is established between the client 01 and the storage device 02.
  • the client 01 may send a data read request or a data write request to the storage device 02.
  • the storage device 02 may include a processor 021, a memory 022, and a hard disk 023.
  • the processor 021 may be a general-purpose central processing unit (central processing unit, CPU), a network processor (NP), or a microprocessor, or may be one or more integrated circuits configured to implement the solutions of this application, for example, an application-specific integrated circuit (application-specific integrated circuit, ASIC), a programmable logic device (programmable logic device, PLD) or a combination thereof.
  • the PLD may be a complex programmable logic device (complex programmable logic device, CPLD), a field-programmable gate array (field-programmable gate array, FPGA), a generic array logic (generic array logic, GAL), or any combination thereof.
  • the storage device may include a plurality of processors 021.
  • processors 021 may be a single-core processor or may be a multi-core processor.
  • the processor herein may refer to one or more devices, circuits, and/or devices including processing cores for processing data (for example, computer program instructions).
  • the processor 021 may implement data read/write by running the operating system.
  • the memory 022 may further store program code of the solutions of this application, and the processor 021 controls execution thereof. That is, the memory 022 is configured to store the program code for executing the solutions of this application, and the processor 021 may execute the program code stored in the memory 022, to implement a data access method provided in the embodiment in FIG. 4 below.
  • the memory 022 may be a read-only memory (read-only memory, ROM), or may be a random access memory (random access memory, RAM), or may be an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), an optical disc (including a compact disc read-only memory (compact disc read-only memory, CD-ROM), a compact optical disc, a laser disc, a digital versatile disc, a Blu-ray disc, and the like), a magnetic disk storage medium or another magnetic storage device, or any other medium capable of being configured to carry or store desired program code that is in the form of an instruction or a data structure and capable of being accessed by a computer, but not limited thereto.
  • the memory 022 may exist independently and is connected to the processor 021. Alternatively, the memory 022 may be integrated with the processor 021.
  • the storage device 02 may be a storage array, or may be a server.
  • the storage device 02 includes a controller and several hard disks.
  • the processor 021 and the memory 022 may be located in the controller of the storage array.
  • the controller is connected to the several hard disks by using back-end interface cards.
  • the processor 021, the memory 022, and several hard disks are all located inside the server.
  • FIG. 1 is merely a schematic diagram of some components included in the device.
  • the storage device 02 may further include a communication bus and a communication interface (not shown in FIG. 1 ).
  • the communication bus is configured to transmit information between components included in the storage device 02.
  • the communication interface is configured to communicate with another device or a communication network, such as Ethernet, a radio access network (RAN), and a wireless local area network (wireless local area network, WLAN).
  • a communication network such as Ethernet, a radio access network (RAN), and a wireless local area network (wireless local area network, WLAN).
  • RAN radio access network
  • WLAN wireless local area network
  • the following describes in detail the data access method provided in an embodiment of this application.
  • the method provided in this embodiment of this application is applied to a storage device, and the storage device includes a plurality of hard disks.
  • data access is mainly implemented by using a logical space, that is, a client accesses data in the storage device by using the logical space. Therefore, for ease of understanding, a process in which the client requests to create the logical space is first described.
  • the storage device receives a first logical space creation request from the client, where the first logical space creation request carries a first logical space size.
  • the storage device creates a service logical space, allocates an identifier to the service logical space, and sends the identifier of the service logical space to the client.
  • the storage device may further determine a target hard disk from the plurality of included hard disks based on the first logical space size, create a hard disk logical space in the target hard disk, allocate an identifier to the created hard disk logical space, and allocate, in the target hard disk, a physical space corresponding to the created hard disk logical space.
  • a size of the physical space is the same as the first logical space size.
  • an implementation process in which the storage device creates the hard disk logical space in the target hard disk and allocates, in the target hard disk, the physical space corresponding to the created hard disk logical space may be as follows:
  • the processor runs an operating system installed on the memory, to send a second logical space creation request to the target hard disk, where the second logical space creation request carries the first logical space size.
  • the target hard disk may create the hard disk logical space, allocate the identifier to the created hard disk logical space, and allocate the physical space corresponding to the created hard disk logical space, where the size of the physical space is equal to the first logical space size.
  • the target hard disk may send the identifier of the created hard disk logical space to the processor.
  • sizes of data currently stored in different hard disks may be different, that is, sizes of remaining storage spaces (or physical spaces) in the plurality of hard disks may be different. Therefore, when the storage device determines the target hard disk from the plurality of included hard disks based on the first logical space size, a size of a remaining storage space of each of the plurality of hard disks may be determined, and a hard disk whose remaining storage space has a size greater than or equal to the first logical space size is selected from the plurality of hard disks as the target hard disk.
  • the storage device may further select, from the plurality of hard disks based on a load balancing algorithm, a hard disk whose remaining storage space has a size greater than or equal to the first logical space size as the target hard disk.
  • all data in the erase block needs to be erased before new data can be written.
  • all data in the erase block needs to be erased before new data can be written.
  • garbage collection is performed on the target hard disk, if all data in an erase block is invalid data, the erase block may be directly erased. If part of data in an erase block is invalid data and a remaining part of data is valid data, the valid data needs to be migrated. The valid data has been written once, and then is written again during the migration. This phenomenon is referred to as write amplification.
  • the hard disk logical space created in the target hard disk corresponds to one or more erase blocks, and data stored in the one or more erase blocks has a same cold/hot degree or a same life cycle. That is, a size of the physical space allocated by the target hard disk to the created hard disk logical space is an integer multiple of a size of one erase block.
  • a size of the physical space allocated by the target hard disk to the created hard disk logical space is an integer multiple of a size of one erase block.
  • the size of the hard disk logical space created in the target hard disk may not be an integer multiple of a size of one erase block.
  • an erase block may correspond to two hard disk logical spaces.
  • data stored in the erase block may also have a same cold/hot degree, or have a same life cycle.
  • data corresponding to the two hard disk logical spaces may also have the same cold/hot degree, or have the same life cycle.
  • the target hard disk allocates physical spaces corresponding to the two hard disk logical spaces, the physical spaces corresponding to the two hard disk logical spaces are consecutive.
  • a difference between cold/hot degrees of the data stored in the erase block is less than a specific threshold, or a difference between life cycles of the data stored in the erase block is less than a specific threshold.
  • the erase block may be erased directly.
  • FIG. 3 for an erase block, if data corresponding to two service logical spaces is written into the erase block, and cold/hot degrees of the data corresponding to the two service logical spaces are different, after a physical space corresponding to one of the service logical spaces is subsequently released, the data corresponding to the other service logical space is still valid. In this case, the erase block cannot be directly erased. The valid data needs to be migrated to erase the erase block. Therefore, in this embodiment of this application, cold/hot degrees of data stored in a same erase block are set to be the same or similar, so that write amplification in the garbage collection process can be reduced.
  • write amplification in the garbage collection process can be reduced by using the foregoing method, or by using another method.
  • a size of the hard disk logical space in the target hard disk is adjusted so that the size of the hard disk logical space is equal to an integer multiple of a size of one erase block.
  • the following describes two implementations of adjusting the size of the hard disk logical space in the target hard disk.
  • the storage device may determine whether the first logical space size is an integer multiple of a size of one erase block. If the first logical space size is not an integer multiple of a size of one erase block, the storage device may determine a second logical space size, where the second logical space size is equal to an integer multiple of a size of one erase block. Then, the target hard disk is determined from the plurality of hard disks based on the second logical space size. Further, the hard disk logical space is created in the target hard disk, and in the target hard disk, the physical space corresponding to the created hard disk logical space is allocated. That is, the size of the physical space is the same as the second logical space size.
  • the storage device may determine whether the first logical space size is an integer multiple of a size of one erase block. If the first logical space size is not an integer multiple of a size of one erase block, the storage device may send a configuration recommendation message to the client, where the configuration recommendation message carries a second logical space size to be recommended, and the recommended second logical space size is an integer multiple of a size of one erase block.
  • the storage device determines the target hard disk from the plurality of hard disks based on the second logical space size, then creates the hard disk logical space in the target hard disk, and allocates, in the target hard disk, the physical space corresponding to the created hard disk logical space. That is, the size of the physical space is the same as the second logical space size.
  • the second logical space size may be greater than the first logical space size, and a difference between the second logical space size and the first logical space size is less than the size of one erase block.
  • an implementation in which the storage device selects the target hard disk based on the second logical space size is the same as the foregoing implementation in which the storage device selects the target hard disk based on the first logical space size.
  • an implementation in which the storage device creates the hard disk logical space in the target hard disk based on the first logical space size and allocates the physical space corresponding to the hard disk logical space is the same as the foregoing implementation in which the storage device creates the hard disk logical space in the target hard disk based on the first logical space size and allocates the physical space corresponding to the hard disk logical space. Details are not described in this embodiment of this application.
  • the storage device may implement write amplification reduction in a manner of storing, on a same erase block, data having a same or similar cold/hot degree or data having a same or similar life cycle.
  • the over-provisioning space can be used to store migrated valid data during garbage collection. However, if processing is performed based on the foregoing method, migration of valid data in the garbage collection process can be avoided. In this way, a size of the over-provisioning space can be reduced, and more data can be stored in the hard disk, thereby reducing costs.
  • the service logical space is specific to the client, and the hard disk logical space is specific to the hard disk.
  • the service logical space that the client requests the storage device to create is to facilitate subsequent data access.
  • the storage device may include a plurality of hard disks. Therefore, to ensure reliability of subsequent data access, after the storage device obtains the identifier of the service logical space and the identifier of the hard disk logical space, the storage device may store a correspondence between the identifier of the service logical space and the identifier of the hard disk logical space. That is, a correspondence between the service logical space and the hard disk logical space is created.
  • the storage device may further store a correspondence between an identifier of the target hard disk and the identifier of the service logical space.
  • the storage device may store a correspondence among the identifier of the service logical space, the identifier of the target hard disk, and the identifier of the hard disk logical space.
  • the target hard disk may store a mapping relationship between the hard disk logical space and the physical space allocated to the hard disk logical space. That is, a mapping relationship between the identifier of hard disk logical space and an address range of the physical space allocated to the hard disk logical space is stored.
  • the first logical space creation request further needs to carry a data protection attribute.
  • the data protection attribute may include a plurality of copies, an erasure code (erasure code, EC), and the like.
  • the plurality of copies are used as an example.
  • a plurality of target hard disks may be selected. That is, the plurality of selected target hard disks are in a one-to-one correspondence with the plurality of copies.
  • a hard disk logical space may be created in each target hard disk in the plurality of target hard disks based on the foregoing same method. Further, a correspondence among the identifier of the service logical space, an identifier of the target hard disk, and an identifier of the hard disk logical space is stored, so as to create a correspondence between the service logical space and the hard disk logical spaces in the plurality of target hard disks.
  • the data protection attribute is the EC or another attribute
  • a process in which the storage device creates the logical space may be different from the foregoing process.
  • a main idea is the same. Both processes are obtained by making appropriate variations based on the foregoing process and characteristics of different data protection attributes.
  • FIG. 4 is a flowchart of a data access method according to an embodiment of this application. The method includes the following steps.
  • Step 401 A storage device receives a first data write request from a client, where the first data write request carries target data to be written and an address of a service logical space corresponding to the target data.
  • the client requests the storage device to create the service logical space when needing to store data.
  • the address of the service logical space may include an identifier and an offset of the service logical space.
  • the identifier of the service logical space is used to uniquely identify the service logical space, and the offset is used to indicate length information of an area, in which data is currently written, in a physical space allocated to the service logical space.
  • Step 402 The storage device determines a target hard disk in the plurality of hard disks and an address of a hard disk logical space corresponding to the service logical space based on the address of the service logical space.
  • the storage device stores a correspondence among the identifier of the service logical space, an identifier of the target hard disk, and an identifier of the hard disk logical space. Therefore, the storage device may determine the identifier of the target hard disk and the identifier of the hard disk logical space from the correspondence based on the identifier of the service logical space.
  • the address of the hard disk logical space may also include the identifier and an offset of the hard disk logical space.
  • the offset included in the address of the hard disk logical space may be the offset included in the address of the service logical space.
  • Step 403 The storage device writes, in an append-only write manner, the target data into a physical space corresponding to the hard disk logical space and in the target hard disk based on the determined address of the hard disk logical space.
  • Append-only write may be understood as writing data in an append-only manner, that is, the append-only write manner means that the written data is organized based on a writing time sequence.
  • the process subsequently does not perform a write operation on an area in which the data is located, and only performs a read operation.
  • the offset is used to indicate length information of an area in which data is currently written, in the physical space allocated to the service logical space, and the address of the hard disk logical space includes the identifier and the offset of the hard disk logical space. Therefore, the storage device may determine a start address in the physical space corresponding to the hard disk logical space and in the target hard disk based on the address of the hard disk logical space, and then start to write the target data from the start address.
  • an implementation process in which the storage device writes, in the append-only write manner, the target data into the physical space corresponding to the hard disk logical space and in the target hard disk based on the determined address of the hard disk logical space is as follows:
  • the processor runs an operating system installed on the memory; so as to send a second data write request to the target hard disk, where the second data write request carries the target data and the determined address of the hard disk logical space.
  • the target hard disk After receiving the second data write request, the target hard disk writes, in the appended write manner, the target data into the physical space corresponding to the hard disk logical space based on the address of the hard disk logical space.
  • the target hard disk stores a mapping relationship between the identifier of the hard disk logical space and an address range of the physical space allocated to the hard disk logical space. Therefore, after receiving the second data write request, the target hard disk may determine, based on the identifier of the hard disk logical space, the address range of the physical space allocated to the hard disk logical space from the mapping relationship. Further, a start address is determined in the address range based on the offset included in the address of the hard disk logical space, and the target data is written starting from the start address.
  • the client accesses the storage device by using the address of the service logical space.
  • the storage device determines the address of the hard disk logical space by using the address of the service logical space, and further accesses the target hard disk based on the address of the hard disk logical space.
  • the service logical space is specific to the client
  • the hard disk logical space is specific to the target hard disk.
  • Both the service logical space and the hard disk logical space are logical spaces.
  • different limitations are imposed. Therefore, it can be learned that a manner in which the client accesses the storage device and a manner in which the storage device accesses the target hard disk are the same. Therefore, when the storage device writes, in the append-only write manner, the target data into the physical space corresponding to the hard disk logical space and in the target hard disk, address translation in the hard disk is not performed. This reduces overheads and increases a data write rate.
  • the storage device may further write metadata corresponding to the service logical space into the target hard disk.
  • the metadata includes information related to the hard disk logical space, for example, a total size of the hard disk logical space, a size of currently written data, and an end location of the currently written data.
  • a process in which the storage device writes the metadata is similar to the process of writing the target data. Details are not described in this embodiment of this application.
  • a physical space in the target hard disk and corresponding to the metadata corresponding to the service logical space may be different from the physical space corresponding to the address of the hard disk logical space and in the target hard disk.
  • the storage device may create a logical space in the target hard disk, where a physical space corresponding to the logical space is different from the physical space corresponding to the hard disk logical space, and the physical space corresponding to the logical space is used to store the metadata corresponding to the service logical space.
  • Steps 401 to 403 are an implementation process in which the storage device writes data based on the logical space.
  • the following describes a process in which the storage device reads the data based on the logical space.
  • the storage device receives a first data read request from the client.
  • the first data read request carries a length of data to be read and an address of a service logical space corresponding to the data to be read.
  • the storage device determines a target hard disk in the plurality of hard disks and an address of a hard disk logical space corresponding to the service logical space based on the address of the service logical space. Then, the storage device may read the data from a physical space corresponding to the hard disk logical space and in the target hard disk based on the length of the data to be read and the address of the hard disk logical space.
  • the address of the service logical space also includes an identifier and an offset of the service logical space.
  • the offset is used to indicate information about a length between a start address of the data to be read and a start address of the physical space allocated to the service logical space.
  • An implementation process in which the storage device reads the data from the physical space corresponding to the hard disk logical space and in the target hard disk based on the length of the data to be read and the address of the hard disk logical space may be as follows: The storage device determines a start address in the physical space corresponding to the hard disk logical space and in the target hard disk based on the address of the hard disk logical space, and then starts to read the data from the start address based on the length of the data to be read.
  • an implementation process in which the storage device determines the start address in the physical space corresponding to the hard disk logical space and in the target hard disk based on the address of the hard disk logical space is as follows:
  • the processor runs the operating system installed on the memory, to send a second data read request to the target hard disk, where the second data read request carries the length of the data to be read and the address of the determined hard disk logical space.
  • the target hard disk determines the start address in the physical space corresponding to the hard disk logical space and in the target hard disk based on the address of the hard disk logical space.
  • the target hard disk stores a mapping relationship between the identifier of the hard disk logical space and an address range of the physical space allocated to the hard disk logical space. Therefore, after receiving the second data read request, the target hard disk may determine, based on the identifier of the hard disk logical space, the address range of the physical space allocated to the hard disk logical space from the mapping relationship. Further, the start address is determined in the address range based on the offset included in the address of the hard disk logical space.
  • states of the service logical space and the hard disk logical space are in an open state by default.
  • the storage device may set both the state of the service logical space and the state of the hard disk logical space in the target hard disk to a closed state. In this way, subsequently data cannot be written by using the service logical space and the hard disk logical space in the target hard disk, but data can still be normally read.
  • the storage device not only sets the state of the service logical space to the closed state, but also needs to set the state of the hard disk logical space in the target hard disk to the closed state. In this way, when the target hard disk maintains resources, the physical space corresponding to the hard disk logical space and in the closed state may not be maintained, thereby saving resources.
  • the storage device may set both the state of the service logical space and the state of the hard disk logical space to a deleted state, and release the corresponding physical space. In this way, during garbage collection, the physical space corresponding to the hard disk logical space and in the deleted state may be erased.
  • the logical space may be referred to as a persistence log (persistence log, plog), and a method for operating the logical space is referred to as a plog interface. That is, both the processor and the hard disk of the storage device provide the plog interface.
  • a solid state drive solid state drive (solid state drive, SSD) is used as an example of a hard disk for description.
  • a client is installed on a user host, a storage device provides a plog interface for the user host, and the SSD drive also provides a plog interface.
  • the client when the client needs to access data in the storage device, the client may interact with the storage device based on the plog interface provided by the storage device for the user host, and data transmission may also be performed inside the storage device based on the plog interface, ensuring that access manners are consistent, so that a data organization manner does not need to be converted, and data read/write efficiency is very high.
  • the storage device does not need to manage a physical space in the hard disk and does not need to process a complex task such as garbage collection, thereby greatly simplifying the storage system.
  • an intelligent network interface card may be disposed in both the user host and the storage device.
  • data read/write may be directly implemented by using the intelligent network interface card subsequently, without participation of a processor in the storage device. That is, when data is written, the client may send the data to the intelligent network interface card of the storage device by using the intelligent network interface card in the user host, and the intelligent network interface card of the storage device may identify the received data, so as to write the data into the hard disk. A case of reading data is similar to this.
  • the processor of the storage device may further process another task in parallel.
  • a control-layer operation further requires participation of the processor of the storage device. For example, operations such as creating, deleting, and closing a logical space further require participation of the processor of the storage device.
  • the storage device may determine the target hard disk and the address of the hard disk logical space corresponding to the service logical space based on the address of the service logical space carried in first data write request sent by the client, and further write the target data into the target hard disk based on the address of the hard disk logical space. That is, the client accesses the storage device by using the address of the service logical space. Inside the storage device, the address of the service logical space is translated into the address of the hard disk logical space. Data is written into the hard disk based on the address of the hard disk logical space. Therefore, it can be learned that only one time of address translation is required, thereby reducing overheads and improving data read/write efficiency.
  • FIG. 6 is a schematic structural diagram of a data access apparatus according to an embodiment of this application.
  • the apparatus is located in a storage device, and is executed by a processor 021 by invoking program code in a memory 022.
  • the apparatus may be implemented as a part or all of the storage device by using software, hardware, or a combination of the two.
  • the storage device may include a plurality of hard disks.
  • the apparatus includes a receiving module 601, a determining module 602, and a writing module 603.
  • the receiving module 601 is configured to receive a first data write request from a client, where the first data write request carries target data to be written and an address of service logical space corresponding to the target data;
  • the apparatus further includes: a creation module, configured to create a correspondence between the service logical space and the hard disk logical space.
  • the hard disk logical space corresponds to one or more erasure blocks, and data stored in the one or more erasure blocks has a same cold/hot degree or a same life cycle.
  • the apparatus further includes: an adjustment module, configured to adjust a size of the hard disk logical space, so that the size of the hard disk logical space is equal to an integer multiple of a size of one erase block.
  • the storage device may determine the target hard disk and the address of the hard disk logical space corresponding to the service logical space based on the address of the service logical space carried in the first data write request sent by the client, and further write the target data into the target hard disk based on the address of the hard disk logical space. That is, the client accesses the storage device by using the address of the service logical space. Inside the storage device, the address of the service logical space is translated into the address of the hard disk logical space. Data is written into the hard disk based on the address of the hard disk logical space. Therefore, it can be learned that only one time of address translation is required, thereby reducing overheads and improving data read/write efficiency.
  • data access apparatus provided in the foregoing embodiments accesses data
  • only division into the foregoing function modules is used as an example for description.
  • the foregoing functions may be allocated to different function modules and implemented according to a requirement, that is, an inner structure of the apparatus is divided into different function modules to implement all or some of the functions described above.
  • the data access apparatus provided in the foregoing embodiment belongs to a same concept as the data access method embodiment. For a specific implementation process, refer to the method embodiment. Details are not described herein again.
  • All or some of the foregoing embodiments may be implemented by using software, hardware, firmware, or any combination thereof.
  • the embodiments may be implemented completely or partially in a form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable apparatuses.
  • the computer instructions may be stored in a computer-readable storage medium or may be transmitted from a computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center to another website, computer, server, or data center in a wired (for example, a coaxial cable, an optical fiber, or a digital subscriber line (digital subscriber line, DSL)) or wireless (for example, infrared, radio, or microwave) manner.
  • the computer storage medium may be any usable medium accessible by a computer, or a data storage device, such as a server or a data center, integrating one or more usable media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a digital versatile disc (digital versatile disc, DVD), a semiconductor medium (for example, a solid-state drive (solid-state drive, SSD)), or the like.
  • the computer-readable storage medium mentioned in this application may be a non-volatile storage medium, in other words, may be a non-transitory storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
EP20885263.2A 2019-11-08 2020-10-09 Datenverarbeitungsverfahren und -vorrichtung sowie speichermedium Pending EP4044039A4 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201911085845 2019-11-08
CN202010117250.9A CN112783804A (zh) 2019-11-08 2020-02-25 数据访问方法、装置及存储介质
PCT/CN2020/119935 WO2021088587A1 (zh) 2019-11-08 2020-10-09 数据访问方法、装置及存储介质

Publications (2)

Publication Number Publication Date
EP4044039A1 true EP4044039A1 (de) 2022-08-17
EP4044039A4 EP4044039A4 (de) 2022-12-07

Family

ID=75749982

Family Applications (1)

Application Number Title Priority Date Filing Date
EP20885263.2A Pending EP4044039A4 (de) 2019-11-08 2020-10-09 Datenverarbeitungsverfahren und -vorrichtung sowie speichermedium

Country Status (4)

Country Link
US (1) US12050539B2 (de)
EP (1) EP4044039A4 (de)
CN (1) CN112783804A (de)
WO (1) WO2021088587A1 (de)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277640B (zh) * 2022-07-29 2023-11-24 迈普通信技术股份有限公司 数据处理方法、装置、智能网卡及存储介质

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5675769A (en) * 1995-02-23 1997-10-07 Powerquest Corporation Method for manipulating disk partitions
JP3500972B2 (ja) * 1998-07-24 2004-02-23 日本電気株式会社 ディスク共有型クラスタシステムにおける論理ファイル管理システム
CN1254748C (zh) * 2003-10-31 2006-05-03 清华大学 存储区域网络中分布式虚拟化存储的方法
US20060069848A1 (en) * 2004-09-30 2006-03-30 Nalawadi Rajeev K Flash emulation using hard disk
CN101604227B (zh) * 2009-07-17 2011-08-31 杭州华三通信技术有限公司 数据存储的方法及设备
US8918619B2 (en) * 2009-10-04 2014-12-23 Infinidat Ltd. Virtualized storage system and method of operating thereof
US20120079229A1 (en) * 2010-09-28 2012-03-29 Craig Jensen Data storage optimization for a virtual platform
CN101976223B (zh) * 2010-10-09 2012-12-12 成都市华为赛门铁克科技有限公司 自动精简配置方法和装置
WO2012109679A2 (en) * 2011-02-11 2012-08-16 Fusion-Io, Inc. Apparatus, system, and method for application direct virtual memory management
US9251086B2 (en) * 2012-01-24 2016-02-02 SanDisk Technologies, Inc. Apparatus, system, and method for managing a cache
US10203876B2 (en) * 2013-11-22 2019-02-12 Nec Corporation Storage medium apparatus, method, and program for storing non-contiguous regions
CN103744864A (zh) * 2013-12-17 2014-04-23 记忆科技(深圳)有限公司 缓存数据读写的方法、系统及其固态硬盘
KR102252419B1 (ko) * 2014-01-09 2021-05-14 한국전자통신연구원 플래시 메모리 장치를 위한 주소변환 시스템 및 그 방법
CA2881206A1 (en) * 2014-02-07 2015-08-07 Andrew WARFIELD Methods, systems and devices relating to data storage interfaces for managing address spaces in data storage devices
CN104156178A (zh) * 2014-08-11 2014-11-19 四川九成信息技术有限公司 一种嵌入式终端数据访问方法
US10423339B2 (en) * 2015-02-02 2019-09-24 Western Digital Technologies, Inc. Logical block address mapping for hard disk drives
CN104820575B (zh) * 2015-04-27 2017-08-15 西北工业大学 实现存储系统自动精简的方法
CN106354431A (zh) * 2016-08-26 2017-01-25 浪潮(北京)电子信息产业有限公司 一种数据存储方法及装置
CN110018966A (zh) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 一种存储器、存储系统、主机及数据操作、垃圾回收方法

Also Published As

Publication number Publication date
US12050539B2 (en) 2024-07-30
EP4044039A4 (de) 2022-12-07
WO2021088587A1 (zh) 2021-05-14
US20220261354A1 (en) 2022-08-18
CN112783804A (zh) 2021-05-11

Similar Documents

Publication Publication Date Title
US11334533B2 (en) Dynamic storage tiering in a virtual environment
US9355112B1 (en) Optimizing compression based on data activity
US9798655B2 (en) Managing a cache on storage devices supporting compression
US10871920B2 (en) Storage device and computer system
US10860494B2 (en) Flushing pages from solid-state storage device
US20180165014A1 (en) Array controller, solid state disk, and method for controlling solid state disk to write data
US9635123B2 (en) Computer system, and arrangement of data control method
US11262916B2 (en) Distributed storage system, data processing method, and storage node
US10037161B2 (en) Tiered storage system, storage controller, and method for deduplication and storage tiering
CN112771493B (zh) 将写入流分离到多个分区中
US11899580B2 (en) Cache space management method and apparatus
EP3992792A1 (de) Ressourcenzuweisungsverfahren, speichervorrichtung und speichersystem
CN112905111A (zh) 数据缓存的方法和数据缓存的装置
KR20180086120A (ko) 테일 레이턴시를 인식하는 포어그라운드 가비지 컬렉션 알고리즘
US12050539B2 (en) Data access method and apparatus and storage medium
CN113794764A (zh) 服务器集群的请求处理方法、介质和电子设备
US20060085598A1 (en) Storage-device resource allocation method and storage device
WO2016206070A1 (zh) 一种文件更新方法及存储设备
US10585592B2 (en) Disk area isolation method and device
TWI722392B (zh) 記憶裝置
US12032849B2 (en) Distributed storage system and computer program product
WO2023102784A1 (zh) 数据存取方法、装置、磁盘控制器、磁盘和数据存储系统
WO2022021280A1 (zh) 存储控制器、存储控制方法、固态硬盘及存储系统
US9146679B2 (en) Effectively limitless apparent free space on storage device
CN111414127B (zh) 计算集群系统及其数据获取方法以及电子设备

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20220511

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

A4 Supplementary search report drawn up and despatched

Effective date: 20221108

RIC1 Information provided on ipc code assigned before grant

Ipc: G06F 12/1081 20160101AFI20221102BHEP

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)