WO2018028529A1 - 无锁io处理方法及其装置 - Google Patents

无锁io处理方法及其装置 Download PDF

Info

Publication number
WO2018028529A1
WO2018028529A1 PCT/CN2017/096152 CN2017096152W WO2018028529A1 WO 2018028529 A1 WO2018028529 A1 WO 2018028529A1 CN 2017096152 W CN2017096152 W CN 2017096152W WO 2018028529 A1 WO2018028529 A1 WO 2018028529A1
Authority
WO
WIPO (PCT)
Prior art keywords
logical address
thread
request
storage
command
Prior art date
Application number
PCT/CN2017/096152
Other languages
English (en)
French (fr)
Inventor
易正利
吴忠杰
Original Assignee
北京忆恒创源科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京忆恒创源科技有限公司 filed Critical 北京忆恒创源科技有限公司
Publication of WO2018028529A1 publication Critical patent/WO2018028529A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Definitions

  • the present application relates to a storage system technology, and in particular, to an IO request processing method and apparatus for a storage system.
  • SSDs Solid State Drives
  • RAID Redundant Array of Independent Disks
  • RAID technology lengthens the IO path and increases computational overhead.
  • multi-core multi-CPU technology is currently used. As long as the CPUs process the IO requests concurrently, it is possible to achieve the goal of data protection and high performance.
  • the present invention aims to solve the technical problems in the related art described above at least to some extent.
  • an IO request processing method of a storage system comprising a plurality of virtual storage disks, the virtual storage disks comprising a plurality of logical address regions, the method comprising Receiving a first IO request, wherein the first IO requests access to the first logical address region; determining the first thread according to the first logical address region, causing the first thread to process the first IO request.
  • an IO request processing method of a second storage system according to the first aspect of the present invention, the logical addresses of the plurality of logical address regions do not overlap each other, The method also includes receiving a second IO request, wherein the second IO requests access to a second logical address region; determining a second thread based on the second logical address region, causing the second thread to process the second IO request.
  • a third IO request processing method in a storage system according to the first aspect of the present invention wherein the first thread is determined according to the first logical address region,
  • the first thread processing the first IO request includes: generating a first IO command according to the first IO request, and filling the first IO command to the first queue corresponding to the first thread according to the first logical address area, first The thread fetches from the first queue and processes the first IO command, wherein the first thread processes only the commands of the first queue.
  • An IO request processing method of a storage system provides a first aspect according to the present invention
  • the IO request processing method of the storage system, determining the second thread according to the second logical address area, and causing the second thread to process the second IO request comprises: generating a second IO command according to the second IO request, and according to the second logic
  • the address area fills the second IO command with the second queue corresponding to the second thread, and the second thread fetches and processes the second IO command from the second queue, wherein the second thread only processes the command of the second queue.
  • a fifth request processing method of the storage system further comprising: indexing the first logical address area to the thread
  • the result of the modulo modulo is the index of the first thread; or the result of modulo the index of the second logical address area to the number of threads is the index of the second thread.
  • a sixth request processing method of the storage system further comprising: calculating a hash of the index of the first logical address region Obtaining an index of the first thread; or calculating a hash of the index of the second logical address area to obtain an index of the second thread.
  • a seventh request processing method of the storage system according to the first aspect of the present invention further comprising: indexing the first logical address area to the thread The result of the modulo modulo is the index of the first queue; or the result of modulo the index of the second logical address area to the number of threads is the index of the second queue.
  • an IO request processing method of the storage system of the first aspect of the present invention further comprising: calculating a hash of the index of the first logical address region Obtaining an index of the first queue; or calculating a hash of the index of the second logical address area to obtain an index of the second queue.
  • the IO request processing method of the storage system according to the first aspect of the present invention provides the IO request processing method of the storage system according to the ninth aspect of the present invention, further comprising: determining the first mapping table according to the first logical address region The first thread accesses the first mapping table entry, obtains the first storage object from the first mapping table entry according to the first logical address accessed by the first IO request, and accesses the first storage object. Processing the first IO request.
  • the IO request processing method of the storage system of the first aspect of the present invention further comprising: determining the second mapping table according to the second logical address region The second thread accesses the second mapping table entry, obtains the second storage object from the second mapping table entry according to the second logical address accessed by the second IO request, and accesses the second storage object. Processing the second IO request.
  • an eleventh request processing method of the storage system further comprising: mapping the index pair of the first logical address region
  • the result of modulo the number of table entries is an index of the first mapping table entry; or the hash of the index of the first logical address region is calculated to obtain an index of the first mapping table entry.
  • the twelfth IO request processing method of the storage system further comprising: mapping the index pair of the second logical address region
  • the result of modulo the number of table entries is an index of the second mapping table entry; or the hash of the index of the second logical address region is calculated to obtain an index of the second mapping table entry.
  • a thirteenth IO request processing method of the storage system further comprising: if the first mapping table entry is not available The storage object is obtained, a third storage object is created for the write request in the IO request, and the third storage object is recorded in the first mapping table entry, and the data is written to the third storage object.
  • the IO request processing method of the storage system of the fourteenth aspect of the present invention further comprising: if the second mapping table entry is not available The storage object is obtained, a fourth storage object is created for the write request in the IO request, and the fourth storage object is recorded in the second mapping table entry, and the data is written to the fourth storage object.
  • an IO request processing method of a storage system comprising a plurality of virtual storage disks, the virtual storage disks comprising a plurality of logical address regions, the method comprising : receiving a first IO request; generating a first IO command and a second IO command according to the first IO request, wherein the first IO command accesses the first logical address region, and the second IO command accesses the second logical address region; according to the first logical address The region determines a first thread, causing the first thread to process the first IO command; and determining a second thread based on the second logical address region, causing the second thread to process the second IO command.
  • the second IO request processing method of the storage system further comprising: the first IO according to the first logical address area
  • the command is populated into a first queue corresponding to the first thread, the first thread fetching and processing the first IO command from the first queue, wherein the first thread processes only the command of the first queue; and according to the second logical address area,
  • the second IO command is populated into a second queue corresponding to the second thread, and the second thread fetches and processes the second IO command from the second queue, wherein the second thread processes only the commands of the second queue.
  • the third IO request processing method of the storage system according to the second aspect of the present invention further comprising: determining the first mapping table according to the first logical address region The first thread accesses the first mapping table entry, obtains the first storage object from the first mapping table entry according to the first logical address accessed by the first IO command, and accesses the first storage object. Processing the first IO command.
  • a fourth request processing method of the storage system further comprising: determining the second mapping table according to the second logical address region The second thread accesses the second mapping table entry, obtains the second storage object from the second mapping table entry according to the second logical address accessed by the second IO command, and accesses the second storage object. Processing the second IO command.
  • a fifth request processing method of the storage system further comprising: mapping the index pair of the first logical address region The result of modulo the number of entries is the index of the first mapping table entry; or the hash of the index of the first logical address region is calculated to obtain an index of the first mapping table entry.
  • a sixth request processing method of the storage system further comprising: mapping the index pair of the second logical address region The result of modulo the number of entries is the index of the second mapping table entry; or the hash of the index of the second logical address region is calculated to obtain an index of the second mapping table entry.
  • a seventh IO request processing method in a storage system according to the second aspect of the present invention wherein the third storage is created for the write request in the IO request An object, and recording a third storage object in the first mapping table entry, and writing data to the third storage object.
  • an eighth IO request processing method in a storage system according to the second aspect of the present invention wherein the fourth storage is created for the write request in the IO request An object, and recording a fourth storage object in the second mapping table entry, and writing data to the fourth storage object.
  • a ninth request processing method in a storage system according to the ninth aspect of the present invention wherein the number of mapping entries is a thread that processes the IO command An integer multiple of the number.
  • the IO request processing method of the storage system of the second aspect of the present invention further comprising: if not obtained from the first mapping table entry The storage object returns a result indicating that the read request is abnormal for the read request in the IO request.
  • an eleventh request processing method of the storage system according to the second aspect of the present invention further comprising: if the second mapping table entry is not available The storage object is obtained, and for the read request in the IO request, a result indicating that the read request is abnormal is returned.
  • a twelfth IO request processing method in a storage system according to the second aspect of the present invention wherein the first thread is executed only by the first CPU And the second thread is only executed by the second CPU.
  • an IO request processing apparatus of a storage system comprising a plurality of virtual storage disks, the virtual storage disk comprising a plurality of logical address areas, the device comprising a first receiving module, configured to receive a first IO request, where the first IO requests access to the first logical address area; the first processing module is configured to determine the first thread according to the first logical address area, so that the first The thread processes the first IO request.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the second storage system according to the third aspect of the present invention, the logical addresses of the plurality of logical address areas do not overlap each other,
  • the first receiving module is further configured to receive a second IO request, where the second IO requests access to the second logical address area; the first processing module is further configured to determine the second thread according to the second logical address region, so that The second thread processes the second IO request.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a third storage system according to the third aspect of the present invention, the first processing module for generating a first An IO command, and according to the first logical address area, filling the first IO command to the first queue corresponding to the first thread, the first thread fetching from the first queue and processing the first IO command, wherein the first thread only The command to process the first queue.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a fourth storage system according to the third aspect of the present invention, the first processing module for generating a a second IO command, and according to the second logical address area, filling the second IO command to the second queue corresponding to the second thread, the second thread fetching from the second queue and processing the second IO command, wherein the second thread only The command to process the second queue.
  • the IO request processing apparatus of the storage system according to the third aspect of the present invention provides the IO request processing apparatus of the fifth storage system according to the third aspect of the present invention, further comprising: a first index calculation module for using the first logic
  • the result of the index of the address area modulo the number of threads is the index of the first thread; or the result of modulo the number of threads of the second logical address area to the number of threads is the index of the second thread.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the sixth storage system according to the third aspect of the present invention, further comprising: a second index calculation module for the first logic
  • the index of the address area calculates a hash to obtain an index of the first thread; or calculates a hash of the index of the second logical address area to obtain an index of the second thread.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a seventh storage system according to the third aspect of the present invention, further comprising: a third index calculation module for using the first logic
  • the result of the index of the address area modulo the number of threads is the index of the first queue; or the result of modulo the index of the second logical address area to the number of threads is the index of the second queue.
  • the IO request processing apparatus of the storage system according to the third aspect of the present invention provides the IO request processing apparatus of the eighth storage system according to the third aspect of the present invention, further comprising: a fourth index calculation module for the first logic
  • the index of the address area calculates a hash to obtain an index of the first queue; or calculates a hash of the index of the second logical address area to obtain an index of the second queue.
  • An IO request processing apparatus of a storage system according to a third aspect of the present invention
  • the IO request processing apparatus of the ninth storage system further comprising: a first mapping table determining module, configured to The logical address area determines a first mapping table entry; the first processing module is further configured to access, by using the first thread, the first mapping table entry, and the first logical address accessed according to the first IO request from the first mapping table
  • the first storage object is obtained in the entry, and the first storage object is accessed to process the first IO request.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the tenth storage system according to the third aspect of the present invention, further comprising: a second mapping table determining module for The logical address area determines a second mapping table entry; the first processing module is further configured to access the second mapping table entry by using the second thread, and accessing the second logical address according to the second IO request from the second mapping table The second storage object is obtained in the entry, and the second storage object is accessed to process the second IO request.
  • the IO request processing apparatus of the storage system according to the third aspect of the present invention provides the IO request processing apparatus of the eleventh storage system according to the third aspect of the present invention, further comprising: a fifth index calculation module for using the first Logical address area
  • the index modulo the number of mapping table entries is the index of the first mapping table entry; or the hash of the index of the first logical address region is calculated to obtain an index of the first mapping table entry.
  • the IO request processing apparatus of the storage system according to the third aspect of the present invention provides the IO request processing apparatus of the twelfth storage system according to the third aspect of the present invention, further comprising: a sixth index calculation module for using the second The result of the index of the logical address area modulo the number of the mapping table entries is the index of the second mapping table entry; or the hash of the index of the second logical address area is calculated, and the index of the second mapping table entry is obtained.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a thirteenth storage system according to the third aspect of the present invention, the first processing module further configured to When the storage object cannot be obtained in the table entry, a third storage object is created for the write request in the IO request, and the third storage object is recorded in the first mapping table entry, and the data is written to the third storage object.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a fourteenth storage system according to the third aspect of the present invention, the first processing module further configured to When the storage object cannot be obtained in the table entry, a fourth storage object is created for the write request in the IO request, and the fourth storage object is recorded in the second mapping entry, and the data is written to the fourth storage object.
  • an IO request processing apparatus of a storage system comprising a plurality of virtual storage disks, the virtual storage disk comprising a plurality of logical address areas
  • the device comprising a second receiving module, configured to receive the first IO request, and a second processing module, configured to generate the first IO command and the second IO command according to the first IO request, where the first IO command accesses the first logical address region, and The second IO command accesses the second logical address area, determines the first thread according to the first logical address area, causes the first thread to process the first IO command, and determines the second thread according to the second logical address area to make the second The thread processes the second IO command.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a second storage system according to the fourth aspect of the present invention, the second processing module for using the first logical address area, Filling a first IO command with a first queue corresponding to the first thread, the first thread fetching and processing the first IO command from the first queue, wherein the first thread processes only the command of the first queue, and according to the second The logical address area fills the second IO command to the second queue corresponding to the second thread, and the second thread fetches and processes the second IO command from the second queue, wherein the second thread processes only the command of the second queue.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the third storage system according to the fourth aspect of the present invention, further comprising: a third mapping module, configured to: according to the first logical address The area determines the first mapping table entry; the second processing module is further configured to access the first mapping table entry by using the first thread, and the first logical address accessed according to the first IO command is from the first mapping table entry Obtaining a first storage object and accessing the first storage object to process the first IO command.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the fourth storage system according to the fourth aspect of the present invention, further comprising: a fourth mapping module, configured to: according to the second logical address The area determines the second mapping table entry; the second processing module is further configured to access the second mapping table entry by using the second thread, and accessing the second logical address from the second mapping table entry according to the second IO command Obtaining a second storage object and accessing the second storage object to process the second IO command.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the fifth storage system according to the fourth aspect of the present invention, further comprising: a seventh index calculation module for using the first logic
  • the index of the address area modulo the number of the mapping table entries is the index of the first mapping table entry; or the hash of the index of the first logical address region is calculated, and the index of the first mapping table entry is obtained.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the sixth storage system according to the fourth aspect of the present invention, further comprising: an eighth index calculation module for using the second logic
  • the result of the index of the address area modulo the number of the mapping table entries is the index of the second mapping table entry; or the hash of the index of the second logical address region is calculated to obtain an index of the second mapping table entry.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of a seventh storage system according to the fourth aspect of the present invention, wherein a third storage object is created for a write request in an IO request And recording the third storage object in the first mapping entry and writing the data to the third storage object.
  • An IO request processing apparatus of a storage system provides an IO request processing apparatus of an eighth storage system according to the fourth aspect of the present invention, wherein a fourth storage object is created for a write request in an IO request And recording the fourth storage object in the second mapping table entry and writing the data to the fourth storage object.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the ninth storage system according to the fourth aspect of the present invention, wherein the number of mapping entries is the number of threads processing the IO command Integer multiple.
  • the IO request processing apparatus of the storage system according to the fourth aspect of the present invention provides the IO request processing apparatus of the tenth storage system according to the fourth aspect of the present invention, further comprising: the second processing module is further configured to A storage object cannot be obtained in a mapping table entry, and a result indicating a read request exception is returned for a read request in an IO request.
  • the IO request processing apparatus of the storage system according to the fourth aspect of the present invention provides the IO request processing apparatus of the eleventh storage system according to the fourth aspect of the present invention, further comprising: the second processing module is further configured to The storage object cannot be obtained in the second mapping table entry, and the result indicating the read request exception is returned for the read request in the IO request.
  • An IO request processing apparatus of a storage system provides the IO request processing apparatus of the twelfth storage system according to the fourth aspect of the present invention, wherein the first thread is executed only by the first CPU, And the second thread is only executed by the second CPU.
  • a computer program comprising computer program code for causing said computer system to perform a first aspect according to the invention when loaded into a computer system and executed on a computer system
  • the IO request processing method of the storage system provided by the second aspect.
  • a program comprising program code for causing said storage system to perform according to the first aspect of the present invention when loaded into a storage system and executed on a storage system
  • the IO request processing method of the storage system provided by the two aspects.
  • the logical address space of the virtual storage disk is divided into multiple regions that do not overlap each other, and the IO request for one region is processed by the corresponding one thread. Avoid two threads processing IO requests for the same region.
  • the embodiments of the present invention have the following advantages: under the premise of ensuring data reliability, the CPUs can be completely concurrent, fully exerting the high performance of the solid state hard disk; ensuring linear scalability of system performance; and depending on performance requirements and CPU and memory The use of such resources requires dynamic configuration.
  • FIG. 1 illustrates an architecture of a storage system in accordance with an embodiment of the present invention
  • FIG. 2 illustrates a structure of a storage object according to an embodiment of the present invention
  • FIG. 3 illustrates a structure of a storage object according to still another embodiment of the present invention
  • FIG. 4 is a schematic diagram of a lockless IO request processing model according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of mapping a logical address space of a virtual storage disk to a storage object according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of mapping a logical address space of a virtual storage disk to a storage object according to another embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a logical address area of a virtual storage disk according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram showing a logical address area of a virtual storage disk according to another embodiment of the present invention.
  • FIG. 9 is a flowchart of an IO request process of a storage system according to an embodiment of the present invention.
  • FIG. 10 is a flow diagram of distributing an IO request to an IO processing thread in accordance with an embodiment of the present invention
  • FIG. 11 is a flow diagram of an IO processing thread processing an IO request in accordance with an embodiment of the present invention.
  • FIG. 1 illustrates an architecture of a memory system in accordance with an embodiment of the present invention.
  • a storage system in accordance with the present invention includes a computer or server (collectively referred to as a host) and a plurality of storage devices (e.g., drives) coupled to the host.
  • the drive is a solid state drive (SSD).
  • SSD solid state drive
  • a disk drive may also be included in an embodiment in accordance with the invention.
  • a data block or a chunk of data (Chunk, referred to as a large block) in each drive is recorded in the storage resource pool.
  • a data chunk is a plurality of data blocks that are contiguous in logical space or physical space in a drive of a predetermined size.
  • the size of the data chunk can be hundreds of kilobytes (kilobytes) or MB (megabytes).
  • recorded in the storage resource are data blocks or data chunks of each driver that have not been allocated to the storage object, and these data chunks or data chunks are also referred to as free data chunks or free data chunks.
  • a storage resource pool is also a virtualization technology that virtualizes storage resources from physical drives into chunks of data or chunks of data for upper layer access or use.
  • a storage object is allocated to the storage object layer at the resource allocation layer, and the storage object includes a plurality of large blocks.
  • the allocator allocates chunks in the storage resource pool to create storage objects.
  • a storage object provided in accordance with an embodiment of the present invention represents a portion of a storage space of a storage system.
  • the storage object is a storage unit with a RAID function, and the storage object structure will be described in detail later with reference to FIG.
  • Storage objects can be created and destroyed. When a storage object is created, the required number of large blocks are obtained from the storage resource pool through the allocator, and these large blocks constitute a storage object.
  • a large block can belong to only one storage object at the same time. Large chunks that have been allocated to storage objects are no longer assigned to other storage objects. When the storage object is destroyed, the chunks that make up the storage object are released back to the storage resource pool and can be reassigned to other storage objects.
  • the storage system includes a plurality of virtual storage disks.
  • the virtual storage disk provides an access interface to the application and provides external services.
  • a virtual storage disk consists of several storage objects, and the application can create multiple virtual storage disks with different attributes as needed.
  • a virtual storage disk provides a logical address space, and a logical address space includes a plurality of logical address areas.
  • FIG. 2 shows the structure of a storage object according to an embodiment of the present invention.
  • a storage object includes multiple data blocks or large data blocks.
  • the storage objects include chunks 220, chunks 222, chunks 224, and chunks 226.
  • the chunks that make up the storage object come from different drives. Each drive provides at most one chunk to a storage object.
  • the bulk 220 is from the driver 210
  • the bulk 222 is from the driver 212
  • the bulk 224 is from the driver 214
  • the bulk 226 is from the driver 216.
  • the storage object includes a plurality of RAID strips (strips 230, strips 232, ... strips 238), each strip consisting of storage spaces from different chunks. Storage spaces from different chunks of the same strip may have the same or different address ranges.
  • a stripe is the smallest unit of write for a storage object, which improves performance by writing data in parallel to multiple drives. There are no size limits for read operations on storage objects.
  • Implement RAID technology in strips Of the four large blocks of storage space constituting the strip 230, three storage spaces are used to store user data, and the other storage space is used to store check data such that a RAID 5 level is provided on the strip 230. Data protection.
  • metadata is also stored in each chunk.
  • each of the chunks 220 - chunks 226 The same metadata is saved, which ensures the reliability of the metadata. Even if some of the large blocks belonging to the same storage object fail, the metadata can be obtained from other large blocks.
  • the metadata is used to record information such as the storage object to which the large block belongs, the data protection level (RAID level) of the storage object where the large block is located, and the number of times the large block is erased.
  • FIG. 3 shows the structure of a storage object according to still another embodiment of the present invention.
  • a storage object is created for the logical address area, and then a resource allocation occurs to create a storage object, and the created storage object and the logical address of the storage space are generated.
  • the area is associated.
  • the storage system includes a plurality of drives (see FIG. 3, drive 0, drive 1, drive 2, and drive 3), and the storage space of the drive is divided into fixed-size storage resources, which are called large blocks. A number of large blocks are organized by a RAID algorithm to form a data protection unit called a storage object. Referring to FIG.
  • driver 0 provides chunk 0, chunk i...
  • driver 1 provides chunk 1
  • driver 2 provides chunk 2, chunk k...
  • driver 3 provides chunk 3.
  • the large block t... shows two storage objects in the embodiment shown in FIG. 3, and one large block is provided by each of the drive 0, the drive 1, and the drive 2 (large block 0, large block 1 and large block 2)
  • the storage object m is constituted, and one large block (large block j, large block k, and large block t) is provided by the drive 1, the drive 2, and the drive 3 to constitute the storage object n.
  • the storage capacity of the large block is at the MB (Megabyte) level
  • the storage capacity of the drive is at the TB (Terabyte, terabyte) level. Therefore, the creation of storage objects in the storage system occurs frequently.
  • the use of storage resources can be effectively controlled and global wear leveling and inverse equalization can be realized.
  • a driver that provides a large block for the storage object can be randomly selected from a plurality of drives based on the weight of the drive, thereby ensuring balanced use of the drive resources to achieve global wear leveling.
  • FIG. 4 is a schematic diagram of a lockless IO request processing model in accordance with an embodiment of the present invention.
  • the application accesses the virtual storage disk.
  • IO processing threads IO Handlers
  • IO Handlers are created for the storage resource pool according to the user's configuration, and these IO processing threads can run completely concurrently on multiple CPUs.
  • the IO request is distributed to a specific IO processing thread for processing according to the parameters of the IO request.
  • IO requests are distributed to different IO processing threads depending on the type of IO request (read request or write request) and/or the logical address of the IO request access.
  • the IO processing thread is responsible for processing IO requests, such as read and write requests of the application and/or data reconstruction requests within the storage system.
  • the main tasks of the IO processing thread include: mutual exclusion and synchronization between IO requests, RAID encoding and decoding, distribution of IO requests to the underlying solid state drive, and processing of return requests.
  • An IO processing thread can be a thread, a process, or other piece of code that executes on the CPU.
  • each IO processing thread is bound to a CPU or CPU core.
  • the CPU or CPU core is dedicated to executing the IO processing threads bound to it, reducing the overhead introduced by thread switching.
  • FIG. 5 is a schematic diagram of a mapping of a logical address space of a virtual storage disk to a storage object according to an embodiment of the present invention.
  • Ct#K indicates a storage object numbered K.
  • the logical address space of the virtual storage disk 0 is from “LBA#0" to "LBA#N".
  • LBA#0 indicates the starting address 0 of the logical address space
  • LBA#N indicates the maximum address N of the logical address space.
  • the unmapped logical address portion is indicated by the shadow, and the non-shaded portion is the logical address portion to which the storage object has been mapped.
  • the logical address portion of the virtual storage disk is mapped to the storage object such that access to the logical address portion is carried by the mapped storage object.
  • the access virtual storage disk 0 is mapped to the logical address portion of the storage object Ct#0
  • the access is completed by accessing the storage object Ct#0.
  • the logical addresses of virtual storage disk 0 are all unmapped areas.
  • the storage object is allocated and the logical address being written is mapped to the storage object.
  • the logical address to which the storage object is mapped belongs to the mapped area.
  • the mapped storage object on the logical address is searched, and the data to be read is obtained by accessing the storage object.
  • each logical address portion is mapped to at most one storage object.
  • FIG. 6 is a schematic diagram of mapping a logical address space of a virtual storage disk to a storage object according to another embodiment of the present invention.
  • the storage objects have different sizes, and thus different storage objects are mapped to different numbers of logical addresses.
  • the storage objects have the same size.
  • some logical addresses of the virtual storage disk 1 are mapped to storage objects of different sizes, and some logical addresses have not been mapped to storage objects.
  • a logical address of the virtual storage disk 1 is written, it is first detected whether the logical address has been mapped to the storage object. If the logical address has been mapped to a storage object (for example, storage object Ct#2), the storage object Ct#2 is directly written; if no storage object is mapped to the logical address, a storage object is created and stored. The object is added to the address space mapping management system of virtual storage disk 1 before continuing to perform the write operation.
  • a storage object for example, storage object Ct#2
  • FIG. 7 is a schematic diagram of a logical address area of a virtual storage disk according to an embodiment of the present invention.
  • the logical address space of the virtual storage disk 0 is divided into consecutive logical address areas of equal size.
  • the logical address area numbered n is indicated by "Re#n".
  • the IO request when processing an IO request, the IO request is distributed to a different IO processing thread for processing in accordance with the logical address area accessed by the IO request.
  • each logical address area one or more storage objects may be corresponding.
  • the logical addresses corresponding to the respective logical address areas do not overlap each other.
  • FIG. 8 is a schematic diagram of a logical address area of a virtual storage disk according to another embodiment of the present invention.
  • the storage objects have different sizes, and the address space of the virtual storage disk is divided into a plurality of logical address areas of equal size, and each logical address area corresponds to one or more storage objects.
  • the logical address area Re#0 includes the storage objects Ct#1 and Ct#2, the logical address corresponding to the logical address area Re#1 has not been allocated the storage object, and the logical logic area Re#3 corresponds to the partial logic.
  • the address is assigned the storage object Ct#K, and another part of the logical address has not been allocated the storage object.
  • FIG. 9 is a flow diagram of an IO request process of a memory system in accordance with an embodiment of the present invention. Multiple threads are processed simultaneously by assigning IO requests accessing different logical address regions to different IO processing threads. Since each thread processes IO requests of different logical address areas, there is no access dependency between these read and write requests, so these multiple threads have no influence on each other and can be processed in parallel.
  • a first IO request is received, wherein the first IO requests access to the first logical address area (910).
  • a first thread is determined based on the first logical address region, causing the first thread to process the first IO request (920).
  • the first logical address area is the logical address area indicated by "Re#0" of FIG.
  • the first thread may be, for example, the IO processing thread (T1) bound to the CPU 1 shown in FIG.
  • an IO request for the region "Re#0" is assigned to the thread T1 processing
  • an IO request for the region "Re#2" is assigned to the thread T2 for processing.
  • Thread T2 is an IO processing thread bound to CPU 2 (see Figure 4).
  • the number of the created IO processing thread is modulo by the number of the logical address area, and the result is the index corresponding to the IO processing thread, and the IO processing thread indicated by the index processes the IO request for the numbered logical address area. Process it.
  • the hash is computed for the number of logical address regions R, and the resulting hash result is used as an index to the IO processing thread.
  • a storage object mapping table is provided to maintain a mapping of the address of the virtual storage disk to the storage object.
  • the storage object mapping table is a shared resource of the entire virtual storage disk, and in the embodiment according to the present invention, the plurality of IO Handlers can concurrently and lockably access the storage object mapping table .
  • the storage object mapping table is composed of a plurality of mapping table entries, each mapping table entry is a red-black tree, and the node of the red-black tree stores the correspondence between the starting logical address of the storage object and the storage object.
  • a plurality of " ⁇ storage object start logical address, storage object>" are recorded in each mapping table entry. Which mapping table entry is placed in the storage object can be obtained by modulating the number of mapping table entries by the logical address area number in which the storage object is located.
  • an IO request for the logical address region R1 is processed by the IO processing thread T1; and a mapping table entry (referred to as M1) for recording the mapping relationship between the logical address of the logical address region R1 and the storage object It is also processed by the IO processing thread T1.
  • the IO request for the logical address area R2 is processed by the IO processing thread T2; and the mapping table entry (referred to as M2) for recording the mapping relationship between the logical address of the logical address area R2 and the storage object is also processed by the IO processing thread T2.
  • the IO processing thread T1 and the IO processing thread T2 thus process IO requests accessing different logical address regions and access different mapping table entries. In this way, multiple IO processing threads process their respective tasks concurrently, and there is no need to use locks to synchronize between multiple IO processing threads.
  • map_entry_num can be divisible by io_handler_num to ensure that for an IO request, its corresponding ioh_index is equal to map_ioh_index.
  • the number of mapping table entries corresponding thereto is set to an integral multiple of the number of IO processing threads allocated to the storage resource pool, thereby ensuring that each IO processing thread is a certain mapping table entry.
  • the only visitor In this case, when multiple IO processing threads access the storage object mapping table, it is not necessary to use locks to synchronize operations with each other, thereby implementing a lock-free design.
  • the logical address space of the virtual storage disk 1 is divided into 12 areas, and the storage resource pool of the virtual storage disk 1 is provided by the three IO processing thread services.
  • the storage object mapping table of the virtual storage disk includes 6 Mapping table entries. Assign the logical address area of the first 4 logical address areas (numbered 0/1/2/3) to the first 2 mapping table entries (index 0/1) to the thread T0, and the middle 4 logical address areas (number A mapping table entry of 4/5/6/7) with an index of 2/3 is assigned to the IO processing thread T1, and a mapping table entry of the last 4 logical address regions and index 5/6 is assigned to the IO processing thread T2. This prevents two or more IO processing threads from accessing the same mapping table entry, and also avoids two or more IO processing threads accessing the same logical address region.
  • a single storage resource pool can support multiple virtual storage disks.
  • an IO processing thread is provided for a pool of storage resources.
  • An IO processing thread for the same storage resource pool can be shared among multiple virtual storage disks supported by the storage resource pool. Enables an IO processing thread to handle IO requests to access different virtual storage disks. Further, multiple storage resource pools can be provided in the storage system.
  • the number of IO processing threads provided for the storage system or storage resource pool can be adjusted, but no more than the number of CPUs or CPU cores in the storage system.
  • the processing of the virtual storage disk IO request can be divided into two phases: the first phase is to receive the IO request and distributed to the corresponding IO processing thread; the second phase is the IO processing thread processes the IO request and sends it to the storage device or the storage device. Drivers.
  • 10 is a flowchart of distributing an IO request to an IO processing thread, which corresponds to the first stage described above; and
  • FIG. 11 is a flowchart of processing an IO request by an IO processing thread according to an embodiment of the present invention, corresponding to FIG. In the second phase above.
  • the user accesses the virtual storage disk to receive an IO request (1010), and determines whether the IO request spans multiple logical address regions (1020) based on the logical address of the IO request.
  • the starting logical address and access length that can be accessed by the IO request determine whether to span multiple logical address regions. If the IO request only accesses one logical address area, an IO command is created (1030). The response to the IO command can be a response to the IO request. And calculating the number of the logical address area accessed by the IO command. For example, the number of the logical address area to which it is accessed is determined according to the starting logical address of the IO request access.
  • a plurality of IO commands are created for the IO request, and each logical address region accessed by the IO request corresponds to an IO command (1040).
  • the IO requests access to the logical address area Re#2 and the logical address area Re#3 (see FIG. 7), generates an IO command C1 for the logical address area Re#2, an IO command C2 for the logical address area, and an IO command C1 and IO command for the logical address area.
  • the response of C2 can be combined into a response to an IO request.
  • the IO processing thread that processes the IO command is determined according to the logical address area number corresponding to the IO command, and the IO command is distributed to the command queue of the IO processing thread (1050).
  • a command queue is provided for each IO processing thread. Hold the IO command in the entry in the command queue.
  • the IO processing thread fetches IO commands from its command queue for processing.
  • the logical address area number corresponding to the IO command is hashed, or the number of IO processing threads is modulo by the logical address area number, and an index of the IO processing thread that processes the IO command is obtained, and the index is obtained according to the index.
  • the IO processes the thread and populates the IO command into the command queue of the IO processing thread.
  • the logical address area number corresponding to the IO command is hashed, or the number of IO processing threads is modulo by the logical address area number, and an index of the IO processing thread that processes the IO command is obtained, and the command is directly obtained according to the index. Queue and insert the IO command into the command queue.
  • the IO processing thread processes the IO commands and sends them to the storage device or the storage device driver.
  • the IO processing thread fetches the IO command from its command queue (1110).
  • the storage object mapping table is searched to determine whether a storage object corresponding to the logical address exists (1120). If the storage object mapping table indicates that the logical address accessed by the processed IO command has been mapped to the storage object, the storage object is accessed, and the storage object is read or written according to the IO command (1150).
  • the IO processing thread When the IO processing thread searches the storage object mapping table, it determines the mapping table entry to be accessed according to the logical address area number accessed by the IO command, and searches for the storage object to which the logical address of the IO command is mapped. In the embodiment of the present invention, the two IO processing threads do not access the same mapping table entry, so that the IO processing thread does not need to lock the storage object mapping table when accessing the storage object mapping table, thereby improving processing efficiency.
  • the storage object mapping table indicates that the logical address accessed by the processed IO command has not been mapped to the storage object, it is further determined whether the IO command is a read command or a write command (1130). If the IO command is a read command, and reading the logical address of the storage object that has not been allocated is illegal, the specified value is used as a response to the IO command (1160). For example, the result of all 0s is returned for the IO command, or the return value indicates that the IO command has accessed a logical address that is illegal or has not been assigned.
  • the storage object is first created, and the created storage object is inserted into the storage object mapping table to record in the entry of the storage object mapping table.
  • the IO processing thread fetches the IO command from the command queue, it can first check whether the IO command is a read command or a write command, and then check whether the logical address of the IO command is allocated a storage object. For a read command, if the logical address of the IO command has been allocated a storage object, the data is fetched from the storage object; if the logical address of the IO command is not assigned a storage object, the predetermined value is used as the read result, or the read is Illegal address.
  • a write command For a write command, if the logical address of the IO command has been allocated a storage object, data is written to the storage object; if the logical address of the IO command is not allocated a storage object, a new storage object is allocated, and the logical address and the newly allocated storage object are allocated. A mapping relationship is established and data is written to the newly allocated inch object as a response to the IO command.
  • the logical address space of the virtual storage disk is divided into multiple logical address regions that do not overlap each other, and the IO request for one logical address region is processed by the corresponding one thread. . Avoid two threads processing IO requests in the same logical address area. Further, avoiding two threads accessing one mapping table entry at the same time. Therefore, resource access between threads does not conflict, and it is not necessary to use locks or other mechanisms to synchronize critical resources between threads, which simplifies the processing and improves the parallelism between threads.
  • the new method of lock-free IO processing proposed by the present invention can utilize multiple CPU cores/multi-CPU concurrent processing characteristics while ensuring data reliability, and can fully utilize multiple solid-state functions.
  • the high performance of the drive. And can achieve linear scalability of performance relative to CPU core / CPU performance, thus reaching customers The need for data reliability and performance. It can also be dynamically configured based on performance requirements and usage requirements for resources such as CPU and memory.
  • embodiments of the present invention cannot guarantee that there is no conflict between multiple IO requests on the same stripe within the storage object.
  • the read request and the read request of the same stripe inside the storage object can be executed concurrently; but between the write request and the write request, between the write request and the reconfiguration request, synchronization or serial execution is required to ensure the correctness of the data. . Synchronization between multiple IO requests within the same storage object is handled by each IO processing thread.
  • the embodiment of the present invention further provides a program containing a program code that, when loaded into a CPU and executed in a CPU, causes the CPU to execute one of the methods according to the embodiments of the present invention provided above.
  • Embodiments of the present invention also provide a program including program code that, when loaded into a host and executed on a host, causes the processor of the host to perform one of the methods provided above in accordance with an embodiment of the present invention.
  • These computer program instructions can be loaded onto a general purpose computer, special purpose computer or other programmable data control device to produce a machine such that instructions executed on a computer or other programmable data control device are created for implementing one or more flowchart blocks The device specified in the function.
  • the computer program instructions can also be stored in a computer readable memory that can be booted by a computer or other programmable data control device to function in a particular manner, such that it can be manufactured using instructions stored in the computer readable memory, including for implementing one Or an article of computer readable instructions of a plurality of functions specified in the flowchart box.
  • the computer program instructions can also be loaded onto a computer or other programmable data control device to cause a series of operational operations to be performed on a computer or other programmable data control device to produce a computer-implemented process, which in the computer or other programmable data.
  • the instructions executed on the control device provide operations for implementing the functions specified in one or more of the flowchart blocks.
  • blocks of the block diagrams and flowcharts support combinations of means for performing the specified functions, combinations of operations for performing the specified functions, and combinations of program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowcharts, and combinations of blocks of the block diagrams and flowcharts can be implemented by a hardware-based, special-purpose computer system that performs the specified function or operation, or by a combination of special purpose hardware and computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种存储系统的IO请求处理方法及装置,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域。所述IO请求处理方法包括:接收访问第一逻辑地址区域的第一IO请求(910);根据第一逻辑地址区域确定第一线程,使第一线程处理第一IO请求(920)。所述方法可以在保证数据可靠性的前提下,让各个CPU充分并发工作,充分发挥固态驱动器的性能,确保系统性能的线性可扩展。

Description

无锁IO处理方法及其装置 技术领域
本申请涉及存储系统技术,特别涉及一种存储系统的IO请求处理方法及装置。
背景技术
固态驱动器(SSD,Solid State Drive)采用半导体存储介质制作而成,具有优越的读写性能。然而,尽管SSD驱动器性能很高,但是数据可靠性以及SSD驱动器的成本限制了SSD盘的普及。现有技术中,采用RAID(Redundant Array of Independent Disks,独立磁盘冗余阵列)技术保证SSD驱动器数据的可靠性,同时也提高了SSD驱动器的使用效率,从而降低了成本。
但是,RAID技术加长了IO路径,加大了计算开销。为了充分发挥多块SSD驱动器的性能,目前通常采用多核多CPU技术。只要尽可能让各个CPU并发处理IO请请求,就有可能达到数据保护和高性能的目标。可是同一个RAID Array(阵列)上对于同一个条带的读请求、写请求,还有重构请求之间有关联,而且与RAID Array的某些管理控制请求有依赖,即对于共享资源需要进行同步和互斥访问。
目前,并没有统一的方法来解决RAID Array(阵列)上相关请求之间的同步与互斥。大多数的实现都是为每个RAID Array(阵列)创建一个独立的线程,专门处理这个阵列相关的管理控制请求,以及读请求、写请求和重构请求。然而,当RAID系统在一个SSD驱动器组成的资源池中创建成千上万个RAID Array(阵列)时,如果采用这种方法将会创建成千上万个线程,导致内存开销巨大。而且,线程调度和切换的开销会导致CPU效率急剧下降。
发明内容
本发明旨在至少在一定程度上解决上述相关技术中的技术问题。
根据本发明的第一方面,提供了根据本发明第一方面的存储系统的IO请求处理方法,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述方法包括:接收第一IO请求,其中,所述第一IO请求访问第一逻辑地址区域;根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO请求。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第二存储系统的IO请求处理方法,所述多个逻辑地址区域的逻辑地址互不交叠,所述方法还包括:接收第二IO请求,其中所述第二IO请求访问第二逻辑地址区域;根据第二逻辑地址区域确定第二线程,使第二线程处理所述第二IO请求。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第三在存储系统的IO请求处理方法,所述根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO请求包括:根据第一IO请求生成第一IO命令,并根据第一逻辑地址区域,将第一IO命令填充到与第一线程对应的第一队列,第一线程从第一队列中取出并处理第一IO命令,其中第一线程仅处理第一队列的命令。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第 四在存储系统的IO请求处理方法,根据第二逻辑地址区域确定第二线程,使第二线程处理所述第二IO请求包括:根据第二IO请求生成第二IO命令,并根据第二逻辑地址区域,将第二IO命令填充到与第二线程对应的第二队列,第二线程从第二队列中取出并处理第二IO命令,其中第二线程仅处理第二队列的命令。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第五在存储系统的IO请求处理方法,还包括:将第一逻辑地址区域的索引对线程的数量取模的结果为第一线程的索引;或者将第二逻辑地址区域的索引对线程的数量取模的结果为第二线程的索引。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第六在存储系统的IO请求处理方法,还包括:对第一逻辑地址区域的索引计算散列,得到第一线程的索引;或者对第二逻辑地址区域的索引计算散列,得到第二线程的索引。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第七在存储系统的IO请求处理方法,还包括:将第一逻辑地址区域的索引对线程的数量取模的结果为第一队列的索引;或者将第二逻辑地址区域的索引对线程的数量取模的结果为第二队列的索引。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第八在存储系统的IO请求处理方法,还包括:对第一逻辑地址区域的索引计算散列,得到第一队列的索引;或者对第二逻辑地址区域的索引计算散列,得到第二队列的索引。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第九在存储系统的IO请求处理方法,还包括:根据第一逻辑地址区域确定第一映射表表项;所述第一线程访问第一映射表表项,根据第一IO请求访问的第一逻辑地址从第一映射表表项中获取第一存储对象,并访问所述第一存储对象来处理所述第一IO请求。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第十在存储系统的IO请求处理方法,还包括:根据第二逻辑地址区域确定第二映射表表项;所述第二线程访问第二映射表表项,根据第二IO请求访问的第二逻辑地址从第二映射表表项中获取第二存储对象,并访问所述第二存储对象来处理所述第二IO请求。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第十一在存储系统的IO请求处理方法,还包括:将第一逻辑地址区域的索引对映射表表项的数量取模的结果为第一映射表表项的索引;或者对第一逻辑地址区域的索引计算散列,得到第一映射表表项的索引。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第十二在存储系统的IO请求处理方法,还包括:将第二逻辑地址区域的索引对映射表表项的数量取模的结果为第二映射表表项的索引;或者对第二逻辑地址区域的索引计算散列,得到第二映射表表项的索引。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第十三在存储系统的IO请求处理方法,还包括:若从第一映射表表项中无法获得存储对象,对于IO请求中的写请求,创建第三存储对象,并在第一映射表项中记录第三存储对象,以及向第三存储对象写入数据。
根据本发明的第一方面的存储系统的IO请求处理方法,提供了根据本发明第一方面的第十四在存储系统的IO请求处理方法,还包括:若从第二映射表表项中无法获得存储对象,对于IO请求中的写请求,创建第四存储对象,并在第二映射表项中记录第四存储对象,以及向第四存储对象写入数据。
根据本发明的第二方面,提供了根据本发明第二方面的存储系统的IO请求处理方法,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述方法包括:接收 第一IO请求;根据第一IO请求生成第一IO命令与第二IO命令,其中第一IO命令访问第一逻辑地址区域,而第二IO命令访问第二逻辑地址区域;根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO命令;以及根据第二逻辑地址区域确定第二线程,使第二线程处理所述第二IO命令。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第二在存储系统的IO请求处理方法,还包括:根据第一逻辑地址区域,将第一IO命令填充到与第一线程对应的第一队列,第一线程从第一队列中取出并处理第一IO命令,其中第一线程仅处理第一队列的命令;以及并根据第二逻辑地址区域,将第二IO命令填充到与第二线程对应的第二队列,第二线程从第二队列中取出并处理第二IO命令,其中第二线程仅处理第二队列的命令。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第三在存储系统的IO请求处理方法,还包括:根据第一逻辑地址区域确定第一映射表表项;所述第一线程访问第一映射表表项,根据第一IO命令访问的第一逻辑地址从第一映射表表项中获取第一存储对象,并访问所述第一存储对象来处理所述第一IO命令。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第四在存储系统的IO请求处理方法,还包括:根据第二逻辑地址区域确定第二映射表表项;所述第二线程访问第二映射表表项,根据第二IO命令访问的第二逻辑地址从第二映射表表项中获取第二存储对象,并访问所述第二存储对象来处理所述第二IO命令。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第五在存储系统的IO请求处理方法,还包括:将第一逻辑地址区域的索引对映射表表项的数量取模的结果为第一映射表表项的索引;或者对第一逻辑地址区域的索引计算散列,得到第一映射表表项的索引。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第六在存储系统的IO请求处理方法,还包括:将第二逻辑地址区域的索引对映射表表项的数量取模的结果为第二映射表表项的索引;或者对第二逻辑地址区域的索引计算散列,得到第二映射表表项的索引。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第七在存储系统的IO请求处理方法,其中,对于IO请求中的写请求,创建第三存储对象,并在第一映射表项中记录第三存储对象,以及向第三存储对象写入数据。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第八在存储系统的IO请求处理方法,其中,对于IO请求中的写请求,创建第四存储对象,并在第二映射表项中记录第四存储对象,以及向第四存储对象写入数据。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第九在存储系统的IO请求处理方法,其中,映射表项的数量是处理IO命令的线程的数量的整数倍。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第十在存储系统的IO请求处理方法,还包括:若从第一映射表表项中无法获得存储对象,对于IO请求中的读请求,返回指示读请求异常的结果。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第十一在存储系统的IO请求处理方法,还包括:若从第二映射表表项中无法获得存储对象,对于IO请求中的读请求,返回指示读请求异常的结果。
根据本发明的第二方面的存储系统的IO请求处理方法,提供了根据本发明第二方面的第十二在存储系统的IO请求处理方法,其中,所述第一线程仅由第一CPU执行,而所述第二线程仅由第二CPU执行。
根据本发明的第三方面,提供了根据本发明第三方面的存储系统的IO请求处理装置,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述装置包括:第一接收模块,用于接收第一IO请求,其中,所述第一IO请求访问第一逻辑地址区域;第一处理模块,用于根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO请求。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第二存储系统的IO请求处理装置,所述多个逻辑地址区域的逻辑地址互不交叠,所述第一接收模块还用于接收第二IO请求,其中所述第二IO请求访问第二逻辑地址区域;所述第一处理模块还用于根据第二逻辑地址区域确定第二线程,使第二线程处理所述第二IO请求。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第三存储系统的IO请求处理装置,所述第一处理模块用于根据第一IO请求生成第一IO命令,并根据第一逻辑地址区域,将第一IO命令填充到与第一线程对应的第一队列,第一线程从第一队列中取出并处理第一IO命令,其中第一线程仅处理第一队列的命令。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第四存储系统的IO请求处理装置,所述第一处理模块用于根据第二IO请求生成第二IO命令,并根据第二逻辑地址区域,将第二IO命令填充到与第二线程对应的第二队列,第二线程从第二队列中取出并处理第二IO命令,其中第二线程仅处理第二队列的命令。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第五存储系统的IO请求处理装置,还包括:第一索引计算模块,用于将第一逻辑地址区域的索引对线程的数量取模的结果为第一线程的索引;或者将第二逻辑地址区域的索引对线程的数量取模的结果为第二线程的索引。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第六存储系统的IO请求处理装置,还包括:第二索引计算模块,用于对第一逻辑地址区域的索引计算散列,得到第一线程的索引;或者对第二逻辑地址区域的索引计算散列,得到第二线程的索引。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第七存储系统的IO请求处理装置,还包括:第三索引计算模块,用于将第一逻辑地址区域的索引对线程的数量取模的结果为第一队列的索引;或者将第二逻辑地址区域的索引对线程的数量取模的结果为第二队列的索引。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第八存储系统的IO请求处理装置,还包括:第四索引计算模块,用于对第一逻辑地址区域的索引计算散列,得到第一队列的索引;或者对第二逻辑地址区域的索引计算散列,得到第二队列的索引。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第九存储系统的IO请求处理装置,还包括:第一映射表确定模块,用于根据第一逻辑地址区域确定第一映射表表项;所述第一处理模块还用于通过所述第一线程访问第一映射表表项,根据第一IO请求访问的第一逻辑地址从第一映射表表项中获取第一存储对象,并访问所述第一存储对象来处理所述第一IO请求。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第十存储系统的IO请求处理装置,还包括:第二映射表确定模块,用于根据第二逻辑地址区域确定第二映射表表项;所述第一处理模块还用于通过所述第二线程访问第二映射表表项,根据第二IO请求访问的第二逻辑地址从第二映射表表项中获取第二存储对象,并访问所述第二存储对象来处理所述第二IO请求。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第十一存储系统的IO请求处理装置,还包括:第五索引计算模块,用于将第一逻辑地址区域的 索引对映射表表项的数量取模的结果为第一映射表表项的索引;或者对第一逻辑地址区域的索引计算散列,得到第一映射表表项的索引。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第十二存储系统的IO请求处理装置,还包括:第六索引计算模块,用于将第二逻辑地址区域的索引对映射表表项的数量取模的结果为第二映射表表项的索引;或者对第二逻辑地址区域的索引计算散列,得到第二映射表表项的索引。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第十三存储系统的IO请求处理装置,所述第一处理模块还用于在从第一映射表表项中无法获得存储对象时,对于IO请求中的写请求,创建第三存储对象,并在第一映射表项中记录第三存储对象,以及向第三存储对象写入数据。
根据本发明的第三方面的存储系统的IO请求处理装置,提供了根据本发明第三方面的第十四存储系统的IO请求处理装置,所述第一处理模块还用于在从第二映射表表项中无法获得存储对象时,对于IO请求中的写请求,创建第四存储对象,并在第二映射表项中记录第四存储对象,以及向第四存储对象写入数据。
根据本发明的第四方面,提供了根据本发明第四方面的存储系统的IO请求处理装置,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述装置包括:第二接收模块,用于接收第一IO请求;第二处理模块,用于根据第一IO请求生成第一IO命令与第二IO命令,其中第一IO命令访问第一逻辑地址区域,而第二IO命令访问第二逻辑地址区域,并根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO命令;以及根据第二逻辑地址区域确定第二线程,使第二线程处理所述第二IO命令。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第二存储系统的IO请求处理装置,所述第二处理模块用于根据第一逻辑地址区域,将第一IO命令填充到与第一线程对应的第一队列,第一线程从第一队列中取出并处理第一IO命令,其中第一线程仅处理第一队列的命令,以及并根据第二逻辑地址区域,将第二IO命令填充到与第二线程对应的第二队列,第二线程从第二队列中取出并处理第二IO命令,其中第二线程仅处理第二队列的命令。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第三存储系统的IO请求处理装置,还包括:第三映射模块,用于根据第一逻辑地址区域确定第一映射表表项;所述第二处理模块还用于通过所述第一线程访问第一映射表表项,根据第一IO命令访问的第一逻辑地址从第一映射表表项中获取第一存储对象,并访问所述第一存储对象来处理所述第一IO命令。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第四存储系统的IO请求处理装置,还包括:第四映射模块,用于根据第二逻辑地址区域确定第二映射表表项;所述第二处理模块还用于通过所述第二线程访问第二映射表表项,根据第二IO命令访问的第二逻辑地址从第二映射表表项中获取第二存储对象,并访问所述第二存储对象来处理所述第二IO命令。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第五存储系统的IO请求处理装置,还包括:第七索引计算模块,用于将第一逻辑地址区域的索引对映射表表项的数量取模的结果为第一映射表表项的索引;或者对第一逻辑地址区域的索引计算散列,得到第一映射表表项的索引。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第六存储系统的IO请求处理装置,还包括:第八索引计算模块,用于将第二逻辑地址区域的索引对映射表表项的数量取模的结果为第二映射表表项的索引;或者对第二逻辑地址区域的索引计算散列,得到第二映射表表项的索引。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第七存储系统的IO请求处理装置,其中,对于IO请求中的写请求,创建第三存储对象,并在第一映射表项中记录第三存储对象,以及向第三存储对象写入数据。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第八存储系统的IO请求处理装置,其中,对于IO请求中的写请求,创建第四存储对象,并在第二映射表项中记录第四存储对象,以及向第四存储对象写入数据。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第九存储系统的IO请求处理装置,其中,映射表项的数量是处理IO命令的线程的数量的整数倍。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第十存储系统的IO请求处理装置,还包括:所述第二处理模块还用于若从第一映射表表项中无法获得存储对象,对于IO请求中的读请求,返回指示读请求异常的结果。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第十一存储系统的IO请求处理装置,还包括:所述第二处理模块还用于若从第二映射表表项中无法获得存储对象,对于IO请求中的读请求,返回指示读请求异常的结果。
根据本发明的第四方面的存储系统的IO请求处理装置,提供了根据本发明第四方面的第十二存储系统的IO请求处理装置,其中,所述第一线程仅由第一CPU执行,而所述第二线程仅由第二CPU执行。
根据本发明的第五方面,提供一种包含计算机程序代码的计算机程序,当被载入计算机系统并在计算机系统上执行时,所述计算机程序代码使所述计算机系统执行根据本发明第一方面至第二方面提供的存储系统的IO请求处理方法。
根据本发明的第六方面,提供一种包括程序代码的程序,当被载入存储系统并在存储系统上执行时,所述计程序代码使所述存储系统执行根据本发明第一方面至第二方面提供的存储系统的IO请求处理方法。
本发明的实施例,将虚拟存储盘的逻辑地址空间划分为互不交叠的多个区域,对一个区域的IO请求,由相对应的一个线程处理。避免两个线程处理同一个区域的IO请求。本发明的实施例具有以下优势:在保证数据可靠性的前提下,能够让各个CPU完全并发,充分发挥固态硬盘的高性能;确保系统性能的线性可扩展;可以根据性能需求和对CPU、内存等资源的使用要求进行动态配置。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:
图1示出了根据本发明实施例的存储系统的架构;
图2示出了根据本发明实施例的存储对象的结构;
图3示出了根据本发明又一实施例的存储对象的结构;
图4是根据本发明实施例的无锁IO请求处理模型的示意图;
图5是根据本发明实施例的虚拟存储盘的逻辑地址空间与存储对象的映射的示意图;
图6是根据本发明另一实施例的虚拟存储盘的逻辑地址空间与存储对象的映射的示意图;
图7是根据本发明实施例的虚拟存储盘的逻辑地址区域示意图;
图8是根据本发明另一个实施例的虚拟存储盘的逻辑地址区域示意图;
图9是根据本发明实施例的存储系统的IO请求处理的流程图;
图10是根据本发明实施例的将IO请求分发到IO处理线程的流程图;以及
图11是根据本发明实施例的IO处理线程处理IO请求的流程图。
具体实施方式
下面详细描述本发明的实施例,实施例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。
图1示出了根据本发明实施例的存储系统的架构。根据本发明的存储系统包括计算机或服务器(统称为主机)以及耦合到主机的多个存储设备(例如,驱动器)。优选地,驱动器是固态驱动器(SSD)。可选地,根据本发明的实施例中也可以包括磁盘驱动器。
通过存储资源池来维护各个驱动器提供的存储资源。存储资源池中记录了各个驱动器中的数据块或数据大块(Chunk,简称大块)。作为举例,数据大块是预定大小的驱动器中逻辑空间或物理空间连续的多个数据块。数据大块的大小,作为举例,可以是数百KB(千字节)或MB(兆字节)。可选地,存储资源中记录的是各个驱动器中尚未分配给存储对象的数据块或数据大块,这些数据块或数据大块又被称为空闲数据块或空闲数据大块。存储资源池也是一种虚拟化技术,将来自物理驱动器的存储资源虚拟化为数据块或数据大块以供上层访问或使用。在存储系统中,存储资源池可以有多个,而在图1的例子中,仅展示了单一的存储资源池。
在资源分配层为存储对象层分配存储对象,存储对象包括多个大块。分配器分配存储资源池中的大块来创建存储对象。根据本发明实施例中提供的存储对象,代表存储系统的部分存储空间。存储对象是带有RAID功能的存储单元,后面会结合图2详细描述存储对象结构。在存储对象层提供多个存储对象。存储对象可以被创建和销毁。一个存储对象被创建时,通过分配器从存储资源池中获取所需数量的大块,这些大块组成存储对象。一个大块在同一时刻可以仅属于一个存储对象。已经被分配给存储对象的大块不会再被分配给其他存储对象。存储对象被销毁时,构成该存储对象的大块被释放回存储资源池,并可被再次分配给其他存储对象。
存储系统包括多个虚拟存储盘。虚拟存储盘对应用程序提供访问接口,并对外提供服务。虚拟存储盘由若干个存储对象构成,应用程序可以根据需要创建多个不同属性的虚拟存储盘。虚拟存储盘提供逻辑地址空间,逻辑地址空间包括多个逻辑地址区域。
图2示出了根据本发明实施例的存储对象的结构。存储对象包括多个数据块或数据大块。在图2的例子中,存储对象包括大块220、大块222、大块224以及大块226。构成存储对象的大块来自不同的驱动器。每个驱动器向一个存储对象提供至多一个大块。参看图2,大块220来自驱动器210,大块222来自驱动器212,大块224来自驱动器214以及大块226来自驱动器216。从而,当单一驱动器发生故障时,存储对象中的一个或少数大块无法访问。通过存储对象的其他大块,能够重建存储对象的数据,以满足数据可靠性的要求。
通过RAID技术为存储对象提供数据保护,并提供存储对象的高性能访问。参看图2,存储对象包括多个RAID条带(条带230、条带232……条带238),每个条带由来自不同大块的存储空间组成。同一条带的来自不同大块的存储空间可具有相同或不同的地址范围。条带是存储对象的最小写入单位,从而通过向多驱动器并行写入数据来提升性能。存储对象的读操作没有大小限制。在条带中实现RAID技术。构成条带230的来自4个大块的存储空间中,3个存储空间用于存储用户数据,而另1个存储空间用于存储校验数据,使得在条带230上提供诸如RAID 5级别的数据保护。
可选地,每个大块中还存储元数据。在图2的例子中,大块220-大块226的每个上 都保存相同的元数据,从而保证了元数据的可靠性,即使属于同一存储对象的部分大块发生故障,从其他大块中依然能获取元数据。元数据用于记录大块所属的存储对象、大块所在存储对象的数据保护级别(RAID级别)、大块的擦写次数等信息。
图3示出了根据本发明又一实施例的存储对象的结构。多当用户第一次写存储空间的某个逻辑地址时,为这片逻辑地址区域创建存储对象,这时发生一次资源分配以创建存储对象,并将所创建的存储对象与存储空间的逻辑地址区域相关联。存储系统中包括多个驱动器(参见图3,驱动器0、驱动器1、驱动器2与驱动器3),驱动器的存储空间切分成固定大小的存储资源,称为大块。若干大块通过RAID算法组织形成一个数据保护单元,称为存储对象。参看图3,驱动器0提供大块0、大块i……,驱动器1提供大块1、大块j……,驱动器2提供大块2、大块k……,驱动器3提供大块3、大块t……在图3所示的实施例中展示了2个存储对象,由驱动器0、驱动器1、驱动器2的每个提供1个大块(大块0、大块1与大块2)构成存储对象m,而由驱动器1、驱动器2与驱动器3的每个提供1个大块(大块j、大块k与大块t)构成存储对象n。
作为举例,在图3的实施例中,大块的存储容量为MB(Megabyte,兆字节)级别,而驱动器的存储容量为TB(Terabyte,千吉字节)级别。从而在存储系统中存储对象的创建是频繁发生的,通过控制存储对象的创建过程可以有效控制存储资源的使用并实现全局磨损均衡与逆均衡。
在根据本发明的实施例中,在创建存储对象时,可以基于驱动器的权重从多个驱动器中随机挑选为存储对象提供大块的驱动器,从而保证对驱动器资源的均衡使用,来实现全局磨损均衡。
图4是根据本发明实施例的无锁IO请求处理模型的示意图。如图4所示,应用程序访问虚拟存储盘。根据用户的配置为存储资源池创建若干个IO处理线程(IO Handler),这些IO处理线程可以在多个CPU上完全并发地运行。对于来自虚拟存储盘的不同IO请求,根据IO请求的参数,将IO请求分发到特定的IO处理线程进行处理。作为举例,依据IO请求的类型(读请求或写请求)和/或IO请求访问的逻辑地址将IO请求分发到不同的IO处理线程。
IO处理线程负责处理IO请求,例如应用的读写请求和/或存储系统内部的数据重构请求。IO处理线程的主要工作包括:IO请求之间的互斥与同步,RAID编码与解码,将IO请求分发给下层固态驱动器,以及对返回请求的回掉处理。IO处理线程可以是线程、进程或其他在CPU上执行的代码段。在图4的例子中,每个IO处理线程与一个CPU或CPU核绑定。CPU或CPU核专用于执行与其绑定的IO处理线程,减少了因线程切换引入的额外开销。
图5是根据本发明实施例的虚拟存储盘的逻辑地址空间与存储对象的映射的示意图。图5中,“Ct#K”指示编号为K的存储对象。虚拟存储盘0的逻辑地址空间为从“LBA#0”到“LBA#N”。“LBA#0”指示逻辑地址空间的起始地址0,而“LBA#N”指示逻辑地址空间的最大地址N。如图5所示,虚拟存储盘0的逻辑地址空间中,由阴影所指示的是未映射的逻辑地址部分,而非阴影部分是已经映射了存储对象的逻辑地址部分。将虚拟存储盘的逻辑地址部分映射给存储对象,使得对该逻辑地址部分的访问由被映射的存储对象来承载。例如,在图5中,当访问虚拟存储盘0被映射给存储对象Ct#0的逻辑地址部分时,通过访问存储对象Ct#0来完成该访问。因而,最初,虚拟存储盘0的逻辑地址都是未映射区域。当首次向逻辑地址写入数据时,分配存储对象,并将被写入的逻辑地址映射到存储对象。被映射了存储对象的逻辑地址属于已映射区域。在读取逻辑地址时,查找该逻辑地址上被映射的存储对象,通过访问该存储对象来获得要读取的数据。进一步地,每个逻辑地址部分至多被映射给一个存储对象。
图6是根据本发明另一实施例的虚拟存储盘的逻辑地址空间与存储对象的映射的示意图。在图6中,存储对象具有不同的大小,因而不同的存储对象被映射给不同数量的逻辑地址。 作为比较,在图5中,存储对象具有相同的大小。
如图6所示,虚拟存储盘1的某些逻辑地址被映射到大小不一的存储对象,而有一些逻辑地址尚未被映射到存储对象。当对虚拟存储盘1的某个逻辑地址进行写操作时,会首先检测该逻辑地址是否已被映射到存储对象。如果该逻辑地址已映射到存储对象(例如存储对象Ct#2),直接对存储对象Ct#2进行写操作;如果没有存储对象被映射到该逻辑地址,会创建一个存储对象,并把该存储对象添加到虚拟存储盘1的地址空间映射管理系统,然后再继续执行写操作。
图7是根据本发明实施例的虚拟存储盘的逻辑地址区域示意图。如图7所示,将虚拟存储盘0的逻辑地址空间划分为连续的大小相等的逻辑地址区域。由“Re#n”指示编号为n的逻辑地址区域。在根据本发明的实施例中,在处理IO请求时,依据IO请求所访问的逻辑地址区域,将IO请求分发给不同的IO处理线程进行处理。在每个逻辑地址区域中,可以对应一个或多个存储对象。各个逻辑地址区域所对应的逻辑地址互不交叠。
虚拟存储盘的多个逻辑地址区域是固定大小的。而存储对象的大小可以是定长的,也可以是变长的。图8是根据本发明另一实施例的虚拟存储盘的逻辑地址区域示意图。如图8所示,存储对象具有不同的大小,虚拟存储盘的地址空间将被划分为若干个大小相等的逻辑地址区域,每个逻辑地址区域对应一个或多个存储对象。例如,参看图8,逻辑地址区域Re#0中包括存储对象Ct#1与Ct#2,逻辑地址区域Re#1对应的逻辑地址尚未被分配存储对象,逻辑地址区域Re#3对应的部分逻辑地址被分配了存储对象Ct#K,而另一部分逻辑地址尚未被分配存储对象。
图9是根据本发明实施例的存储系统的IO请求处理的流程图。通过将访问不同逻辑地址区域的IO请求分配给不同的IO处理线程处理,实现了多个线程同时处理。由于每个线程处理不同逻辑地址区域的IO请求,这些读写请求之间不存在访问相关性,所以这些多个线程互相没有影响,可以并行处理。
参看图9,为对存储系统的IO请求进行处理,接收第一IO请求,其中,第一IO请求访问第一逻辑地址区域(910)。根据第一逻辑地址区域确定第一线程,使第一线程处理第一IO请求(920)。
作为举例,第一逻辑地址区域是图7的由“Re#0”所指示的逻辑地址区域。第一线程可以是例如图4中示出的绑定在CPU 1上的IO处理线程(T1)。
例如:将对区域“Re#0”(参看图7)的IO请求,指派给线程T1处理,而将对区域“Re#2”(参看图7)的IO请求,指派给线程T2处理。线程T2是绑定到CPU2的IO处理线程(参看图4)。从而对于虚拟存储盘的地址空间的不同逻辑区域的读写请求,可由不同的线程同时处理。
所属领域技术人员将意识到,有多种从逻辑地址区域R映射到线程的T的方式。例如利用逻辑地址区域的编号对所创建的IO处理线程的数量取模,所得的结果就是对应IO处理线程的索引,由该索引所指示的IO处理线程来处理对编号的逻辑地址区域的IO请求进行处理。在另一个例子中,对逻辑地址区域R的编号计算散列,得到的散列结果作为对IO处理线程的索引。
提供存储对象映射表来维护虚拟存储盘的地址到存储对象的映射。为每个虚拟存储盘提供存储对象映射表,存储对象映射表是整个虚拟存储盘的共享资源,在根据本发明的实施例中,使多个IO Handler可以并发且无锁地访问存储对象映射表。
存储对象映射表由若干个映射表条目组成,每个映射表条目是一棵红黑树,红黑树的节点存储了存储对象起始逻辑地址与该存储对象的对应关系。因而每个映射表条目中记录了多个“<存储对象起始逻辑地址,存储对象>”。将存储对象放入哪个映射表条目,可由该存储对象所在的逻辑地址区域编号对映射表条目的数量取模得到。
在根据本发明的实施例中,对逻辑地址区域R1的IO请求,由IO处理线程T1处理;而记录逻辑地址区域R1的逻辑地址与存储对象的映射关系的映射表条目(称为,M1)也由IO处理线程T1处理。同时,对逻辑地址区域R2的IO请求,由IO处理线程T2处理;而记录逻辑地址区域R2的逻辑地址与存储对象的映射关系的映射表条目(称为,M2)也由IO处理线程T2处理。从而IO处理线程T1与IO处理线程T2处理访问不同逻辑地址区域的IO请求,并访问不同的映射表条目。以此方式,多个IO处理线程并发地处理各自的任务,也无需使用锁来在多个IO处理线程之间进行同步。
作为一个例子,为存储系统的一个存储资源池,提供io_handler_num个IO处理线程,虚拟存储盘0的存储对象映射表有map_entry_num个映射表表项。对于索引号是region_index的逻辑地址区域:该逻辑地址区域所属IO处理线程由编号ioh_index指示,而ioh_index=region_index%io_handler_num;该逻辑地址区域所属的映射表条目由索引map_entry指示,其中map_index=region_index%map_entry_num;该逻辑地址区域所属的映射表条目,由编号为map_ioh_index的IO处理线程访问,其中map_ioh_index=map_index%io_handler_num。
进一步地,使map_entry_num能够被io_handler_num整除,来确保对于一IO请求,其相对应的ioh_index等于map_ioh_index。
因此,对于一个虚拟存储盘,将与其相对应的映射表条目的数量设置为分配给存储资源池的IO处理线程数量的整数倍,以此来保证每个IO处理线程是某些映射表条目的唯一访问者。在此情况下,多个IO处理线程访问存储对象映射表时,无需使用锁来同步彼此的操作,从而实现无锁设计。
在另一个例子中,将虚拟存储盘1的逻辑地址空间划分为12个区域,由3个IO处理线程服务为虚拟存储盘1提供资源的存储资源池,虚拟存储盘的存储对象映射表包括6个映射表条目。将前4个逻辑地址区域(编号为0/1/2/3)的逻辑地址区域与前2个映射表条目(索引为0/1)分配到线程T0,将中间4个逻辑地址区域(编号为4/5/6/7)与索引为2/3的映射表条目分配到IO处理线程T1,将最后4个逻辑地址区域与索引为5/6的映射表条目分配到IO处理线程T2。从而避免两个或多个IO处理线程访问相同的映射表条目,也避免两个或多个IO处理线程访问相同的逻辑地址区域。
一个存储资源池可支持多个虚拟存储盘。在根据本发明的实施例中,为存储资源池提供IO处理线程。用于同一存储资源池的IO处理线程,可在该存储资源池支持的多个虚拟存储盘间共享。使得一个IO处理线程,处理访问不同虚拟存储盘的IO请求。进一步地,在存储系统中可提供多个存储资源池。
优选地,为存储系统或存储资源池提供的IO处理线程的数量可以调整,但不多于存储系统中的CPU或CPU核的数量。
对虚拟存储盘IO请求的处理可以分为两阶段:第一阶段是接收IO请求并分发到对应的IO处理线程;第二阶段是IO处理线程对IO请求进行处理并发送给存储设备或存储设备的驱动程序。下面将分别进行阐述。图10是根据本发明实施例的将IO请求分发到IO处理线程的流程图,其对应于上述第一阶段;以及图11是根据本发明实施例的IO处理线程处理IO请求的流程图,对应于上述第二阶段。
如图10所示,用户访问虚拟存储盘,从而接收到IO请求(1010),根据IO请求的逻辑地址判断该IO请求是否跨越多个逻辑地址区域(1020)。可通过IO请求访问的起始逻辑地址与访问长度确定是否跨越多个逻辑地址区域。如果该IO请求仅访问一个逻辑地址区域,创建一个IO命令(1030)。对该IO命令的响应能够作为对该IO请求的响应。以及计算该IO命令所访问的逻辑地址区域的编号。例如,依据该IO请求访问的起始逻辑地址确定其所访问的逻辑地址区域的编号。如果该IO请求访问的逻辑地址范围跨越了多个逻辑地址区域,为该IO请求创建多个IO命令,该IO请求访问的每个逻辑地址区域对应一个IO命令(1040)。例如, IO请求访问逻辑地址区域Re#2与逻辑地址区域Re#3(参看图7),为逻辑地址区域Re#2生成IO命令C1,为逻辑地址区域生成IO命令C2,对IO命令C1与IO命令C2的响应能够组合为对IO请求的响应。
根据IO命令对应的逻辑地址区域编号,确定处理IO命令的IO处理线程,将IO命令分发到IO处理线程的命令队列(1050)。在根据本发明的实施例中,为每个IO处理线程提供命令队列。在命令队列的条目中容纳IO命令。IO处理线程从其命令队列中取出IO命令进行处理。在步骤1050,作为举例,对IO命令对应的逻辑地址区域编号进行哈希,或者通过逻辑地址区域编号对IO处理线程数量取模,得到处理该IO命令的IO处理线程的索引,依据索引获取该IO处理线程,并将IO命令填充到该IO处理线程的命令队列中。作为另一个例子,对IO命令对应的逻辑地址区域编号进行哈希,或者通过逻辑地址区域编号对IO处理线程数量取模,得到处理该IO命令的IO处理线程的索引,依据该索引直接获取命令队列,并将该IO命令插入该命令队列。
第二阶段,IO处理线程对IO命令进行处理,并发送给存储设备或存储设备的驱动程序。如图11所示,IO处理线程从其命令队列中取出IO命令(1110)。依据IO命令的逻辑地址,查找存储对象映射表,确定与该逻辑地址相对应的存储对象是否存在(1120)。若存储对象映射表指示被处理的IO命令所访问的逻辑地址已被映射到存储对象,则访问该存储对象,并依据IO命令对存储对象进行读取或写入(1150)。IO处理线程查找存储对象映射表时,依据IO命令所访问的逻辑地址区域编号,确定要访问的映射表表项,并查找IO命令的逻辑地址被映射到的存储对象。在根据本发明的实施例中,两个IO处理线程不会访问相同的映射表表项,从而IO处理线程访问存储对象映射表时,无需对存储对象映射表加锁,提升了处理效率。
若存储对象映射表指示被处理的IO命令所访问的逻辑地址尚未被映射到存储对象,进一步确定IO命令是读命令还是写命令(1130)。若IO命令是读命令,而读取尚未被分配存储对象的逻辑地址是非法的,此时以指定值作为对IO命令的响应(1160)。例如,为该IO命令返回全0的结果,或者在返回值中指示该IO命令访问了非法或尚未被分配的逻辑地址。若IO命令是写命令,而在写入尚未被分配存储对象的逻辑地址时,首先创建存储对象,并将所创建的存储对象插入存储对象映射表,以在存储对象映射表的表项中记录该逻辑地址与所创建的存储对象的映射关系(1140),进而访问存储对象,依据IO命令向所创建的存储对象写入数据(1150)。
显然,所属领域技术人员将意识到,图11中所展示的步骤可按其他顺序执行。例如,IO处理线程从命令队列中取出IO命令后,可首先检查IO命令是读取命令还是写入命令,再检查IO命令的逻辑地址是否被分配存储对象。对于读命令,若IO命令的逻辑地址已被分配存储对象,则从存储对象中取出数据;若IO命令的逻辑地址未分配存储对象,则以预定值作为读取结果,或者指示读取的是非法地址。对于写命令,若IO命令的逻辑地址已被分配存储对象,则向存储对象写入数据;若IO命令的逻辑地址未分配存储对象,则分配新存储对象,将逻辑地址与新分配的存储对象建立映射关系,并向新分配的寸处对象写入数据作为对IO命令的响应。
综上所述,在本发明的实施例中,将虚拟存储盘的逻辑地址空间划分为互不交叠的多个逻辑地址区域,对一个逻辑地址区域的IO请求,由相对应的一个线程处理。避免两个线程处理同一个逻辑地址区域的IO请求。进一步地,避免两个线程同时访问一个映射表条目。从而线程之间的资源访问没有冲突,不必使用锁或其他机制来同步线程之间的临界资源,简化了处理过程,提升了线程间的并行性。
对于包括多块固态驱动器的存储系统,采用本发明所提出的无锁IO处理新方法,可以在保证数据可靠性的同时,利用多CPU核/多CPU并发处理的特性,能够充分发挥多块固态驱动器的高性能。并且可以做到性能相对于CPU核/CPU数量性能线性可扩展,从而达到客户 对于数据可靠性和性能的需求。还可以根据性能需求和对CPU、内存等资源的使用要求进行动态配置。
需要注意的是,本发明的实施例不能保证存储对象内部同一个条带上的多个IO请求之间不存在冲突。在存储对象内部同一个条带的读请求与读请求可以并发执行;但写请求与写请求之间、写请求与重构请求之间,需要进行同步或串行执行,以保证数据的正确性。由每个IO处理线程处理对同一存储对象内部的多个IO请求之间的同步。
本发明实施例还提供一种包含程序代码的程序,当被载入CPU并在CPU中执行时,程序使CPU执行上面提供的根据本发明实施例的方法之一。
本发明实施例还提供一种包括程序代码的程序,当被载入主机并在主机上执行时,所述程序使主机的处理器执行上面提供的根据本发明实施例的方法之一。
应该理解,框图和流程图的每个框以及框图和流程图的框的组合可以分别由包括计算机程序指令的各种装置来实施。
这些计算机程序指令可以加载到通用计算机、专用计算机或其他可编程数据控制设备上以产生机器,从而在计算机或其他可编程数据控制设备上执行的指令创建了用于实现一个或多个流程图框中指定的功能的装置。这些计算机程序指令还可以存储在可以引导计算机或其他可编程数据控制设备的计算机可读存储器中从而以特定方式起作用,从而能够利用存储在计算机可读存储器中的指令来制造包括用于实现一个或多个流程图框中所指定功能的计算机可读指令的制品。计算机程序指令还可以加载到计算机或其他可编程数据控制设备上以使得在计算机或其他可编程数据控制设备上执行一系列的操作操作,从而产生计算机实现的过程,进而在计算机或其他可编程数据控制设备上执行的指令提供了用于实现一个或多个流程图框中所指定功能的操作。
因而,框图和流程图的框支持用于执行指定功能的装置的组合、用于执行指定功能的操作的组合和用于执行指定功能的程序指令装置的组合。还应该理解,框图和流程图的每个框以及框图和流程图的框的组合可以由执行指定功能或操作的、基于硬件的专用计算机系统实现,或由专用硬件和计算机指令的组合实现。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (10)

  1. 一种存储系统的IO请求处理方法,其特征在于,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述方法包括:
    接收第一IO请求,其中,所述第一IO请求访问第一逻辑地址区域;
    根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO请求。
  2. 根据权利要求1所述的方法,其特征在于,所述根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO请求包括:
    根据第一IO请求生成第一IO命令,并根据第一逻辑地址区域,将第一IO命令填充到与第一线程对应的第一队列,第一线程从第一队列中取出并处理第一IO命令,其中第一线程仅处理第一队列的命令。
  3. 根据权利要求1-2之一所述的方法,其特征在于,还包括:
    根据第一逻辑地址区域确定第一映射表表项;
    所述第一线程访问第一映射表表项,根据第一IO请求访问的第一逻辑地址从第一映射表表项中获取第一存储对象,并访问所述第一存储对象来处理所述第一IO请求。
  4. 根据权利要求3所述的方法,其特征在于,还包括:
    若从第一映射表表项中无法获得存储对象,对于IO请求中的写请求,创建第三存储对象,并在第一映射表项中记录第三存储对象,以及向第三存储对象写入数据。
  5. 一种存储系统的IO请求处理方法,其特征在于,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述方法包括:
    接收第一IO请求;
    根据第一IO请求生成第一IO命令与第二IO命令,其中第一IO命令访问第一逻辑地址区域,而第二IO命令访问第二逻辑地址区域;
    根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO命令;以及根据第二逻辑地址区域确定第二线程,使第二线程处理所述第二IO命令。
  6. 根据权利要求5所述的方法,其特征在于,还包括:
    根据第一逻辑地址区域,将第一IO命令填充到与第一线程对应的第一队列, 第一线程从第一队列中取出并处理第一IO命令,其中第一线程仅处理第一队列的命令;以及
    并根据第二逻辑地址区域,将第二IO命令填充到与第二线程对应的第二队列,第二线程从第二队列中取出并处理第二IO命令,其中第二线程仅处理第二队列的命令。
  7. 根据权利要求5-6之一所述的方法,其特征在于,还包括:
    根据第一逻辑地址区域确定第一映射表表项;
    所述第一线程访问第一映射表表项,根据第一IO命令访问的第一逻辑地址从第一映射表表项中获取第一存储对象,并访问所述第一存储对象来处理所述第一IO命令。
  8. 根据权利要求7所述的方法,其特征在于,还包括:
    若从第一映射表表项中无法获得存储对象,对于IO请求中的读请求,返回指示读请求异常的结果。
  9. 根据权利要求5-8之一所述的方法,其特征在于,其中,
    所述第一线程仅由第一CPU执行,而所述第二线程仅由第二CPU执行。
  10. 一种存储系统的IO请求处理装置,其特征在于,所述存储系统包括多个虚拟存储盘,虚拟存储盘包括多个逻辑地址区域,所述装置包括:
    第一接收模块,用于接收第一IO请求,其中,所述第一IO请求访问第一逻辑地址区域;
    第一处理模块,用于根据第一逻辑地址区域确定第一线程,使第一线程处理所述第一IO请求。
PCT/CN2017/096152 2016-08-08 2017-08-07 无锁io处理方法及其装置 WO2018028529A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610644949.4A CN107704194B (zh) 2016-08-08 2016-08-08 无锁io处理方法及其装置
CN201610644949.4 2016-08-08

Publications (1)

Publication Number Publication Date
WO2018028529A1 true WO2018028529A1 (zh) 2018-02-15

Family

ID=61161771

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/096152 WO2018028529A1 (zh) 2016-08-08 2017-08-07 无锁io处理方法及其装置

Country Status (2)

Country Link
CN (2) CN107704194B (zh)
WO (1) WO2018028529A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110568991B (zh) * 2018-06-06 2023-07-25 北京忆恒创源科技股份有限公司 降低锁引起的io命令冲突的方法与存储设备
CN109101194A (zh) * 2018-07-26 2018-12-28 郑州云海信息技术有限公司 一种刷写性能优化方法和存储系统
CN108958944A (zh) * 2018-07-26 2018-12-07 郑州云海信息技术有限公司 一种多核处理系统及其任务分配方法
US10949204B2 (en) * 2019-06-20 2021-03-16 Microchip Technology Incorporated Microcontroller with configurable logic peripheral
CN111638854A (zh) * 2020-05-26 2020-09-08 北京同有飞骥科技股份有限公司 Nas构建的性能优化方法、装置及san堆栈块设备
CN112306413B (zh) * 2020-10-30 2024-05-07 北京百度网讯科技有限公司 用于访问内存的方法、装置、设备以及存储介质
CN112463037B (zh) * 2020-11-13 2022-08-12 苏州浪潮智能科技有限公司 一种元数据保存方法、装置、设备、产品
CN112463306A (zh) * 2020-12-03 2021-03-09 南京机敏软件科技有限公司 一种虚拟机中共享盘数据一致性的方法
CN113568736A (zh) * 2021-06-24 2021-10-29 阿里巴巴新加坡控股有限公司 数据处理方法及装置
CN113849317B (zh) * 2021-11-29 2022-03-22 苏州浪潮智能科技有限公司 一种内存池资源使用方法及相关装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101937465A (zh) * 2010-09-10 2011-01-05 中兴通讯股份有限公司 分布式文件系统及其中的上层文件系统的访问方法
CN102073461A (zh) * 2010-12-07 2011-05-25 成都市华为赛门铁克科技有限公司 输入输出请求调度方法、存储控制器和存储阵列
CN102298561A (zh) * 2011-08-10 2011-12-28 北京百度网讯科技有限公司 一种对存储设备进行多通道数据处理的方法、系统和装置
CN102637147A (zh) * 2011-11-14 2012-08-15 天津神舟通用数据技术有限公司 利用固态硬盘作为计算机写缓存的存储系统以及相应的管理调度方法
US20140245299A1 (en) * 2013-02-27 2014-08-28 Vmware, Inc. Managing Storage Commands According to Input-Output Priorities and Dependencies

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2396717A1 (en) * 2009-02-11 2011-12-21 Infinidat Ltd Virtualized storage system and method of operating it
WO2010095182A1 (ja) * 2009-02-17 2010-08-26 パナソニック株式会社 マルチスレッドプロセッサ及びデジタルテレビシステム
CN101937466B (zh) * 2010-09-15 2011-11-30 任子行网络技术股份有限公司 网页邮箱识别分类方法及系统
CN102622189B (zh) * 2011-12-31 2015-11-25 华为数字技术(成都)有限公司 存储虚拟化的装置、数据存储方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101937465A (zh) * 2010-09-10 2011-01-05 中兴通讯股份有限公司 分布式文件系统及其中的上层文件系统的访问方法
CN102073461A (zh) * 2010-12-07 2011-05-25 成都市华为赛门铁克科技有限公司 输入输出请求调度方法、存储控制器和存储阵列
CN102298561A (zh) * 2011-08-10 2011-12-28 北京百度网讯科技有限公司 一种对存储设备进行多通道数据处理的方法、系统和装置
CN102637147A (zh) * 2011-11-14 2012-08-15 天津神舟通用数据技术有限公司 利用固态硬盘作为计算机写缓存的存储系统以及相应的管理调度方法
US20140245299A1 (en) * 2013-02-27 2014-08-28 Vmware, Inc. Managing Storage Commands According to Input-Output Priorities and Dependencies

Also Published As

Publication number Publication date
CN107704194B (zh) 2020-07-31
CN111679795B (zh) 2024-04-05
CN107704194A (zh) 2018-02-16
CN111679795A (zh) 2020-09-18

Similar Documents

Publication Publication Date Title
WO2018028529A1 (zh) 无锁io处理方法及其装置
Bjørling et al. {LightNVM}: The linux {Open-Channel}{SSD} subsystem
US10969963B2 (en) Namespaces allocation in non-volatile memory devices
US20220137849A1 (en) Fragment Management Method and Fragment Management Apparatus
JP6205650B2 (ja) 不均等アクセス・メモリにレコードを配置するために不均等ハッシュ機能を利用する方法および装置
US9378093B2 (en) Controlling data storage in an array of storage devices
US8103847B2 (en) Storage virtual containers
JP6076506B2 (ja) ストレージ装置
WO2020204880A1 (en) Snapshot-enabled storage system implementing algorithm for efficient reclamation of snapshot storage space
KR102275563B1 (ko) 호스트-관리 비휘발성 메모리
CN101645045B (zh) 用于使用透明页变换来管理存储器的方法和系统
WO2016082196A1 (zh) 文件访问方法、装置及存储设备
US20140281307A1 (en) Handling snapshot information for a storage device
KR102443600B1 (ko) 하이브리드 메모리 시스템
KR20180002259A (ko) 계층적 플래시 변환 레이어 구조 및 그 설계 방법
CN111868679B (zh) 混合存储器系统
US11119703B2 (en) Utilizing a set of virtual storage units distributed across physical storage units
US11880584B2 (en) Reverse range lookup on a unified logical map data structure of snapshots
US10853257B1 (en) Zero detection within sub-track compression domains
US11429519B2 (en) System and method for facilitating reduction of latency and mitigation of write amplification in a multi-tenancy storage drive
US11144445B1 (en) Use of compression domains that are more granular than storage allocation units
Choi et al. Pb+-tree: Pcm-aware b+-tree
US11947803B2 (en) Effective utilization of different drive capacities
US20230176921A1 (en) Application-defined storage architecture for offloading computation
US20230236737A1 (en) Storage Controller Managing Different Types Of Blocks, Operating Method Thereof, And Operating Method Of Storage Device Including The Same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17838659

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17838659

Country of ref document: EP

Kind code of ref document: A1