WO2020029749A1 - 一种i/o请求派发方法及装置 - Google Patents

一种i/o请求派发方法及装置 Download PDF

Info

Publication number
WO2020029749A1
WO2020029749A1 PCT/CN2019/095922 CN2019095922W WO2020029749A1 WO 2020029749 A1 WO2020029749 A1 WO 2020029749A1 CN 2019095922 W CN2019095922 W CN 2019095922W WO 2020029749 A1 WO2020029749 A1 WO 2020029749A1
Authority
WO
WIPO (PCT)
Prior art keywords
request
zone
write
lock
dispatching
Prior art date
Application number
PCT/CN2019/095922
Other languages
English (en)
French (fr)
Inventor
王鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020029749A1 publication Critical patent/WO2020029749A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Definitions

  • the present application relates to the field of computer technology, and in particular, to an input / output (I / O) request dispatching method and device.
  • HDD Hard disk
  • SMR shingled magnetic recording
  • the structure of a conventional hard disk and an SMR disk is illustrated. It can be seen from the content of Figure 1 that the SMR disc uses shingle recording. By overlapping the tracks in a certain area in order, a higher storage area density, a larger disk capacity, and a lower unit capacity price are obtained.
  • the SMR disk brings the above benefits, it also imposes sequential constraints on the I / O behavior of upper-layer applications, that is, it only supports sequential writing.
  • the SMR disk is logically divided into zones (Zones) according to a fixed size, and a zone contains a continuous logical block address (LBA).
  • Zones are divided into two categories: traditional zones (C-Zone) that allow random writes, and sequential zones (S-Zone) that only support sequential writes).
  • C-Zone traditional zones
  • S-Zone sequential zones
  • the sequential constraints of read and write I / O on SMR disks are reflected in the fact that one S-Zone can only be written sequentially, and multiple S-Zones can be written concurrently.
  • the I / O scheduler manages the request queue of the block device, and reduces the number of disk seeks by merging I / O requests and sorting the I / O to improve disk performance.
  • the I / O scheduler When receiving an I / O request, the I / O scheduler first adds the I / O request to the I / O scheduling queue, and then transfers the I / O request in the scheduling queue to the dispatch queue according to a specific scheduling algorithm, and finally submits carried out.
  • each part of the I / O stack needs to support order preservation.
  • each S-Zone of the SMR disk is introduced with a lock for writing I / O order keeping, called a zone lock (Zone Lock). Lock when the write I / O is issued and unlock when the write I / O returns. Make sure that at most one write request is sent to the SMR disk for an S-Zone at any time. Write I / O out of order will occur.
  • the scheduling behavior of the SMR disk by the I / O scheduler is controlled by the zone lock status, and the write I / O request blocked by the zone lock is skipped to schedule the SMR disk, that is, the write I / O request is issued to When the SMR disk is on, write I / O will only be allowed to be sent to the unlocked target S-Zone. If the target S-Zone is locked, then this write request will not be dispatched, it will be returned to the dispatch queue, and the dispatcher will continue Find the next write request that can be issued.
  • the scheduler When there are multiple zone sequential sequential writes on the SMR disk, if there are write I / O requests blocked by the zone lock, the scheduler skips the write I / O requests blocked by the zone lock and schedules the next write I / O request, resulting in the hard disk head Seek to another zone; in multiple zone concurrent sequential write scenarios where multiple write I / O requests are blocked by zone locks, it will cause repeated swings of the magnetic head, generate seek storms, and increase a lot of unnecessary random I / O , Causing performance degradation. If the distance between the zone and the zone is far, the impact on the performance of the SMR disk is more significant.
  • the embodiments of the present application provide an I / O request dispatching method and device, which avoids seek storms and improves the performance of SMR disks when sequentially writing and scheduling I / O requests.
  • an I / O request dispatch method may include: querying a zone lock status of a target zone in an SMR disk for a write I / O request; if the zone lock is in a locked state, monitoring the zone lock Status; when the zone lock is monitored for unlocking, the write I / O request is dispatched to the target zone.
  • the I / O request dispatching method provided in this application dispatches write I / O requests after the zone lock of the target zone is locked and waits for its unlocking, even if multiple write I / O requests are issued in a multi-zone concurrent sequential write scenario. Zone locks are blocked. Waiting for unlocking before distribution will not cause repeated swings of the magnetic head, avoiding seek storms, and thus ensuring the performance of the SMR disk. Further, due to the sequential addition of SMR disks in the S-Zone, write I / O requests blocked by zone locks are generally sequential write I / Os added in the same zone. Waiting for unlocking before dispatching will allow I / O Request writing is more sequential, minimizing the moving distance of the magnetic head, optimizing the seek time, and improving the I / O access performance to the SMR disk.
  • the write I / O request is the write I / O request to be dispatched in the scheduling queue with the highest priority. At least one I / O request is stored in the scheduling queue in order.
  • the order of the I / O requests in the scheduling queue can be arranged according to various scheduling algorithms, which is not specifically limited in this application.
  • the write I / O request is dispatched to the target zone, which can be specifically implemented as follows: when the zone lock is unlocked within a preset period of time, Dispatches write I / O requests to the target zone.
  • the I / O request dispatching method provided in this application may further include: if the zone lock is locked for a preset period of time, skipping the write I / O request and dispatching other I / O requests.
  • the maximum waiting time is limited by a preset duration to avoid waiting too long. After skipping the write I / O request, the next I / O request to be dispatched is processed, and the specific process is the same, which is not described here again.
  • the I / The O request dispatch method may further include: if the zone lock is in an unlocked state, dispatching a write I / O request to a target zone.
  • the I / O request dispatch method provided in this application may further include : Zone lock of the target zone. In order to achieve the use of zone lock to the I / O request dispatch process to maintain order.
  • an I / O request dispatching device may include a query unit, a monitoring unit, and a dispatch unit.
  • the query unit is used to query the zone lock status of the target zone in the SMR disk for the write I / O request;
  • the monitoring unit is used to query the zone lock status of the zone lock if the query unit queries the zone lock status;
  • the monitoring unit detects that the zone lock is unlocked, it sends a write I / O request to the target zone.
  • the I / O request dispatching device when the zone lock of the target zone is locked, waits to listen for its unlocking and dispatches write I / O requests after the zone is unlocked, even if multiple write I / O requests are received in a multiple zone concurrent sequential write scenario Zone locks are blocked. Waiting for unlocking before distribution will not cause repeated swings of the magnetic head, avoiding seek storms, and thus ensuring the performance of the SMR disk. Further, due to the sequential addition of SMR disks in the S-Zone, write I / O requests blocked by zone locks are generally sequential write I / Os added in the same zone. Waiting for unlocking before dispatching will allow I / O Request writing is more sequential, minimizing the moving distance of the magnetic head, optimizing the seek time, and improving the I / O access performance to the SMR disk.
  • the dispatching unit may be specifically configured to dispatch the write I / O request to the target zone when the intercepting unit monitors that the zone lock is unlocked within a preset period of time. If the zone lock is locked for a preset period of time, skip the write I / O request and dispatch other I / O requests. After the write I / O request is skipped, the next I / O request to be dispatched is processed, and the specific process is the same, and will not be repeated here.
  • the dispatch unit may also be used: if the query unit queries that the zone lock is unlocked, dispatch the write I / O request to the target Zone.
  • the apparatus may further include: a locking unit, configured to distribute the write I / O request to the target zone after the dispatch unit distributes the write I / O request to the target zone. Zone lock for the target zone.
  • the I / O request dispatching device provided in the second aspect is used to implement the I / O request dispatching method provided in the above first aspect or any possible implementation manner.
  • the I / O request dispatching device provided in the second aspect is used to implement the I / O request dispatching method provided in the above first aspect or any possible implementation manner.
  • refer to the first aspect or Any possible implementation manner of the first aspect is not repeated here one by one.
  • the present application provides an I / O request dispatching device.
  • the I / O request dispatching device may implement the functions in the foregoing method examples, and the functions may be implemented by hardware or by executing corresponding software by hardware.
  • the hardware or software includes one or more modules corresponding to the foregoing functions.
  • the structure of the I / O request dispatching device includes a processor and a transceiver, and the processor is configured to support the I / O request dispatching device to execute the corresponding method in the foregoing method. Functions.
  • the transceiver is used to support communication between the I / O request dispatching device and other devices.
  • the I / O request dispatching device may further include a memory, which is used for coupling with the processor, and stores the program instructions and data necessary for the I / O request dispatching device.
  • the I / O request dispatching device provided in the third aspect is configured to implement the I / O request dispatching method provided in the first aspect or any possible implementation manner, and its specific implementation may refer to the first aspect or Any possible implementation manner of the first aspect is not repeated here one by one.
  • an I / O scheduler including the I / O request dispatching device provided in the third aspect.
  • an embodiment of the present application provides a computer storage medium including instructions that, when run on a computer, causes the computer to execute the program designed in the first aspect or any possible implementation manner of the first aspect.
  • an embodiment of the present application provides a computer program product that, when run on a computer, causes the computer to execute the program designed in the foregoing first aspect or any possible implementation manner of the first party.
  • FIG. 1 is a schematic structural diagram of a conventional hard disk and an SMR disk
  • Figure 2 is a schematic diagram of the division of the SMR disk zone and the WP indication
  • FIG. 3 is a schematic diagram of an I / O stack architecture of a system based on an SMR disk provided in the prior art
  • FIG. 4 is a schematic diagram of an internal structure of an I / O scheduler provided in the prior art
  • FIG. 5 is a schematic diagram of an internal structure of an I / O scheduler in a deadline scheduling algorithm provided by the prior art
  • FIG. 6 is a schematic diagram of an interaction scenario between an SMR disk zone lock and an I / O scheduler
  • FIG. 8 is a schematic diagram of a dispatching sequence of I / O requests in a scheduling queue provided by the prior art
  • FIG. 8a is a schematic architecture diagram of an I / O scheduler
  • FIG. 9 is a schematic flowchart of an I / O request processing process in a Linux general block layer
  • FIG. 10 is a schematic diagram of the overall process of the I / O scheduler
  • FIG. 11 is a schematic structural diagram of an I / O request dispatching device according to an embodiment of the present application.
  • FIG. 12 is a schematic flowchart of an I / O request dispatching method according to an embodiment of the present application.
  • FIG. 13 is a schematic flowchart of another I / O request dispatching method according to an embodiment of the present application.
  • FIG. 15 is a schematic diagram of a dispatching sequence of I / O requests in a dispatch queue according to an embodiment of the present application.
  • FIG. 16 is a schematic structural diagram of another I / O request dispatching apparatus according to an embodiment of the present application.
  • FIG. 17 is a schematic structural diagram of still another I / O request dispatching apparatus according to an embodiment of the present application.
  • FIG. 18 is a schematic structural diagram of another I / O request dispatching apparatus according to an embodiment of the present application.
  • SMR disks are mainly used in the fields of video, archiving, and backup. These applications usually have large files and sequential writing. Multi-zone concurrent sequential writing on SMR disks is a very common scenario. For example, in a video surveillance scenario, multiple video streams recorded by multiple cameras are written to different zones through the file system. To improve the recovery speed during data reconstruction, data from multiple backup disks is restored and written to concurrently. Multiple zones, etc.
  • SMR disks' sequential constraints on writing can be managed by the hard disk itself on the hardware side, or by software on the host side.
  • the former is called “Driver Managed SMR Disk” (Drive Managed SMR Disk), and the latter is called “Host Managed SMR Disk (Host Managed SMR, HM SMR Disk)".
  • HM SMR disk provides a new interface upwards. It is not allowed for the upper layer software to write randomly. Only the upper layer is allowed to write in sequence. I / Os that do not meet the SMR disk write rules will report an error.
  • the SMR disk is logically divided into zones based on a fixed size (for example, 256MB), and a zone contains a continuous LBA address.
  • C-Zone allows random writes, S-Zone only supports sequential writes.
  • Each S-Zone has a Write Pointer (WP), and WP locates where the next sequential write can occur.
  • WP Write Pointer
  • the SMR disk can only write to the LBA where the WP is located. Writing to other LBAs will report an error. Write operations starting from the WP position will automatically move the write pointer backwards.
  • the HM SMR disk can use the REPORT ZONES command to obtain information about Zone and WP.
  • Figure 2 illustrates the division of the SMR disk Zone and the WP indication.
  • FIG. 3 shows the architecture diagram of the system I / O stack based on the SMR disk.
  • the I / O stack includes a file system, a general block layer configured with an I / O scheduler, a small computer system interface (SCSI) subsystem, and a host bus adapter (Host bus). Adapter (HBA) driver, SMR disk.
  • I / O scheduler (I / O Scheduler) is an important part of the general block layer. After the block I / O (Block I / O) generated by the file system enters the general block layer, it is dispatched by the I / O scheduler before being distributed to the underlying SMR disk.
  • the I / O scheduler manages the queue of block devices. By adjusting the order of the requests in the queue, it has a great impact on the I / O performance of the SMR disk.
  • the I / O scheduler (I / O scheduler) is an important part. It manages the I / O request queue of the block device. When the I / O request submitted by the file system is received, it is merged. I / O requests, sort I / O, adjust the order of requests in the queue, decide when to dispatch requests to block devices, in order to reduce the number of disk seeks, thereby improving disk performance.
  • the I / O scheduler receives an I / O request from the file system, it generates a scheduling queue and generates a dispatch queue according to the scheduling algorithm.
  • combining refers to combining two or more I / O requests into a new request.
  • the file system submits an I / O request to the request queue, if there is already a request in the queue at this time, the disk sector accessed by it is adjacent to the disk sector currently requested, then the two requests can be combined into a pair New request for operation of two adjacent disk sectors.
  • Merge requests obviously reduce system overhead and disk addressing times. Sorting is to order all I / O requests that enter the request queue in the increasing direction of the sector number (LBA address), so that the magnetic head can complete the I / O operation according to the rotation order of the disk, reducing the number of seeks of the disk.
  • the algorithm of the scheduler is similar to the elevator operation strategy, so the I / O scheduler is also called the I / O elevator (I / O elevator).
  • FIG. 4 is a schematic diagram of an internal structure of an I / O scheduler.
  • the request queue is divided into two queues: one is the I / O scheduling queue, which is contained in the I / O scheduler; the other is the dispatch queue (dispatch queue), the I / O request queue to be dispatched to disk.
  • the I / O request submitted by the file system is first added to the dispatch queue, and then the I / O scheduler transfers it to the dispatch queue according to a specific scheduling algorithm, and is finally submitted for execution.
  • the elevator algorithm that sorts in the request queue according to the order of the requested disk sector (LBA address) is an important basic part of it.
  • the elevator algorithm focuses on maximizing I / O throughput.
  • Various schedulers have added read / write priority, I / O fairness, and avoid I / O starvation. The consideration of other factors, and further adjusting the order of I / O requests, formed their own characteristics and applicable scenarios.
  • the deadline scheduling algorithm maintains the requests in the order of the sector number of the destination area of the I / O request and forms a sorted queue. On the other hand, it also maintains the requests in the order of the arrival time of the request, forming a first-in-first order. Out (First Input, First Output, FIFO) queue for first-come-first-served.
  • FIFO First Input, First Output
  • FIG. 5 illustrates the internal structure of the I / O scheduler in the deadline scheduling algorithm.
  • the deadline scheduling algorithm actually uses four queues, which are distinguished based on the type of request. That is, a write operation has a sort queue and a FIFO queue, and a read operation also has a sort queue and a FIFO queue.
  • each request has a timeout period. By default, the timeout for read requests is 500 milliseconds, and the timeout for write requests is 5 seconds. Read requests take precedence over write requests.
  • the deadline scheduler adds it to the sort queue for merging and sorting. At the same time, it adds the request to the FIFO queue.
  • the read request is added to the read FIFO queue.
  • the write request is added to the write FIFO queue. In these two queues, the requests are completely arranged in FIFO order, that is, new requests are always put at the end of the queue.
  • requests are removed from the sort queue and pushed to the dispatch queue; if there is a request timeout in the write FIFO queue or read FIFO queue head (that is, the current time exceeds the timeout time specified by the request), then the deadline The I / O scheduler no longer uses the sort queue, but takes the request from the FIFO queue and puts it into the dispatch queue until there are no timeout requests in the queue. With this approach, the deadline I / O scheduler guarantees that all requests will not time out for a long time. Therefore, the deadline scheduler prevents the occurrence of starvation requests.
  • each part of the I / O stack needs to support order preservation, which is achieved by adding a zone lock protection.
  • the I / O scheduler is above the zone lock, and its behavior is controlled by the zone lock status.
  • Figure 6 illustrates the interaction scenario between the SMR disk zone lock and the I / O scheduler. As shown in Figure 6, Wi in the scheduling queue indicates a write I / O request whose destination is Zonei, and Ri indicates that the destination is Zonei. Read I / O request.
  • Zone lock on Zone 0 and Zone 1 is locked; a read I / O on Zone 2 is in progress, and Zone 1 There is no I / O, and Zone 2 and Zone are unlocked.
  • the flag of the first write I / O request is W0, and the corresponding Zone 0 is locked, so it cannot be dispatched further down.
  • the fourth flag bit W2 of the write I / O request, the corresponding Zone 0 is unlocked, so it can be distributed to the lower driver.
  • the prior art scheduler implements the corresponding scheduling strategy according to the following principles: 1) Any non-write request Dispatching is always allowed; 2) Any write to C-Zone is allowed to be distributed down; 3) For writes to S-Zone, first check the zone lock. If the zone is not locked, the write operation is allowed to continue after the target zone is locked; if the zone is locked, the write request is skipped and an attempt is made to dispatch the next request in the queue.
  • write I / O when a write I / O request is issued to an SMR disk, write I / O is only allowed to be sent to an unlocked target S-Zone; if the target S-Zone is locked, then This write request will not be dispatched, it will be returned to the dispatch queue, and the scheduler will continue to find the next write request that can be issued. If no write request can be issued in the write I / O scheduling queue belonging to the same scheduling batch, then the read request is allowed to be issued, even if the write scheduling batch in which the write request is located has not been processed at this time.
  • FIG. 7 The specific flow of the I / O request dispatching method provided in the prior art is shown in FIG. 7, and the flow may include the following steps:
  • the I / O scheduler determines whether the scheduling queue is empty.
  • the I / O scheduler selects an I / O request to be dispatched.
  • the I / O request to be dispatched is the first I / O request in the scheduling queue.
  • the I / O scheduler determines whether the I / O request to be dispatched is a write request.
  • the I / O scheduler determines whether the target area of the I / O request to be dispatched is S-Zone.
  • the I / O scheduler determines whether the zone lock of the target area of the I / O request to be dispatched is locked.
  • the I / O scheduler skips the I / O request to be dispatched and attempts to process the next I / O request in the scheduling queue.
  • the I / O scheduler locks a target area of the I / O request to be dispatched.
  • Figure 8 illustrates the dispatch sequence of I / O requests in a dispatch queue.
  • a write scheduling batch there are 19 write I / O requests in the scheduling queue waiting to be scheduled.
  • the numbers in the rectangle represent the dispatch order of write I / O requests in the prior art scheduler scheme. .
  • the prior art scheduler will cause the head to jump continuously when multiple zones are concurrently writing, and it will seek to zone 2 immediately after issuing one write I / O of zone 1 and issue another One write I / O, then skip to the next zone. If the distance between Zone and Zone is farther, the performance impact is more significant.
  • the scheduler skips the write I / O requests blocked by the zone lock and schedules the next write I / O request, resulting in Hard disk head seeks to another zone; in multiple zone concurrent sequential write scenarios where multiple write I / O requests are blocked by zone locks, it will cause repeated swings of the magnetic head, generate seek storms, and increase a large number of unnecessary random I / O, causing performance degradation. If the distance between the zone and the zone is far, the impact on the performance of the SMR disk is more significant.
  • this application proposes a method for dispatching I / O requests.
  • the basic principle is: when a write I / O request is blocked by a zone lock, it is dispatched after waiting for unlocking to avoid seek storms and improve performance.
  • the I / O request dispatching method provided in this application is applied to the system I / O stack architecture of the SMR disk shown in FIG. 3. Specifically, the I / O scheduler in the system I / O stack architecture applied to the SMR disk illustrated in FIG. 3 is in the process of dispatching I / O requests to the SMR disk.
  • FIG. 8a illustrates the architecture of the I / O scheduler involved in the solution provided in this application.
  • the architecture of the I / O scheduler involved in the solution provided by this application consists of several groups of dispatch queues, dispatch queues, zone lock tables, and elevator objects.
  • the dispatch queue is a series of queues maintained by the I / O scheduler. These queues are used to sort requests or perform I / O merges; dispatch queues are used to temporarily store I / O requests issued to disk drivers.
  • SMR The disk sequentially processes the I / O in the dispatch queue; the elevator object is the core part of the I / O scheduler, responsible for picking out the appropriate I / O request from the dispatch queue according to certain rules, and inserting it into the tail of the dispatch queue;
  • Zone lock The table is a unique component of the SMR disk I / O scheduler. It is locked when the write I / O is issued and unlocked when the write I / O is completed. It is used to ensure that there is at most one write I / O in each S-Zone at the same time. Executing.
  • zone lock table can be configured inside the I / O scheduler or other positions in the system I / O stack architecture of the SMR disk shown in FIG. 3, which is not specifically described in the embodiment of the present application.
  • FIG. 8a is only schematic and does not constitute a specific limit.
  • the upper layer submits a general block layer request (bio) to the block I / O subsystem.
  • the block I / O subsystem provides an interface for receiving a general block layer request (bio) to an upper layer. After the upper layer (file system, etc.) constructs the bio, it is submitted to the general block layer for processing.
  • bio general block layer request
  • a request queue is constructed from the bio.
  • Merge combines multiple I / Os accessed to consecutive sectors of the disk into one I / O. Sorting means that the I / O is sorted according to the sector number accessed to the disk, so as to make the magnetic head move in one direction.
  • the lower layer processes the requests in the SCSI block device request queue one by one.
  • the block I / O subsystem dispatches the I / O requests in the request queue to the underlying SCSI subsystem.
  • the I / O scheduler plays an important role in S902 and S903.
  • the overall process of the I / O scheduler can specifically include:
  • the scheduler establishes an associated I / O scheduling queue for a request queue.
  • the SCSI block device request queue is allocated to establish an associated I / O scheduling queue.
  • the scheduler determines whether bio can be incorporated into the request.
  • the scheduler after receiving the bio, the scheduler tries to select a request from the request queue that can merge this bio, and merge backwards and forwards.
  • the scheduler adds the combined request to the I / O scheduling queue and sorts it.
  • the block layer adds it to the I / O scheduling queue, and at the same time sorts it according to the requirements of the elevator object to maintain the orderliness of the I / O scheduling queue.
  • the scheduler dispatches a request from the I / O scheduling queue to the dispatch queue.
  • solution provided in this application may be applied to the process of S1004, and the solution provided in this application is described in detail below.
  • first and second in the description and claims of the embodiments of the present application are used to distinguish different objects, rather than to describe a specific order of the objects.
  • first instruction information, the second instruction information, and the like are used to distinguish different instruction information, and are not used to describe a specific order of the information.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as more preferred or more advantageous than other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present related concepts in a concrete manner to facilitate understanding.
  • an embodiment of the present application provides an I / O request dispatching device.
  • the I / O request dispatching device may be configured in the scheduler. Therefore, the I / O request dispatching device may be a functional unit in the scheduler that executes the I / O request dispatching method provided in this application.
  • the scheduler may also be called an I / O request dispatching device, which is not specifically limited in this embodiment of the present application.
  • FIG. 11 shows an I / O request dispatching device 110 related to the embodiments of the present application.
  • the I / O request dispatching device 110 may be part or all of a scheduler in the architecture shown in FIG. 3. As shown in FIG. 11, the I / O request dispatching device 110 may include a processor 1101, a memory 1102, and a transceiver 1103.
  • I / O request dispatching device 110 Each component of the I / O request dispatching device 110 is specifically described below with reference to FIG. 11:
  • the memory 1102 may be a volatile memory (for example, random-access memory (RAM); or a non-volatile memory (for example, read-only memory) (ROM), flash memory (flash memory), hard disk (HDD) or solid-state drive (SSD); or a combination of the above types of memory, used to store programs that can implement the method of this application Code, and configuration files.
  • RAM random-access memory
  • ROM read-only memory
  • flash memory flash memory
  • HDD hard disk
  • SSD solid-state drive
  • the processor 1101 is the control center of the I / O request dispatching device 110, which may be a central processing unit (CPU), a specific integrated circuit (Application Specific Integrated Circuit, ASIC), or configured to implement The one or more integrated circuits in the embodiment of the present application, for example, one or more microprocessors (digital microprocessors, DSPs), or one or more field programmable gate arrays (FPGAs).
  • the processor 1101 may execute various functions of the I / O request dispatching device 110 by running or executing software programs and / or modules stored in the memory 1102 and calling data stored in the memory 1102.
  • the transceiver 1103 is used for the I / O request dispatching device 110 to interact with other units. Exemplarily, the transceiver 1103 may send and receive ports for the I / O request dispatching device 110.
  • the processor 1101 executes the following functions by running or executing software programs and / or modules stored in the memory 1102 and calling data stored in the memory 1102:
  • zone lock status of the target zone Zone of the write I / O request in the SMR disk if the zone lock is locked, monitor the zone lock status; when the zone lock is monitored to unlock, the write I / O request is dispatched to the target zone .
  • an embodiment of the present application provides an I / O request dispatching method, which is applied to a process in which a system I / O stack of an SMR disk sends a write I / O request to the SMR disk.
  • the process of the I / O request dispatching method described in the embodiment of the present application is executed by the I / O request dispatching device, and the I / O request dispatching device may be deployed in the scheduler as a functional unit or a chip.
  • the I / O request dispatching device described in this article performs an operation. This can be understood as being performed by the I / O request dispatching device itself, or it can be understood as being performed by the scheduler to which the I / O request dispatching device belongs. This is not specifically limited, and it is only described below that the scheduler performs an operation.
  • the I / O request dispatching method may include:
  • the I / O request dispatching device queries the zone lock status of the target zone in the SMR disk for the write I / O request.
  • the write I / O request is the highest priority write I / O request to be dispatched in the dispatch queue.
  • the write I / O request carries the identifier of the target zone accessed by the write I / O request,
  • the dispatch queue is the scheduler where the I / O request dispatching device is located.
  • the request queue After receiving the I / O request from the file system, the request queue is configured and then generated by combining, sorting, and other methods.
  • the embodiment of the present application does not limit the specific process when the I / O scheduler generates a scheduling queue.
  • the I / O scheduler uses the scheduling algorithm to send I / O requests in the scheduling queue to the dispatch queue, and the SMR disk executes the I / O requests in the dispatch queue in turn.
  • the scheduling algorithm used by the scheduler is not specifically limited.
  • the zone lock table stores the zone lock status of each zone in the SMR disk.
  • the corresponding zone lock is locked and enters the locked state.
  • the corresponding zone lock is unlocked.
  • the write I / O request carries the identifier of the target zone accessed by the write I / O request, and queries the zone lock table according to the identifier to query the zone lock status of the target zone in the SMR disk of the write I / O request.
  • the zone identifier is the information that uniquely identifies the zone in the SMR disk. Any information that can be used to uniquely identify the zone can be used as the zone identifier.
  • the identifier may be a character string, a number, or the like, and the content of the zone identifier is not specifically limited in this embodiment of the present application.
  • Table 1 is only an example description of the zone lock table, and does not constitute a specific limitation on the form and content of the zone lock table. In actual applications, the form and specific content of the zone lock table can be configured according to requirements, which is not specifically limited in the embodiment of this application.
  • the zone lock table can be configured inside the scheduler.
  • the zone lock table can also be configured at other positions of the general block layer in the system I / O stack of the SMR disk, which is not specifically limited in this embodiment of the present application.
  • the process of S1201 may be performed for the processor 1101 in the I / O request dispatching device 110 illustrated in FIG. 11. After executing S1201, if the zone lock of the target zone of the query for the write I / O request is locked, then execute S1202.
  • the I / O request dispatch device monitors the state of the zone lock.
  • monitoring the status of the zone lock means querying the zone lock table in real time to query its status.
  • the I / O request dispatching device dispatches the write I / O request to the target zone.
  • zone lock when the zone lock is unlocked during the monitoring process of S1202, it indicates that the zone has ended the write operation, and the write I / O request currently waiting can be dispatched to the target area for the write operation.
  • the dispatching process may refer to the system I / O stack architecture of the SMR disk illustrated in FIG. 3.
  • the I / O scheduler sends the dispatched I / O request to the SCSI subsystem, and then dispatches it to the SMR disk through the HBA driver. .
  • the embodiment of this application will not repeat the details of the distribution process.
  • the I / O request dispatching device keeps monitoring the status of the zone lock until S1203 is executed when the zone lock is unlocked, and the write I / O request is dispatched to the target zone.
  • the I / O request dispatching device monitors the status of the zone lock, and dispatches the write I / O request to the target zone when the zone lock is unlocked within a preset period of time.
  • S1203 is specifically implemented as follows: When the zone lock is unlocked within a preset period of time, the I / O request dispatching device dispatches the write I / O request to the target zone.
  • the length of the preset duration may be configured according to actual requirements, which is not specifically limited in this embodiment of the present application.
  • the I / O request dispatching device dispatches the write I / O request to the target zone.
  • the I / O provided in the embodiment of this application may further include: if the zone lock is locked for a preset period of time, the I / O request dispatch device skips the write I / O request and dispatches other I / O requests.
  • the I / O request dispatch method of S1201 to S1203 is for a write I / O request dispatch process.
  • the I / O request dispatch device can execute S1201 for each write I / O request The process from S1203 and the execution process are the same, so I will not repeat them here.
  • the I / O request dispatching method provided in this application dispatches write I / O requests after the zone lock of the target zone is locked and waits for its unlocking, even if multiple write I / O requests are issued in a multi-zone concurrent sequential write scenario. Zone locks are blocked. Waiting for unlocking before distribution will not cause repeated swings of the magnetic head, avoiding seek storms, and thus ensuring the performance of the SMR disk. Further, due to the sequential addition of SMR disks in the S-Zone, write I / O requests blocked by zone locks are generally sequential write I / Os added in the same zone. Waiting for unlocking before dispatching will allow I / O Request writing is more sequential, minimizing the moving distance of the magnetic head, optimizing the seek time, and improving the I / O access performance to the SMR disk.
  • the I / O request dispatching device in S1201 queries the zone lock status of the target zone Zone of the write I / O request in the SMR disk
  • the I / O request provided in the embodiment of the present application may further include S1204.
  • the I / O request dispatching device dispatches the write I / O request to the target zone.
  • the zone lock when it is found that the zone lock is unlocked, it indicates that the zone has no write operation, and the write I / O request currently waiting can be directly sent to the target area for the write operation.
  • the I / O request dispatch method provided in the embodiment of the present application may further include S1205.
  • the I / O requests the dispatching device to lock the zone lock of the target zone.
  • the I / O request dispatching method provided by the embodiment of the present application shown in FIG. 12 and FIG. 13 is a processing process of the I / O request dispatching device to a write I / O request in the dispatch queue, and the I / O request The dispatching device has the same processing process for each write I / O request in the dispatch queue, and will not repeat them one by one here.
  • the dispatch queue may also include read I / O requests, and I / O requests with a target zone of C-Zone. Therefore, in actual applications, other steps can be included before S1201 to implement Dispatch of all I / O requests.
  • the I / O request dispatching method provided in this application may further include other steps based on FIG. 12 or FIG. 13. Among other steps, the other steps included in the I / O request dispatching method illustrated in FIG. 14 may be performed by the I / O request dispatching device or may be performed by the I / O scheduler where the I / O request dispatching device is located. This is not specifically limited, and is only described herein as being executed by the scheduler where the I / O request dispatching device is located (FIG. 14 and related descriptions are abbreviated as I / O scheduler execution).
  • the I / O scheduler determines whether the scheduling queue is empty.
  • the I / O scheduler selects an I / O request to be dispatched.
  • the I / O scheduler determines whether the I / O request to be dispatched is a write request.
  • the I / O scheduler determines whether the target area of the I / O request to be dispatched is S-Zone.
  • the I / O scheduler determines whether the zone lock of the target area of the I / O request to be dispatched is locked.
  • the I / O scheduler monitors the status of the zone lock.
  • the I / O scheduler executes S1401 and executes S1401.
  • the I / O scheduler skips the write I / O request and executes S1401 to try the next pending I / O request.
  • the I / O scheduler dispatches the write I / O request to the target area and locks the zone lock of the target area.
  • S1401 is executed to process the next I / O request.
  • the I / O scheduler dispatches the I / O request to be dispatched downward.
  • S1401 is executed to process the next I / O request.
  • the dispatching order in the dispatch queue shown in FIG. 8 adopts the I / O request dispatching method provided in this application, the dispatching order is as shown in FIG. 15.
  • the numbers in the rectangle represent the prior art.
  • the I / O scheduler device includes a hardware structure and / or a software module corresponding to each function.
  • the I / O scheduler device includes a hardware structure and / or a software module corresponding to each function.
  • this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
  • the functional part of the I / O scheduler that executes the I / O request dispatching method provided in this application is called an I / O request dispatching device.
  • the I / O request dispatching device may be an I / O request dispatching device.
  • the I / O request dispatching device can be equivalent to the I / O scheduler, or the I / O request dispatching device can also be deployed in the I / O scheduler to support I / O
  • the scheduler executes the I / O request dispatching method provided in this application.
  • the I / O scheduler may be divided into functional modules according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware or software functional modules. It should be noted that the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • dividing the functional modules of the I / O scheduler is equivalent to dividing the functional modules of the I / O request dispatching device; or, when When the I / O request dispatching device is part or all of the I / O scheduler, dividing the functional modules of the I / O request dispatching device is equivalent to dividing the functional modules of the I / O scheduler.
  • FIG. 16 shows a possible structure diagram of the I / O request dispatching device in the I / O scheduler involved in the above embodiment.
  • the I / O request dispatching device 160 may include a query unit 1601, a monitoring unit 1602, and a dispatch unit 1603.
  • the query unit 1601 is configured to execute the process S1201 in FIG. 12 or FIG. 13;
  • the monitoring unit 1602 is configured to execute the process S1202 in FIG. 12 or FIG. 13;
  • the dispatch unit 1603 is configured to execute the processes S1203 and S1204 in FIG. 12 or FIG. 13.
  • all relevant content of each step involved in the above method embodiment can be referred to the functional description of the corresponding functional module, which will not be repeated here.
  • the I / O request dispatching device 160 may further include a locking unit 1604.
  • the locking unit 1604 is configured to perform process S1205 in FIG. 13.
  • FIG. 18 shows a possible structure diagram of the I / O request dispatching device in the I / O scheduler involved in the above embodiment.
  • the I / O request dispatching device 180 may include a processing module 1801 and a communication module 1802.
  • the processing module 1801 is configured to control and manage the operation of the I / O request dispatching device 180.
  • the processing module 1801 is configured to support the I / O request dispatching device 180 through the communication module 1802 to support the I / O request dispatching device 180 to execute the processes S1201 to S1205 in FIG. 12 or FIG. 13.
  • the I / O request dispatching device 180 may further include a storage module 1803 for storing the program code and data of the I / O request dispatching device 180.
  • the processing module 1801 may be the processor 1101 in the physical structure of the I / O request dispatching device 110 shown in FIG. 11, and may be a processor or a controller. For example, it can be a CPU, a general-purpose processor, a DSP, an ASIC, an FPGA, or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute various exemplary logical blocks, modules, and circuits described in connection with the present disclosure.
  • the processor 1801 may also be a combination that implements computing functions, such as a combination including one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • the communication module 1802 may be a transceiver 1103 in the physical structure of the I / O request dispatching device 110 shown in FIG. 11.
  • the communication module 1802 may be a communication port, or may be a transceiver, a transceiver circuit, or a communication interface. Alternatively, the communication interface may implement communication with other devices through the component having a transmitting and receiving function.
  • the above-mentioned components having a transmitting and receiving function may be implemented by an antenna and / or a radio frequency device.
  • the storage module 1803 may be the memory 1102 in the physical structure of the I / O request dispatching device 110 shown in FIG. 11.
  • the I / O request dispatching device When the processing module 1801 is a processor, the communication module 1802 is a transceiver, and the storage module 1803 is a memory, the I / O request dispatching device according to FIG. 18 in the embodiment of the present application may be the I / O request dispatching device shown in FIG. 11 110.
  • the I / O request dispatching device 160 or the I / O request dispatching device 180 provided in the embodiment of the present application may be used to implement the function of the I / O scheduler in the method implemented by the embodiments of the present application.
  • the I / O request dispatching device 160 or the I / O request dispatching device 180 provided in the embodiment of the present application may be used to implement the function of the I / O scheduler in the method implemented by the embodiments of the present application.
  • Only parts related to the embodiment of the present application are shown, and specific technical details are not disclosed, please refer to each embodiment of the present application.
  • the embodiment of the present application also provides an I / O scheduler, which includes the I / O request dispatching device provided by any of the foregoing embodiments.
  • I / O scheduler which includes the I / O request dispatching device provided by any of the foregoing embodiments.
  • the steps of the method or algorithm described in combination with the disclosure of this application may be implemented in a hardware manner, or may be implemented in a manner in which a processor executes software instructions.
  • Software instructions can be composed of corresponding software modules.
  • Software modules can be stored in RAM, flash memory, ROM, erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EPROM, EEPROM), registers, hard disks, mobile hard disks, CD-ROMs, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be an integral part of the processor.
  • the processor and the storage medium may reside in an ASIC.
  • the ASIC can be located in a core network interface device.
  • the processor and the storage medium can also exist as discrete components in the core network interface device.
  • the memory may be coupled to the processor, for example, the memory may exist independently and be connected to the processor through a bus.
  • the memory can also be integrated with the processor.
  • the memory may be configured to store application program code that executes the technical solution provided by the embodiment of the present application, and is controlled and executed by a processor.
  • the processor is configured to execute application program code stored in the memory, so as to implement the technical solution provided in the embodiment of the present application.
  • An embodiment of the present application further provides a chip system.
  • the chip system includes a processor, and is configured to implement a technical method of a communication device according to an embodiment of the present invention.
  • the chip system further includes a memory for storing program instructions and / or data necessary for the communication device in the embodiment of the present invention.
  • the chip system further includes a memory for the processor to call the application program code stored in the memory.
  • the chip system may be composed of one or more chips, and may also include chips and other discrete devices, which are not specifically limited in the embodiments of the present application.
  • the functions described in this application may be implemented by hardware, software, firmware, or any combination thereof.
  • the functions When implemented in software, the functions may be stored on a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium.
  • Computer-readable media includes computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
  • a storage media may be any available media that can be accessed by a general purpose or special purpose computer.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may be physically included separately, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the above integrated unit implemented in the form of a software functional unit may be stored in a computer-readable storage medium.
  • the above software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute some steps of the method described in the embodiments of the present application.
  • the aforementioned storage media include: U disks, mobile hard disks, read-only memory (ROM), random access memory (RAM), magnetic disks or compact discs, and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种I/O请求派发方法及装置,涉及计算机技术领域,实现顺序并发调度写I/O请求时避免寻道风暴,提高SMR盘的性能。所述方法包括:I/O请求派发装置查询写I/O请求在SMR盘中的目标Zone的Zone锁状态(S1201);若Zone锁处于锁定状态,I/O请求派发装置监听该Zone锁的状态(S1202);在监听到该Zone锁解锁时,I/O请求派发装置将该写I/O请求派发至目标Zone(S1203)。该方法用于派发I/O请求。

Description

一种I/O请求派发方法及装置 技术领域
本申请涉及计算机技术领域,尤其涉及一种输入/输出(Input/Output,I/O)请求派发方法及装置。
背景技术
大数据时代的数据量呈指数级增长,预计每两年就会翻一番。硬盘(Hard Disk Drive,HDD)作为一种经济高效的存储介质,在当今的存储数字世界中依然起着中流砥柱的作用。但是,传统硬盘即将达到其存储密度极限,存储业界引入了叠瓦式磁记录(Shingled Magnetic Recording,SMR)盘技术来克服限制提升容量。
如图1所示,示意了传统硬盘与SMR盘的结构。从图1内容可以看出,SMR盘采用叠瓦式记录,通过将一定区域内的磁道依次部分重叠,获得了更高的存储面密度、更大的磁盘容量、更低的单位容量价格。但是,SMR盘在带来上述好处的同时,对于上层应用的I/O行为提出了顺序性约束,即只支持顺序写入。SMR盘按照固定大小在逻辑上被分割成一个个区域(Zone),一个Zone包含了一段连续的逻辑区块地址(Logical Block Address,LBA)。Zone分为两类:允许随机写的传统区域(Conventional Zone,C-Zone)和只支持顺序写的顺序区域(Sequential Zone,S-Zone))。SMR盘读写I/O的顺序性约束体现在一个S-Zone内只能进行顺序写,多个S-Zone之间则可以进行并发写入。
在I/O栈中,I/O调度器管理块设备的请求队列,通过合并I/O请求、对I/O进行排序来于减少磁盘寻道次数,从而提高磁盘性能。I/O调度器在接收到I/O请求时,先将I/O请求添加到I/O调度队列,再按特定的调度算法将调度队列中的I/O请求转移到派发队列,最终提交执行。
为支持SMR盘S-Zone内的顺序写,I/O栈中的各个部分都需要支持保序。为了防止I/O乱序,对SMR盘的每一个S-Zone引入一个用于写I/O保序的锁,称为Zone锁(Zone Lock)。在写I/O下发时加锁,在写I/O返回时解锁,确保任何时刻对于一个S-Zone,最多只有一个写请求被下发到SMR盘上,确保SMR盘S-Zone内不会发生写I/O乱序。因此,I/O调度器对SMR盘的调度行为受到Zone锁状态的控制,跳过被Zone锁阻塞的写I/O请求的方式来对SMR盘进行调度,即将写I/O请求下发到SMR盘上时,将只允许向未锁定的目标S-Zone下发写I/O,如果目标S-Zone被锁定了,那么这个写请求将不被派发,放回调度队列,调度器将继续查找下一个能下发的写请求。
当SMR盘存在多个Zone并发顺序写时,若存在写I/O请求被Zone锁阻塞,调度器跳过被Zone锁阻塞的写I/O请求调度下一个写I/O请求,导致硬盘磁头寻道至另一个Zone;在多个写I/O请求被Zone锁阻塞的多Zone并发顺序写场景下,则会引起磁头的反复摆动,产生寻道风暴,增加大量不必要的随机I/O,引起性能下降。如果Zone与Zone间隔较远,对SMR盘性能影响更加显著。
发明内容
本申请实施例提供一种I/O请求派发方法及装置,顺序并发调度写I/O请求时避免寻道风暴,提高SMR盘的性能。
为达到上述目的,本申请的实施例采用如下技术方案:
第一方面,提供一种I/O请求派发方法,该方法可以包括:查询写I/O请求在SMR盘中的目标Zone的Zone锁状态;若该Zone锁处于锁定状态,监听该Zone锁的状态;在监听到该Zone锁解锁时,将该写I/O请求派发至该目标Zone。
本申请提供的I/O请求派发方法,在目标Zone的Zone锁处于锁定状态时,等待监听其解锁后派发写I/O请求,即使多Zone并发顺序写场景下多个写I/O请求被Zone锁阻塞,等待解锁后再派发不会引起磁头的反复摆动,避免了寻道风暴,进而保证了SMR盘的性能。进一步的,由于SMR盘在S-Zone内顺序追加的特点,被Zone锁阻塞的写I/O请求一般是同一个Zone内顺序追加的写I/O,等待解锁后再派发会让I/O请求写入更加顺序化,最小化了磁头的移动距离,优化了寻道时间,提高了对SMR盘的I/O访问性能。
其中,上述写I/O请求为调度队列中顺序最优先的待派发的写I/O请求。调度队列中按序存储了至少一个I/O请求,调度队列中I/O请求的顺序可以根据各种调度算法排列,本申请对此不进行具体限定。
结合第一方面,在一种可能的实现方式中,在监听到Zone锁解锁时,将写I/O请求派发至目标Zone,具体可以实现为:在预设时长内监听到Zone锁解锁时,将写I/O请求派发至目标Zone。相应的,本申请提供的I/O请求派发方法还可以包括:若在预设时长内Zone锁一直锁定,跳过该写I/O请求派发其他I/O请求。在该实现方式中,通过预设时长限定最长等待时间,避免过长时间等待。跳过写I/O请求后,处理下一个待派发的I/O请求,其具体过程相同,此处不再进行赘述。
结合第一方面或上述任一方面可能的实现方式,在另一种可能的实现方式中,在查询写I/O请求在SMR盘中的目标Zone的Zone锁状态之后,本申请提供的I/O请求派发方法还可以包括:若该Zone锁处于解锁状态,将写I/O请求派发至目标Zone。
结合第一方面或上述任一方面可能的实现方式,在另一种可能的实现方式中,在将写I/O请求派发至目标Zone后,本申请提供的I/O请求派发方法还可以包括:锁定目标Zone的Zone锁。以实现采用Zone锁对I/O请求派发过程的保序。
第二方面,提供一种I/O请求派发装置,该装置可以包括查询单元、监听单元以及派发单元。其中,查询单元,用于查询写I/O请求在SMR盘中的目标Zone的Zone锁状态;监听单元,用于若查询单元查询Zone锁处于锁定状态,监听Zone锁的状态;派发单元,用于在监听单元监听到Zone锁解锁时,将写I/O请求派发至目标Zone。
本申请提供的I/O请求派发装置,在目标Zone的Zone锁处于锁定状态时,等待监听其解锁后派发写I/O请求,即使多Zone并发顺序写场景下多个写I/O请求被Zone锁阻塞,等待解锁后再派发不会引起磁头的反复摆动,避免了寻道风暴,进而保证了SMR盘的性能。进一步的,由于SMR盘在S-Zone内顺序追加的特点,被Zone锁阻塞的写I/O请求一般是同一个Zone内顺序追加的写I/O,等待解锁后再派发会让I/O请求写入更加顺序化,最小化了磁头的移动距离,优化了寻道时间,提高了对SMR盘的I/O访问性能。
结合第二方面,在一种可能的实现方式中,派发单元具体可以用于:在监听单元在预设时长内监听到Zone锁解锁时,将写I/O请求派发至目标Zone。若在预设时长内Zone锁一直锁定,跳过写I/O请求派发其他I/O请求。跳过写I/O请求后,处理下一个待派发的I/O 请求,其具体过程相同,此处不再进行赘述。
结合第二方面或上述任一种可能的实现方式,在另一种可能的实现方式中,派发单元还可以用于:若查询单元查询Zone锁处于解锁状态,将写I/O请求派发至目标Zone。
结合第二方面或上述任一种可能的实现方式,在另一种可能的实现方式中,该装置还可以包括:锁定单元,用于在派发单元将写I/O请求派发至目标Zone后,锁定目标Zone的Zone锁。
需要说明的是,第二方面提供的I/O请求派发装置,用于实现上述第一方面或任一种可能的实现方式提供的I/O请求派发方法,其具体实现可以参考第一方面或第一方面的任一种可能的实现方式,此处不再一一赘述。
第三方面,本申请提供一种I/O请求派发装置,该I/O请求派发装置可以实现上述方法示例中的功能,所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个上述功能相应的模块。
结合第三方面,在一种可能的实现方式中,该I/O请求派发装置的结构中包括处理器和收发器,该处理器被配置为支持该I/O请求派发装置执行上述方法中相应的功能。该收发器用于支持该I/O请求派发装置与其他设备之间的通信。该I/O请求派发装置还可以包括存储器,该存储器用于与处理器耦合,其保存该I/O请求派发装置必要的程序指令和数据。
需要说明的是,第三方面提供的I/O请求派发装置,用于实现上述第一方面或任一种可能的实现方式提供的I/O请求派发方法,其具体实现可以参考第一方面或第一方面的任一种可能的实现方式,此处不再一一赘述。
第四方面,提供一种I/O调度器,包括如上述第三方面提供的I/O请求派发装置。
第五方面,本申请实施例提供了一种计算机存储介质,包括指令,当其在计算机上运行时,使得计算机执行上述第一方面或第一方面任一种可能的实现方式所设计的程序。
第六方面,本申请实施例提供了一种计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第一方任一种可能的实现方式所设计的程序。
上述第四方面至第六方面提供的方案,用于实现上述第一方面提供的I/O请求派发装置,因此可以与第一方面达到相同的有益效果,此处不再进行赘述。
附图说明
图1为传统硬盘与SMR盘的结构示意图;
图2为SMR盘Zone的划分以及WP指示的示意图;
图3为现有技术提供的基于SMR盘的系统I/O栈架构示意图;
图4为现有技术提供的一种I/O调度器的内部结构示意图;
图5为现有技术提供的一种最后期限调度算法中I/O调度器的内部结构示意图;
图6为一种SMR盘Zone锁与I/O调度器的交互场景示意图;
图7为现有技术中提供的一种I/O请求派发方法的流程示意图;
图8为现有技术提供的一种调度队列中I/O请求的派发顺序示意图;
图8a为一种I/O调度器的架构示意图;
图9为Linux通用块层中I/O请求的处理过程的流程示意图;
图10为I/O调度器工作的总体流程的示意图;
图11为本申请实施例提供的一种I/O请求派发装置的结构示意图;
图12为本申请实施例提供的一种I/O请求派发方法的流程示意图;
图13为本申请实施例提供的另一种I/O请求派发方法的流程示意图;
图14为本申请实施例提供的再一种I/O请求派发方法的流程示意图;
图15为本申请实施例提供的一种调度队列中I/O请求的派发顺序示意图;
图16为本申请实施例提供的另一种I/O请求派发装置的结构示意图;
图17为本申请实施例提供的再一种I/O请求派发装置的结构示意图;
图18为本申请实施例提供的又一种I/O请求派发装置的结构示意图。
具体实施方式
SMR盘主要应用在视频、归档、备份的场合,这些应用通常具有大文件、顺序写为主的特点。多Zone并发顺序写在SMR盘上是非常常见的场景。例如,视频监控场景中,多个摄像头录下的多路视频流通过文件系统写入到不同的Zone内;数据重构时为提升恢复速度,来自多块备份盘的数据并发地恢复写入到多个Zone等。
SMR盘对写入的顺序性约束可在硬件侧由硬盘本身来管理,亦可在主机侧由软件来进行。前者被称为“驱动器自管理的SMR盘”(Drive Managed SMR disk),后者被称为主机管理的SMR盘(Host Managed SMR,HM SMR盘)。HM SMR盘向上提供了新接口,不允许上层软件随机写,只允许上层进行顺序追加写,不符合SMR盘写入规则的I/O将会报错。
如前所述,SMR盘按照固定大小(例如256MB)在逻辑上被分割成一个个Zone,一个Zone包含了一段连续的LBA地址。C-Zone允许随机写,S-Zone只支持顺序写。每个S-Zone有一个写指针(Write Pointer,WP),WP定位了下一个顺序写可以发生的位置。SMR盘只能对WP所在位置的LBA进行写入操作,写入其他LBA则会报错。从WP位置开始的写操作将自动向后移动写指针。HM SMR盘可以使用REPORT ZONES命令获取有关Zone和WP的信息。图2示意了SMR盘Zone的划分以及WP指示。
图3展示了基于SMR盘的系统I/O栈架构图。如图3所示,在I/O栈中,包括文件系统、配置了I/O调度器的通用块层、小型计算机系统接口(Small Computer System Interface,SCSI)子系统、主机总线适配器(Host Bus Adapter,HBA)驱动、SMR盘。I/O调度器(I/O Scheduler)是通用块层的一个重要组成部分。文件系统生成的块I/O(Block I/O)进入通用块层后,在派发给底层SMR盘之前,会经过I/O调度器的调度。I/O调度器管理块设备的队列,通过调整队列中的请求排列顺序,对SMR盘的I/O性能有着极大的影响。
在I/O栈中,I/O调度器(I/O Scheduler)是一个重要组成部分,它管理块设备的I/O请求队列,在接收到文件系统提交的I/O请求时,通过合并I/O请求、对I/O进行排序,调整队列中的请求排列顺序,决定在什么时刻派发请求到块设备,以减少磁盘寻道次数,从而提高磁盘性能。I/O调度器从文件系统接收到I/O请求时,生成调度队列,按照调度算法生成派发队列。
其中,合并是指将两个或多个I/O请求结合成一个新请求。当文件系统提交I/O请求到请求队列,如果此时队列中己经存在一个请求,它访问的磁盘扇区和当前请求访问的磁盘扇区相邻,那么这两个请求可以合并为一个对两个相邻磁盘扇区操作的新请求。合并请求显然能减少系统开销和磁盘寻址次数。排序则是将进入请求队列的所有I/O请求按扇区编 号(LBA地址)递增方向有序排列,使得磁头可以按照磁盘的旋转顺序的完成I/O操作,减小磁盘的寻道次数。调度器的算法和电梯运行的策略相似,因此I/O调度器也被称作I/O电梯(I/O Elevator)。
图4为一种I/O调度器的内部结构示意图。如图4所示,引入I/O调度器后,请求队列就分为两种队列:一种为I/O调度队列,它蕴涵于I/O调度器中;另一种是派发队列(dispatch queue),即将派送至入磁盘的I/O请求队列。文件系统提交的I/O请求首先被添加到调度队列,然后,I/O调度器按特定的调度算法将它转移到派发队列,最终被提交执行。
在各类针对磁盘的I/O调度器中,根据请求磁盘扇区(LBA地址)的顺序在请求队列中进行排序的电梯算法都是其中重要的基础部分。电梯算法着重使I/O的吞吐最大化,各种调度器在其基础上,加入了读写优先级(priority)、I/O公平性(fairness)、避免I/O饿死(starvation)等其他因素方面的考虑,进一步调整I/O请求的顺序,形成了各自的特点和适用场景。
这里描述一种基于电梯算法的调度算法的最后期限调度算法。最后期限调度算法一方面以I/O请求的目的区域的扇区编号为次序来维护请求,组成排序队列(sorted queue),另一方面也按请求到达的时间为次序来维护请求,组成先进先出(First Input First Output,FIFO)队列,用于先来先服务。
图5示意了最后期限调度算法中I/O调度器的内部结构。如图5所示,最后期限调度算法实际使用四个队列,以请求类型为依据区分。也就是说,写操作有一个排序队列和FIFO队列,读操作也有一个排序队列和FIFO队列。在最后期限调度器中,每个请求都有一个超时时间。默认情况下,读请求的超时时间是500毫秒,写请求的超时时间是5秒,读请求的优先级高于写请求。当新请求被提交,最后期限调度器将其加入到排序队列中进行合并和排序,同时将这个请求加入到FIFO队列中,读请求被加入至读FIFO队列,写请求被加入到写FIFO队列,这两个队列中,请求完全按照FIFO顺序排列,即新请求永远被放入到队列的末尾。
一般情况下,请求从排序队列中取下,再推入到派发队列;如果写FIFO队列或读FIFO队列头部有请求超时(也就是,当前时间超过了请求指定的超时时间),那么最后期限I/O调度器就不再使用排序队列,而是从FIFO队列中取出请求放入派发队列,直到队列中没有超时的请求。依靠这种方法,最后期限I/O调度器保证所有的请求都不会长时间超时。因此,最后期限调度器防止了请求饿死(starvation)的出现。
需要说明的是,调度算法的类型多种多样,上述描述了最后期限调度算法;但是,本申请可以应用于各种调度算法的场景中,本申请对于调度算法的类型不进行具体限定。
如前所述,为支持SMR盘S-Zone内的顺序写,I/O栈中的各个部分都需要支持保序,通过加Zone锁保护的方式实现。在SMR盘上,I/O调度器在Zone锁的以上,其行为受到Zone锁状态的控制。图6示意了SMR盘Zone锁与I/O调度器的交互场景,如图6所示,调度队列中的Wi表示目的地为Zone i的写I/O请求,Ri表示目的地为Zone i的读I/O请求。SMR盘上Zone 0、Zone 1上分别有一个写I/O正在进行中,因此Zone 0和Zone 1上的Zone锁处于锁定状态;Zone 2上有一个读I/O正在进行中,Zone n上则没有I/O,Zone 2和Zone n处于未锁定状态。在I/O调度器的调度队列中,第1个写I/O请求的标志为W0,对应的 Zone 0处于上锁状态,因此无法继续往下派发。第4个写I/O请求的标志位W2,对应的Zone 0处于解锁状态,因此可以往下层驱动派发。
具体而言,对于处于一个调度批次的I/O调度队列,当一个写I/O请求被Zone锁阻塞,当前的处理方案是跳过被Zone锁阻塞的写I/O请求,将该被阻塞的写I/O请求放回调度队列处理下一个待派发I/O请求的方式来对SMR盘进行调度。
针对SMR盘,根据调度器的算法挑选出待派发的候选I/O请求后(加入派发队列前),现有技术的调度器根据以下原则实现了相应的调度策略:1)任何非写入请求始终允许向下派发;2)任何对C-Zone的写入都允许向下派发;3)对于对S-Zone的写入,首先检查Zone锁。如果该Zone未锁定,则允许在锁定目标Zone后继续写入操作;如果该Zone已锁定,则会跳过写入请求,并尝试调度队列中的下一个请求。也就是说,现有技术中将写I/O请求下发到SMR盘上时,将只允许向未锁定的目标S-Zone下发写I/O;如果目标S-Zone被锁定了,那么这个写请求将不被派发,放回调度队列,调度器将继续查找下一个能下发的写请求。如果在属于同一调度批次的写I/O调度队列内找不到能下发的写请求,那么允许读请求被下发,即使这时候这个写请求所在的写调度批次还没处理完。
现有技术中提供的I/O请求派发方法的具体流程示意如图7所示,该流程可以包括如下步骤:
S701、I/O调度器判断调度队列是否为空。
在S701中,若是,则结束派发流程;否则,执行S702和S703。
S702、I/O调度器挑选出待派发的I/O请求。
其中,待派发的I/O请求即调度队列中顺序最靠前的I/O请求。
S703、I/O调度器判断待派发的I/O请求是否为写请求。
在S703中,若是,则执行S704;否则,执行S708向下派发后执行S701。
S704、I/O调度器判断待派发的I/O请求的目标区域是否为S-Zone。
在S704中,若是,则执行S705;否则,执行S708向下派发后执行S701。
S705、I/O调度器判断待派发的I/O请求的目标区域的Zone锁是否锁定。
在S705中,若是,则执行S706后执行S701;否则,执行S707后执行S708向下派发后再执行S701。
S706、I/O调度器跳过该待派发的I/O请求,尝试处理调度队列中下一个I/O请求。
S707、I/O调度器锁定该待派发的I/O请求的目标区域。
S708、I/O调度器向下派发该待派发的I/O请求。
图8示意了一个调度队列中I/O请求的派发顺序。如图8所示,在一个写调度批次内,调度队列中共有19个写I/O请求等待调度,矩形中的数字代表现有技术的调度器的方案中写I/O请求的派发顺序。当写I/O被Zone锁阻塞时,现有技术的调度器在多Zone并发写时会造成磁头不断跳动,在下发Zone 1的一个写I/O后立即寻道至Zone 2,下发另一个写I/O,再跳到下一个Zone。如果Zone与Zone之间的间隔较远的话,其性能影响更加显著。
因此,当SMR盘存在多个Zone并发顺序写时,若存在写I/O请求被Zone锁阻塞,调度器跳过被Zone锁阻塞的写I/O请求调度下一个写I/O请求,导致硬盘磁头寻道至另一个Zone;在多个写I/O请求被Zone锁阻塞的多Zone并发顺序写场景下,则会引起磁头的反 复摆动,产生寻道风暴,增加大量不必要的随机I/O,引起性能下降。如果Zone与Zone间隔较远,对SMR盘性能影响更加显著。
基于此,本申请提出一种I/O请求派送方法,其基本原理是:当写I/O请求被Zone锁阻塞时,等待解锁后派送,避免寻道风暴,提高性能。
本申请提供的I/O请求派送方法,应用于如图3所示的SMR盘的系统I/O栈架构中。具体的,应用于图3示意的SMR盘的系统I/O栈架构中I/O调度器向SMR盘派发I/O请求的过程中。
基于如图3所示的SMR盘的系统I/O栈架构,图8a示意了本申请提供的方案涉及的I/O调度器的架构。如图8a所示,本申请提供的方案涉及的I/O调度器的架构由若干组调度队列、派发队列、Zone锁表,以及电梯对象组成。
其中,调度队列是I/O调度器维护的一系列队列,这些队列用于对请求的排序,或者执行I/O合并;派发队列用于暂存下发给磁盘驱动的I/O请求,SMR盘依次处理派发队列中的I/O;电梯对象是I/O调度器的核心部分,负责从调度队列中依照一定的规则挑选出适当的I/O请求,插入到派发队列的尾部;Zone锁表则是SMR盘I/O调度器特有的组件,写I/O下发时上锁,写I/O完成时解锁,用于确保同一时刻每个S-Zone内最多只有一个写I/O正在执行。
需要说明的是,Zone锁表可以配置在I/O调度器内部,也可以配置在图3所示的SMR盘的系统I/O栈架构中的其他位置,本申请实施例对此不进行具体限定,图8a只是示意并不构成具体限定。
在详细介绍本申请提供的I/O请求派送方法之前,为了便于理解,此处先描述Linux通用块层中,I/O请求的处理过程。
Linux通用块层中,I/O请求的处理过程如图9所示,该过程可以包括:
S901、上层向块I/O子系统提交通用块层请求(bio)。
块I/O子系统向上层提供接收通用块层请求(bio)的接口,上层(文件系统等)在构造好bio后,提交给通用块层处理。
S902、从bio构造I/O请求队列(request queue),进行合并与排序。
在S902中,经过合并与排序,从bio构造请求队列(request queue)。合并将对磁盘连续扇区访问的多个I/O合并为一个I/O,排序是讲I/O按照对磁盘访问的扇区编号进行排序,尽量使磁头向一个方向移动。
S903、下层(SCSI子系统)逐个处理SCSI块设备请求队列(request queue)中的请求。
在S903中,块I/O子系统将请求队列(request queue)中的I/O请求派发到下层SCSI子系统中。
在图9示意的I/O请求的处理过程中,I/O调度器在S902及S903中发挥着重要作用。下面描述I/O调度器工作的总体流程。如图10所示,I/O调度器工作的总体流程具体可以包括:
S1001、调度器为请求队列(request queue)建立关联的I/O调度队列。
在确定好电梯对象后,为SCSI块设备请求队列(request queue)分配建立相关联的I/O调度队列。
S1002、调度器判断bio是否可以被合并到请求中。
在S1002中,接收到bio后,调度器尝试从request queue中选择一个可以合并这个bio的request,进行向后合并、向前合并。
S1003、调度器将合并后的请求添加到I/O调度队列并排序。
准备好request后,块层将其添加到I/O调度队列中,同时按电梯对象要求进行排序,保持I/O调度队列的有序性。
S1004、调度器从I/O调度队列派发请求到派发队列。
具体的,本申请提供的方案可以应用于S1004的过程中,下面详细描述本申请提供的方案。
本申请实施例的说明书和权利要求书中的术语“第一”和“第二”等是用于区别不同的对象,而不是用于描述对象的特定顺序。例如,第一指示信息和第二指示信息等是用于区别不同的指示信息,而不是用于描述信息的特定顺序。
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念,便于理解。
下面结合附图,对本申请的实施例进行具体阐述。
一方面,本申请实施例提供一种I/O请求派发装置。I/O请求派发装置可以配置在调度器中,因此,I/O请求派发装置可以为调度器中执行本申请提供的I/O请求派发方法的功能单元。当然,也可以将调度器称之为I/O请求派发装置,本申请实施例对此不进行具体限定。图11示出的是与本申请各实施例相关的一种I/O请求派发装置110。I/O请求派发装置110可以为图3所示的架构中的调度器的部分或全部。如图11所示,I/O请求派发装置110可以包括:处理器1101、存储器1102、收发器1103。
下面结合图11对I/O请求派发装置110的各个构成部件进行具体的介绍:
存储器1102,可以是易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);或者非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);或者上述种类的存储器的组合,用于存储可实现本申请方法的程序代码、以及配置文件。
处理器1101是I/O请求派发装置110的控制中心,可以是一个中央处理器(central processing unit,CPU),也可以是特定集成电路(Application Specific Integrated Circuit,ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路,例如:一个或多个微处理器(digital singnal processor,DSP),或,一个或者多个现场可编程门阵列(Field Programmable Gate Array,FPGA)。处理器1101可以通过运行或执行存储在存储器1102内的软件程序和/或模块,以及调用存储在存储器1102内的数据,执行I/O请求派发装置110的各种功能。
收发器1103用于I/O请求派发装置110与其他单元进行交互。示例性的,收发器1103可以为I/O请求派发装置110的收发端口。
具体的,处理器1101通过运行或执行存储在存储器1102内的软件程序和/或模块,以及调用存储在存储器1102内的数据,执行如下功能:
查询写I/O请求在SMR盘中的目标区域Zone的Zone锁状态;若Zone锁处于锁定状态,监听Zone锁的状态;在监听到Zone锁解锁时,将写I/O请求派发至目标Zone。
另一方面,本申请实施例提供一种I/O请求派发方法,应用于SMR盘的系统I/O栈向SMR盘派送写I/O请求的过程中。本申请实施例描述的I/O请求派发方法的过程,由I/O请求派发装置执行,I/O请求派发装置可以部署在调度器中作为功能单元或者芯片。文中描述的I/O请求派发装置执行某操作,对此可以理解为由I/O请求派发装置自身执行,也可以理解为由I/O请求派发装置所属的调度器执行,本申请实施例对此不进行具体限定,下文中仅描述为调度器执行某操作。
如图12所示,本申请实施例提供的I/O请求派发方法可以包括:
S1201、I/O请求派发装置查询写I/O请求在SMR盘中的目标Zone的Zone锁状态。
其中,该写I/O请求是调度队列中顺序最优先的待派发的写I/O请求。写I/O请求中携带了该写I/O请求访问的目标Zone的标识,
如前所述,调度队列是I/O请求派发装置所在的调度器,从文件系统接收I/O请求后配置请求队列后通过合并、排序等方式生成的。本申请实施例对于I/O调度器生成调度队列时的具体过程不进行限定。在生产调度队列后,I/O调度器使用调度算法将调度队列中的I/O请求发送到派送队列中,SMR盘依次执行派送队列中的I/O请求,本申请实施例对于I/O调度器所使用的调度算法不进行具体限定。
具体的,Zone锁表中存储了SMR盘中每个Zone的Zone锁状态,当对某个Zone派发写I/O请求进行写入操作,其对应的Zone锁则上锁,进入锁定状态,当写入操作结束接收到SMR盘反馈的I/O请求完成响应后,其对应的Zone锁解锁。写I/O请求中携带了该写I/O请求访问的目标Zone的标识,根据该标识查询Zone锁表,以查询写I/O请求在SMR盘中的目标Zone的Zone锁状态。
其中,Zone的标识是唯一识别SMR盘中Zone的信息,凡是可以用来唯一识别Zone的信息,均可以作为Zone的标识。例如,该标识可以为字符串、数字等,本申请实施例对于Zone的标识的内容不进行具体限定。
示例性的,如表1所示,示意了一种Zone锁表。
表1
Zone标识 Zone锁状态
1 锁定
2 解锁
3 解锁
4 解锁
…… ……
需要说明的是,表1只是对Zone锁表的示例说明,并不构成对Zone锁表的形式以及内容的具体限定。在实际应用中,可以根据需求配置Zone锁表的形式以及具体内容,本申 请实施例对此不进行具体限定。
可选的,Zone锁表可以配置在调度器内部,当然,Zone锁表也可以配置在SMR盘的系统I/O栈中通用块层的其他位置,本申请实施例对此不进行具体限定。
具体的,可以为图11示意的I/O请求派发装置110中的处理器1101执行S1201的过程。在执行S1201之后,若查询的该写I/O请求的目标Zone的Zone锁处于锁定状态,则执行S1202。
S1202、若Zone锁处于锁定状态,I/O请求派发装置监听该Zone锁的状态。
具体的,在S1202中,监听该Zone锁的状态即实时查询Zone锁表,查询其状态。
S1203、在监听到该Zone锁解锁时,I/O请求派发装置将该写I/O请求派发至目标Zone。
具体的,当在S1202的监听过程中,监听到该Zone锁解锁时,说明该Zone已经结束写入操作,当前等待的写I/O请求可以向目标区域派发进行写入操作。
示例性的,派发过程可以参考图3示意的SMR盘的系统I/O栈架构示意,I/O调度器将派发的I/O请求发送至SCSI子系统,再经过HBA驱动派发到SMR盘中。当然,派发也可以有其他实现方式,本申请实施例对于派发过程不再进行赘述。
一种可能的实现中,I/O请求派发装置一直监听该Zone锁的状态,直到监听到该Zone锁解锁时执行S1203,将该写I/O请求派发至目标Zone。
另一种可能的实现中,I/O请求派发装置监听该Zone锁的状态,并在预设时长内监控到该Zone锁解锁时将该写I/O请求派发至目标Zone。对应的,S1203则具体实现为:在预设时长内监听到该Zone锁解锁时,I/O请求派发装置将该写I/O请求派发至目标Zone。
可选的,预设时长的长短可以根据实际需求配置,本申请实施例对此不进行具体限定。
进一步可选的,若S1203则具体实现为在预设时长内监听到该Zone锁解锁时,I/O请求派发装置将该写I/O请求派发至目标Zone,本申请实施例提供的I/O请求派发方法还可以包括:若在预设时长内Zone锁一直锁定,I/O请求派发装置跳过该写I/O请求,派发其他I/O请求。
需要说明的是,S1201至S1203的I/O请求派发方法是针对一个写I/O请求的派发过程,在实际应用中,I/O请求派发装置对于每一个写I/O请求均可以执行S1201至S1203的过程,并且执行过程相同,此处不再一一赘述。
本申请提供的I/O请求派发方法,在目标Zone的Zone锁处于锁定状态时,等待监听其解锁后派发写I/O请求,即使多Zone并发顺序写场景下多个写I/O请求被Zone锁阻塞,等待解锁后再派发不会引起磁头的反复摆动,避免了寻道风暴,进而保证了SMR盘的性能。进一步的,由于SMR盘在S-Zone内顺序追加的特点,被Zone锁阻塞的写I/O请求一般是同一个Zone内顺序追加的写I/O,等待解锁后再派发会让I/O请求写入更加顺序化,最小化了磁头的移动距离,优化了寻道时间,提高了对SMR盘的I/O访问性能。
进一步可选的,如图13所示,在S1201中I/O请求派发装置查询写I/O请求在SMR盘中的目标区域Zone的Zone锁状态之后,本申请实施例提供的I/O请求派发方法还可以包括S1204。
S1204、若Zone锁处于解锁状态,I/O请求派发装置将写I/O请求派发至目标Zone。
具体的,当查询到该Zone锁解锁时,说明该Zone没有写入操作,当前等待的写I/O 请求可以直接向目标区域派发进行写入操作。
进一步的,如图13所示,在向目标区域派发进行写入操作之后,为了保序,本申请实施例提供的I/O请求派发方法还可以包括S1205。
S1205、I/O请求派发装置锁定目标Zone的Zone锁。
需要说明的是,图12及图13示意的本申请实施例提供的I/O请求派发方法,是I/O请求派发装置对调度队列中一个写I/O请求的处理过程,I/O请求派发装置对于调度队列中每个写I/O请求的处理过程相同,此处不再一一赘述。
进一步的,在实际应用中,调度队列中还可以包括读I/O请求,以及目标Zone为C-Zone的I/O请求,因此,在实际应用中,还可以在S1201之前包括其他步骤以实现所有I/O请求的派发。如图14所示,本申请提供的I/O请求派发方法在图12或图13基础上还可以包括其他步骤。其中,图14示意的I/O请求派发方法中包括的其他步骤可以为I/O请求派发装置执行,也可以为I/O请求派发装置所在的I/O调度器执行,本申请实施例对此不进行具体限定,此处仅描述为I/O请求派发装置所在的调度器执行(图14以及相关说明简写为I/O调度器执行)。
S1401、I/O调度器判断调度队列是否为空。
在S1401中,若是,则结束派发流程;否则,执行S1402和S1403。
S1402、I/O调度器挑选出待派发的I/O请求。
S1403、I/O调度器判断待派发的I/O请求是否为写请求。
在S1403中,若是,则执行S1404;否则,执行S1408向下派发。
S1404、I/O调度器判断待派发的I/O请求的目标区域是否为S-Zone。
在S1404中,若是,则执行S1405;否则,执行S1408向下派发。
S1405、I/O调度器判断待派发的I/O请求的目标区域的Zone锁是否锁定。
在S1405中,若是,则执行S1406;否则,执行S1407向下派发。
S1406、I/O调度器监听该Zone锁的状态。
可选的,在S1406中,在监听到该Zone锁解锁时,或者,在预设时长内监听到该Zone锁解锁时,I/O调度器执行S1407后执行S1401。
可选的,在S1406中,若监听到该Zone锁一直锁定或者在预设时长内一直锁定,I/O调度器跳过该写I/O请求执行S1401尝试下一个待派发I/O请求。
S1407、I/O调度器将写I/O请求派发至目标区域并锁定目标区域的Zone锁。
在S1407后,执行S1401以处理下一个I/O请求。
S1408、I/O调度器向下派发该待派发的I/O请求。
在S1408后,执行S1401以处理下一个I/O请求。
示例性的,图8示意的调度队列中I/O请求的派发顺序若采用本申请提供的I/O请求派发方法,其派发顺序则如图15所示,矩形中的数字代表现有技术的调度器的方案中写I/O请求的派发顺序。
上述主要从部署了本申请提供的I/O请求派发装置的I/O调度器工作过程的角度对本申请实施例提供的方案进行了介绍。可以理解的是,I/O调度器装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结 合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
需要说明的是,I/O调度器中执行本申请提供的I/O请求派发方法的功能部分称之为I/O请求派发装置,可以理解的是,I/O请求派发装置可以为I/O调度器的部分或全部,换言之,I/O请求派发装置可以与I/O调度器等价,或者,I/O请求派发装置也可以部署在I/O调度器内,以支持I/O调度器执行本申请提供的I/O请求派发方法。
本申请实施例可以根据上述方法示例对I/O调度器进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。当I/O请求派发装置为I/O调度器的部分或全部时,对I/O调度器进行功能模块的划分,就相当于对I/O请求派发装置进行功能模块的划分;或者,当I/O请求派发装置为I/O调度器的部分或全部时,对I/O请求派发装置进行功能模块的划分,就相当于对I/O调度器进行功能模块的划分。
在采用对应各个功能划分各个功能模块的情况下,图16示出了上述实施例中所涉及的I/O调度器中的I/O请求派发装置的一种可能的结构示意图。I/O请求派发装置160可以包括:查询单元1601,监听单元1602,派发单元1603。查询单元1601用于执行图12或图13中的过程S1201;监听单元1602用于执行图12或图13中的过程S1202;派发单元1603用于执行图12或图13中的过程S1203、S1204。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
进一步的,如图17所示,I/O请求派发装置160还可以包括锁定单元1604。锁定单元1604用于执行图13中的过程S1205。
在采用集成的单元的情况下,图18示出了上述实施例中所涉及的I/O调度器中的I/O请求派发装置的一种可能的结构示意图。I/O请求派发装置180可以包括:处理模块1801、通信模块1802。处理模块1801用于对I/O请求派发装置180的动作进行控制管理。例如,处理模块1801用于支持I/O请求派发装置180通过通信模块1802支持I/O请求派发装置180执行图12或图13中的过程S1201至S1205。I/O请求派发装置180还可以包括存储模块1803,用于存储I/O请求派发装置180的程序代码和数据。
其中,处理模块1801可以为图11所示的I/O请求派发装置110的实体结构中的处理器1101,可以是处理器或控制器。例如可以是CPU,通用处理器,DSP,ASIC,FPGA或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器1801也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信模块1802可以为图11所示的I/O请求派发装置110的实体结构中的收发器1103,通信模块1802可以是通信端口,或者可以是收发器、收发电路或通信接口等。或者,上述通信接口可以通过上述具有收发功能的元件,实现与其他设备的通信。上述具有收发功能的 元件可以由天线和/或射频装置实现。存储模块1803可以是图11所示的I/O请求派发装置110的实体结构中的存储器1102。
当处理模块1801为处理器,通信模块1802为收发器,存储模块1803为存储器时,本申请实施例图18所涉及的I/O请求派发装置可以为图11所示的I/O请求派发装置110。
如前述,本申请实施例提供的I/O请求派发装置160或I/O请求派发装置180可以用于实施上述本申请各实施例实现的方法中I/O调度器的功能,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请各实施例。
另一方面,本申请实施例还提供一种I/O调度器,包括上述任一实施例提供的I/O请求派发装置,具体技术细节请参照本申请各实施例。
结合本申请公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于RAM、闪存、ROM、可擦除可编程只读存储器(Erasable Programmable ROM,EPROM)、电可擦可编程只读存储器(Electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于核心网接口设备中。当然,处理器和存储介质也可以作为分立组件存在于核心网接口设备中。或者,存储器可以与处理器耦合,例如存储器可以是独立存在,通过总线与处理器相连接。存储器也可以和处理器集成在一起。存储器可以用于存储执行本申请实施例提供的技术方案的应用程序代码,并由处理器来控制执行。处理器用于执行存储器中存储的应用程序代码,从而实现本申请实施例提供的技术方案。
本申请实施例再提供一种芯片系统,该芯片系统包括处理器,用于实现本发明实施例通信设备的技术方法。在一种可能的设计中,该芯片系统还包括存储器,用于保存本发明实施例通信设备必要的程序指令和/或数据。在一种可能的设计中,该芯片系统还包括存储器,用于处理器调用存储器中存储的应用程序代码。该芯片系统,可以由一个或多个芯片构成,也可以包含芯片和其他分立器件,本申请实施例对此不作具体限定。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组 件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理包括,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (12)

  1. 一种输入/输出I/O请求派发方法,其特征在于,所述方法包括:
    查询写I/O请求在SMR盘中的目标区域Zone的Zone锁状态;
    若所述Zone锁处于锁定状态,监听所述Zone锁的状态;
    在监听到所述Zone锁解锁时,将所述写I/O请求派发至所述目标Zone。
  2. 根据权利要求1所述的I/O请求派发方法,其特征在于,
    所述在监听到所述Zone锁解锁时,将所述写I/O请求派发至所述目标Zone,包括:在预设时长内监听到所述Zone锁解锁时,将所述写I/O请求派发至所述目标Zone;
    所述方法还包括:若在所述预设时长内所述Zone锁一直锁定,跳过所述写I/O请求派发其他I/O请求。
  3. 根据权利要求1或2所述的I/O请求派发方法,其特征在于,在所述查询写I/O请求在SMR盘中的目标区域Zone的Zone锁状态之后,所述方法还包括:
    若所述Zone锁处于解锁状态,将所述写I/O请求派发至所述目标Zone。
  4. 根据权利要求1-3任一项所述的I/O请求派发方法,其特征在于,在所述将所述写I/O请求派发至所述目标Zone后,所述方法还包括:
    锁定所述目标Zone的Zone锁。
  5. 一种输入/输出I/O请求派发装置,其特征在于,所述装置包括:
    查询单元,用于查询写I/O请求在SMR盘中的目标区域Zone的Zone锁状态;
    监听单元,用于若所述查询单元查询所述Zone锁处于锁定状态,监听所述Zone锁的状态;
    派发单元,用于在所述监听单元监听到所述Zone锁解锁时,将所述写I/O请求派发至所述目标Zone。
  6. 根据权利要求5所述的I/O请求派发装置,其特征在于,所述派发单元具体用于:
    在所述监听单元在预设时长内监听到所述Zone锁解锁时,将所述写I/O请求派发至所述目标Zone;
    若在所述预设时长内所述Zone锁一直锁定,跳过所述写I/O请求派发其他I/O请求。
  7. 根据权利要求5或6所述的I/O请求派发装置,其特征在于,所述派发单元还用于:
    若所述查询单元查询所述Zone锁处于解锁状态,将所述写I/O请求派发至所述目标Zone。
  8. 根据权利要求5-7任一项所述的I/O请求派发装置,其特征在于,所述装置还包括:
    锁定单元,用于在所述派发单元将所述写I/O请求派发至所述目标Zone后,锁定所述目标Zone的Zone锁。
  9. 一种I/O请求派发装置,其特征在于,所述装置包括处理器、存储器以及存储在存储器上并可在处理器上运行的指令,当所述指令被运行时,使得所述装置执行如权利要求1至4任一项所述的I/O请求派发方法。
  10. 一种I/O调度器,其特征在于,包括如权利要求9所述的I/O请求派发装置。
  11. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至4任一项所述的I/O请求派发方法。
  12. 一种计算机程序产品,当其在计算机上运行时,使得计算机执行权利要求1至4任一项所述的I/O请求派发方法。
PCT/CN2019/095922 2018-08-08 2019-07-15 一种i/o请求派发方法及装置 WO2020029749A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810897159.6 2018-08-08
CN201810897159.6A CN109324755A (zh) 2018-08-08 2018-08-08 一种i/o请求派发方法及装置

Publications (1)

Publication Number Publication Date
WO2020029749A1 true WO2020029749A1 (zh) 2020-02-13

Family

ID=65264111

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/095922 WO2020029749A1 (zh) 2018-08-08 2019-07-15 一种i/o请求派发方法及装置

Country Status (2)

Country Link
CN (1) CN109324755A (zh)
WO (1) WO2020029749A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324755A (zh) * 2018-08-08 2019-02-12 成都华为技术有限公司 一种i/o请求派发方法及装置
CN113360077B (zh) * 2020-03-04 2023-03-03 华为技术有限公司 数据存储方法、计算节点及存储系统
CN115186300B (zh) * 2022-09-08 2023-01-06 粤港澳大湾区数字经济研究院(福田) 文件安全处理系统及文件安全处理方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106201355A (zh) * 2016-07-12 2016-12-07 腾讯科技(深圳)有限公司 数据处理方法和装置以及存储系统
US20170277591A1 (en) * 2015-02-06 2017-09-28 Western Digital Technologies, Inc. Indirection data structures to manage file system metadata
CN108021339A (zh) * 2017-11-03 2018-05-11 网宿科技股份有限公司 一种磁盘读写的方法、设备以及计算机可读存储介质
CN108255408A (zh) * 2016-12-28 2018-07-06 中国电信股份有限公司 数据存储方法以及系统
CN109324755A (zh) * 2018-08-08 2019-02-12 成都华为技术有限公司 一种i/o请求派发方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562648B (zh) * 2016-07-01 2021-04-06 北京忆恒创源科技有限公司 无锁ftl访问方法与装置
CN107957852B (zh) * 2017-10-13 2021-08-13 记忆科技(深圳)有限公司 一种提升固态硬盘性能一致性的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170277591A1 (en) * 2015-02-06 2017-09-28 Western Digital Technologies, Inc. Indirection data structures to manage file system metadata
CN106201355A (zh) * 2016-07-12 2016-12-07 腾讯科技(深圳)有限公司 数据处理方法和装置以及存储系统
CN108255408A (zh) * 2016-12-28 2018-07-06 中国电信股份有限公司 数据存储方法以及系统
CN108021339A (zh) * 2017-11-03 2018-05-11 网宿科技股份有限公司 一种磁盘读写的方法、设备以及计算机可读存储介质
CN109324755A (zh) * 2018-08-08 2019-02-12 成都华为技术有限公司 一种i/o请求派发方法及装置

Also Published As

Publication number Publication date
CN109324755A (zh) 2019-02-12

Similar Documents

Publication Publication Date Title
US11681614B1 (en) Storage device with subdivisions, subdivision query, and write operations
US9710377B1 (en) Multi-array operation support and related devices, systems and software
WO2020029749A1 (zh) 一种i/o请求派发方法及装置
US7181571B2 (en) System and method for storage system
US10019196B2 (en) Efficient enforcement of command execution order in solid state drives
US6836819B2 (en) Automated on-line capacity expansion method for storage device
CN103186350B (zh) 混合存储系统及热点数据块的迁移方法
US6944707B2 (en) Storage subsystem, information processing system and method of controlling I/O interface
US20190129876A1 (en) Devices and methods for data storage management
JP4672282B2 (ja) 情報処理装置、及び情報処理装置の制御方法
US7013364B2 (en) Storage subsystem having plural storage systems and storage selector for selecting one of the storage systems to process an access request
WO2019062202A1 (zh) 硬盘操作命令的执行方法、硬盘及存储介质
US9465745B2 (en) Managing access commands by multiple level caching
WO2015068208A1 (ja) システムおよびその制御方法
CN106775438A (zh) 一种基于固态盘读写特性的i/o调度方法
WO2011074591A1 (ja) ストレージ装置、ストレージ制御装置、ストレージ制御方法及びプログラム
CN108255424A (zh) 一种NVMe固态硬盘IO响应延迟的保障方法
US20180335951A1 (en) Information processing apparatus, non-transitory computer-readable storage medium, and information processing method
US7669007B2 (en) Mirrored redundant array of independent disks (RAID) random access performance enhancement
US11853611B2 (en) Network interface card implementing composite zoned namespace architecture
US11907582B2 (en) Cloud storage device implementing composite zoned namespace architecture
JPH1153292A (ja) 記憶装置サブシステム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19846581

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19846581

Country of ref document: EP

Kind code of ref document: A1