WO2018126771A1 - 一种存储控制器及io请求处理方法 - Google Patents

一种存储控制器及io请求处理方法 Download PDF

Info

Publication number
WO2018126771A1
WO2018126771A1 PCT/CN2017/108194 CN2017108194W WO2018126771A1 WO 2018126771 A1 WO2018126771 A1 WO 2018126771A1 CN 2017108194 W CN2017108194 W CN 2017108194W WO 2018126771 A1 WO2018126771 A1 WO 2018126771A1
Authority
WO
WIPO (PCT)
Prior art keywords
index
request
processing
sorting
sort index
Prior art date
Application number
PCT/CN2017/108194
Other languages
English (en)
French (fr)
Inventor
余思
龚骏辉
赵聪
王成
卢玥
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP17889889.6A priority Critical patent/EP3537281B1/en
Publication of WO2018126771A1 publication Critical patent/WO2018126771A1/zh
Priority to US16/503,817 priority patent/US10884667B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0637Permissions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0658Controller construction arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F2003/0697Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers device management, e.g. handlers, drivers, I/O schedulers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5022Workload threshold
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/503Resource availability

Definitions

  • the present application relates to the field of storage technologies, and in particular, to a storage controller and an input/output (input output, IO) request processing method performed by the storage controller.
  • a storage array is commonly used in a large-scale storage scenario, including multiple storage media and storage controllers.
  • the storage medium may include a hard disk (English name: hard disk drive, abbreviated as HDD) and a solid state disk (English name: solid state) Drive, abbreviation: SSD).
  • HDD hard disk drive
  • SSD solid state disk
  • the client sends an IO request to the storage controller through the communication network, and the storage controller processes the received IO request. For example, if the IO request is a read request, the storage controller determines which storage medium the read request is directed to, The storage controller then reads the corresponding data from the one or more storage media and returns it to the client.
  • the storage controller virtualizes the storage medium of the storage array into a plurality of storage units, and the IO requests received by the storage controller generally point to a certain storage unit.
  • the storage controller virtualizes the multiple storage media into different types of storage units (English: storage unit).
  • the storage controller virtualizes the plurality of storage media into one or more logical unit numbers (English full name: logical unit number, abbreviated: LUN), and each IO request of the client points to a certain LUN;
  • LUN logical unit number
  • each IO request of the client points to a file system
  • object storage English: object
  • each IO request of the client points to a certain bucket (English: bucket).
  • IOPS input output per second
  • the IOPS parameter achievement rate of the existing IO request scheduling method is low.
  • the application provides a storage controller to increase the achievement rate of IOPS.
  • a first aspect of the present application provides a storage controller that is applicable to a storage system having a plurality of storage units, the storage controller including: a memory device and a plurality of cores, the plurality of cores including at least A distribution core, multiple sorting cores, and at least one request processing core.
  • the memory device also stores a plurality of IO requests, each IO request is directed to a storage unit, and the corresponding shared processing sort index is set for each storage unit in the memory device.
  • the distribution core executes code stored in the memory device to execute to receive an IO request stored in the memory device and distribute the received IO request to the plurality of sort cores.
  • Executing the code stored in the memory device to perform the following actions: acquiring an IO request to be generated by the distribution core to process the sort index; determining a target storage unit to which the IO request to process the sort index is to be generated; Obtaining an IO per second parameter of the target storage unit; generating, according to the value of the shared processing sort index corresponding to the target storage unit and the IO per second parameter of the target storage unit, a processing sort index for the IO request to be generated by the processing sort index; And using the processing sort index of the IO request to be processed to sort the index, and updating the shared processing sort index corresponding to the target storage unit; storing the processing sort index of the IO request to be generated by the processing sort index into the index corresponding to each sort core In the queue, the index queue corresponding to each sorting core is stored in the memory device and includes a processing sort index generated by the respective sorting cores as IO requests directed to the plurality of storage units.
  • the code stored in the memory device is executed to periodically process the IO request corresponding to the smallest processing sort index in the index queue corresponding to each sorting core.
  • the above distribution core, multiple sort cores and request processing cores can work in parallel.
  • the storage controller generates a processing sort index for each IO request, and determines a processing order according to the size of the processing sort index of each IO request, thereby effectively improving the achievement rate of the IOPS parameter.
  • each sorting core does not need to access other sorting cores to obtain other sorting cores to generate and process sorting indexes for IO requests when generating sorting indexes for IO requests, which improves processing efficiency.
  • the respective sorting cores referred to in any one or any implementation of any of the aspects of the present application refer to any one of the plurality of sorting cores.
  • the target storage unit referred to in any aspect or any implementation of any aspect of the present application is a logical unit number LUN, a file system, or a bucket.
  • the each sorting core generates a processing sort index for the IO request of the to-be-generated processing sort index by the following operation:
  • the respective sorting cores are used to obtain the current before calculating the processing sort index of the IO request to be generated by the processing sort index. system time.
  • the respective sorting cores calculate the processing sort index of the IO request to be generated by the processing sort index by the following operation:
  • the IO request handles the sort index.
  • the system time is taken into account in the calculation of the processing sort index, which improves the scheduling precision of the IO request.
  • each sorting core is further configured to: generate an IO request for sorting an index for the to-be-generated processing At the time after processing the sort index, it is determined that the IO request that has not been indexed to the target storage unit that has not been distributed to the respective sort core is not yet generated.
  • the wait processing sort index is calculated as the wait processing sort index.
  • the waiting processing sort index includes an index queue corresponding to each sorting core during the existence of the index queue corresponding to each sorting core The IO request corresponding to the processing sort index that is larger than the waiting for processing the sort index cannot be processed by the request processing core.
  • the respective sorting cores are further configured to: after the moment, the respective sorting cores are distributed with the IO request pointing to the target storage unit or the waiting queue of the waiting sorting index in the index queue corresponding to the respective sorting cores exceeds a preset threshold In the case, the waiting processing sort index is eliminated from the index queue corresponding to each sorting core.
  • Waiting to process the use of sorted indexes can improve the scheduling accuracy of IO requests.
  • the request processing core periodically processes the smallest processing in the index queue corresponding to each sorting core by using the following operations
  • the IO request corresponding to the sorting index periodically accessing the index queue corresponding to each sorting core; processing the IO request corresponding to the smallest processing sorting index in the index queue corresponding to each sorting core in each access.
  • the second aspect of the present application provides an IO request processing method, which is executed when the storage controller provided by the foregoing first aspect is running.
  • the method includes: the distribution core receiving an IO request, and distributing the received IO request to the plurality of sorting cores; each sorting core acquires an IO request to be generated by the distribution core to process a sort index; the respective sorting cores obtain Determining a target storage unit pointed to by the IO request to be generated by the processing sort index; the respective sorting cores acquire an IO per second parameter of the target storage unit; the respective sorting cores according to the shared processing sort index and the target storage corresponding to the target storage unit The IO per second parameter of the unit generates a processing sort index for the IO request to be generated by the processing sort index; the each sorting core updates the shared processing corresponding to the target storage unit by using the processing sort index of the IO request to be generated by the processing sort index a sorting index; the respective sorting cores store the processing sorting index of the IO request of the
  • the respective sorting cores sort the index according to the shared processing sort index corresponding to the target storage unit and the IO per second parameter of the target storage unit
  • the IO request generation processing sort index includes: processing the processing order of the IO request of the processing-ordered index to be generated according to the sum of the ratio of the shared processing sort index corresponding to the target storage unit and the IO per second parameter of the target storage unit Index, K is a positive number.
  • the method further includes: Each sorting core acquires a current system time; then the respective sorting cores are based on a sharing place corresponding to the target storage unit And processing a sort index with a ratio of K to an IO per second parameter of the target storage unit, and calculating a processing sort index of the IO request to be generated by the processing sort index comprises: the respective sorting cores according to the shared processing corresponding to the target storage unit The sum of the sort index and the ratio of K to the IO per second parameter of the target storage unit, and the larger of the current system time, as the processing sort index of the IO request of the sort index to be generated.
  • the method further includes: the each sorting core is generated by the IO request for sorting the index to be generated by the processing Processing the sorted index, determining that the IO request that has not been indexed to the target storage unit that has not been distributed to the respective sorting cores is calculated; the respective sorting cores calculate the sorting processing order corresponding to the target storage unit at the moment The value of the index, the sum of the ratio of K and the IO per second parameter of the target storage unit, is used as a wait processing sort index; the respective sorting cores store the wait processing sort index in the index queue corresponding to each sort core.
  • the waiting processing sorting index is included in the index queue corresponding to each sorting core, and the index queue corresponding to each sorting core includes The IO request corresponding to the processing sort index that is larger than the waiting for processing the sort index cannot be processed by the request processing core; the respective sorting cores, after the moment, the respective sorting cores are distributed with the IO request pointing to the target storage unit or the waiting
  • the processing sort index is used to eliminate the waiting processing sort index from the index queue corresponding to each sorting core, in a case where the existence time of the index queue corresponding to each sorting core exceeds a preset threshold.
  • the request processing core periodically processes the smallest processing order index corresponding to the index queue corresponding to each sorting core.
  • IO requests include:
  • the IO request corresponding to the smallest processing sort index in the index queue corresponding to each sorting core is processed in each access.
  • the third aspect of the present application provides a storage medium, where the program code is stored, and when the program code is run by the storage controller, the storage controller performs the foregoing second aspect or any implementation of the second aspect.
  • the storage medium includes, but is not limited to, a read only memory, a random access memory, a flash memory, an HDD, or an SSD.
  • a fourth aspect of the present application provides a computer program product, comprising: program code, when the computer program product is executed by a storage controller, the memory controller performs any of the foregoing second aspect or the second aspect
  • the IO request processing method provided by the method may be a software installation package, and if the IO request processing method provided by any of the foregoing second aspect or the second aspect is required, the computer program product may be downloaded to the storage controller and The computer program product runs on the storage controller.
  • FIG. 1 is a schematic structural diagram of a storage system according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of an organization structure of a storage controller
  • FIG. 3 is a schematic structural diagram of a memory device
  • FIG. 5 is a schematic diagram of a process of generating a processing sort index
  • FIG. 6 is a schematic diagram of another process of generating a processing sort index
  • Figures 7-1 through 7-5 are schematic diagrams of the process of generating another sort index.
  • the processor includes one or more central processing units (English: central processing unit), each central processing unit including one or more cores (English: core).
  • the storage unit may be a LUN, a file system, or a bucket, corresponding to a case where the storage system uses block storage, file storage, or object storage.
  • the storage system in this specification presents P LUNs to the client, and P is a positive integer greater than one.
  • IO requests include IO data and metadata.
  • the IO data includes information about the IO request to be operated, the address of the data to be operated, and the like
  • the metadata includes the target storage unit ID of the IO request, and the target storage unit ID may be a LUN ID, a file system ID, or a bucket. ID.
  • the function of the function Max ⁇ x,y ⁇ is: return the larger of x and y.
  • the IOPS parameters can be the IOPS of a certain memory unit, or the IOPS processing weight of a certain memory unit.
  • the IOPS processing weight refers to the proportion of resources used by the storage array to process IO requests directed to respective storage units. Therefore, the IOPS parameter may be set by the user according to the service requirement. For example, the user determines the minimum IOPS of the storage unit related to a certain service according to the service requirement, or the user determines that the IO request of the storage unit related to a certain service needs to be occupied. A resource with a lot of weight for the storage array.
  • the IOPS parameter can also be set according to the user's level. For example, the advanced user's IOPS parameter is higher to ensure the experience of the advanced user.
  • the IOPS parameters of different storage units are stored in the storage controller.
  • the processor of the memory controller includes a plurality of cores, a memory device, and a communication interface. Each core establishes a communication connection with the memory device.
  • the storage controller communicates with the client and the storage medium through the communication interface.
  • the IO request obtained from the communication interface is stored in the IO storage space of the memory device.
  • the IO request dispatcher, IO request sorter, and IO request handler are all implemented by code in the core running memory device.
  • the core that runs the IO request dispatcher is called the distribution core.
  • the core that runs the IO request sorter is called the sort core.
  • the core that runs the IO request handler is called the request processing core.
  • core 1 is used to execute an IO request dispatcher
  • core 2 through core n are used to execute an IO request sorter
  • core n+1 to core n+m are used to execute an IO request handler
  • core n +m+1 is used to perform the operation of the memory controller As a system.
  • the IO request dispatcher distributes the IO requests in the IO storage space and distributes the IO requests to the subspaces of each core running the IO request sorter.
  • the subspaces of core 2 to core n are located in the space of core 1.
  • the subspaces of core 2 to core n may also be located outside the space of core 1, or respectively located at each core. Within the space.
  • the load balancing of each IO request sorting program is mainly considered, and the IO requests directed to a certain LUN are not considered to be distributed to a certain core space.
  • the IO request dispatcher can send a plurality of received IO requests to each IO request sorter in turn to ensure that the number of IO requests distributed by each IO request sorter is the same.
  • each IO request sorter reads and sorts the IO requests distributed to itself, and the sort results are stored in an index queue in the space of the core where each IO request sorter is located.
  • the index queue can be implemented by different data structures, such as heap (English: pile), first in first out queue, and the like.
  • Each IO request sorter generates a processing sort index for each IO request in its own subspace, and then sorts the processing sort index of each IO request in its index queue, and the sorted small IO request will be preferentially idle IO. Request handler processing.
  • the IO request handler may specifically perform a write operation or a read operation corresponding to the IO request according to the type of the IO request, and the IO request processing program may also be used for arranging or deleting the data carried by the IO request.
  • the storage array in which the storage controller is located uses block storage and the storage medium of the storage array is virtualized into 100 LUNs.
  • IOPS parameters need to be set for some LUNs.
  • Each IO request received by the storage controller points to a certain LUN.
  • the number of IO requests generated by different LUNs per second may vary greatly.
  • the sorting result of the IO request sorter affects whether the IOPS parameters of each LUN can be achieved. For example, the IOPS parameter of LUN1 is 1000, and the IOPS parameter of LUN2 is 200.
  • the application provides an IO request processing method. This method is applicable to the memory controller shown in FIG. 2.
  • a shared processing sort index is maintained for each storage unit, and the P shared processing sort indexes can be read and written by each IO scheduler.
  • the initial value of each shared processing sort index is the same.
  • the initial value of each shared processing sort index may be zero.
  • each shared processing sort index can be combined into one table and set in the storage space of the memory device.
  • the entire shared processing sort index is established by the operating system before the storage controller begins distributing the IO description information.
  • the communication interface receives a plurality of IO requests sent by the client, and deposits the multiple IO requests. IO storage space.
  • the IO request dispatcher generates IO description information for each IO request and establishes a mapping relationship between each IO request and the IO description information of the IO request.
  • the IO description information of each IO request includes the LUN ID carried in the metadata of the IO request.
  • the IO description information can be generated for each IO request in the process of sorting the IO request, and the subsequent IO request sorting program generates a processing sort index according to the IO description information to reduce the memory device. Read and write burden.
  • the IO request dispatcher distributes a plurality of IO description information to the subspace of the core where each IO request sorter is located.
  • the IO request dispatcher can construct a queue for each LUN in the core subspace where each IO request sorter is located, and store the IO description information in the subspace allocated to the core where the IO request sorter is located. In the queue of each LUN, in the subsequent steps, the IO request sorter identifies the LUN pointed to by each IO description information.
  • the following figure shows how the IO request sorting program on the core 2 and the running IO request sorting index is generated for each IO description information.
  • Each IO request sorting program uses the same method to generate and process each IO description information in the running process. Sort the index.
  • the IO description information A-B-C indicates the Cth IO description information directed to the LUN B to which the IO request sorting program running on the core A is distributed.
  • the processing sort index A-B-C indicates the processing sort index of the IO description information A-B-C.
  • the IO request sorting program running on the core 2 currently generates a processing sort index for the IO description information 2-1-3 as an example. Therefore, the processing order index of the IO description information 2-1-1 and the IO description information 2-1-2 has been stored in the index queue of the core 2 by the IO request sorting program.
  • the IO request sorting program running on the core 2 obtains the IO description information 2-1-3 from its own subspace, and obtains the LUN ID corresponding to the IO description information 2-1-3.
  • the IO request sorting program running on the core 2 acquires the IOPS parameter of the LUN 1 according to the LUN ID.
  • the IO request sorting program running on the core 2 obtains the value of the shared processing sort index corresponding to the LUN 1 according to the LUN ID, that is, the value of the shared processing sort index 1.
  • the IO request sorter running on Core 2 obtains the current system time through the operating system interface.
  • the current system time may specifically refer to the number of nanoseconds that the storage controller has passed during the period from startup to call of the operating system interface.
  • the IO request sorting program running on the core 2 calculates the processing sort index 2-1-3 of the IO description information 2-1-3.
  • processing sort index 2-1-3 Max ⁇ the value of the shared processing sort index 1+K/LUN 1 IOPS parameter, current system time ⁇ .
  • K is a positive number, and the common value of K is 1.
  • the IO request sorting program running on the core 2 updates the shared processing sort index corresponding to the LUN 1 by processing the sort index 2-1-3.
  • the shared processing sort index corresponding to the LUN is updated by using the generated processing sort index. Therefore, the P shared counters respectively record the processed sort index of the newly processed IO request directed to each LUN. Therefore, when any IO request sorter calculates a processing sort index that points to the IO description information of LUN p, the shared processing sort index p The value is equal to the previous processing sort index of the IO description information pointing to LUN p.
  • LUN p is any of the P LUNs presented by the storage system.
  • the initial processing sort index can be zero.
  • the IO request sorter running on core 2 will process the sort index 2-1-3 into the index queue of core 2.
  • a correspondence relationship is established between the IO description information 2-1-3 and the processing sort index 2-1-3, or between the processing order index 2-1-3 and the IO request for generating the IO description information 2-1-3. Correspondence relationship, so that the IO request corresponding to the processing sort index 2-1-3 can be determined in the subsequent steps.
  • each IO request sorting program generates a processing sort index for each IO description information distributed to itself and stores them in the respective index queues. Therefore, in the index queue of core 2 to core n, there is a processing sort index of one or more unprocessed IO requests.
  • the operating system After the IO request handler running on any core has processed an IO request, the operating system knows that the IO request handler has entered an idle state.
  • each IO request handler is recorded in the operating system, that is, after an IO request handler enters idle, the idle IO request handler subsequently processes which of the index queues corresponds to the smallest IO request corresponding to the sort index.
  • the processing sequence needs to make an IO request handler process the same or close to the processing sort index in each index queue, that is, an IO request handler periodically processes the smallest of each index queue. Process the IO request corresponding to the sort index.
  • the processing sequence may poll each IO request handler for each index queue according to core 2 to core n, and process the IO request corresponding to the smallest processing sort index in the index queue for each access. After processing the IO request corresponding to the smallest processing sort index in an index queue, the idle IO request handler processes the IO request corresponding to the smallest processing sort index in the next index queue.
  • the operating system binds the IO request handler and the index queue one by one, and then an IO request handler enters idle state, The operating system determines that the idle IO request handler next processes the IO request corresponding to the smallest processing sort index in the index queue corresponding to the free IO request handler.
  • the operating system After the operating system determines the IO request corresponding to the smallest processing sort index in the index queue processed by the idle IO request handler, the operating system selects the smallest processing sort index from the index queue, and notifies the idle IO request.
  • the handler processes the smallest IO request corresponding to the processing sort index, or the operating system instructs the idle IO request handler to access the index queue and processes the IO request corresponding to the smallest processing sort index in the index queue.
  • the execution order of some of the above steps may be adjusted.
  • the steps of obtaining the IOPS parameter of LUN 1 and obtaining the current system time may be performed at any time before the process sort index 2-1-3 is generated.
  • the foregoing step of obtaining the system time is an optional step.
  • the generated processing sort index 2-1-3 the value of the shared processing sort index 1 + K/LUN 1 IOPS parameter.
  • the idle IO request handler preferentially processes the IO request corresponding to the smallest processing sort index in each index queue. Therefore, for the IO request sorter running on the core 2, if the IO request directed to a certain LUN is not distributed for a period of time, the IO that is directed to the other LUN is continuously distributed during the period of time. request. Then, in the next time, the IO request pointing to the LUN is distributed to the IO request sorter running on the core 2, and the processing sort index of the IO request pointing to the LUN may be processed more than the IO request pointing to other LUNs.
  • the sort index is small, causing IO requests directed to the LUN to continue to be processed by the idle IO request handler, causing IO requests pointing to other LUNs to starve. Therefore, considering the system time into the calculation of the processing sort index, in the case that there is a long-term idle LUN, the subsequent IO request directed to the LUN is distributed to the subspace of the IO request sorting program running on the core 2, Blocking of IO requests to other LUNs improves scheduling accuracy of IO requests.
  • the processing ordering index recorded in the index queue of core 2 includes:
  • the IO request corresponding to the processing sort index already stored in the index queue at time 1 has been processed, and the new processing sort index generated between time 1 and time 2 includes:
  • Processing sort index 2-1-5 7.5
  • Processing sort index 2-1-7 10.5
  • the IO request sorting program running on the core 2 is not distributed with new IO description information pointing to the LUN 2. Therefore, after time 2, if the system time is not taken into account in the processing of the sort index, if the IO request sorter running on core 2 is distributed with new IO description information pointing to LUN 2, then these newly distributed pointing LUNs
  • the processing index of the IO description information of 2 will be much smaller than the processing order index of the IO description information pointing to LUN 1, and the idle IO request handler will continue to process the newly distributed pointing LUN 2 when accessing the index queue of core 2. IO request.
  • the newly processed processing order index of the IO description information pointing to LUN 2 may be equal to the current system time, and will not be sorted by the processing index pointing to the IO description information of LUN 1. A lot smaller.
  • the IO request sorting program running on the core 2 determines the index queue of the core 2 at a certain moment, the IO description information pointing to the LUN 1 has been processed, as shown in FIG. 6, in the processing sort index 2
  • the IO description information that is distributed to the IO request sorter running on the core 2 is not generated by the processing sort index and points to the IO description information of the LUN 1, the IO running on the core 2
  • the request generation waits for the sort index to be indexed and stores the wait sort index in core 2's index queue.
  • the waiting processing sort index the value of the shared processing sort index corresponding to LUN 1 at this time + K/LUN 1 IOPS parameter.
  • the waiting for processing the sorting index will be eliminated in one of two cases. First, the IO request sorting program running on the core 2 is distributed with a new IO description information pointing to the LUN 1, and second, the waiting for processing the sorting index The existence time exceeds the preset threshold.
  • Waiting to process the generation of a sorted index is an optional step.
  • the processing sort index in each index queue is sorted together with the waiting sort index. If the operating system selects the next processed IO request for the idle IO request handler, the smallest processing sort index in the current index queue is determined. In order to wait for the sort index to be processed, since the wait sort index does not correspond to any one IO request, the idle IO request handler cannot process the IO request corresponding to the processing sort index in the index queue. The operating system can reselect an index queue for the free IO request handler.
  • an idle IO request handler polls the index queues of the cores in the order of core 2 to core n, the idle IO request handler if polling the index queue of core 3 is currently polled, but finds the index queue of core 3. The smallest processing sort index is waiting to process the sort index, then the idle IO request handler skips the index queue of core 3 and accesses the index queue of core 4.
  • the speed is generally much higher than the speed at which the IO request handler processes the IO request. Therefore, if the IO description information that points to the LUN 2 that is waiting to be processed for the sort index and the IO request sorter running on the core 2 is distributed is not generated, the IO description information that minimizes the processing of the sort index may be caused in the index queue of the core 2. Always pointing to LUN 1, causing idle IO request handlers to continuously process IO requests directed to LUN 1, ultimately causing LUN 2 IOPS parameters to be difficult to achieve.
  • the processing sort index stored in the index queue of core 2 includes:
  • the index includes:
  • Processing sort index 2-1-5 7.5
  • the IO description information directed to LUN 1 is continuously distributed to the IO request sorter running on core 2. Then, after the IO request corresponding to the sort index 2-2-4 is processed, if the wait sort index is not generated, the idle IO request handler will continuously process the IO request directed to the LUN 1 once accessing the index queue of the core 2, The IOPS parameter of LUN 2 could not be reached. In contrast, the generation waits for the processing of the sort index 2-2-6, before the processing of the sort index 2-2-6 is eliminated, the processing queue of the core 2 is larger than the processing sorting index 2-2-6. The index will not be processed by the free IO request handler, causing the idle IO request handler to access other index queues. Therefore, using the waiting index to sort the index can improve the scheduling precision of the IO request and improve the achievement rate of the IOPS parameter.
  • FIG. 7-1 is an initial state, and the initial values of the shared processing sort index 1 and the shared processing sort index 2 are both 0.
  • the IO description information a-b-c indicates that it is distributed to the core.
  • the c IO description information of a pointing to LUN b.
  • the initial processing order index for the IO description information for LUN 1 and the IO description information for LUN 2 is 0.
  • the IOPS parameter of LUN 1 is 1000
  • the IO request sorting program running on the core 2 calculates the processing sort index for the IO description information 2-1-1, and the processing sort index of the IO description information 2-1-1 is 0.001.
  • the value of the shared processing sort index 1 is then updated to 0.001, as shown in Figure 7-2.
  • the IO request sorting program running on the core 3 calculates the processing sort index for the IO description information 3-1-1, and the processing sort index 3-1-1 is 0.002.
  • the value of the shared processing sort index 1 is then updated to 0.002, as shown in Figure 7-3.
  • the IO request sorting program running on the core 4 calculates the processing sort index for the IO description information 4-1-1, processes the sort index 4-1-1 to 0.003, and then updates the value of the shared processing sort index 1 to 0.003.
  • the IO request sorting program running on the core 4 calculates the processing sort index for the IO description information 4-2-1, and the processing sort index of the IO description information 4-2-1 is 0.002, and then the value of the processing processing sort index 2 is shared. Updated to 0.002, as shown in Figure 7-4.
  • the IO request sorting program running on the core 2 calculates the processing sort index for the IO description information 2-1-2, processes the sort index 2-1-2 to 0.004, and then updates the value of the shared processing sort index 1 to 0.004.
  • the IO request sorting program running on the core 2 calculates the processing sort index for the IO description information 2-2-1, processes the sort index 2-2-1 to 0.004, and then updates the value of the shared processing sort index 2 to 0.004. As shown in Figure 7-5.
  • the memory controller to which the present application is applied includes a bus, a processor, a memory device, and a communication interface.
  • the processor, the memory device, and the communication interface communicate via a bus.
  • the memory device may include a volatile memory (English: volatilememory), such as random access memory (English: random access memory, abbreviation: RAM).
  • a volatile memory such as volatile memory
  • random access memory English: random access memory, abbreviation: RAM
  • the communication interface includes a network interface and a storage medium access interface, which are respectively used for acquiring an IO request sent by the client and accessing the storage medium.
  • the memory device stores the code required to execute the IO request dispatcher, the IO request sorter, the IO request handler, and the operating system.
  • each core in the processor calls the code stored in the memory device to perform the IO request processing method provided above.
  • the methods described in connection with the present disclosure can be implemented by a processor executing software instructions.
  • the software instructions can be composed of corresponding software modules, which can be stored in RAM, flash memory, read-only memory (English: read only memory, abbreviation: ROM), erasable programmable read-only memory (English: erasable programmable) Read only memory, abbreviation: EPROM), electrically erasable programmable read only memory (English: electrical erasable programmable read only memory, abbreviation: EEPROM), A hard disk, optical disk, or any other form of storage medium known in the art.
  • the functions described herein may be implemented in hardware or software.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Multi Processors (AREA)
  • Computer And Data Communications (AREA)

Abstract

一种存储控制器,该存储控制器包括分发核心,多个排序核心和请求处理核心。这三种核心分别用于将输入输出IO请求分发至不同排序核心,为每个IO请求生成处理排序索引和按照各个IO请求的处理排序索引的大小处理IO请求,以灵活调度该存储控制器接收的IO请求。

Description

一种存储控制器及IO请求处理方法 技术领域
本申请涉及存储技术领域,尤其涉及一种存储控制器以及该存储控制器执行的输入输出(英文全称:input output,缩写:IO)请求处理方法。
背景技术
如图1,存储阵列常用于大规模存储场景中,包括多个存储介质和存储控制器,存储介质可以包括硬盘(英文全称:hard disk drive,缩写:HDD)和固态硬盘(英文全称:solid state drive,缩写:SSD)。客户端通过通信网络,将IO请求发送至存储控制器,存储控制器对接收的IO请求进行处理,例如IO请求为读请求的情况下,存储控制器确定该读请求指向于哪一个存储介质,然后存储控制器从该一个或多个存储介质中读取对应的数据并返回给客户端。
存储控制器将存储阵列的存储介质虚拟化为多个存储单元,存储控制器接收的IO请求一般指向某一存储单元。采用不同的存储类型的情况下,存储控制器将这多个存储介质虚拟化为不同类型的存储单元(英文:storage unit)。例如采用块存储的情况下,存储控制器将这多个存储介质虚拟成一个或多个逻辑单元号(英文全称:logical unit number,缩写:LUN),客户端的每个IO请求指向某一个LUN;采用文件存储的情况下,客户端的每个IO请求指向某一个文件系统;采用对象(英文:object)存储的情况下,客户端的每个IO请求指向某一个桶(英文:bucket)。
出于业务需要,用户常需要为不同存储单元设置IO每秒(英文:input output per second,缩写:IOPS)参数。如果客户端发送的IO请求数量较高,这些IO请求指向不同的存储单元,而由于存储控制器处理IO请求的速度有限,因此存储控制器需要对接收的进行调度来尽量达成该多个存储单元的QOS参数。
现有的IO请求的调度方法的IOPS参数达成率较低。
发明内容
本申请提供了一种存储控制器,以提升IOPS的达成率。
本申请的第一方面,提供了一种存储控制器,该存储控制器适用于有多个存储单元的存储系统,该存储控制器包括:内存设备和多个核心,这多个核心中包括至少一个分发核心,多个排序核心和至少一个请求处理核心。该内存设备内还存储有多个IO请求,每个IO请求指向一个存储单元,该内存设备内还为每个存储单元设置有对应的共享处理排序索引。
该分发核心工作时执行该内存设备中存储的代码以执行以接收存储于该内存设备中的IO请求,并将该接收的IO请求分发至该多个排序核心。
各个排序核心工作时执行该内存设备中存储的代码以执行以下动作:获取由该分发核心分发的待生成处理排序索引的IO请求;确定该待生成处理排序索引的IO请求指向的目标存储单元;获取该目标存储单元的IO每秒参数;根据该目标存储单元对应的共享处理排序索引的值和该目标存储单元的IO每秒参数,为该待生成处理排序索引的IO请求生成处理排序索引;用该待生成处理排序索引的IO请求的处理排序索引,更新该目标存储单元对应的共享处理排序索引;将该待生成处理排序索引的IO请求的处理排序索引存入该各个排序核心对应的索引队列中,该各个排序核心对应的索引队列存储于该内存设备且包含了该各个排序核心为指向该多个存储单元的IO请求生成的处理排序索引。
该请求处理核心工作时执行该内存设备中存储的代码以周期性的处理该各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
上述分发核心,多个排序核心和请求处理核心可以并行工作。
该存储控制器为每个IO请求生成处理排序索引,并根据各个IO请求的处理排序索引的大小确定处理顺序,有效提升了IOPS参数的达成率。
同时,每个排序核心在为IO请求生成处理排序索引时,无须访问其他排序核心以获取其他排序核心为IO请求生成处理排序索引的情况,提升了处理效率。
本申请的任一方面或任一方面的任一实现方式中提及的该各个排序核心,指代该多个排序核心中的任一个排序核心。
本申请的任一方面或任一方面的任一实现方式中提及的该目标存储单元为逻辑单元号LUN、文件系统或桶。
结合第一方面,在第一方面的第一种实现方式中,该各个排序核心通过以下操作为该待生成处理排序索引的IO请求生成处理排序索引:
根据该目标存储单元对应的共享处理排序索引的值与K和该目标存储单元的IO每秒参数之比的和,计算该待生成处理排序索引的IO请求的处理排序索引,K为正数。
结合第一方面的第一种实现方式中,在第一方面的第二种实现方式中,该各个排序核心在计算该待生成处理排序索引的IO请求的处理排序索引前,还用于获取当前系统时间。
因此,该各个排序核心通过以下操作计算该待生成处理排序索引的IO请求的处理排序索引:
将根据该目标存储单元对应的共享处理排序索引的值与K和该目标存储单元的IO每秒参数之比的和,与该当前系统时间之间的较大者,作为该待生成处理排序索引的IO请求的处理排序索引。
将系统时间考虑入处理排序索引的计算中,提升了IO请求的调度精度。
结合第一方面的第一种实现方式或第二种实现方式,在第一方面的第三种实现方式中,该各个排序核心,还用于:在为该待生成处理排序索引的IO请求生成处理排序索引后的时刻,确定没有被分发至该各个排序核心的指向该目标存储单元的还未被生成索引的IO请求。
随后,计算该时刻下的该目标存储单元对应的共享处理排序索引的值,与K和该目标存储单元的IO每秒参数之比的和,以作为等待处理排序索引。并且,将该等待排序索引存入该各个排序核心对应的索引队列中。该该该该该该该各个排序核心运行过程中,一旦确定被分发给该各个排序核心的还未被生成索引的IO请求中,已经没有指向该目标存储单元的IO请求,则生成该等待处理排序索引。
结合第一方面的第三种实现方式,在第一方面的第四种实现方式中,该等待处理排序索引在该各个排序核心对应的索引队列的存在期间,该各个排序核心对应的索引队列包含的大于该等待处理排序索引的处理排序索引对应的IO请求不能被该请求处理核心处理。
该各个排序核心还用于,在该时刻后该各个排序核心被分发了指向该目标存储单元的IO请求或该等待处理排序索引在该各个排序核心对应的索引队列的存在时间超过预设的阈值的情况下,从该各个排序核心对应的索引队列中消除该等待处理排序索引。
等待处理排序索引的运用可以提升IO请求的调度精度。
结合第一方面或第一方面的任一种实现方式,在第一方面的第六种实现方式中,该请求处理核心通过以下操作周期性的处理该各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求:周期性的访问该各个排序核心对应的索引队列;处理每次访问中,该各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
本申请第二方面提供了一种IO请求处理方法,前述第一方面提供的存储控制器运行时,执行该方法。该方法包括:该分发核心接收IO请求,并将该接收的IO请求分发至该多个排序核心;各个排序核心获取由该分发核心分发的待生成处理排序索引的IO请求;该各个排序核心获取确定该待生成处理排序索引的IO请求指向的目标存储单元;该各个排序核心获取该目标存储单元的IO每秒参数;该各个排序核心根据该目标存储单元对应的共享处理排序索引和该目标存储单元的IO每秒参数,为该待生成处理排序索引的IO请求生成处理排序索引;该各个排序核心用该待生成处理排序索引的IO请求的处理排序索引,更新该目标存储单元对应的共享处理排序索引;该各个排序核心将该待生成处理排序索引的IO请求的处理排序索引存入该各个排序核心对应的索引队列中,该各个排序核心对应的索引队列存储于该内存设备且包含了该各个排序核心为指向该多个存储单元的IO请求生成的处理排序索引;该请求处理核心周期性的处理该各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
结合第二方面,在第二方面的第一种实现方式中,该各个排序核心根据该目标存储单元对应的共享处理排序索引和该目标存储单元的IO每秒参数,为该待生成处理排序索引的IO请求生成处理排序索引包括:根据该目标存储单元对应的共享处理排序索引与K和该目标存储单元的IO每秒参数之比的和,计算该待生成处理排序索引的IO请求的处理排序索引,K为正数。
结合第二方面的第一种实现方式,在第二方面的第二种实现方式中,在该各个排序核心计算该待生成处理排序索引的IO请求的处理排序索引前,该方法还包括:该各个排序核心获取当前系统时间;则该各个排序核心根据该目标存储单元对应的共享处 理排序索引与K和该目标存储单元的IO每秒参数之比的和,计算该待生成处理排序索引的IO请求的处理排序索引包括:该各个排序核心将根据该目标存储单元对应的共享处理排序索引与K和该目标存储单元的IO每秒参数之比的和,与该当前系统时间之间的较大者,作为该待生成处理排序索引的IO请求的处理排序索引。
结合第二方面的第一种实现方式和第二种实现方式,在第二方面的第三种实现方式中,该方法还包括:该各个排序核心在为该待生成处理排序索引的IO请求生成处理排序索引后的时刻,确定没有被分发至该各个排序核心的指向该目标存储单元的还未被生成索引的IO请求;该各个排序核心计算该时刻下的该目标存储单元对应的共享处理排序索引的值,与K和该目标存储单元的IO每秒参数之比的和,以作为等待处理排序索引;该各个排序核心将该等待处理排序索引存入该各个排序核心对应的索引队列中。
结合第二方面的第三种实现方式,在第二方面的第四种实现方式中,该等待处理排序索引在该各个排序核心对应的索引队列的存在期间,该各个排序核心对应的索引队列包含的大于该等待处理排序索引的处理排序索引对应的IO请求不能被该请求处理核心处理;该各个排序核心,在该时刻后该各个排序核心被分发了指向该目标存储单元的IO请求或该等待处理排序索引在该各个排序核心对应的索引队列的存在时间超过预设的阈值的情况下,从该各个排序核心对应的索引队列中消除该等待处理排序索引。
结合第二方面或第二方面的任一种实现方式,在第二方面的第五种实现方式中,该请求处理核心周期性的处理该各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求包括:
周期性的访问该各个排序核心对应的索引队列;
处理每次访问中,该各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
本申请第三方面提供了一种存储介质,该存储介质中存储了程序代码,该程序代码被存储控制器运行时,该存储控制器执行前述第二方面或第二方面的任一实现方式提供的IO请求处理方法。该存储介质包括但不限于只读存储器,随机访问存储器,快闪存储器、HDD或SSD。
本申请第四方面提供了一种计算机程序产品,该计算机程序产品包括程序代码,当该计算机程序产品被存储控制器执行时,该存储控制器执行前述第二方面或第二方面的任一实现方式提供的IO请求处理方法。该计算机程序产品可以为一个软件安装包,在需要使用前述第二方面或第二方面的任一实现方式提供的IO请求处理方法的情况下,可以下载该计算机程序产品至存储控制器并在该存储控制器上运行该计算机程序产品。
附图说明
图1为本申请实施例提供的存储系统的组织结构示意图;
图2为存储控制器的组织结构示意图;
图3为内存设备的结构示意图;
图4为另一内存设备的结构示意图;
图5为生成处理排序索引的一个过程示意图;
图6为生成处理排序索引的另一个过程示意图;
图7-1至7-5为生成处理排序索引另一个的过程示意图。
具体实施方式
下面结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
贯穿本说明书,处理器包括一个或多个中央处理单元(英文全称:central processing unit),每个中央处理单元包括一个或多个核心(英文:core)。
贯穿本说明书,存储单元可以为LUN,文件系统或桶,分别对应存储系统采用块存储,文件存储或对象存储的情况。示例性的,本说明书中的存储系统对客户端呈现P个LUN,P为大于1的正整数。
贯穿本说明书,IO请求包括IO数据和元数据。其中,IO数据包括了该IO请求待操作的数据、待操作的数据的地址等信息,元数据包括了IO请求的目标存储单元ID,该目标存储单元ID可以为LUN ID,文件系统ID或桶ID。
贯穿本说明书,函数Max{x,y}的功能为:返回x和y中较大值。
贯穿本说明书,IOPS参数可以为某一存储单元的IOPS,或某一存储单元的IOPS处理权重。其中,IOPS处理权重指代存储阵列用于处理指向各个存储单元的IO请求的资源的比例。因此,IOPS参数可以是用户根据业务需求设置的,例如,用户根据业务需求判断与某一业务相关的存储单元的IOPS最低为多少,或用户判断与某一业务相关的存储单元的IO请求需要占用存储阵列多大权重的资源。IOPS参数还可以是根据用户的等级设置的,例如高级用户的IOPS参数较高,以保证高级用户的体验。存储控制器内存储有不同存储单元的IOPS参数。
本申请实施例所应用的存储控制器架构
如图2所示,存储控制器的处理器包括多个核心、内存设备和通信接口。每个核心与该内存设备建立通信连接。存储控制器通过该通信接口与客户端和存储介质进行通信。从通信接口获取的IO请求被存入内存设备的IO存储空间。
对IO请求进行调度和处理的过程中主要有三类程序发挥作用,即IO请求分发程序、IO请求排序程序和IO请求处理程序。IO请求分发程序、IO请求排序程序和IO请求处理程序均由核心运行内存设备中的代码实现。运行IO请求分发程序的核心被称为分发核心,运行IO请求排序程序的核心被称为排序核心,运行IO请求处理程序的核心被称为请求处理核心。
分别分配多少个核心用于这三类程序可以根据这三类程序的运行压力进行调度,各个程序运行于哪个核心上也可以根据每个核心的负载状况进行迁移。在图2中示例性的,核心1用于执行IO请求分发程序,核心2至核心n用于执行IO请求排序程序,核心n+1至核心n+m用于执行IO请求处理程序,核心n+m+1用于执行存储控制器的操 作系统。
IO请求分发程序对IO存储空间中的IO请求进行分发,将IO请求分发至各个运行了IO请求排序程序的核心的子空间内。在图2中示例性的,核心2至核心n的子空间位于核心1的空间内,实际中,核心2至核心n的子空间也可以位于核心1的空间外,或者分别位于每个核心的空间内。
IO请求分发程序分发IO请求的过程中,主要考虑每个IO请求排序程序后续的负载均衡,并不考虑将指向某一LUN的IO请求全部分发至某个核心的空间内。例如,IO请求分发程序可以将接收到的多个IO请求轮流发给每个IO请求排序程序,以保证为每个IO请求排序程序分发的IO请求数量相同。
随后,各个IO请求排序程序读取分发给自己的IO请求并对其进行排序,排序结果存储于各个IO请求排序程序所在的核心的空间内的索引队列中。该索引队列可以通过不同数据结构实现,例如堆(英文:pile)、先进先出的队列等。各个IO请求排序程序为自己的子空间内的每个IO请求生成一个处理排序索引,然后对其索引队列中的各个IO请求的处理排序索引进行排序,排序小的IO请求会优先被空闲的IO请求处理程序处理。
IO请求处理程序具体可以根据IO请求的类型,执行IO请求对应的写操作或读操作,IO请求处理程序也可以用于对IO请求携带的数据进行排布或重删等。
以该存储控制器所在的存储阵列采用块存储且该存储阵列的存储介质被虚拟化成100个LUN为例。由于业务需求,需要为有些LUN设置IOPS参数。存储控制器接到的每个IO请求指向于某一个LUN,由于客户端生成IO请求的速度不定,每秒产生的指向于不同LUN的IO请求的数量可能会有较大差异。由于IO请求处理程序的处理效率有限,因此IO请求排序程序的排序结果会影响各个LUN的IOPS参数能否达成。例如,LUN1的IOPS参数为1000,LUN2的IOPS参数为200,但由于一段时间内生成的指向LUN 2 IO请求较多,导致某一时刻内存设备的IO存储空间中存储了指向LUN 1的1000个IO请求和指向LUN 2的2000个IO请求。这3000个IO请求被分发至核心2至核心n+1上的IO请求排序程序进行排序。如果每个IO请求排序程序仅根据LUN 1的IOPS参数和LUN 2的IOPS参数对IO请求进行调度,则最终这n个IO请求排序程序一般难以达成LUN 1的IOPS参数和LUN 2的IOPS参数。如果各个IO请求排序程序在生成处理排序索引的过程中互相通信,虽然有助于达成各个LUN的IOPS的下限值,但各个IO请求排序程序之间的通信开销将会很高。
本申请提供了一种IO请求处理方法。该方法适用于图2所示的存储控制器。
图2所示的存储控制器的内存设备中为每个存储单元维护了一个共享处理排序索引,这P个共享处理排序索引可以被每个IO调度程序读写。每个共享处理排序索引的初始值相同,示例性的,每个共享处理排序索引的初始值可以为0。
实际中,共享处理排序索引实现方式可以有多种实现方式。例如,可以将各个共享处理排序索引合并为一张表设置于内存设备的存储空间内。全部的共享处理排序索引在该存储控制器开始分发IO描述信息之前,由操作系统建立。
如图3所示,通信接口接收客户端发送的多个IO请求,并将该多个IO请求存入 IO存储空间。
IO请求分发程序为每个IO请求生成IO描述信息,并建立每个IO请求和该IO请求的IO描述信息的映射关系。每个IO请求的IO描述信息包括了该IO请求的元数据中携带的LUN ID。
由于IO请求占用的空间较大,因此在对IO请求进行排序的过程中可以为每个IO请求生成IO描述信息,后续IO请求排序程序根据IO描述信息来生成处理排序索引,以降低内存设备的读写负担。
如图4所示,IO请求分发程序将多个IO描述信息分发至各个IO请求排序程序所在的核心的子空间。
IO请求分发程序可以在各个IO请求排序程序所在的核心的子空间内,为每个LUN构建一个队列,并将分配至一个IO请求排序程序所在的核心的子空间内的IO描述信息分别存入各个LUN的队列中,以便后续步骤中,IO请求排序程序识别每个IO描述信息指向的LUN。
以下通过图5,介绍核心2上与运行的IO请求排序程序如何为一个IO描述信息生成处理排序索引,每个IO请求排序程序在运行过程中均用相同的方法为每个IO描述信息生成处理排序索引。
图5中,IO描述信息A-B-C指示了核心A上运行的IO请求排序程序被分发的第C个指向LUN B的IO描述信息。相应的,处理排序索引A-B-C指示IO描述信息A-B-C的处理排序索引。
以核心2上运行的IO请求排序程序当前为IO描述信息2-1-3生成处理排序索引为例。因此,IO描述信息2-1-1和IO描述信息2-1-2的处理排序索引已经被IO请求排序程序存入核心2的索引队列中。
核心2上运行的IO请求排序程序从自己子空间内获取IO描述信息2-1-3,获取IO描述信息2-1-3对应的LUN ID。
核心2上运行的IO请求排序程序根据该LUN ID,获取LUN 1的IOPS参数。
核心2上运行的IO请求排序程序根据该LUN ID,获取LUN 1对应的共享处理排序索引的值,也即共享处理排序索引1的值。
核心2上运行的IO请求排序程序通过操作系统接口,获取当前系统时间。
该当前系统时间具体可以指代存储控制器从启动至调用操作系统接口期间经过的纳秒数。
核心2上运行的IO请求排序程序计算IO描述信息2-1-3的处理排序索引2-1-3。
其中,处理排序索引2-1-3=Max{该共享处理排序索引1的值+K/LUN 1的IOPS参数,当前系统时间}。K为正数,常见的K的取值为1。
核心2上运行的IO请求排序程序用处理排序索引2-1-3更新LUN 1对应的共享处理排序索引。
每个IO请求排序程序生成对应于某一LUN的IO描述信息的处理排序索引后,均用生成的处理排序索引更新该LUN对应的共享处理排序索引。因此,这P个共享计数器分别记录了最新处理的指向各个LUN的IO请求的处理排序索引。因此,任一IO请求排序程序计算指向LUN p的IO描述信息的处理排序索引时,共享处理排序索引p 的值等于上一个指向LUN p的IO描述信息的处理排序索引。LUN p为存储系统呈现的P个LUN之任一。
如果核心2上运行的IO请求排序程序当前为处理排序索引为2-1-1生成处理排序索引。则处理排序索引2-1-1=Max{初始处理排序索引+K/LUN 1的IOPS参数,当前系统时间}。该初始处理排序索引可以为0。
核心2上运行的IO请求排序程序将处理排序索引2-1-3存入核心2的索引队列。
IO描述信息2-1-3与处理排序索引2-1-3之间建立有对应关系,或者处理排序索引2-1-3与生成IO描述信息2-1-3的IO请求之间建立有对应关系,以便后续步骤中能够确定处理排序索引2-1-3对应的IO请求。
通过以上步骤,各个IO请求排序程序为分发给自己的每个IO描述信息生成处理排序索引并存入各自的索引队列中。因此,核心2至核心n的索引队列中,存有一个或多个未被处理的IO请求的处理排序索引。
任一核心上运行的IO请求处理程序处理完毕一个IO请求后,操作系统得知该IO请求处理程序进入空闲状态。
操作系统内记录了每个IO请求处理程序的处理顺序,即一个IO请求处理程序进入空闲后,该空闲的IO请求处理程序后续处理哪个索引队列中最小的处理排序索引对应的IO请求。为了保证IOPS参数的达成,该处理顺序需要使得一个IO请求处理程序处理各个索引队列内的处理排序索引的频率相同或者接近,也即一个IO请求处理程序周期性的处理每个索引队列中最小的处理排序索引对应的IO请求。
该处理顺序可以为每个IO请求处理程序按照核心2至核心n的轮询每个索引队列,并处理每次访问的索引队列中最小的处理排序索引对应的IO请求。每处理完一个索引队列中最小的处理排序索引对应的IO请求后,空闲的IO请求处理程序处理下一个索引队列中最小的处理排序索引对应的IO请求。
或者,如果m=n-1,也即IO请求排序程序和IO请求处理程序的数量相同,则操作系统将IO请求处理程序和索引队列一一绑定,则一个IO请求处理程序进入空闲后,操作系统确定该空闲的IO请求处理程序接下来处理该空闲的IO请求处理程序对应的索引队列中最小的处理排序索引对应的IO请求。
操作系统确定了该空闲的IO请求处理程序处理哪个索引队列中最小的处理排序索引对应的IO请求后,由该操作系统从该索引队列中选取最小的处理排序索引,并通知该空闲的IO请求处理程序处理最小的处理排序索引对应的IO请求,或者该操作系统指示该空闲的IO请求处理程序访问该索引队列,并处理该索引队列中最小的处理排序索引对应的IO请求。
部分以上步骤的执行顺序可以调整,获取LUN 1的IOPS参数和获取当前系统时间的步骤,都可以在生成处理排序索引2-1-3前任意时刻执行。
前述获取系统时间的步骤为可选步骤,当不执行该步骤时,生成的处理排序索引2-1-3=该共享处理排序索引1的值+K/LUN 1的IOPS参数。
由于空闲的IO请求处理程序会优先处理各个索引队列内最小的处理排序索引对应的IO请求。因此,对于核心2上运行的IO请求排序程序而言,如果在一段时间内没有被分发指向某一LUN的IO请求,而在该段时间内不断被分发指向其他LUN的IO 请求。那么,接下来的时间内,指向该LUN的IO请求一旦分发至核心2上运行的IO请求排序程序,这些指向该LUN的IO请求的处理排序索引可能将会比指向其他LUN的IO请求的处理排序索引都要小,导致指向该LUN的IO请求会持续优先的被空闲的IO请求处理程序处理,使得指向其他LUN的IO请求饥饿。因此,将系统时间考虑入处理排序索引的计算中,避免了存在长期闲置的LUN的情况下,后续指向该LUN的IO请求被分发到核心2上运行的IO请求排序程序的子空间后,对指向其他LUN的IO请求的阻塞,提升了IO请求的调度精度。
例如,在时刻1,核心2的索引队列内记录的处理排序索引包括:
处理排序索引2-1-1=3    处理排序索引2-2-1=2.2
处理排序索引2-1-2=3.5  处理排序索引2-2-2=2.8
处理排序索引2-1-3=5.5  处理排序索引2-2-3=3.0
在时刻2,时刻1时已经存储于索引队列中的处理排序索引对应的IO请求都已经被处理了,时刻1和时刻2之间生成的新的处理排序索引包括:
处理排序索引2-1-4=6
处理排序索引2-1-5=7.5
处理排序索引2-1-6=9.5
处理排序索引2-1-7=10.5
处理排序索引2-1-8=12
也即,时刻1至时刻2期间,核心2上运行的IO请求排序程序没有被分发新的指向LUN 2的IO描述信息。因此,时刻2后,如果未将系统时间考虑入处理排序索引的计算,则如果核心2上运行的IO请求排序程序被分发了新的指向LUN 2的IO描述信息,则这些新分发的指向LUN 2的IO描述信息的处理排序索引将会比指向LUN 1的IO描述信息的处理排序索引小很多,导致空闲的IO请求处理程序访问核心2的索引队列时,将持续处理新分发的指向LUN 2的IO请求。而如果将系统时间考虑入处理排序索引的计算,则这些新分发的指向LUN 2的IO描述信息的处理排序索引可能等于当前系统时间,将不会比指向LUN 1的IO描述信息的处理排序索引小很多。
因此,将系统时间考虑入处理排序索引的计算中,避免了存在一个时间段内闲置的LUN的情况下,后续指向该闲置的LUN的IO请求来到后,对指向其他LUN的IO请求的阻塞,提升了IO请求的调度精度。
以上步骤执行的过程中,如果核心2上运行的IO请求排序程序在某一时刻确定核心2的索引队列中,指向LUN 1的IO描述信息已经被处理完毕,如图6,在处理排序索引2-1-5生成完毕后,如果被分发给核心2上运行的IO请求排序程序的IO描述信息中,没有未被生成处理排序索引且指向LUN 1的IO描述信息,则核心2上运行的IO请求生成等待处理排序索引并将该等待处理排序索引存入核心2的索引队列中。
该等待处理排序索引=该时刻下的LUN 1对应的共享处理排序索引的值+K/LUN 1的IOPS参数。
该等待处理排序索引在以下两种情况之一会被消除,其一,核心2上运行的IO请求排序程序被分发了新的指向LUN 1的IO描述信息,其二,该等待处理排序索引的 存在时间超过预设阈值。
等待处理排序索引的生成为可选步骤。每个索引队列中的处理排序索引与等待处理排序索引一同排序,如果操作系统在为空闲IO请求处理程序选取接下来处理的IO请求的过程中,确定当前某一索引队列内最小的处理排序索引为等待处理排序索引,由于等待处理排序索引并不对应于任何一个IO请求,因此该空闲IO请求处理程序无法处理该索引队列内的处理排序索引对应的IO请求。操作系统可以为该空闲的IO请求处理程序重新选择一个索引队列。
例如,空闲的IO请求处理程序按照核心2至核心n的顺序轮询各个核心的索引队列,则空闲的IO请求处理程序如果当前轮询到核心3的索引队列,但发现核心3的索引队列中最小的处理排序索引为等待处理排序索引,则该空闲的IO请求处理程序跳过核心3的索引队列,访问核心4的索引队列。
由于每个IO请求排序程序被分发的指向不同LUN的IO描述信息的数量不同,并且IO请求排序程序为IO请求生成处理排序索引的速度一般远高于IO请求处理程序处理IO请求的速度。因此,如果不生成等待处理排序索引且核心2上运行的IO请求排序程序被分发的指向LUN 2的IO描述信息很少,那么可能导致核心2的索引队列内,处理排序索引最小的IO描述信息始终指向LUN 1,导致空闲的IO请求处理程序不断处理指向LUN 1的IO请求,最终引起LUN 2的IOPS参数难以达成。
例如,在时刻1,核心2的索引队列内存储的处理排序索引包括:
处理排序索引2-1-1=3   处理排序索引2-2-1=2.2
处理排序索引2-1-2=3.5  处理排序索引2-2-2=2.8
处理排序索引2-1-3=5.5  处理排序索引2-2-3=3.0
在时刻2,部分时刻1已经生成的处理排序索引对应的IO请求已经被处理了,同时,时刻1至时刻2期间又有新的处理排序索引被生成,核心2的索引队列内记录的处理排序索引包括:
处理排序索引2-1-4=6   处理排序索引2-2-4=3.4
处理排序索引2-1-5=7.5
处理排序索引2-1-6=9.5
如果接下来一段时间内,没有指示LUN 2的IO描述信息被分发到核心2上运行的IO请求排序程序,而指向LUN 1的IO描述信息不断被分发到核心2上运行的IO请求排序程序。则处理排序索引2-2-4对应的IO请求被处理后,如果不生成等待处理排序索引,则空闲的IO请求处理程序一旦访问核心2的索引队列,将不断处理指向LUN 1的IO请求,导致LUN 2的IOPS参数无法达成。与之相对的,生成等待处理排序索引2-2-6,在等待处理排序索引2-2-6被消除之前,核心2的索引队列中比等待处理排序索引2-2-6大的处理排序索引将无法被空闲的IO请求处理程序处理,导致空闲的IO请求处理程序需要访问其他索引队列。因此,采用等待处理排序索引可以提升IO请求的调度精度,提升IOPS参数的达成率。
以处理器包含3个排序核心为例,图7-1为初始状态,共享处理排序索引1和共享处理排序索引2的初始值均为0。图7-1中,IO描述信息a-b-c指示被分发至核心 a的指向LUN b的第c个IO描述信息。指向LUN 1的IO描述信息和指向LUN 2的IO描述信息的初始处理排序索引均为0。LUN 1的IOPS参数为1000,LUN 2的IOPS参数为500,K=1。
T1时刻,核心2上运行的IO请求排序程序为IO描述信息2-1-1计算处理排序索引,IO描述信息2-1-1的处理排序索引为0.001。随后将共享处理排序索引1的值更新为0.001,如图7-2。
T2时刻,核心3上运行的IO请求排序程序为IO描述信息3-1-1计算处理排序索引,处理排序索引3-1-1为0.002。随后将共享处理排序索引1的值更新为0.002,如图7-3。
T3时刻,核心4上运行的IO请求排序程序为IO描述信息4-1-1计算处理排序索引,处理排序索引4-1-1为0.003,随后将共享处理排序索引1的值更新为0.003。
T4时刻,核心4上运行的IO请求排序程序为IO描述信息4-2-1计算处理排序索引,IO描述信息4-2-1的处理排序索引为0.002,随后将共享处理排序索引2的值更新为0.002,如图7-4。
T5时刻,核心2上运行的IO请求排序程序为IO描述信息2-1-2计算处理排序索引,处理排序索引2-1-2为0.004,随后将共享处理排序索引1的值更新为0.004。
T6时刻,核心2上运行的IO请求排序程序为IO描述信息2-2-1计算处理排序索引,处理排序索引2-2-1为0.004,随后将共享处理排序索引2的值更新为0.004,如图7-5。
核心2、核心3、核心4后续生产处理排序索引的过程以此类推。
图7-1至图7-5对应的示例中未将系统时间对处理排序索引的生成过程的影响考虑在内。
如图2所示,本申请所应用的存储控制器包括总线、处理器、内存设备和通信接口。处理器、内存设备和通信接口之间通过总线通信。
内存设备可以包括易失性存储器(英文:volatilememory),例如随机存取存储器(英文:random access memory,缩写:RAM)。
通信接口包括网络接口和存储介质访问接口,分别用于获取客户端发来的IO请求和访问存储介质。
内存设备中存储有执行IO请求分发程序、IO请求排序程序、IO请求处理程序和操作系统所需的代码。存储控制器运行时,处理器中的各个核心调用内存设备中存储的代码,以执行前文提供的IO请求处理方法。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
结合本申请公开内容所描述的方法可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于RAM、快闪存储器、只读存储器(英文:read only memory,缩写:ROM)、可擦除可编程只读存储器(英文:erasable programmable read only memory,缩写:EPROM)、电可擦可编程只读存储器(英文:electrically erasable programmable read only memory,缩写:EEPROM)、 硬盘、光盘或者本领域熟知的任何其它形式的存储介质中。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请所描述的功能可以用硬件或软件来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。存储介质可以是通用或专用计算机能够存取的任何可用介质。
以上该的具体实施方式,对本申请的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上该仅为本申请的具体实施方式而已,并不用于限定本申请的保护范围,凡在本申请的技术方案的基础之上,所做的任何修改、改进等,均应包括在本申请的保护范围之内。

Claims (12)

  1. 一种存储控制器,其特征在于,所述存储控制器适用于有多个存储单元的存储系统,包括至少一个分发核心、多个排序核心、至少一个请求处理核心和内存设备,所述内存设备内存储有多个输入输出IO请求,每个IO请求指向一个存储单元,所述内存设备内还为每个存储单元设置有对应的共享处理排序索引;
    各个分发核心,用于接收IO请求,并将所述接收的IO请求分发至所述多个排序核心;
    各个排序核心,用于:
    获取由所述各个分发核心分发的待生成处理排序索引的IO请求;
    确定所述待生成处理排序索引的IO请求指向的目标存储单元;
    获取所述目标存储单元的IO每秒参数;
    根据所述目标存储单元对应的共享处理排序索引的值和所述目标存储单元的IO每秒参数,为所述待生成处理排序索引的IO请求生成处理排序索引;
    用所述待生成处理排序索引的IO请求的处理排序索引,更新所述目标存储单元对应的共享处理排序索引;
    将所述待生成处理排序索引的IO请求的处理排序索引存入所述各个排序核心对应的索引队列中,所述各个排序核心对应的索引队列存储于所述内存设备且包含了所述各个排序核心为指向所述多个存储单元的IO请求生成的处理排序索引;
    各个请求处理核心,用于周期性的处理所述各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
  2. 如权利要求1所述的存储控制器,其特征在于,所述各个排序核心通过以下操作为所述待生成处理排序索引的IO请求生成处理排序索引:
    根据所述目标存储单元对应的共享处理排序索引的值与K和所述目标存储单元的IO每秒参数之比的和,计算所述待生成处理排序索引的IO请求的处理排序索引,K为正数。
  3. 如权利要求2所述的存储控制器,其特征在于,所述各个排序核心在计算所述待生成处理排序索引的IO请求的处理排序索引前,还用于获取当前系统时间;则
    所述各个排序核心通过以下操作计算所述待生成处理排序索引的IO请求的处理排序索引:
    将根据所述目标存储单元对应的共享处理排序索引的值与K和所述目标存储单元的IO每秒参数之比的和,与所述当前系统时间之间的较大者,作为所述待生成处理排序索引的IO请求的处理排序索引。
  4. 如权利要求2或3所述的存储控制器,其特征在于,所述各个排序核心,还用于:
    在为所述待生成处理排序索引的IO请求生成处理排序索引后的时刻,确定没有被分发至所述各个排序核心的指向所述目标存储单元的还未被生成索引的IO请求;
    计算所述时刻下的所述目标存储单元对应的共享处理排序索引的值,与K和所述目标存储单元的IO每秒参数之比的和,以作为等待处理排序索引;
    将所述等待处理排序索引存入所述各个排序核心对应的索引队列中。
  5. 如权利要求4所述的存储控制器,其特征在于,所述等待处理排序索引在所述各个排序核心对应的索引队列的存在期间,所述各个排序核心对应的索引队列包含的大于所述等待处理排序索引的处理排序索引对应的IO请求不能被所述各个请求处理核心处理;
    所述各个排序核心还用于,在所述时刻后所述各个排序核心被分发了指向所述目标存储单元的IO请求或所述等待处理排序索引在所述各个排序核心对应的索引队列的存在时间超过预设的阈值的情况下,从所述各个排序核心对应的索引队列中消除所述等待处理排序索引。
  6. 如权利要求1至5任一所述的存储控制器,其特征在于,所述各个请求处理核心通过以下操作周期性的处理所述各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求:
    周期性的访问所述各个排序核心对应的索引队列;
    处理每次访问中,所述各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
  7. 一种输入输出IO请求处理方法,其特征在于,所述方法由包含多个存储单元的存储系统的存储控制器执行,所述存储控制器包括内存设备、至少一个分发核心、多个排序核心和至少一个请求处理核心,所述内存设备内存储有多个IO请求,每个IO请求指向一个存储单元,所述内存设备内还为每个存储单元设置有对应的共享处理排序索引,所述方法包括:
    各个分发核心接收IO请求,并将所述接收的IO请求分发至所述多个排序核心;
    各个排序核心获取由所述各个分发核心分发的待生成处理排序索引的IO请求;
    所述各个排序核心获取确定所述待生成处理排序索引的IO请求指向的目标存储单元;
    所述各个排序核心获取所述目标存储单元的IO每秒参数;
    所述各个排序核心根据所述目标存储单元对应的共享处理排序索引和所述目标存储单元的IO每秒参数,为所述待生成处理排序索引的IO请求生成处理排序索引;
    所述各个排序核心用所述待生成处理排序索引的IO请求的处理排序索引,更新所述目标存储单元对应的共享处理排序索引;
    所述各个排序核心将所述待生成处理排序索引的IO请求的处理排序索引存入所述各个排序核心对应的索引队列中,所述各个排序核心对应的索引队列存储于所述内存设备且包含了所述各个排序核心为指向所述多个存储单元的IO请求生成的处理排序索引;
    各个请求处理核心周期性的处理所述各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
  8. 如权利要求7所述的方法,其特征在于,所述各个排序核心根据所述目标存储单元对应的共享处理排序索引和所述目标存储单元的IO每秒参数,为所述待生成处理排序索引的IO请求生成处理排序索引包括:
    根据所述目标存储单元对应的共享处理排序索引与K和所述目标存储单元的IO每秒参数之比的和,计算所述待生成处理排序索引的IO请求的处理排序索引,K为正数。
  9. 如权利要求8所述的方法,其特征在于,在所述各个排序核心计算所述待生成处理排序索引的IO请求的处理排序索引前,所述方法还包括:所述各个排序核心获取当前系统时间;则
    所述各个排序核心根据所述目标存储单元对应的共享处理排序索引与K和所述目标存储单元的IO每秒参数之比的和,计算所述待生成处理排序索引的IO请求的处理排序索引包括:
    所述各个排序核心将根据所述目标存储单元对应的共享处理排序索引与K和所述目标存储单元的IO每秒参数之比的和,与所述当前系统时间之间的较大者,作为所述待生成处理排序索引的IO请求的处理排序索引。
  10. 如权利要求8或9所述的方法,其特征在于,所述方法还包括:
    所述各个排序核心在为所述待生成处理排序索引的IO请求生成处理排序索引后的时刻,确定没有被分发至所述各个排序核心的指向所述目标存储单元的还未被生成索引的IO请求;
    所述各个排序核心计算所述时刻下的所述目标存储单元对应的共享处理排序索引的值,与K和所述目标存储单元的IO每秒参数之比的和,以作为等待处理排序索引;
    所述各个排序核心将所述等待处理排序索引存入所述各个排序核心对应的索引队列中。
  11. 如权利要求10所述的方法,其特征在于,所述等待处理排序索引在所述各个排序核心对应的索引队列的存在期间,所述各个排序核心对应的索引队列包含的大于所述等待处理排序索引的处理排序索引对应的IO请求不能被所述各个请求处理核心处理;
    所述各个排序核心,在所述时刻后所述各个排序核心被分发了指向所述目标存储单元的IO请求或所述等待处理排序索引在所述各个排序核心对应的索引队列的存在时间超过预设的阈值的情况下,从所述各个排序核心对应的索引队列中消除所述等待处理排序索引。
  12. 如权利要求7至11任一所述的方法,其特征在于,所述各个请求处理核心周期性的处理所述各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求包括:
    周期性的访问所述各个排序核心对应的索引队列;
    处理每次访问中,所述各个排序核心对应的索引队列中最小的处理排序索引对应的IO请求。
PCT/CN2017/108194 2017-01-05 2017-10-28 一种存储控制器及io请求处理方法 WO2018126771A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17889889.6A EP3537281B1 (en) 2017-01-05 2017-10-28 Storage controller and io request processing method
US16/503,817 US10884667B2 (en) 2017-01-05 2019-07-05 Storage controller and IO request processing method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710008824.7 2017-01-05
CN201710008824.7A CN106775493B (zh) 2017-01-05 2017-01-05 一种存储控制器及io请求处理方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/503,817 Continuation US10884667B2 (en) 2017-01-05 2019-07-05 Storage controller and IO request processing method

Publications (1)

Publication Number Publication Date
WO2018126771A1 true WO2018126771A1 (zh) 2018-07-12

Family

ID=58949747

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/108194 WO2018126771A1 (zh) 2017-01-05 2017-10-28 一种存储控制器及io请求处理方法

Country Status (4)

Country Link
US (1) US10884667B2 (zh)
EP (1) EP3537281B1 (zh)
CN (2) CN109799956B (zh)
WO (1) WO2018126771A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6773229B2 (ja) 2016-12-29 2020-10-21 ホアウェイ・テクノロジーズ・カンパニー・リミテッド ストレージコントローラおよびioリクエスト処理方法
CN109799956B (zh) * 2017-01-05 2023-11-17 华为技术有限公司 一种存储控制器及io请求处理方法
US11194735B2 (en) * 2017-09-29 2021-12-07 Intel Corporation Technologies for flexible virtual function queue assignment
US11005970B2 (en) * 2019-07-24 2021-05-11 EMC IP Holding Company LLC Data storage system with processor scheduling using distributed peek-poller threads
CN111708491B (zh) 2020-05-29 2022-11-04 苏州浪潮智能科技有限公司 一种随机写方法和装置
US20220374149A1 (en) * 2021-05-21 2022-11-24 Samsung Electronics Co., Ltd. Low latency multiple storage device system
US12067254B2 (en) 2021-05-21 2024-08-20 Samsung Electronics Co., Ltd. Low latency SSD read architecture with multi-level error correction codes (ECC)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073461A (zh) * 2010-12-07 2011-05-25 成都市华为赛门铁克科技有限公司 输入输出请求调度方法、存储控制器和存储阵列
CN105589829A (zh) * 2014-09-15 2016-05-18 华为技术有限公司 基于多核处理器芯片的数据处理方法、装置以及系统
CN105892955A (zh) * 2016-04-29 2016-08-24 华为技术有限公司 一种管理存储系统的方法及设备
US20160313943A1 (en) * 2015-04-24 2016-10-27 Kabushiki Kaisha Toshiba Storage device that secures a block for a stream or namespace and system having the storage device
CN106775493A (zh) * 2017-01-05 2017-05-31 华为技术有限公司 一种存储控制器及io请求处理方法

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7917903B2 (en) 2003-03-27 2011-03-29 Hewlett-Packard Development Company, L.P. Quality of service controller and method for a data storage system
US7277984B2 (en) * 2004-06-23 2007-10-02 International Business Machines Corporation Methods, apparatus and computer programs for scheduling storage requests
US7646779B2 (en) 2004-12-23 2010-01-12 Intel Corporation Hierarchical packet scheduler using hole-filling and multiple packet buffering
US7823154B2 (en) 2005-09-16 2010-10-26 Hewlett-Packard Development Company, L.P. System and method for providing, by a plurality of schedulers, differentiated service to consumers of distributed resources
JP2007257180A (ja) 2006-03-22 2007-10-04 Hitachi Ltd ネットワークノード、スイッチ及びネットワーク障害回復方法
US11010076B2 (en) * 2007-03-29 2021-05-18 Violin Systems Llc Memory system with multiple striping of raid groups and method for performing the same
CN100553331C (zh) * 2007-12-21 2009-10-21 北京天天宽广网络科技有限公司 基于p2p技术的视频网络中的内容分发与存储系统及其方法
CN101272334B (zh) 2008-03-19 2010-11-10 杭州华三通信技术有限公司 使用多核CPU处理QoS业务的方法、装置和设备
CN101299181A (zh) 2008-07-08 2008-11-05 杭州华三通信技术有限公司 基于磁盘进行i/o请求缓存的方法和装置以及san存储设备
US20100030931A1 (en) 2008-08-04 2010-02-04 Sridhar Balasubramanian Scheduling proportional storage share for storage systems
CN101354664B (zh) * 2008-08-19 2011-12-28 中兴通讯股份有限公司 多核处理器中断负载均衡方法和装置
US7912951B2 (en) * 2008-10-28 2011-03-22 Vmware, Inc. Quality of service management
US8037219B2 (en) * 2009-04-14 2011-10-11 Lsi Corporation System for handling parallel input/output threads with cache coherency in a multi-core based storage array
CN103299271B (zh) 2011-01-11 2016-04-13 惠普发展公司,有限责任合伙企业 并发请求调度
US8793463B2 (en) 2011-09-12 2014-07-29 Microsoft Corporation Allocation strategies for storage device sets
CN103577115B (zh) 2012-07-31 2016-09-14 华为技术有限公司 数据的排布处理方法、装置和服务器
US8943505B2 (en) * 2012-08-24 2015-01-27 National Instruments Corporation Hardware assisted real-time scheduler using memory monitoring
US8984243B1 (en) 2013-02-22 2015-03-17 Amazon Technologies, Inc. Managing operational parameters for electronic resources
CN103338252B (zh) * 2013-06-27 2017-05-24 南京邮电大学 一种分布式数据库并发存储虚拟请求机制的实现方法
CN103412790B (zh) * 2013-08-07 2016-07-06 南京师范大学 移动安全中间件的多核并发调度方法与系统
US9170943B2 (en) * 2013-08-29 2015-10-27 Globalfoundries U.S. 2 Llc Selectively enabling write caching in a storage system based on performance metrics
US9983801B1 (en) * 2013-09-21 2018-05-29 Avago Technologies General Ip (Singapore) Pte. Ltd. Priority queueing for low latency storage networks
CN104679575B (zh) 2013-11-28 2018-08-24 阿里巴巴集团控股有限公司 输入输出流的控制系统及其方法
WO2015127642A1 (en) * 2014-02-28 2015-09-03 Huawei Technologies Co., Ltd. Method for debugging computer program
JP2017512350A (ja) * 2014-03-08 2017-05-18 ディアマンティ インコーポレイテッド 集中型ネットワーキング及びストレージのための方法及びシステム
US9483187B2 (en) * 2014-09-30 2016-11-01 Nimble Storage, Inc. Quality of service implementation in a networked storage system with hierarchical schedulers
CN105934793A (zh) 2014-12-27 2016-09-07 华为技术有限公司 一种存储系统数据分发的方法、分发装置与存储系统
CN104571960B (zh) * 2014-12-30 2018-01-16 华为技术有限公司 Io请求分发装置及方法、主机、存储阵列和计算机系统
US9575664B2 (en) * 2015-04-08 2017-02-21 Prophetstor Data Services, Inc. Workload-aware I/O scheduler in software-defined hybrid storage system
CN106155764A (zh) 2015-04-23 2016-11-23 阿里巴巴集团控股有限公司 调度虚拟机输入输出资源的方法及装置
CN105183375B (zh) 2015-08-31 2019-04-23 成都华为技术有限公司 一种热点数据的服务质量的控制方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073461A (zh) * 2010-12-07 2011-05-25 成都市华为赛门铁克科技有限公司 输入输出请求调度方法、存储控制器和存储阵列
CN105589829A (zh) * 2014-09-15 2016-05-18 华为技术有限公司 基于多核处理器芯片的数据处理方法、装置以及系统
US20160313943A1 (en) * 2015-04-24 2016-10-27 Kabushiki Kaisha Toshiba Storage device that secures a block for a stream or namespace and system having the storage device
CN105892955A (zh) * 2016-04-29 2016-08-24 华为技术有限公司 一种管理存储系统的方法及设备
CN106775493A (zh) * 2017-01-05 2017-05-31 华为技术有限公司 一种存储控制器及io请求处理方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3537281A4 *

Also Published As

Publication number Publication date
CN109799956A (zh) 2019-05-24
EP3537281A1 (en) 2019-09-11
CN106775493A (zh) 2017-05-31
CN106775493B (zh) 2019-01-25
EP3537281A4 (en) 2019-11-27
EP3537281B1 (en) 2022-08-17
CN109799956B (zh) 2023-11-17
US20190332328A1 (en) 2019-10-31
US10884667B2 (en) 2021-01-05

Similar Documents

Publication Publication Date Title
WO2018126771A1 (zh) 一种存储控制器及io请求处理方法
CN107688492B (zh) 资源的控制方法、装置和集群资源管理系统
CN114138486B (zh) 面向云边异构环境的容器化微服务编排方法、系统及介质
WO2020181813A1 (zh) 一种基于数据处理的任务调度方法及相关设备
Rajguru et al. A comparative performance analysis of load balancing algorithms in distributed system using qualitative parameters
CN110162388A (zh) 一种任务调度方法、系统及终端设备
CN110383764A (zh) 无服务器系统中使用历史数据处理事件的系统和方法
Xie et al. Pandas: robust locality-aware scheduling with stochastic delay optimality
US10965610B1 (en) Systems and methods for allocating shared resources in multi-tenant environments
Delavar et al. A synthetic heuristic algorithm for independent task scheduling in cloud systems
CN111209091A (zh) 混合云环境下含隐私数据的Spark任务的调度方法
JP2015069577A (ja) 情報処理システム、管理装置制御プログラム及び情報処理システムの制御方法
CN111679901A (zh) 基于作业调度软件和并行文件系统的高性能服务系统
Stavrinides et al. Orchestrating bag-of-tasks applications with dynamically spawned tasks in a distributed environment
WO2018119899A1 (zh) 一种存储控制器及io请求处理方法
Azmi et al. Scheduling grid jobs using priority rule algorithms and gap filling techniques
US11080092B1 (en) Correlated volume placement in a distributed block storage service
Stavrinides et al. Scheduling a time-varying workload of multiple types of jobs on distributed resources
KR20150089665A (ko) 워크플로우 작업 스케줄링 장치
CN109558214A (zh) 异构环境下宿主机资源管理方法、装置和存储介质
CN115167973B (zh) 一种云计算数据中心的数据处理系统
CN104881271B (zh) 云动态管理方法及装置
CN113157404B (zh) 任务处理方法和装置
US11048554B1 (en) Correlated volume placement in a distributed block storage service
Yuan et al. Fairness-aware scheduling algorithm for multiple DAGs based on task replication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17889889

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017889889

Country of ref document: EP

Effective date: 20190604

NENP Non-entry into the national phase

Ref country code: DE