WO2020024207A1 - Service request processing method, device and storage system - Google Patents
Service request processing method, device and storage system Download PDFInfo
- Publication number
- WO2020024207A1 WO2020024207A1 PCT/CN2018/098277 CN2018098277W WO2020024207A1 WO 2020024207 A1 WO2020024207 A1 WO 2020024207A1 CN 2018098277 W CN2018098277 W CN 2018098277W WO 2020024207 A1 WO2020024207 A1 WO 2020024207A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- processor cores
- request
- processor
- core
- cores
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Definitions
- the present application relates to the field of information technology, and more particularly, to a method, an apparatus, and a processor for processing a service request.
- the central processing unit (CPU) of the array controller is a key factor affecting system performance.
- a method for processing a service request in a storage system includes multiple processor cores, including: receiving a request for a current stage of a service request, where the current stage request is the service request A request of one of the multiple stages of the request; determining a first set of processor cores to execute the request of the current stage, the first set of processor cores being one processor core of the plurality of processor cores Send the request of the current stage to the first processor core set with the lightest load processor core.
- this application processes business requests.
- the method can ensure load balancing between processor cores, determine the set of processor cores for each stage of business request, and schedule the current stage of requests within the scope of the processor set.
- the processor core considers the correlation between the request of each stage and the delay that affects the processing of the request by the processor core, and reduces the delay of processing service requests.
- the determining a first set of processor cores to execute the request of the current phase includes: querying a core binding relationship, determining the first set of processor cores to execute the request of the current phase, and The core binding relationship is used to indicate an association relationship between the request in the current stage and the first processor core set.
- the method further includes: re-determining the number of processor cores that executes the request of the current stage according to the first set of processor cores; Re-determining the number of processor cores that execute the request of the current phase, and allocating, in the plurality of processor cores, the request of the current phase to a second set of processor cores that meets the number; according to the The second processor core set generates a new core binding relationship, where the new core binding relationship is used to indicate an association relationship between the request at the current stage and the second processor core set.
- re-determining the number of processor cores that executes the request in the current phase according to the first set of processor cores includes: determining a utilization of the processor cores in the first set of processor cores And the average utilization rate of the plurality of processor cores; and re-determining the execution location according to the total utilization rate of the processor cores in the first processor core set and the average utilization rate of the plurality of processor cores. Describes the number of processor cores requested in the current phase.
- processor cores for requests at the corresponding stage By periodically monitoring the utilization of processor cores in the storage system, and according to changes in the utilization of processor cores allocated for requests at any stage, reallocating processor cores for requests at the corresponding stage can be based on The change of the utilization rate of the processor cores is periodically adjusted to the processor cores allocated to the requests in the corresponding phases, thereby improving the load imbalance between the processor cores.
- the number includes: re-determining the execution of the request of the current stage based on the following relationship according to the sum of the utilization of the processor cores in the first processor core set and the average utilization of the plurality of processor cores. Number of processor cores:
- N U P / U ave
- N is the number of re-determined processor cores executing the current stage request
- U P is the total utilization of the processor cores in the first set of processor cores
- U ave is the multiple processes. Processor core average utilization.
- the allocating, in the plurality of processor cores, the request of the current stage for the current stage request to a second set of processor cores including: generating Multiple sets of allocation results.
- Each set of allocation results includes a set of processor cores that satisfy the corresponding number of requests for reassignment for each stage of the request.
- Multiple path lengths are determined for the multiple sets of allocation results, and each set of allocation results corresponds to a path Length, the path length L satisfies:
- c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing the requests in adjacent stages
- d i, i + 1 represents the average topological distance between the processor cores executing the requests in the adjacent stages.
- M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the plurality of path lengths, the request in the current stage is allocated a second that satisfies the number Processor core collection.
- processor cores allocated for the requests of each stage According to the determined number of processor cores allocated for the requests of each stage, generate multiple sets of processor core allocation results, determine multiple path lengths for the multiple sets of allocation results, and consider the allocation of processor cores for the requests of each stage
- the topological distance between processor cores determines the allocation result corresponding to the shortest path length among multiple path lengths as the final processor core allocation result, thereby ensuring load balancing between processor cores and reducing the delay in processing business requests .
- the first set of processor cores includes K processor cores, where K is an integer greater than or equal to 3, and the first processor core
- the processor with the lightest load in the core set sending the request of the current stage includes: determining a scheduler for the request of the current stage among the K processor cores according to the sliding window length w and the sliding step d.
- Region, the scheduling sub-region includes w processor cores, w is an integer greater than or equal to 2 and less than K, and d is an integer greater than or equal to 1 and less than K; the load is the largest among the w processor cores
- the light processor core sends the request at the current stage.
- the search range of the processor core with the lightest search load is narrowed, so that the processor core with the lightest load in the scheduling subregion executes the request of the corresponding stage To ensure load balancing between processor cores and further reduce the delay in processing business requests.
- the d and the K are prime numbers each other.
- a configuration method for processing a service request including: configuring a first set of processor cores for a request in a first stage of a service request, the first set of processor cores being used to execute the first stage Request; configure a first rule, the first rule instructing to send the request of the first stage to the lightest-loaded processor core in the first processor core set.
- the configuration method for processing a business request of the present application enables the processor to be guaranteed when processing a business request.
- Load balancing between cores takes into account the correlation between the requests in each phase and the delay that affects the processor core's processing of requests in each phase, reducing the delay in processing business requests.
- the method further includes: configuring a second set of processor cores for a request in a second phase of a service request, the second set of processor cores being used for execution The request of the second phase; configuring a second rule, the second rule instructing to send the request of the second phase to the lightest-loaded processor core in the second set of processor cores.
- an apparatus for processing a service request is provided.
- the apparatus is configured in a storage system, and the apparatus is configured to execute the method in any one of the possible implementation manners of the first aspect or the second aspect.
- the apparatus may include a module for executing a method in any possible implementation manner of the first aspect or the second aspect.
- a storage system includes a plurality of processor cores and a memory; the memory is configured to store computer instructions; and one or more of the plurality of processor cores are configured to execute Computer instructions stored in the memory, when the computer instructions in the memory are executed, the one or more processor cores are configured to execute any one of the possible implementations of the first aspect or the second aspect above method.
- a computer-readable storage medium stores computer instructions, and when the computer instructions are run on a computer, the computer is caused to execute any one of the first aspect or the second aspect Methods in possible implementations.
- a computer program product including computer instructions is provided, and when the computer instructions are run on a computer, the computer is caused to execute the method in any possible implementation manner of the first aspect or the second aspect.
- FIG. 1 is a schematic diagram of a storage array architecture according to an embodiment of the present invention.
- FIG. 2 is a schematic diagram of a controller of a storage array according to an embodiment of the present invention.
- FIG. 3 is a schematic diagram of a distributed block storage system according to an embodiment of the present invention.
- FIG. 4 is a schematic structural block diagram of a server of a distributed block storage system.
- FIG. 5 is a schematic block diagram of a processor according to an embodiment of the present invention.
- FIG. 6 is a schematic flowchart of a method for processing a service request in a storage system according to an embodiment of the present invention.
- FIG. 7 is a schematic diagram of scheduling a processor core based on a sliding window mechanism according to an embodiment of the present invention.
- FIG. 8 is a schematic diagram of a topology distance between logical cores sharing different levels of memory or cache under a NUMA architecture according to an embodiment of the present invention.
- FIG. 9 is a schematic flowchart of a configuration method for processing a service request according to an embodiment of the present invention.
- FIG. 10 is a schematic block diagram of an apparatus for processing a service request according to an embodiment of the present invention.
- FIG. 11 is a schematic block diagram of a storage system according to an embodiment of the present invention.
- the storage system in the embodiment of the present invention may be a storage array (such as Huawei Oceanstor 18000 series, V3 series).
- the storage array includes a storage controller 101 and a plurality of hard disks.
- the hard disks include a solid state disk (SSD), a mechanical hard disk, or a hybrid hard disk.
- Mechanical hard disks such as HDD (hard disk drive).
- the controller 101 includes a central processing unit (CPU) 201, a memory 202, and an interface 203.
- the memory 202 stores computer instructions
- the CPU 201 includes multiple processor cores (not shown in FIG. 2).
- the CPU 201 executes computer instructions in the memory 202 to perform management and data access operations on the storage system.
- a field programmable gate array (FPGA) or other hardware can also be used to perform all operations of the CPU 201 in the embodiment of the present invention, or the FPGA or other hardware and the CPU 201 are respectively used for The operations of the CPU 201 according to the embodiment of the present invention are performed.
- the CPU 201 and the memory 202 are referred to as a processor, or the FPGA and other hardware replacing the CPU 201 are referred to as a processor, or the combination of the FPGA and other hardware replacing the CPU 201 and the CPU 201 are collectively referred to as a processor.
- the processor is in communication with the interface 203.
- the interface 203 may be a network interface card (NIC), a host bus adaptor (HBA), or the like.
- the CPU 201 is configured to process a service request, such as receiving a service request sent by a host or a client, and use the method for processing a service request provided by an embodiment of the present invention to process the service request.
- the storage system in the embodiment of the present invention may also be a distributed file storage system (such as Huawei of 9000 series), distributed block storage systems (such as Huawei of Series) and so on. Huawei of Series as an example.
- a distributed block storage system includes multiple servers, such as server 1, server 2, server 3, server 4, server 5, and server 6, and infiniband technology or Ethernet is used between the servers Waiting to communicate with each other.
- the number of servers in the distributed block storage system can be increased according to actual needs, which is not limited in the embodiment of the present invention.
- the server of the distributed block storage system includes a structure as shown in FIG. 4.
- each server in the distributed block storage system includes a central processing unit (CPU) 401, a memory 402, an interface 403, a hard disk 1, a hard disk 2, and a hard disk 3.
- the memory 402 stores computer instructions
- the CPU 401 includes multiple processor cores (not shown in FIG. 4), and the CPU 401 executes computer instructions in the memory 402 to perform corresponding operations.
- the interface 403 may be a hardware interface, such as a network interface card (NIC) or a host bus adapter (HBA), or a program interface module.
- the hard disk includes a solid state disk (SSD), a mechanical hard disk, or a hybrid hard disk. Mechanical hard disks such as HDD (hard disk drive).
- a field programmable gate array (FPGA) or other hardware can also perform the corresponding operations in place of the CPU401, or FPGA or other hardware can perform the corresponding operations in conjunction with the CPU401.
- the CPU 401 and the memory 402 are referred to as a processor, or the FPGA and other hardware replacing the CPU 401 are referred to as a processor, or the combination of the FPGA and other hardware replacing the CPU 401 and the CPU 401 are collectively referred to as a processor.
- the interface 403 may be a network interface card (NIC), a host bus adapter (HBA), or the like.
- the CPU 401 is configured to process a service request, such as receiving a service request sent by a host or a client, and use the method for processing a service request provided by an embodiment of the present invention to process the service request.
- the load of the processor core is estimated based on the number of business requests to be processed on each processor core in a storage system containing multiple processor cores, and the business request is finally sent to the load in the storage system
- the lightest for example, the least number of pending business requests
- an embodiment of the present invention proposes a method for processing a service request.
- the pending service request can be divided into multiple stages of request execution, and a certain number of processor cores (for example, processors) are allocated for each stage of the request. Core set), and send each stage request to the lightest-loaded processor core in the set of processor cores allocated for the request in this stage, as opposed to sending business requests to all processor cores in the storage system Lightest processor core.
- factors that affect the delay such as the access delay, access distance, connection relationship between processors, or bus type, for each level of memory or cache accessed by the CPU (such as a processor core) are for each stage.
- the method for processing service requests in the embodiments of the present invention can ensure load balancing among processor cores, and schedule requests at the current stage within the scope of the processor core set.
- the access request can be divided into two phases: a resource waiting phase and a resource using phase.
- requests in the resource waiting phase generally require special resources, such as disks, memory, and files. When resources are occupied by the previous request and not released, requests in the resource waiting phase are blocked until the resource can be used;
- a request using the resource phase is a request that actually performs a data access phase.
- the SCSI subsystem is a layered architecture, which is divided into three layers.
- the top layer which is called the upper layer, represents the highest interface of the operating system kernel to access the SCSI protocol device and the driver of the main device type.
- the middle layer also known as the common layer or unified layer, in this layer contains some of the higher and lower layers of the SCSI stack public services.
- the lower layer represents the actual driver for the physical interface of the device that is suitable for the SCSI protocol.
- SCSI-based access requests are also divided into three stages of requests.
- the processor for example, the CPU 201 in FIG. 2 and the CPU 401 in FIG. 4 provided by the embodiment of the present invention is first introduced.
- the processor in the embodiment of the present invention includes multiple processor cores (for example, processor core 0 to processor core S, S ⁇ 2), and one of the multiple processor cores
- the load balancing module 501 and the binding core calculation module 502 are included.
- the other processor cores include a scheduling module 503.
- the load balancing module 501 is used to calculate the number of processor cores to be bound for each stage of a service request;
- the core binding relationship calculation module 502 is used to allocate a request that satisfies a corresponding number of requests for each stage of a service request.
- a processor core which in turn generates a binding relationship, which indicates a correspondence between a request for a phase of a service request and a set of processor cores that process the request for the phase;
- a scheduling module 503 is configured to save the binding relationship and receive
- a business request is made in a certain stage, query the binding core relationship, determine the set of processing cores used to execute the request in this stage, and send the request in this stage to the lightest-loaded processor core in the set of processor cores.
- the request of this stage is executed by the processor core.
- At least one processor core is provided with a listening module 504.
- the listening module 504 is configured to monitor a service request from a host or a client. When a service request is sent, the service request is sent to the scheduling module 503 in the processor core.
- the processor in the embodiment of the present invention is described above only by taking the load balancing module 501 and the core relationship calculation module 502 as an example to deploy in the processor core S, but the embodiment of the present invention is not limited to this.
- the load balancing module 501 and the binding relationship calculation module 502 may be deployed in any one of the processor cores 0 to S, and the load balancing module 501 and the binding relationship calculation module 502 may be deployed in the same process.
- Processor cores can also be deployed in different processor cores.
- FIG. 6 shows a schematic flowchart of a method for processing a service request in a storage system, including steps 601 to 603.
- the listening module 504 in the processor core listens to the service request from the host or the client
- the service request in the current stage is a multiple of the service request.
- the monitoring module 504 in the processor core 1 sends the request of the current stage to the scheduling module 503 in the processor core 1.
- the scheduling module 503 in the processor core 1 determines a set of processor cores (for example, a first set of processor cores) that executes the request of the current phase for the received request of the current phase.
- the scheduling module 503 may determine a first set of processor cores that executes the request of the current phase according to the specific type of the request of the current phase, and the first set of processor cores is a processor core of multiple processor cores in the storage system. set.
- determining the first set of processor cores that executes the request of the current phase includes: querying the core binding relationship, determining the first set of processor cores used to execute the request of the current phase, and the binding core relationship is used to indicate An association relationship between the request in the current stage and the first processor core set.
- the scheduling module 503 in the processor core 1 may query a binding core relationship, where the binding core relationship indicates a set of processor cores allocated for each stage of the service request request, and each processor core set includes A plurality of processor cores, and the scheduling module 503 in the processor core 1 determines a first set of processor cores that executes the request in the current stage according to the core-binding relationship.
- the scheduling module 503 in the processor core 1 queries the binding core relationship to determine the processor core set including the processor core 1, the processor core 2, the processor core 4, the processor core 7 and the processor core 9 and the current core. There is an association relationship between the requests of the phases, and then the processor core set is determined as the first processor core set that executes the request of the current phase.
- the scheduling module 503 in processor core 1 sends the service request to the lightest-loaded processing in the first set of processor cores Processor core, which executes the request at the current stage.
- the scheduling module 503 in processor core 1 determines the lightest-loaded processing among processor core 1, processor core 2, processor core 4, processor core 7, and processor core 9 in the first set of processor cores.
- the processor core is the processor core 7, and the scheduling module 503 in the processor core 1 sends a service request to the processor core 7, and the processor core 7 executes the request at the current stage.
- the scheduling module 503 in the processor core 7 determines the The processor core set for the next stage request is sent to the processor core with the lightest load in the processor core set, and the processor core executes the request for the next stage.
- a certain number of processor cores are allocated to the requests for each phase, and the requests for each phase are sent to the The lightest-loaded processor core in the set of processor cores requesting allocation, compared to sending a service request to the lightest-loaded processor core among multiple processor cores in a storage system, the embodiment of the present invention
- the method can ensure load balancing among processor cores, determine the set of processor cores for each stage of business request, and schedule the current stage of requests within the scope of the processor set.
- the processor core considers the correlation between the requests in each phase and the delay that affects the processing of the requests by the processor core, reducing the delay in processing business requests.
- the first processor core set includes K processor cores, where K is an integer greater than or equal to 3, and the lightest-loaded processor core in the first processor core set is sent the current stage
- the request includes: according to the sliding window length w and the sliding step d, determining a scheduling sub-region for the current stage request in the K processor cores, the scheduling sub-region includes w processor cores, and w is greater than or An integer equal to 2 and less than K, and d is an integer greater than or equal to 1 and less than K; sending the request of the current stage to the lightest-loaded processor core among the w processor cores.
- the scheduling module 503 may send the request of the current phase to the lightest-loaded processor core in the first set of processor cores,
- the processor core executes the request of the current stage; or, the processor core executing the request of the current stage may also be determined based on the sliding window mechanism.
- the scheduling module 503 may use the sliding window length w and the sliding step d to determine the The phase request determines a scheduling sub-area, determines the lightest-loading processor core from the processor cores included in the scheduling sub-area, and sends the service request to the lightest-loading processor core in the scheduling sub-area.
- the scheduling sub-region determined by the scheduling module 503 for the request of the current stage is shown in FIG. 7. It can be seen from FIG. 7 that the processor core included in the scheduling sub-region is the processor.
- Core 1, processor core 3, and processor core 4, the scheduling module 503 sends the request of the current stage to the processor core 1, processor core 3, and processor core 4 with the lightest load processor core, and the load The lightest processor core executes the request at this stage.
- processor core 1 When the processor set including processor core 1, processor core 3, processor core 4, processor core 5, processor core 8, processor core 9, and processor core 10 is also used to process the request at the current stage
- the scheduling sub-area of the request for a certain stage of the other service request is to slide the sliding window backwards by two processor cores.
- the scheduling module 503 sends a request for a certain stage of the other service request to the processor core 4, the processor core 5, and the processor core 8 with the lightest load processor core.
- the processor core executes a request of a certain stage of the other service request.
- the search range of the processor core with the lightest search load is narrowed, so that the processor core with the lightest load in the scheduling subregion executes the request of the corresponding stage.
- the method for processing service requests in the embodiments of the present invention can ensure load balancing among processor cores, determine a set of processor cores for each stage of service request requests, and schedule requests in the current stage within the scope of the processor set.
- the processor core with the lightest load in the storage system is directly selected, and the correlation between the request of each stage and the delay that affects the processing of the request by the processor core is considered, which further reduces the delay of processing business requests.
- the binding relationship may be pre-configured, and then the binding relationship calculation module 502 in the processor core updates the binding relationship, that is, generates a new Nuclear ties.
- the method further includes: re-determining the number of processor cores that execute the request of the current stage according to the first set of processor cores; and according to the re-determined processor core that executes the request of the current stage The number of second processor core sets that satisfy the number of requests for the current stage among the multiple processor cores; according to the second processor core set, a new core binding relationship is generated, and the new core binding The relationship is used to indicate an association relationship between the request in the current stage and the second processor core set.
- the load balancing module 501 in the processor core S periodically determines the processing in the set of processor cores used to execute the requests of each phase for the requests of multiple phases of the service request.
- the number of processor cores, and the determined number of processor cores in the set of processor cores used to execute the request of each phase is provided to the binding core relationship calculation module 502 according to the load balancing module 501 Re-determine the number of processor cores in the set of processor cores used to execute the request for each phase, reallocate the request for each phase to meet the corresponding number of processor cores, and reallocate according to the request for each phase Satisfy the corresponding number of processor cores and periodically generate new core binding relationships.
- the following uses the load balancing module 501 to re-determine the number of processor cores used to execute the request of the current stage as an example, and describes a method to re-determine the number of processor cores used to execute the request of each stage.
- re-determining the number of processor cores executing the request of the current stage according to the first set of processor cores includes: determining a total utilization rate of the processor cores in the first set of processor cores With the average utilization of the plurality of processor cores; and re-determining the execution of the request at the current stage according to the sum of the utilization of the processor cores in the first processor core set and the average utilization of the plurality of processor cores The number of processor cores.
- the load balancing module 501 monitors the utilization rate of each processor core in the storage system in real time, wherein the utilization rate of the processor core is a ratio of the running time of the processor core to the sum of the running time and the idle time, and according to the processing, Changes in the utilization of processor cores, and re-determine the number of processor cores in the set of processor cores used to execute the request in the current phase.
- the first set of processor cores bound to the request at the current stage is represented as P
- the utilization rate of the first processor core set is represented by U P
- the utilization rate of the first processor core set is U P is equal to the total utilization of the processor cores in the first processor core set in the current cycle, which is expressed as:
- U j represents the utilization rate of any processor core in the first processor core set in the current cycle.
- a plurality of processor cores in a storage system in the current cycle average utilization is expressed as U ave, the scheduling module 503 in accordance with the U P U ave execution request for re-determining the current phase of the processing core of the set of processors The number of processor cores.
- the number of processor cores executing the request of the current stage is re-determined according to the sum of the utilization ratios of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores. Including: re-determining the number of processor cores executing the request of the current phase based on the following relationship according to the sum of the utilization ratio of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores :
- N is the number of processor cores that are re-determined to execute the request at the current stage
- U P is the total utilization of the processor cores in the first set of processor cores
- U ave is the Average utilization.
- the load balancing module 501 After the load balancing module 501 re-determines the number N of processor cores used to execute the request of the current stage in the current cycle, it will determine the processor cores in the set of processor cores used to execute the request of the current stage. The number is provided to the core relationship calculation module 502, and the core relationship calculation module 502 reallocates a set of processor cores (for example, a second processor core set) that meets the foregoing number N at the beginning of the next cycle for the request of the current stage. ).
- a set of processor cores for example, a second processor core set
- the number of processor cores used to execute the request of the current phase in the current cycle is 8, and after the load balancing module 501 re-determines the number of processor cores used to execute the request of the current phase in the current cycle, for example, The number of processor cores re-determined by the load balancing module 501 in the current cycle to execute the request of the current phase is 6, and the load balancing module 501 provides the number of processor cores 6 re-determined for the request of the current phase to the core binding relationship.
- the binding core calculation module 502 may delete two processor cores from the eight processor cores stored in the binding relationship to execute the request of the current phase at the beginning of the next cycle, that is, generate a new processor core Nuclear ties.
- the load balancing module 501 provides the number of processor cores 6 that is re-determined for the current request to the binding core calculation module 502.
- the binding core calculation module 502 does not use the Two processor cores are deleted from the eight processor cores executing the current phase of the request, but six processor cores are reassigned in the storage system for the current phase of the request, and the core relationship will be tied at the beginning of the next cycle
- the 8 processor cores that were originally allocated for the current stage request are replaced with the 6 processor cores that were reassigned to generate a new core-binding relationship.
- processor cores for requests at the corresponding stage By periodically monitoring the utilization of processor cores in the storage system, and according to changes in the utilization of processor cores allocated for requests at any stage, reallocating processor cores for requests at the corresponding stage can be based on The change of the utilization rate of the processor cores is periodically adjusted to the processor cores allocated to the requests in the corresponding phases, thereby improving the load imbalance between the processor cores.
- a method for assigning the core relationship calculation module 502 to the current stage request in the storage system to satisfy the number of processor cores is taken as an example.
- the request for each phase is provided. The method of allocating the corresponding number of processor cores will be described in detail.
- multiple processor cores usually share different levels of memory or cache.
- the different levels of memory or cache can include L 1 cache, L 2 cache, L 3 cache, and local memory.
- L 1 cache L 1 cache
- L 2 cache L 3 cache
- local memory L 1 cache
- the processor core When sharing different levels of memory or cache, the topological distance between processor cores is also different.
- each processor core can access local memory in a remote node (hereinafter referred to as "remote memory").
- remote memory a remote node
- each processor core can be abstracted into multiple logical cores. For example, each processor core is abstracted into two logical cores, which are respectively logical core 0 and logical core 1, as shown in FIG. 8.
- Figure 8 shows a schematic diagram of the topology distance between logical cores sharing different levels of memory or cache under the NUMA architecture. It can be seen that under the NUMA architecture, there are nodes 0 and 1, and the logical cores in node 0 can be connected to nodes. The logical core in 1 shares the local memory in node 1. The local memory in node 1 is the remote memory for node 0.
- the topological distance between two logical cores sharing L 1 cache in node 0 is D 1
- the topological distance between two logical cores sharing L 2 cache is D 2
- L 3 is shared
- the topological distance between the two logical cores of the cache is D 3
- the topological distance between the two logical cores sharing local memory is D 4.
- the logical core in node 0 and the logical core in node 1 share the In local memory, the topological distance between the two logical cores is D 5 .
- the access latency ratio of accessing local memory and remote memory is approximately 8:12, so the topological distance between logical cores that share remote memory between nodes can be calculated as 64.
- the binding core calculation module 502 of the embodiment of the present invention is in each stage of the storage system A method for requesting allocation of a processor core satisfying a corresponding number will be described in detail.
- node 0 and node 1 in FIG. 8 are in a NUMA architecture and communicate with each other through hyper-threading.
- allocating a second processor core set that meets the number of requests for the current stage in multiple processors includes generating multiple sets of allocation results, and each set of allocation results includes requests for each stage The allocated set of processor cores meets the corresponding number; multiple path lengths are determined for the multiple sets of allocation results, and each set of allocation results corresponds to a path length, and the path length L satisfies:
- c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing requests in adjacent phases
- d i, i + 1 represents the average topological distance between the processor cores executing requests in the adjacent phases
- M is the number of requests in multiple stages of the service request; where the communication volume can represent the number of interactions between processor cores.
- the request of the current stage is allocated to satisfy the number of processor cores.
- each processor core is abstracted into logical core 0 and logical core 1, and 16 processor cores are abstracted into 32 logical cores.
- the three phase requests are denoted as M 0 , M 1, and M 2 respectively .
- the processor core used to execute the request of the current phase is determined by the foregoing.
- the method of determining the number of logic cores used to execute M 0 , M 1 and M 2 in the current cycle it is determined that the number of logic cores used to execute M 0 is 8, the number of logic cores used to execute M 1 is determined, and the number of logic cores used to execute M 2 is 16.
- the binding relationship calculation module 502 generates multiple sets of allocation results according to the number of logical cores determined for M 0 , M 1, and M 2. Each group of allocation results includes logical cores that satisfy a corresponding number of allocations for each stage of the request.
- the allocation result 1 is: logical cores 0 to 7 in node 0 are assigned to M 0 , logical cores 8 to 15 of node 0 are assigned to M 1 , and logical cores 0 to 15 of node 1 are assigned to M 2 ;
- the allocation result 2 is: logic cores 0 to 3 in node 0 and logic cores 0 to 3 in node 1 are allocated to M 0 , and logic cores 4 to 7 in node 0 and logic cores 4 to 7 in node 1 It is assigned to M 1 , and logical cores 8 to 15 in node 0 and logical cores 8 to 15 in node 1 are assigned to M 2 .
- the binding core calculation module 502 will allocate logical cores 0 to 3 in node 0 and logical cores 0 to 3 in node 1.
- M 0 assign logical cores 4 to 7 in node 0 and logical cores 4 to 7 in node 1 to M 1
- M 2 and replace the processor core originally allocated for the request of each stage of the business request in the binding relationship with the reallocated processor core at the beginning of the next cycle.
- multiple path lengths are determined for the multiple sets of allocation results.
- the topological distance between processor cores is considered, and multiple paths are The allocation result corresponding to the shortest path length in the length is determined as the final processor core allocation result, thereby ensuring load balancing among the processor cores, determining the processor core set for each stage of the business request request, and within the scope of the processor set.
- the scheduling of requests in the current phase takes into account the correlation between the requests in each phase and the delay that affects the processing of requests by the processor core in each phase, reducing the processing of business requests. Delay.
- FIG. 9 shows a schematic flowchart of a configuration method for processing a service request.
- the processing of the service request is divided into multiple stages, and the multiple stages correspond to the multiple stage requests.
- the multiple stage requests include the first stage requests, and a processor is configured for the first stage requests.
- a set of cores eg, a first set of processor cores through which the first stage of requests are processed.
- a first rule may be configured, and the first rule may indicate that the lightest-loaded processor core in the first set of processor cores configured for the request of the first stage executes the request of the first stage.
- the method further includes:
- the service request also includes a request in a second phase
- the request in the second phase may be a request in a phase subsequent to the request in the first phase
- a processor core set is configured for the request in the second phase.
- a second set of processor cores through which the requests in the second stage are processed.
- a second rule may be configured, and the second rule may indicate that the lightest-loaded processor core in the second set of processor cores configured for the request of the second stage executes the request of the second stage.
- the lightest processor core By allocating a certain number of processor cores (for example, a set of processor cores) for each stage of a business request, and sending requests for each stage to a load in the set of processor cores allocated for the request for that stage
- the lightest processor core compared to the lightest load processor core among multiple processor cores in the storage system, sends the service request to the service request configuration method according to the embodiment of the present invention, which can ensure that when processing a service request, Load balancing among processor cores, determining the set of processor cores for each stage of a business request, and scheduling the current stage of requests within the scope of the processor set.
- the correlation between the request of each stage and the delay that affects the processing of the request by the processor core is considered to reduce the delay of processing business requests.
- the service request includes the request of the first stage and the request of the second stage, and does not specifically limit the embodiment of the present invention.
- the service request may also include requests of other stages.
- FIG. 10 is a schematic block diagram of an apparatus 800 for processing a service request according to an embodiment of the present invention.
- the apparatus 800 is configured in a storage system and includes a transceiver module 801 and a processing module 802.
- the transceiver module 801 is configured to receive a request in a current stage of a service request, where the request in the current stage is a request in one of a plurality of stages in the service request.
- the processing module 802 is configured to determine a first set of processor cores that executes the request at the current stage, where the first set of processor cores is a subset of the plurality of processor cores.
- the transceiver module 801 is further configured to send the request of the current stage to the processor core with the lightest load in the first processor core set.
- the processing module 802 is further configured to query a core binding relationship and determine the first processor core set used to execute the request of the current phase, and the core binding relationship is used to indicate that the request of the current phase and the first phase Association between processor core sets.
- the processing module 802 is further configured to re-determine the number of processor cores that execute the request of the current stage according to the first set of processor cores; and according to the re-determined processor that executes the request of the current stage The number of cores, among the plurality of processor cores, for the current stage request, a second set of processor cores satisfying the number is allocated; according to the second set of processor cores, a new binding core relationship is generated, and the new binding The core relationship is used to indicate an association relationship between the request in the current stage and the second processor core set.
- the processing module 802 is further configured to determine a total utilization rate of the processor cores in the first processor core set and an average utilization rate of the plurality of processor cores; according to the first processor core set, The sum of the utilization of the processor cores and the average utilization of the plurality of processor cores re-determines the number of processor cores executing the request in the current stage.
- the processing module 802 is further configured to re-determine the execution of the current based on the following relationship based on the sum of the utilization rates of the processor cores in the first processor core set and the average utilization rate of the plurality of processor cores. Number of requested processor cores for the phase:
- N U P / U ave
- N is the number of processor cores that are re-determined to execute the request at the current stage
- U P is the total utilization of the processor cores in the first set of processor cores
- U ave is the Average utilization.
- the processing module 802 is further configured to generate multiple sets of allocation results, and each set of allocation results includes a set of processor cores that meets a corresponding number of requests for reallocation for each stage of the request; Path length, each group of allocation results corresponds to a path length, the path length L satisfies:
- c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing requests in adjacent phases
- d i, i + 1 represents the average topological distance between the processor cores executing requests in the adjacent phases
- M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the multiple path lengths, the request for the current stage is allocated a second set of processor cores that meets the number.
- the first processor core set includes K processor cores, where K is an integer greater than or equal to 3, and the processing module 802 is further configured to, according to the sliding window length w and the sliding step size d, in the K Among the processor cores, a scheduling sub-region is determined for the request of the current stage.
- the scheduling sub-region includes w processor cores, where w is an integer greater than or equal to 2 and less than K, and d is an integer greater than or equal to 1 and less than K.
- the transceiver module 801 is further configured to send the request of the current stage to the lightest-loaded processor core among the w processor cores.
- d and K are prime numbers each other.
- the apparatus 800 for processing a service request may correspond to executing the method 600 or the method 700 described in the embodiment of the present invention, and the above and other operations and / or functions of each module in the apparatus 800 are respectively implemented in order to implement FIG. 6.
- the specific implementation of the apparatus 800 for processing a service request in the embodiment of the present invention may be a processor, or a software module, or a combination of a processor and a software module, which is not limited in the embodiment of the present invention.
- FIG. 11 is a schematic block diagram of a storage system 900 according to an embodiment of the present invention.
- the storage system includes a processor 901 and a memory 902, and the processor 901 includes multiple processor cores.
- One or more processor cores in the plurality of processor cores are used to execute computer instructions stored in the memory 902.
- the one or more processor cores are used to execute The following operations: receiving a request of a current stage of a service request, the current stage request being one of a plurality of stage requests of the service request; determining a first set of storage system cores to execute the current stage request, the The first storage system core set is a subset of the storage system cores of the plurality of storage system cores; and the request of the current stage is sent to the storage system core with the lightest load in the first storage system core set.
- the one or more processor cores are further configured to query a core binding relationship, determine the first storage system core set used to execute the request of the current phase, and the core binding relationship is used to indicate the current phase. An association relationship between the request and the first storage system core set.
- the one or more processor cores are further configured to re-determine the number of storage system cores that execute the request of the current phase according to the first set of storage system cores; The number of requested storage system cores, and among the multiple storage system cores, a second storage system core set that satisfies the number is allocated to the current stage of the request; and a new binding core relationship is generated according to the second storage system core set, The new core binding relationship is used to indicate an association relationship between the request in the current stage and the second storage system core set.
- the one or more processor cores are further configured to determine a sum of utilization rates of the storage system cores in the first storage system core set and an average utilization rate of the plurality of storage system cores; according to the first processing The sum of the utilization rates of the processor cores in the processor core set and the average utilization rate of the plurality of processor cores re-determines the number of processor cores executing the request in the current stage.
- the one or more processor cores are further configured to re-based on the sum of the utilization rates of the processor cores in the first processor core set and the average utilization rate of the plurality of processor cores based on the following relationship: Determine the number of processor cores executing requests for this current phase:
- N U P / U ave
- N is the number of processor cores that are re-determined to execute the request at the current stage
- U P is the total utilization of the processor cores in the first set of processor cores
- U ave is the Average utilization.
- the one or more processor cores are further configured to generate multiple groups of allocation results, and each group of allocation results includes a set of processor cores that meets a corresponding number of requests reallocated for each stage of the request;
- the allocation result determines multiple path lengths.
- Each group of allocation results corresponds to a path length.
- the path length L satisfies:
- c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing requests in adjacent phases
- d i, i + 1 represents the average topological distance between the processor cores executing requests in the adjacent phases
- M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the multiple path lengths, the request for the current stage is allocated a second set of processor cores that meets the number.
- the first set of processor cores includes K processor cores, where K is an integer greater than or equal to 3, and the one or more processor cores are further used according to the sliding window length w and the sliding step size d.
- the scheduling sub-region includes w processor cores, w is an integer greater than or equal to 2 and less than K, and d is greater than or equal to 1 And an integer smaller than K; sending the request of the current stage to the lightest-loaded processor core among the w processor cores.
- the d and the K are prime numbers each other.
- Each module shown in FIG. 5 in the embodiment of the present invention may be hardware logic in the processor core, or may be computer instructions executed by the processor core, or a combination of hardware logic and computer instructions, which is not limited in the embodiment of the present invention. .
- Each module of the apparatus 800 for processing a service request may be implemented by a processor, may be implemented by a processor and a memory together, or may be implemented by a software module. Accordingly, each module shown in FIG. 5 may correspond to one or more modules shown in FIG. 8, and the module shown in FIG. 8 includes corresponding functions of the module shown in FIG. 5.
- An embodiment of the present invention provides a computer-readable storage medium.
- the computer-readable storage medium stores computer instructions.
- the computer instructions When the computer instructions are run on a computer, the computer executes a method for processing a service request in an embodiment of the present invention. Or configuration methods for processing business requests.
- Embodiments of the present invention provide a computer program product containing computer instructions, and when the computer instructions are run on a computer, the computer is caused to execute the method for processing a service request or the method for configuring a service request in an embodiment of the present invention.
- processors mentioned in the embodiments of the present invention may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), and application-specific integrated circuits (DSPs).
- DSPs digital signal processors
- DSPs application-specific integrated circuits
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- the memory mentioned in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrical memory Erase programmable read-only memory (EPROM, EEPROM) or flash memory.
- the volatile memory may be a random access memory (RAM), which is used as an external cache.
- RAM random access memory
- DRAM dynamic random access memory
- SDRAM synchronous dynamic random access memory
- double SDRAM double SDRAM
- DDR SDRAM double data rate synchronous dynamic random access memory
- enhanced SDRAM enhanced SDRAM
- SLDRAM synchronous connection dynamic random access memory
- direct RAMbus RAM direct RAMbus RAM
- the processor is a general-purpose processor, a DSP, an ASIC, an FPGA, or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component
- the memory memory module
- memory described herein is intended to include, but is not limited to, these and any other suitable types of memory.
- the disclosed systems, devices, and methods may be implemented in other manners.
- the device embodiments described above are only schematic.
- the division of the unit is only a logical function division.
- multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
- the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the embodiment of the present invention is essentially a part that contributes to the existing technology or a part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium.
- the foregoing storage media include: U disks, mobile hard disks, read-only memories (ROM), random access memories (RAM), magnetic disks or compact discs, and other media that can store computer instructions .
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present application provides a method for processing a service request in a storage system, wherein the storage system contains a plurality of processor cores, and is characterized by comprising: Receiving a request of the current stage of a service request, wherein the request of the current stage is a request of one stage of a plurality of stages of the service request; Determining a first processor core set for performing the request of the current stage, wherein the first processor core set is a processor core subset of a plurality of processor cores; Sending the request of the current stage to the processor core with least load in the first processor core set. The method can ensure load balance between the processor cores and reduce the time delay of processing the service request.
Description
本申请涉及信息技术领域,并且更具体地,涉及处理业务请求的方法、装置与处理器。The present application relates to the field of information technology, and more particularly, to a method, an apparatus, and a processor for processing a service request.
在存储系统中,阵列控制器的中央处理器(central processing unit,CPU)是影响系统性能的关键因素,通常CPU包括的处理器核越多,存储系统的性能就越高。In storage systems, the central processing unit (CPU) of the array controller is a key factor affecting system performance. Generally, the more processor cores a CPU includes, the higher the performance of the storage system.
然而,在阵列控制器包含多处理器核的存储系统中,随着处理器核数的增多,调度处理器核处理业务请求时会出现处理器核之间的负载不均衡问题。However, in a storage system where the array controller includes multiple processor cores, as the number of processor cores increases, the problem of load imbalance among processor cores occurs when scheduling processor cores to process business requests.
现在技术,根据处理器核上待处理的业务请求个数来估计处理器核的负载,最终将业务请求发送至负载最小的处理器核。这种方法虽然能够改善处理器核之间的负载不均衡问题,但是处理业务请求的时间复杂度会随着处理器核数的增多而线性扩展,导致处理业务请求的时延的不可控。Current technology estimates the load of the processor core based on the number of pending service requests on the processor core, and finally sends the service request to the processor core with the smallest load. Although this method can improve the load imbalance between processor cores, the time complexity of processing business requests will linearly expand as the number of processor cores increases, resulting in uncontrollable delays in processing business requests.
发明内容Summary of the invention
第一方面,提供了一种存储系统中处理业务请求的方法,所述存储系统包含多个处理器核,包括:接收业务请求的当前阶段的请求,所述当前阶段的请求为所述业务请求的多个阶段的请求中的一个阶段的请求;确定执行所述当前阶段的请求的第一处理器核集合,所述第一处理器核集合为所述多个处理器核的一个处理器核子集;向所述第一处理器核集合负载最轻的处理器核发送所述当前阶段的请求。In a first aspect, a method for processing a service request in a storage system is provided. The storage system includes multiple processor cores, including: receiving a request for a current stage of a service request, where the current stage request is the service request A request of one of the multiple stages of the request; determining a first set of processor cores to execute the request of the current stage, the first set of processor cores being one processor core of the plurality of processor cores Send the request of the current stage to the first processor core set with the lightest load processor core.
通过将待处理的业务请求划分为多个阶段的请求来执行,为每一阶段的请求分配一定数量的处理器核(例如,处理器核集合),并将每一阶段的请求均发送至为该阶段的请求分配的处理器核集合中的负载最轻的处理器核,相对于将业务请求发送至存储系统中多个处理器核当中负载最轻的处理器核,本申请的处理业务请求的方法能够保证处理器核之间的负载均衡,为业务请求每个阶段的请求确定处理器核集合,在处理器集合范围内调度当前阶段的请求,相对于直接选择存储系统中负载最轻的处理器核,考虑了各阶段的请求与影响处理器核处理各阶段的请求的时延的相关性,降低处理业务请求的时延。By dividing the pending business request into multiple stages of requests for execution, allocating a certain number of processor cores (for example, a set of processor cores) to each stage of the request, and sending each stage of the request to The lightest-loaded processor core in the set of processor cores requested for allocation at this stage, compared to sending the business request to the lightest-loaded processor core among multiple processor cores in the storage system, this application processes business requests. The method can ensure load balancing between processor cores, determine the set of processor cores for each stage of business request, and schedule the current stage of requests within the scope of the processor set. Compared to directly selecting the lightest load in the storage system, The processor core considers the correlation between the request of each stage and the delay that affects the processing of the request by the processor core, and reduces the delay of processing service requests.
可选地,所述确定执行所述当前阶段的请求的第一处理器核集合,包括:查询绑核关系,确定用于执行所述当前阶段的请求的所述第一处理器核集合,所述绑核关系用于指示所述当前阶段的请求与所述第一处理器核集合之间的关联关系。Optionally, the determining a first set of processor cores to execute the request of the current phase includes: querying a core binding relationship, determining the first set of processor cores to execute the request of the current phase, and The core binding relationship is used to indicate an association relationship between the request in the current stage and the first processor core set.
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:根据所述第一处理器核集合,重新确定执行所述当前阶段的请求的处理器核的数量;根据所述重新确定的执行所述当前阶段的请求的处理器核的数量,在所述多个处理器核中为所述当前阶段的请求分配满足所述数量的第二处理器核集合;根据所述第二处理器核集合,生成新的绑核关系,所述新的绑核关系用于指示所述当前阶段的请求与所述第二处理器核集合之间的关联关 系。With reference to the first aspect, in some implementations of the first aspect, the method further includes: re-determining the number of processor cores that executes the request of the current stage according to the first set of processor cores; Re-determining the number of processor cores that execute the request of the current phase, and allocating, in the plurality of processor cores, the request of the current phase to a second set of processor cores that meets the number; according to the The second processor core set generates a new core binding relationship, where the new core binding relationship is used to indicate an association relationship between the request at the current stage and the second processor core set.
可选地,所述根据所述第一处理器核集合,重新确定执行所述当前阶段的请求的处理器核的数量,包括:确定所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率;根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,重新确定执行所述当前阶段的请求的处理器核的数量。Optionally, re-determining the number of processor cores that executes the request in the current phase according to the first set of processor cores includes: determining a utilization of the processor cores in the first set of processor cores And the average utilization rate of the plurality of processor cores; and re-determining the execution location according to the total utilization rate of the processor cores in the first processor core set and the average utilization rate of the plurality of processor cores. Describes the number of processor cores requested in the current phase.
通过周期性地监控存储系统中的处理器核的利用率,并根据为任一阶段的请求分配的处理器核的利用率的变化情况,为相应阶段的请求重新分配处理器核,从而能够根据处理器核的利用率的变化情况,周期性地调整为相应阶段的请求分配的处理器核,进而改善处理器核之间的负载不均衡的现象。By periodically monitoring the utilization of processor cores in the storage system, and according to changes in the utilization of processor cores allocated for requests at any stage, reallocating processor cores for requests at the corresponding stage can be based on The change of the utilization rate of the processor cores is periodically adjusted to the processor cores allocated to the requests in the corresponding phases, thereby improving the load imbalance between the processor cores.
可选地,所述根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,重新确定执行所述当前阶段的请求的处理器核的数量,包括:根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,基于以下关系式重新确定执行所述当前阶段的请求的处理器核的数量:Optionally, the re-determining the processor core executing the request of the current stage according to the sum of the utilization ratio of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores. The number includes: re-determining the execution of the request of the current stage based on the following relationship according to the sum of the utilization of the processor cores in the first processor core set and the average utilization of the plurality of processor cores. Number of processor cores:
N=U
P/U
ave
N = U P / U ave
其中,N为重新确定的执行所述当前阶段的请求的处理器核的数量,U
P为所述第一处理器核集合中的处理器核的利用率总和,U
ave为所述多个处理器核的平均利用率。
Wherein N is the number of re-determined processor cores executing the current stage request, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the multiple processes. Processor core average utilization.
结合第一方面,在第一方面的某些实现方式中,所述在所述多个处理器核中为所述当前阶段的请求分配满足所述数量的第二处理器核集合,包括:生成多组分配结果,每组分配结果中包括为每一个阶段的请求重新分配的满足相应数量的处理器核集合;针对所述多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,所述路径长度L满足:With reference to the first aspect, in some implementation manners of the first aspect, the allocating, in the plurality of processor cores, the request of the current stage for the current stage request to a second set of processor cores including: generating Multiple sets of allocation results. Each set of allocation results includes a set of processor cores that satisfy the corresponding number of requests for reassignment for each stage of the request. Multiple path lengths are determined for the multiple sets of allocation results, and each set of allocation results corresponds to a path Length, the path length L satisfies:
其中,c
i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d
i,i+1表示执行所述相邻阶段的请求的处理器核间的平均拓扑距离,M为所述业务请求的多个阶段的请求的数量;根据所述多个路径长度中的最短路径长度对应的一组分配结果,为所述当前阶段的请求分配满足所述数量的第二处理器核集合。
Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing the requests in adjacent stages, and d i, i + 1 represents the average topological distance between the processor cores executing the requests in the adjacent stages. M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the plurality of path lengths, the request in the current stage is allocated a second that satisfies the number Processor core collection.
根据确定的为各个阶段的请求分配的处理器核的数量,生成多组处理器核的分配结果,针对该多组分配结果确定多个路径长度,通过为各个阶段的请求分配处理器核时考虑处理器核间的拓扑距离,将多个路径长度中的最短路径长度对应的分配结果确定为最终的处理器核分配结果,从而保证处理器核之间的负载均衡,降低处理业务请求的时延。According to the determined number of processor cores allocated for the requests of each stage, generate multiple sets of processor core allocation results, determine multiple path lengths for the multiple sets of allocation results, and consider the allocation of processor cores for the requests of each stage The topological distance between processor cores determines the allocation result corresponding to the shortest path length among multiple path lengths as the final processor core allocation result, thereby ensuring load balancing between processor cores and reducing the delay in processing business requests .
结合第一方面,在第一方面的某些实现方式中,所述第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,所述向所述第一处理器核集合中负载最轻的处理器核发送所述当前阶段的请求,包括:根据滑动窗口长度w与滑动步长d,在所述K个处理器核中为所述当前阶段的请求确定调度子区域,所述调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;向所述w个处理器核中负载最轻的处理器核发送所述当前阶段的请求。With reference to the first aspect, in some implementations of the first aspect, the first set of processor cores includes K processor cores, where K is an integer greater than or equal to 3, and the first processor core The processor with the lightest load in the core set sending the request of the current stage includes: determining a scheduler for the request of the current stage among the K processor cores according to the sliding window length w and the sliding step d. Region, the scheduling sub-region includes w processor cores, w is an integer greater than or equal to 2 and less than K, and d is an integer greater than or equal to 1 and less than K; the load is the largest among the w processor cores The light processor core sends the request at the current stage.
在确定执行任一阶段的请求的处理器核时,通过引入滑动窗口机制,缩小搜索负载最轻的处理器核的搜索范围,使调度子区域中负载最轻的处理器核执行相应阶段的请求,保 证处理器核之间的负载均衡,进一步降低处理业务请求的时延。When determining the processor core that executes a request at any stage, by introducing a sliding window mechanism, the search range of the processor core with the lightest search load is narrowed, so that the processor core with the lightest load in the scheduling subregion executes the request of the corresponding stage To ensure load balancing between processor cores and further reduce the delay in processing business requests.
结合第一方面,在第一方面的某些实现方式中,所述d与所述K互为质数。With reference to the first aspect, in some implementation manners of the first aspect, the d and the K are prime numbers each other.
引入滑动窗口机制后,当存在多个阶段的请求与同一处理器核集合之间存在绑定关系时,并且当该处理器核集合中的每个处理器核的负载相同时,此时,在依次处理该多个阶段的请求时,为了保证处理器核间的负载均衡,需要保证负载相同(即,待处理的请求队列的个数相同)的处理器核被选中用于执行请求的概率相同,即,需要保证每个处理器核作为滑动窗口内的搜索起始点的概率相同,当该处理器核集合中的处理器核的个数K与滑动步长d互为质数时,能够保证每个处理器核作为滑动窗口内的搜索起始点的概率相同。After the sliding window mechanism is introduced, when there is a binding relationship between a request in multiple stages and the same processor core set, and when the load of each processor core in the processor core set is the same, at this time, in In order to process the requests in multiple stages in order, in order to ensure load balancing among processor cores, it is necessary to ensure that processor cores with the same load (that is, the same number of pending request queues) are selected for the same probability of request execution That is, it is necessary to ensure that each processor core has the same probability as the search starting point in the sliding window. When the number of processor cores K in the processor core set and the sliding step size d are prime numbers, it can be guaranteed that The probabilities of the processor cores as search starting points in the sliding window are the same.
第二方面,提供了一种处理业务请求的配置方法,包括;为业务请求的第一阶段的请求配置第一处理器核集合,所述第一处理器核集合用于执行所述第一阶段的请求;配置第一规则,所述第一规则指示向所述第一处理器核集合中负载最轻的处理器核发送所述第一阶段的请求。In a second aspect, a configuration method for processing a service request is provided, including: configuring a first set of processor cores for a request in a first stage of a service request, the first set of processor cores being used to execute the first stage Request; configure a first rule, the first rule instructing to send the request of the first stage to the lightest-loaded processor core in the first processor core set.
通过为业务请求的每一阶段的请求分配一定数量的处理器核(例如,处理器核集合),并将每一阶段的请求均发送至为该阶段的请求分配的处理器核集合中的负载最轻的处理器核,相对于将业务请求发送至存储系统中多个处理器核当中负载最轻的处理器核,本申请的处理业务请求的配置方法能够使得处理业务请求时,保证处理器核之间的负载均衡,考虑了各阶段的请求与影响处理器核处理各阶段的请求的时延的相关性,降低处理业务请求的时延。By allocating a certain number of processor cores (for example, a set of processor cores) for each stage of a business request, and sending requests for each stage to a load in the set of processor cores allocated for the request for that stage The lightest processor core, as opposed to sending the business request to the lightest-loaded processor core among the multiple processor cores in the storage system, the configuration method for processing a business request of the present application enables the processor to be guaranteed when processing a business request. Load balancing between cores takes into account the correlation between the requests in each phase and the delay that affects the processor core's processing of requests in each phase, reducing the delay in processing business requests.
结合第二方面,在第二方面的某些实现方式中,所述方法还包括:为业务请求的第二阶段的请求配置第二处理器核集合,所述第二处理器核集合用于执行所述第二阶段的请求;配置第二规则,所述第二规则指示向所述第二处理器核集合中负载最轻的处理器核发送所述第二阶段的请求。With reference to the second aspect, in some implementations of the second aspect, the method further includes: configuring a second set of processor cores for a request in a second phase of a service request, the second set of processor cores being used for execution The request of the second phase; configuring a second rule, the second rule instructing to send the request of the second phase to the lightest-loaded processor core in the second set of processor cores.
第三方面,提供一种处理业务请求的装置,所述装置配置于存储系统中,所述装置用于执行上述第一方面或第二方面的任一可能的实现方式中的方法。具体地,所述装置可以包括用于执行第一方面或第二方面的任一可能的实现方式中的方法的模块。According to a third aspect, an apparatus for processing a service request is provided. The apparatus is configured in a storage system, and the apparatus is configured to execute the method in any one of the possible implementation manners of the first aspect or the second aspect. Specifically, the apparatus may include a module for executing a method in any possible implementation manner of the first aspect or the second aspect.
第四方面,提供一种存储系统,所述存储系统包括多个处理器核与存储器;存储器,用于存储计算机指令;所述多个处理器核中的一个或多个处理器核用于执行所述存储器中存储的计算机指令,当所述存储器中的计算机指令被执行时,所述一个或多个处理器核用于执行上述第一方面或第二方面的任一可能的实现方式中的方法。According to a fourth aspect, a storage system is provided. The storage system includes a plurality of processor cores and a memory; the memory is configured to store computer instructions; and one or more of the plurality of processor cores are configured to execute Computer instructions stored in the memory, when the computer instructions in the memory are executed, the one or more processor cores are configured to execute any one of the possible implementations of the first aspect or the second aspect above method.
第五方面,提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行第一方面或第二方面的任一可能的实现方式中的方法。According to a fifth aspect, a computer-readable storage medium is provided, where the computer-readable storage medium stores computer instructions, and when the computer instructions are run on a computer, the computer is caused to execute any one of the first aspect or the second aspect Methods in possible implementations.
第六方面,提供一种包含计算机指令的计算机程序产品,当该计算机指令在计算机上运行时,使得计算机执行第一方面或第二方面的任一可能的实现方式中的方法。According to a sixth aspect, a computer program product including computer instructions is provided, and when the computer instructions are run on a computer, the computer is caused to execute the method in any possible implementation manner of the first aspect or the second aspect.
图1是本发明实施例的存储阵列架构示意图。FIG. 1 is a schematic diagram of a storage array architecture according to an embodiment of the present invention.
图2是本发明实施例的存储阵列的控制器的示意图。FIG. 2 is a schematic diagram of a controller of a storage array according to an embodiment of the present invention.
图3是本发明实施例的分布式块存储系统的示意图。FIG. 3 is a schematic diagram of a distributed block storage system according to an embodiment of the present invention.
图4是分布式块存储系统的服务器的示意性结构框图。FIG. 4 is a schematic structural block diagram of a server of a distributed block storage system.
图5是本发明实施例的处理器的示意性框图。FIG. 5 is a schematic block diagram of a processor according to an embodiment of the present invention.
图6是本发明实施例提供的存储系统中处理业务请求的方法的示意性流程图。FIG. 6 is a schematic flowchart of a method for processing a service request in a storage system according to an embodiment of the present invention.
图7是本发明实施例提供的基于滑动窗口机制调度处理器核的原理性示意图。FIG. 7 is a schematic diagram of scheduling a processor core based on a sliding window mechanism according to an embodiment of the present invention.
图8是本发明实施例的NUMA架构下共享不同层次的内存或cache的逻辑核之间的拓扑距离示意图。FIG. 8 is a schematic diagram of a topology distance between logical cores sharing different levels of memory or cache under a NUMA architecture according to an embodiment of the present invention.
图9位本发明实施例提供的处理业务请求的配置方法的示意性流程图。FIG. 9 is a schematic flowchart of a configuration method for processing a service request according to an embodiment of the present invention.
图10为本发明实施例提供的处理业务请求的装置的示意性框图。FIG. 10 is a schematic block diagram of an apparatus for processing a service request according to an embodiment of the present invention.
图11为本发明实施例提供的存储系统的示意性框图。FIG. 11 is a schematic block diagram of a storage system according to an embodiment of the present invention.
下面将结合附图,对本发明实施例中的技术方案进行描述。The technical solutions in the embodiments of the present invention will be described below with reference to the drawings.
首先对适用于本发明实施例的存储系统进行介绍。First, a storage system applicable to the embodiment of the present invention is described.
如图1所示,本发明实施例中的存储系统,可以为存储阵列(如华为
的Oceanstor
18000系列,
V3系列)。存储阵列包括存储控制器101和多块硬盘,其中,硬盘包含固态硬盘(solid state disk,SSD)、机械硬盘或者混合硬盘等。机械硬盘如HDD(hard disk drive)。如图2所示,控制器101包含中央处理单元(central processing unit,CPU)201、存储器202和接口203,存储器202中存储计算机指令,CPU201包括多个处理器核(图2中未示出),CPU201执行存储器202中的计算机指令对存储系统进行管理及数据访问操作。另外,为节省CPU201的计算资源,现场可编程门阵列(field programmable gate array,FPGA)或其他硬件也可以用于执行本发明实施例中CPU201全部操作,或者,FPGA或其他硬件与CPU201分别用于执行本发明实施例CPU201的操作。为方便描述,本发明实施例将CPU201与内存202称为处理器、或将FPGA及其他替代CPU201的硬件称为处理器,或将FPGA及其他替代CPU201的硬件与CPU201的组合统称为处理器,处理器与接口203通信。接口203可以为网络接口卡(networking interface card,NIC)、主机总线适配器(host bus adaptor,HBA)等。
As shown in FIG. 1, the storage system in the embodiment of the present invention may be a storage array (such as Huawei Oceanstor 18000 series, V3 series). The storage array includes a storage controller 101 and a plurality of hard disks. The hard disks include a solid state disk (SSD), a mechanical hard disk, or a hybrid hard disk. Mechanical hard disks such as HDD (hard disk drive). As shown in FIG. 2, the controller 101 includes a central processing unit (CPU) 201, a memory 202, and an interface 203. The memory 202 stores computer instructions, and the CPU 201 includes multiple processor cores (not shown in FIG. 2). The CPU 201 executes computer instructions in the memory 202 to perform management and data access operations on the storage system. In addition, in order to save the computing resources of the CPU 201, a field programmable gate array (FPGA) or other hardware can also be used to perform all operations of the CPU 201 in the embodiment of the present invention, or the FPGA or other hardware and the CPU 201 are respectively used for The operations of the CPU 201 according to the embodiment of the present invention are performed. For convenience of description, in the embodiment of the present invention, the CPU 201 and the memory 202 are referred to as a processor, or the FPGA and other hardware replacing the CPU 201 are referred to as a processor, or the combination of the FPGA and other hardware replacing the CPU 201 and the CPU 201 are collectively referred to as a processor. The processor is in communication with the interface 203. The interface 203 may be a network interface card (NIC), a host bus adaptor (HBA), or the like.
如图1和图2所描述的存储阵列,CPU201用于处理业务请求,如接收主机或客户端发送的业务请求,使用本发明实施例提供的处理业务请求的方法处理该业务请求。As shown in FIG. 1 and FIG. 2, the CPU 201 is configured to process a service request, such as receiving a service request sent by a host or a client, and use the method for processing a service request provided by an embodiment of the present invention to process the service request.
进一步的,本发明实施例的存储系统还可以为分布式文件存储系统(如华为
的
9000系列),分布式块存储系统(如华为
的
系列)等。以华为
的
系列为例。示例性的如图3所示,分布式块存储系统包括多台服务器,如服务器1、服务器2、服务器3、服务器4、服务器5和服务器6,服务器间通过无限带宽(infiniband)技术或以太网络等互相通信。在实际应用当中,分布式块存储系统中服务器的数量可以根据实际需求增加,本发明实施例对此不作限定。
Further, the storage system in the embodiment of the present invention may also be a distributed file storage system (such as Huawei of 9000 series), distributed block storage systems (such as Huawei of Series) and so on. Huawei of Series as an example. As shown in FIG. 3 by way of example, a distributed block storage system includes multiple servers, such as server 1, server 2, server 3, server 4, server 5, and server 6, and infiniband technology or Ethernet is used between the servers Waiting to communicate with each other. In practical applications, the number of servers in the distributed block storage system can be increased according to actual needs, which is not limited in the embodiment of the present invention.
分布式块存储系统的服务器中包含如图4所示的结构。如图4所示,分布式块存储系统中的每台服务器包含中央处理单元(central processing unit,CPU)401、内存402、接口403、硬盘1、硬盘2和硬盘3,内存402中存储计算机指令,CPU401包括多个处理器核(图4中未示出)、CPU401执行内存402中的计算机指令执行相应的操作。接口403可以为硬件接口,如网络接口卡(network interface card,NIC)或主机总线适 配器(host bus adaptor,HBA)等,也可以为程序接口模块等。硬盘包含固态硬盘(solid state disk,SSD)、机械硬盘或者混合硬盘。机械硬盘如HDD(hard disk drive)。另外,为节省CPU401的计算资原,现场可编程门阵列(field programmable gate array,FPGA)或其他硬件也可以代替CPU401执行上述相应的操作,或者,FPGA或其他硬件与CPU401共同执行上述相应的操作。为方便描述,本发明实施例将CPU401与内存402称为处理器、或将FPGA及其他替代CPU401的硬件称为处理器,或将FPGA及其他替代CPU401的硬件与CPU401的组合统称为处理器。接口403可以为网络接口卡(networking interface card,NIC)、主机总线适配器(host bus adaptor,HBA)等。The server of the distributed block storage system includes a structure as shown in FIG. 4. As shown in FIG. 4, each server in the distributed block storage system includes a central processing unit (CPU) 401, a memory 402, an interface 403, a hard disk 1, a hard disk 2, and a hard disk 3. The memory 402 stores computer instructions The CPU 401 includes multiple processor cores (not shown in FIG. 4), and the CPU 401 executes computer instructions in the memory 402 to perform corresponding operations. The interface 403 may be a hardware interface, such as a network interface card (NIC) or a host bus adapter (HBA), or a program interface module. The hard disk includes a solid state disk (SSD), a mechanical hard disk, or a hybrid hard disk. Mechanical hard disks such as HDD (hard disk drive). In addition, in order to save the computing resources of the CPU401, a field programmable gate array (FPGA) or other hardware can also perform the corresponding operations in place of the CPU401, or FPGA or other hardware can perform the corresponding operations in conjunction with the CPU401. . For convenience of description, in the embodiment of the present invention, the CPU 401 and the memory 402 are referred to as a processor, or the FPGA and other hardware replacing the CPU 401 are referred to as a processor, or the combination of the FPGA and other hardware replacing the CPU 401 and the CPU 401 are collectively referred to as a processor. The interface 403 may be a network interface card (NIC), a host bus adapter (HBA), or the like.
如图3和图4所描述的分布式块存储系统,CPU401用于处理业务请求,如接收主机或客户端发送的业务请求,使用本发明实施例提供的处理业务请求的方法处理该业务请求。As shown in the distributed block storage system shown in FIG. 3 and FIG. 4, the CPU 401 is configured to process a service request, such as receiving a service request sent by a host or a client, and use the method for processing a service request provided by an embodiment of the present invention to process the service request.
下面对处理业务请求的一般方法进行简单介绍:The general method of processing business requests is briefly introduced below:
在处理业务请求时,根据包含多个处理器核的存储系统中的每个处理器核上待处理的业务请求的数量来估计处理器核的负载情况,最终将业务请求发送至存储系统中负载最轻(例如,待处理的业务请求的数量最少)的处理器核。When processing a business request, the load of the processor core is estimated based on the number of business requests to be processed on each processor core in a storage system containing multiple processor cores, and the business request is finally sent to the load in the storage system The lightest (for example, the least number of pending business requests) processor core.
这种方法虽然能够改善处理器核之间的负载不均衡的现象,但是处理业务请求的时间复杂度会随着处理器核数的增多而线性扩展,导致对处理业务请求的时延的不可控。Although this method can improve the load imbalance between processor cores, the time complexity of processing business requests will linearly expand as the number of processor cores increases, resulting in uncontrollable delays in processing business requests. .
针对上述问题,本发明实施例提出一种处理业务请求的方法,待处理的业务请求可以划分为多个阶段的请求执行,为每一阶段的请求分配一定数量的处理器核(例如,处理器核集合),并将每一阶段的请求均发送至为该阶段的请求分配的处理器核集合中的负载最轻的处理器核,相对于将业务请求发送至存储系统中所有处理器核当中负载最轻的处理器核。本发明实施例中,基于CPU(如处理器核)访问各个层次的内存或cache的访问时延、访问距离、处理器之间的连接关系或总线类型等影响时延的因素,为每一个阶段的请求分配处理器核集合。本发明实施例的处理业务请求的方法能够保证处理器核之间的负载均衡,在处理器核集合范围内调度当前阶段的请求,相对于直接选择存储系统中负载最轻的处理器核,考虑了各阶段请求与处理器核处理各阶段的请求的时延的相关性,降低处理业务请求的时延。示例性的,访问请求可以分为两个阶段:等待资源阶段和使用资源阶段。其中,等待资源阶段的请求一般需要请求特殊的资源,如磁盘、内存、文件等,当资源被上一个请求占用没有被释放时,等待资源阶段的请求就会被阻塞,直到能够使用这个资源;使用资源阶段的请求是真正进行数据访问阶段的请求。再例如,以小型计算机系统接口(computer system interface,SCSI)子系统为例,SCSI子系统是一种分层的架构,共分为三层。顶部的那层,即是上层叫做较高层,代表的是操作系统内核访问SCSI协议的设备和主要设备类型的驱动器的最高接口。接下来的是中间层,也称为公共层或统一层,在这一层包含SCSI堆栈的较高层和较低层的一些公共服务。最后是较低层,代表的是适用于SCSI协议的设备的物理接口的实际驱动器。基于SCSI的访问请求也相应划分为3个阶段的请求。In view of the above problems, an embodiment of the present invention proposes a method for processing a service request. The pending service request can be divided into multiple stages of request execution, and a certain number of processor cores (for example, processors) are allocated for each stage of the request. Core set), and send each stage request to the lightest-loaded processor core in the set of processor cores allocated for the request in this stage, as opposed to sending business requests to all processor cores in the storage system Lightest processor core. In the embodiment of the present invention, factors that affect the delay, such as the access delay, access distance, connection relationship between processors, or bus type, for each level of memory or cache accessed by the CPU (such as a processor core) are for each stage. Request to allocate a collection of processor cores. The method for processing service requests in the embodiments of the present invention can ensure load balancing among processor cores, and schedule requests at the current stage within the scope of the processor core set. Compared with directly selecting the lightest load processor core in the storage system, consider The correlation between the request of each stage and the delay of the processor core processing the request of each stage is reduced, and the delay of processing the service request is reduced. Exemplarily, the access request can be divided into two phases: a resource waiting phase and a resource using phase. Among them, requests in the resource waiting phase generally require special resources, such as disks, memory, and files. When resources are occupied by the previous request and not released, requests in the resource waiting phase are blocked until the resource can be used; A request using the resource phase is a request that actually performs a data access phase. For another example, taking a small computer system interface (SCSI) subsystem as an example, the SCSI subsystem is a layered architecture, which is divided into three layers. The top layer, which is called the upper layer, represents the highest interface of the operating system kernel to access the SCSI protocol device and the driver of the main device type. Next is the middle layer, also known as the common layer or unified layer, in this layer contains some of the higher and lower layers of the SCSI stack public services. Finally, the lower layer represents the actual driver for the physical interface of the device that is suitable for the SCSI protocol. SCSI-based access requests are also divided into three stages of requests.
在对本发明实施例提供的存储系统中处理业务请求的方法进行介绍之前,首先对本发明实施例提供的处理器(例如,图2中的CPU201与图4中的CPU401)进行介绍。Before introducing the method for processing a service request in the storage system provided by the embodiment of the present invention, the processor (for example, the CPU 201 in FIG. 2 and the CPU 401 in FIG. 4) provided by the embodiment of the present invention is first introduced.
如图5所示,本发明实施例中的处理器包括多个处理器核(例如,处理器核0~处理 器核S,S≥2),多个处理器核中的一个处理器核中包括负载均衡模块501与绑核关系计算模块502,其他处理器核中包括调度模块503。其中,负载均衡模块501用于为业务请求的每一阶段的请求计算需要绑定的处理器核的数量;绑核关系计算模块502用于为业务请求的每一阶段的请求分配满足相应数量的处理器核,进而生成绑核关系,该绑核关系指示业务请求的一个阶段的请求与一个处理该阶段请求的处理器核集合的对应关系;调度模块503用于保存该绑核关系,在接收到某一阶段的业务请求时,查询该绑核关系,确定用于执行该阶段的请求的处理核集合,并将该阶段的请求发送至该处理器核集合中负载最轻的处理器核,由该处理器核执行该阶段的请求。As shown in FIG. 5, the processor in the embodiment of the present invention includes multiple processor cores (for example, processor core 0 to processor core S, S ≧ 2), and one of the multiple processor cores The load balancing module 501 and the binding core calculation module 502 are included. The other processor cores include a scheduling module 503. Among them, the load balancing module 501 is used to calculate the number of processor cores to be bound for each stage of a service request; the core binding relationship calculation module 502 is used to allocate a request that satisfies a corresponding number of requests for each stage of a service request. A processor core, which in turn generates a binding relationship, which indicates a correspondence between a request for a phase of a service request and a set of processor cores that process the request for the phase; a scheduling module 503 is configured to save the binding relationship and receive When a business request is made in a certain stage, query the binding core relationship, determine the set of processing cores used to execute the request in this stage, and send the request in this stage to the lightest-loaded processor core in the set of processor cores. The request of this stage is executed by the processor core.
此外,在部署有调度模块503的处理器核中,至少有一个处理器核中部署有监听模块504,该监听模块504用于监听来自主机或客户端的业务请求,在监听到来自主机或客户端的业务请求时,将该业务请求发送至处理器核中的调度模块503。In addition, among the processor cores on which the scheduling module 503 is deployed, at least one processor core is provided with a listening module 504. The listening module 504 is configured to monitor a service request from a host or a client. When a service request is sent, the service request is sent to the scheduling module 503 in the processor core.
需要说明的是,上述仅以负载均衡模块501与绑核关系计算模块502部署在处理器核S中为例对本发明实施例中的处理器进行说明,但本发明实施例并不限定于此,负载均衡模块501与绑核关系计算模块502可以部署在处理器核0~处理器核S中的任意一个处理器核中,并且负载均衡模块501与绑核关系计算模块502可以部署在同一个处理器核中,也可以部署在不同的处理器核中。It should be noted that, the processor in the embodiment of the present invention is described above only by taking the load balancing module 501 and the core relationship calculation module 502 as an example to deploy in the processor core S, but the embodiment of the present invention is not limited to this. The load balancing module 501 and the binding relationship calculation module 502 may be deployed in any one of the processor cores 0 to S, and the load balancing module 501 and the binding relationship calculation module 502 may be deployed in the same process. Processor cores can also be deployed in different processor cores.
下面对本发明实施例提供的存储系统中处理业务请求的方法600进行详细说明。图6示出了存储系统中处理业务请求的方法的示意性流程图,包括步骤601至603。The method 600 for processing a service request in a storage system according to an embodiment of the present invention is described in detail below. FIG. 6 shows a schematic flowchart of a method for processing a service request in a storage system, including steps 601 to 603.
601,接收业务请求的当前阶段的请求,该当前阶段的请求为该业务请求的多个阶段的请求中的一个阶段的请求。需要说明的是,在本发明实施例中,业务请求的处理分为多个阶段进行,并为每一阶段分配了一个处理器核集合,由相应处理器核集合中负载最轻的处理器核处理业务请求的相应阶段的请求。业务请求的当前待处理的阶段的请求称为当前阶段的请求。601. Receive a request of a current stage of a service request, where the current stage request is one of a plurality of stages of the service request. It should be noted that, in the embodiment of the present invention, the processing of the service request is divided into multiple stages, and a set of processor cores is allocated for each stage, and the lightest-loaded processor core in the corresponding set of processor cores is allocated. Processes requests for the appropriate stages of a business request. A request in the current pending stage of a business request is called a request in the current stage.
具体地,例如,当处理器核中的监听模块504(例如,处理器核1中的监听模块504)监听到来自主机或客户端的该业务请求时,当前阶段的业务请求是业务请求的多个阶段的请求中的第一个阶段的请求。Specifically, for example, when the listening module 504 in the processor core (for example, the listening module 504 in processor core 1) listens to the service request from the host or the client, the service request in the current stage is a multiple of the service request. The request of the first stage among the requests of the stage.
处理器核1中的监听模块504将该当前阶段的请求发送至处理器核1中的调度模块503。The monitoring module 504 in the processor core 1 sends the request of the current stage to the scheduling module 503 in the processor core 1.
602,确定执行该当前阶段的请求的第一处理器核集合,该第一处理器核集合为该多个处理器核的一个处理器核子集。602. Determine a first set of processor cores to execute the request in the current phase, where the first set of processor cores is a subset of the plurality of processor cores.
具体地,处理器核1中的调度模块503为接收到的当前阶段的请求确定执行该当前阶段的请求的处理器核集合(例如,第一处理器核集合)。Specifically, the scheduling module 503 in the processor core 1 determines a set of processor cores (for example, a first set of processor cores) that executes the request of the current phase for the received request of the current phase.
例如,调度模块503可以根据当前阶段的请求的具体类型,确定执行当前阶段的请求的第一处理器核集合,第一处理器核集合是存储系统中的多个处理器核的一个处理器核子集。For example, the scheduling module 503 may determine a first set of processor cores that executes the request of the current phase according to the specific type of the request of the current phase, and the first set of processor cores is a processor core of multiple processor cores in the storage system. set.
还例如,确定执行该当前阶段的请求的第一处理器核集合,包括:查询绑核关系,确定用于执行该当前阶段的请求的该第一处理器核集合,该绑核关系用于指示该当前阶段的请求与该第一处理器核集合之间的关联关系。For another example, determining the first set of processor cores that executes the request of the current phase includes: querying the core binding relationship, determining the first set of processor cores used to execute the request of the current phase, and the binding core relationship is used to indicate An association relationship between the request in the current stage and the first processor core set.
具体地,处理器核1中的调度模块503可以查询绑核关系,该绑核关系中指示了为该 业务请求的每一阶段的请求分配的处理器核集合,每个处理器核集合中包括多个处理器核,处理器核1中的调度模块503根据该绑核关系,确定执行当前阶段的请求的第一处理器核集合。Specifically, the scheduling module 503 in the processor core 1 may query a binding core relationship, where the binding core relationship indicates a set of processor cores allocated for each stage of the service request request, and each processor core set includes A plurality of processor cores, and the scheduling module 503 in the processor core 1 determines a first set of processor cores that executes the request in the current stage according to the core-binding relationship.
例如,处理器核1中的调度模块503查询该绑核关系,确定包含处理器核1、处理器核2、处理器核4、处理器核7与处理器核9的处理器核集合与当前阶段的请求之间存在关联关系,进而将该处理器核集合确定为执行当前阶段的请求的第一处理器核集合。For example, the scheduling module 503 in the processor core 1 queries the binding core relationship to determine the processor core set including the processor core 1, the processor core 2, the processor core 4, the processor core 7 and the processor core 9 and the current core. There is an association relationship between the requests of the phases, and then the processor core set is determined as the first processor core set that executes the request of the current phase.
603,向该第一处理器核集合负载最轻的处理器核发送该当前阶段的请求。603. Send the request of the current stage to the processor core with the lightest load in the first processor core set.
具体地,在确定了用于执行当前阶段的请求的第一处理器核集合后,处理器核1中的调度模块503将该业务请求发送至第一处理器核集合中的负载最轻的处理器核,由该处理器核执行当前阶段的请求。Specifically, after determining the first set of processor cores for executing the request of the current stage, the scheduling module 503 in processor core 1 sends the service request to the lightest-loaded processing in the first set of processor cores Processor core, which executes the request at the current stage.
例如,处理器核1中的调度模块503确定第一处理器核集合中的处理器核1、处理器核2、处理器核4、处理器核7与处理器核9中负载最轻的处理器核为处理器核7,则处理器核1中的调度模块503将业务请求发送至处理器核7,由处理器核7执行当前阶段的请求。For example, the scheduling module 503 in processor core 1 determines the lightest-loaded processing among processor core 1, processor core 2, processor core 4, processor core 7, and processor core 9 in the first set of processor cores. The processor core is the processor core 7, and the scheduling module 503 in the processor core 1 sends a service request to the processor core 7, and the processor core 7 executes the request at the current stage.
当处理器核7完成对该当前阶段的请求的执行后,该业务请求便进入下一执行阶段,该处理器核7中的调度模块503根据保存的绑核关系,确定用于执行业务请求的下一阶段的请求的处理器核集合,并将该下一阶段的请求发送至该处理器核集合中的负载最轻的处理器核,由该处理器核执行该下一阶段的请求。After the processor core 7 finishes executing the request of the current phase, the service request enters the next execution phase. The scheduling module 503 in the processor core 7 determines the The processor core set for the next stage request is sent to the processor core with the lightest load in the processor core set, and the processor core executes the request for the next stage.
依次重复上述操作,直至最终完成对该业务请求的处理。Repeat the above operations in turn until the processing of the service request is finally completed.
通过将待处理的业务请求划分为多个阶段执行,为每一阶段的请求分配一定数量的处理器核(例如,处理器核集合),并将每一阶段的请求均发送至为该阶段的请求分配的处理器核集合中的负载最轻的处理器核,相对于将业务请求发送至存储系统中多个处理器核当中负载最轻的处理器核,本发明实施例的处理业务请求的方法能够保证处理器核之间的负载均衡,为业务请求每个阶段的请求确定处理器核集合,在处理器集合范围内调度当前阶段的请求,相对于直接选择存储系统中负载最轻的处理器核,考虑了各阶段的请求与影响处理器核处理各阶段的请求的时延的相关性,降低处理业务请求的时延。By dividing the pending business request into multiple phases for execution, a certain number of processor cores (for example, a set of processor cores) are allocated to the requests for each phase, and the requests for each phase are sent to the The lightest-loaded processor core in the set of processor cores requesting allocation, compared to sending a service request to the lightest-loaded processor core among multiple processor cores in a storage system, the embodiment of the present invention The method can ensure load balancing among processor cores, determine the set of processor cores for each stage of business request, and schedule the current stage of requests within the scope of the processor set. Compared to directly selecting the lightest load processing in the storage system, The processor core considers the correlation between the requests in each phase and the delay that affects the processing of the requests by the processor core, reducing the delay in processing business requests.
可选地,该第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,该向该第一处理器核集合中负载最轻的处理器核发送该当前阶段的请求,包括:根据滑动窗口长度w与滑动步长d,在该K个处理器核中为该当前阶段的请求确定调度子区域,该调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;向该w个处理器核中负载最轻的处理器核发送该当前阶段的请求。Optionally, the first processor core set includes K processor cores, where K is an integer greater than or equal to 3, and the lightest-loaded processor core in the first processor core set is sent the current stage The request includes: according to the sliding window length w and the sliding step d, determining a scheduling sub-region for the current stage request in the K processor cores, the scheduling sub-region includes w processor cores, and w is greater than or An integer equal to 2 and less than K, and d is an integer greater than or equal to 1 and less than K; sending the request of the current stage to the lightest-loaded processor core among the w processor cores.
具体地,调度模块503在确定了用于执行当前阶段的请求的第一处理器核集合后,可以将当前阶段的请求发送至该第一处理器核集合中的负载最轻的处理器核,由该处理器核执行当前阶段的请求;或者,还可以基于滑动窗口机制确定执行该当前阶段的请求的处理器核。Specifically, after the scheduling module 503 determines the first set of processor cores for executing the request of the current phase, it may send the request of the current phase to the lightest-loaded processor core in the first set of processor cores, The processor core executes the request of the current stage; or, the processor core executing the request of the current stage may also be determined based on the sliding window mechanism.
调度模块503在确定用于执行当前阶段的请求的第一处理器核集合后,可以根据滑动窗口长度w与滑动步长d,在根据绑核关系确定的第一处理器核集合中为该当前阶段的请求确定调度子区域,从该调度子区域包括的处理器核中确定负载最轻的处理器核,将该业务请求发送至该调度子区域中负载最轻的处理器核。After determining the first set of processor cores used to execute the request of the current phase, the scheduling module 503 may use the sliding window length w and the sliding step d to determine the The phase request determines a scheduling sub-area, determines the lightest-loading processor core from the processor cores included in the scheduling sub-area, and sends the service request to the lightest-loading processor core in the scheduling sub-area.
例如,调度模块503根据绑核关系确定的用于执行当前阶段的请求的第一处理器核集合中的处理器核为处理器核1、处理器核3、处理器核4、处理器核5、处理器核8、处理器核9与处理器核10(即,K=7)。例如,w=3,d=2,则调度模块503为当前阶段的请求确定的调度子区域如图7所示,从图7中可以看出,调度子区域中包括的处理器核为处理器核1、处理器核3、处理器核4,则调度模块503将当前阶段的请求发送至处理器核1、处理器核3、处理器核4中负载最轻的处理器核,由该负载最轻的处理器核执行该当前阶段的请求。For example, the processor core in the first set of processor cores determined by the scheduling module 503 to execute the request of the current phase according to the binding core relationship is processor core 1, processor core 3, processor core 4, processor core 5 , Processor core 8, processor core 9, and processor core 10 (i.e., K = 7). For example, w = 3, d = 2, the scheduling sub-region determined by the scheduling module 503 for the request of the current stage is shown in FIG. 7. It can be seen from FIG. 7 that the processor core included in the scheduling sub-region is the processor. Core 1, processor core 3, and processor core 4, the scheduling module 503 sends the request of the current stage to the processor core 1, processor core 3, and processor core 4 with the lightest load processor core, and the load The lightest processor core executes the request at this stage.
当该包含处理器核1、处理器核3、处理器核4、处理器核5、处理器核8、处理器核9与处理器核10的处理器集合还用于处理该当前阶段的请求之后的其他业务请求的某一阶段的请求时,则该其他业务请求的某一阶段的请求的调度子区域是将滑动窗口向后滑动两个处理器核,由处理器核4、处理器核5、处理器核8形成的子区域,调度模块503将该其他业务请求的某一阶段的请求发送至处理器核4、处理器核5、处理器核8中负载最轻的处理器核,由该处理器核执行该其他业务请求的某一阶段的请求。When the processor set including processor core 1, processor core 3, processor core 4, processor core 5, processor core 8, processor core 9, and processor core 10 is also used to process the request at the current stage When a request for a certain stage of other business requests follows, the scheduling sub-area of the request for a certain stage of the other service request is to slide the sliding window backwards by two processor cores. The processor core 4 and the processor core 5. In the sub-area formed by the processor core 8, the scheduling module 503 sends a request for a certain stage of the other service request to the processor core 4, the processor core 5, and the processor core 8 with the lightest load processor core. The processor core executes a request of a certain stage of the other service request.
在确定执行任一阶段的请求的处理器核时,通过引入滑动窗口机制,缩小搜索负载最轻的处理器核的搜索范围,使调度子区域中负载最轻的处理器核执行相应阶段的请求,本发明实施例的处理业务请求的方法能够保证处理器核之间的负载均衡,为业务请求每个阶段的请求确定处理器核集合,在处理器集合范围内调度当前阶段的请求,相对于直接选择存储系统中负载最轻的处理器核,考虑了各阶段的请求与影响处理器核处理各阶段的请求的时延的相关性,进一步降低处理业务请求的时延。When determining the processor core that executes a request at any stage, by introducing a sliding window mechanism, the search range of the processor core with the lightest search load is narrowed, so that the processor core with the lightest load in the scheduling subregion executes the request of the corresponding stage The method for processing service requests in the embodiments of the present invention can ensure load balancing among processor cores, determine a set of processor cores for each stage of service request requests, and schedule requests in the current stage within the scope of the processor set. The processor core with the lightest load in the storage system is directly selected, and the correlation between the request of each stage and the delay that affects the processing of the request by the processor core is considered, which further reduces the delay of processing business requests.
引入滑动窗口机制后,当存在多个阶段的请求与同一处理器核集合之间存在绑定关系时,并且当该处理器核集合中的每个处理器核的负载相同时,此时,在依次处理该多个阶段的请求时,为了保证处理器核间的负载均衡,需要保证负载相同(即,待处理的请求队列的个数相同)的处理器核被选中用于执行请求的概率相同,即,需要保证每个处理器核作为滑动窗口内的搜索起始点的概率相同,当该处理器核集合中的处理器核的个数K与滑动步长d互为质数时,能够保证每个处理器核作为滑动窗口内的搜索起始点的概率相同。After the sliding window mechanism is introduced, when there is a binding relationship between a request in multiple stages and the same processor core set, and when the load of each processor core in the processor core set is the same, at this time, in In order to process the requests in multiple stages in order, in order to ensure load balancing among processor cores, it is necessary to ensure that processor cores with the same load (that is, the same number of pending request queues) are selected for the same probability of request execution That is, it is necessary to ensure that each processor core has the same probability as the search starting point in the sliding window. When the number of processor cores K in the processor core set and the sliding step size d are prime numbers, it can be guaranteed that The probabilities of the processor cores as search starting points in the sliding window are the same.
需要说明的是,在该存储系统刚开始运行时,该绑核关系可以是预先配置好的,后面由处理器核中的绑核关系计算模块502对该绑核关系进行更新,即生成新的绑核关系。It should be noted that when the storage system starts to run, the binding relationship may be pre-configured, and then the binding relationship calculation module 502 in the processor core updates the binding relationship, that is, generates a new Nuclear ties.
下面对本发明实施例提供的生成新的绑核关系的方法进行详细说明。The method for generating a new core binding relationship provided in the embodiment of the present invention is described in detail below.
作为示例而非限定,该方法还包括:根据该第一处理器核集合,重新确定执行该当前阶段的请求的处理器核的数量;根据该重新确定的执行该当前阶段的请求的处理器核的数量,在该多个处理器核中为该当前阶段的请求分配满足该数量的第二处理器核集合;根据该第二处理器核集合,生成新的绑核关系,该新的绑核关系用于指示该当前阶段的请求与该第二处理器核集合之间的关联关系。By way of example and not limitation, the method further includes: re-determining the number of processor cores that execute the request of the current stage according to the first set of processor cores; and according to the re-determined processor core that executes the request of the current stage The number of second processor core sets that satisfy the number of requests for the current stage among the multiple processor cores; according to the second processor core set, a new core binding relationship is generated, and the new core binding The relationship is used to indicate an association relationship between the request in the current stage and the second processor core set.
具体地,随着存储系统的运行,处理器核S中的负载均衡模块501针对业务请求的多个阶段的请求,周期性地确定用于执行每一阶段的请求的处理器核集合中的处理器核的数量,将确定的用于执行每一阶段的请求的处理器核集合中的处理器核的数量提供给绑核关系计算模块502,绑核关系计算模块502根据负载均衡模块501提供的重新确定的用于执行每一阶段的请求的处理器核集合中的处理器核的数量,为每一阶段的请求重新分配满足相应数量的处理器核,并根据为每一阶段的请求重新分配满足相应数量的处理器核,周期 性的生成新的绑核关系。Specifically, with the running of the storage system, the load balancing module 501 in the processor core S periodically determines the processing in the set of processor cores used to execute the requests of each phase for the requests of multiple phases of the service request. The number of processor cores, and the determined number of processor cores in the set of processor cores used to execute the request of each phase is provided to the binding core relationship calculation module 502 according to the load balancing module 501 Re-determine the number of processor cores in the set of processor cores used to execute the request for each phase, reallocate the request for each phase to meet the corresponding number of processor cores, and reallocate according to the request for each phase Satisfy the corresponding number of processor cores and periodically generate new core binding relationships.
以下以负载均衡模块501重新确定用于执行当前阶段的请求的处理器核的数量的方法为例,对重新确定用于执行每一阶段的请求的处理器核的数量的方法进行说明。The following uses the load balancing module 501 to re-determine the number of processor cores used to execute the request of the current stage as an example, and describes a method to re-determine the number of processor cores used to execute the request of each stage.
作为示例而非限定,该根据该第一处理器核集合,重新确定执行该当前阶段的请求的处理器核的数量,包括:确定该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率;根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,重新确定执行该当前阶段的请求的处理器核的数量。By way of example and not limitation, re-determining the number of processor cores executing the request of the current stage according to the first set of processor cores includes: determining a total utilization rate of the processor cores in the first set of processor cores With the average utilization of the plurality of processor cores; and re-determining the execution of the request at the current stage according to the sum of the utilization of the processor cores in the first processor core set and the average utilization of the plurality of processor cores The number of processor cores.
具体地,负载均衡模块501实时监控存储系统中的每个处理器核的利用率,其中,处理器核的利用率为处理器核的运行时间与运行时间加空闲时间之和的比值,根据处理器核的利用率的变化情况,重新确定用于执行当前阶段的请求的处理器核集合中的处理器核的数量。Specifically, the load balancing module 501 monitors the utilization rate of each processor core in the storage system in real time, wherein the utilization rate of the processor core is a ratio of the running time of the processor core to the sum of the running time and the idle time, and according to the processing, Changes in the utilization of processor cores, and re-determine the number of processor cores in the set of processor cores used to execute the request in the current phase.
例如,在当前监控周期内,当前阶段的请求绑定的第一处理器核集合表示为P,第一处理器核集合的利用率用U
P表示,则第一处理器核集合的利用率U
P等于第一处理器核集合中的处理器核在当前周期内的利用率的总和,表示为:
For example, in the current monitoring cycle, the first set of processor cores bound to the request at the current stage is represented as P, and the utilization rate of the first processor core set is represented by U P , then the utilization rate of the first processor core set is U P is equal to the total utilization of the processor cores in the first processor core set in the current cycle, which is expressed as:
U
P=ΣU
j,j∈P (1)
U P = ΣU j , j∈P (1)
其中,U
j表示第一处理器核集合中的任一处理器核在当前周期内的利用率。
Among them, U j represents the utilization rate of any processor core in the first processor core set in the current cycle.
将存储系统中的多个处理器核在当前周期内的平均利用率表示为U
ave,则调度模块503根据U
P与U
ave重新确定用于执行当前阶段的请求的处理器核集合中的处理器核的数量。
A plurality of processor cores in a storage system in the current cycle average utilization is expressed as U ave, the scheduling module 503 in accordance with the U P U ave execution request for re-determining the current phase of the processing core of the set of processors The number of processor cores.
作为示例而非限定,该根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,重新确定执行该当前阶段的请求的处理器核的数量,包括:根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,基于以下关系式重新确定执行该当前阶段的请求的处理器核的数量:By way of example and not limitation, the number of processor cores executing the request of the current stage is re-determined according to the sum of the utilization ratios of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores. Including: re-determining the number of processor cores executing the request of the current phase based on the following relationship according to the sum of the utilization ratio of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores :
N=U
P/U
ave (2)
N = U P / U ave (2)
其中,N为重新确定的执行该当前阶段的请求的处理器核的数量,U
P为该第一处理器核集合中的处理器核的利用率总和,U
ave为该多个处理器核的平均利用率。
Among them, N is the number of processor cores that are re-determined to execute the request at the current stage, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the Average utilization.
当负载均衡模块501在当前周期内重新确定出用于执行该当前阶段的请求的处理器核的数量N后,将确定的用于执行该当前阶段的请求的处理器核集合中的处理器核的数量提供给绑核关系计算模块502,由绑核关系计算模块502在下一周期的起始时刻为当前阶段的请求重新分配满足上述数量N的处理器核集合(例如,第二处理器核集合)。After the load balancing module 501 re-determines the number N of processor cores used to execute the request of the current stage in the current cycle, it will determine the processor cores in the set of processor cores used to execute the request of the current stage. The number is provided to the core relationship calculation module 502, and the core relationship calculation module 502 reallocates a set of processor cores (for example, a second processor core set) that meets the foregoing number N at the beginning of the next cycle for the request of the current stage. ).
例如,当前周期内用于执行当前阶段的请求的处理器核的数量为8,而当负载均衡模块501在当前周期对用于执行当前阶段的请求的处理器核的数量重新确定后,例如,负载均衡模块501在当前周期重新确定的用于执行当前阶段的请求的处理器核的数量为6,负载均衡模块501将为当前阶段的请求重新确定的处理器核的数量6提供给绑核关系计算模块502,则绑核关系计算模块502可以在下一周期的起始时刻将绑核关系中保存的用于执行当前阶段的请求的8个处理器核中删除两个处理器核,即生成新的绑核关系。For example, the number of processor cores used to execute the request of the current phase in the current cycle is 8, and after the load balancing module 501 re-determines the number of processor cores used to execute the request of the current phase in the current cycle, for example, The number of processor cores re-determined by the load balancing module 501 in the current cycle to execute the request of the current phase is 6, and the load balancing module 501 provides the number of processor cores 6 re-determined for the request of the current phase to the core binding relationship. Calculation module 502, the binding core calculation module 502 may delete two processor cores from the eight processor cores stored in the binding relationship to execute the request of the current phase at the beginning of the next cycle, that is, generate a new processor core Nuclear ties.
再例如,负载均衡模块501将为当前阶段的请求重新确定的处理器核的数量6提供给绑核关系计算模块502,此时绑核关系计算模块502不去从绑核关系中保存的用于执行当前阶段的请求的8个处理器核中删除两个处理器核,而是在存储系统中为当前阶段的请求 重新分配6个处理器核,并在下一周期的起始时刻将绑核关系中原来为当前阶段的请求分配的8个处理器核替换为重新分配的该6个处理器核,即生成新的绑核关系。As another example, the load balancing module 501 provides the number of processor cores 6 that is re-determined for the current request to the binding core calculation module 502. At this time, the binding core calculation module 502 does not use the Two processor cores are deleted from the eight processor cores executing the current phase of the request, but six processor cores are reassigned in the storage system for the current phase of the request, and the core relationship will be tied at the beginning of the next cycle The 8 processor cores that were originally allocated for the current stage request are replaced with the 6 processor cores that were reassigned to generate a new core-binding relationship.
通过周期性地监控存储系统中的处理器核的利用率,并根据为任一阶段的请求分配的处理器核的利用率的变化情况,为相应阶段的请求重新分配处理器核,从而能够根据处理器核的利用率的变化情况,周期性地调整为相应阶段的请求分配的处理器核,进而改善处理器核之间的负载不均衡的现象。By periodically monitoring the utilization of processor cores in the storage system, and according to changes in the utilization of processor cores allocated for requests at any stage, reallocating processor cores for requests at the corresponding stage can be based on The change of the utilization rate of the processor cores is periodically adjusted to the processor cores allocated to the requests in the corresponding phases, thereby improving the load imbalance between the processor cores.
下面以绑核关系计算模块502在该存储系统中为该当前阶段的请求分配满足该数量的处理器核的方法为例,对绑核关系计算模块502在该存储系统中为该各个阶段的请求分配满足相应数量的处理器核的方法进行详细说明。In the following, a method for assigning the core relationship calculation module 502 to the current stage request in the storage system to satisfy the number of processor cores is taken as an example. For the core relationship calculation module 502 in the storage system, the request for each phase is provided. The method of allocating the corresponding number of processor cores will be described in detail.
在存储系统中,多个处理器核通常会共享不同层次的内存或缓存(cache),不同层次的内存或缓存可以包括L
1cache、L
2cache、L
3cache以及本地内存,当处理器核共享不同层次的内存或cache时,处理器核间的拓扑距离也是不同的。
In a storage system, multiple processor cores usually share different levels of memory or cache. The different levels of memory or cache can include L 1 cache, L 2 cache, L 3 cache, and local memory. When the processor core When sharing different levels of memory or cache, the topological distance between processor cores is also different.
在非统一内存访问架构(non uniform memory access architecture,NUMA)中,每个处理器核可以访问远端节点中的本地内存(以下简称为“远端内存”),当采用超线程通信时,每个处理器核可以被抽象为多个逻辑核。例如,每个处理器核被抽象为两个逻辑核,该两个逻辑核分别为逻辑核0与逻辑核1,如图8所示。In a non-uniform memory access architecture (NUMA), each processor core can access local memory in a remote node (hereinafter referred to as "remote memory"). When using hyper-threaded communication, each Each processor core can be abstracted into multiple logical cores. For example, each processor core is abstracted into two logical cores, which are respectively logical core 0 and logical core 1, as shown in FIG. 8.
图8示出了NUMA架构下共享不同层次的内存或cache的逻辑核之间的拓扑距离示意图,可以看出,在NUMA架构下,存在节点0与节点1,节点0中的逻辑核可以与节点1中的逻辑核共享节点1中的本地内存,节点1中的本地内存对节点0而言是远端内存。Figure 8 shows a schematic diagram of the topology distance between logical cores sharing different levels of memory or cache under the NUMA architecture. It can be seen that under the NUMA architecture, there are nodes 0 and 1, and the logical cores in node 0 can be connected to nodes. The logical core in 1 shares the local memory in node 1. The local memory in node 1 is the remote memory for node 0.
从图8中可以看出,节点0中共享L
1cache的两个逻辑核之间的拓扑距离为D
1,共享L
2cache的两个逻辑核之间的拓扑距离为D
2,共享L
3cache的两个逻辑核之间的拓扑距离为D
3,共享本地内存的两个逻辑核之间的拓扑距离为D
4,节点0中的逻辑核与节点1中的逻辑核共享节点1中的本地内存时,两个逻辑核之间的拓扑距离为D
5。
As can be seen from FIG. 8, the topological distance between two logical cores sharing L 1 cache in node 0 is D 1 , the topological distance between two logical cores sharing L 2 cache is D 2 , and L 3 is shared The topological distance between the two logical cores of the cache is D 3 , and the topological distance between the two logical cores sharing local memory is D 4. The logical core in node 0 and the logical core in node 1 share the In local memory, the topological distance between the two logical cores is D 5 .
根据Intel发布的各版本的CPU手册,可以获取到CPU访问各个层次的内存或cache的访问时延数据。以Xeon E5-2658v2型号的CPU为例,访问时延如表1所示。According to the CPU manuals of each version released by Intel, you can obtain the access latency data of the CPU access to the memory or cache of various levels. Taking the Xeon E5-2658v2 CPU as an example, the access delay is shown in Table 1.
表1Table 1
共享的内存或缓存Shared memory or cache | 访问时延Access delay |
L 1 cache L 1 cache | 1.3ns1.3ns |
L 2 cache L 2 cache | 3.7ns3.7ns |
L 3 cache L 3 cache | 12.8ns12.8ns |
本地内存Local memory | 56.5ns56.5ns |
通过参考CPU访问不同层次的内存或cache的时延的比例关系,可以量化共享不同层次的内存或cache两个逻辑核之间的拓扑距离。假设共享L
1cache的两个逻辑核之间的拓扑距离D
1=1,则根据CPU访问各个层次的内存或cache的访问时延,可以得到共享不同层次的内存或cache的两个逻辑核之间的拓扑距离,如表2所示。
By referring to the proportional relationship between the latency of the CPU accessing different levels of memory or cache, the topological distance between two logical cores sharing different levels of memory or cache can be quantified. Assuming the topological distance D 1 = 1 between the two logical cores sharing the L 1 cache, one of the two logical cores sharing different levels of memory or cache can be obtained according to the access delay of the CPU accessing the memory of each level or the cache. The topological distance between them is shown in Table 2.
在NUMA架构中,访问本地内存和远端内存的访问时延比大约是8:12,因此,可以计算出节点之间共享远端内存的逻辑核之间的拓扑距离为64。In the NUMA architecture, the access latency ratio of accessing local memory and remote memory is approximately 8:12, so the topological distance between logical cores that share remote memory between nodes can be calculated as 64.
表2Table 2
共享的内存或缓存Shared memory or cache | 两个逻辑核之间的拓扑距离Topological distance between two logical cores |
L 1 cache L 1 cache | 11 |
L 2 cache L 2 cache | 33 |
L 3 cache L 3 cache | 1010 |
本地内存Local memory | 4343 |
远端内存Remote memory | 6464 |
下面以存储系统中的CPU满足图8所示的拓扑结构,以为当前阶段的请求分配满足相应数量的逻辑核为例,对本发明实施例的绑核关系计算模块502在存储系统中为各个阶段的请求分配满足相应数量的处理器核的方法进行详细说明。其中,图8中的节点0与节点1在NUMA架构中,并且之间通过超线程通信。The following assumes that the CPU in the storage system satisfies the topology shown in FIG. 8, and takes the allocation of the corresponding number of logical cores for the request of the current stage as an example. The binding core calculation module 502 of the embodiment of the present invention is in each stage of the storage system A method for requesting allocation of a processor core satisfying a corresponding number will be described in detail. Among them, node 0 and node 1 in FIG. 8 are in a NUMA architecture and communicate with each other through hyper-threading.
作为示例而非限定,在多个处理器中为该当前阶段的请求分配满足该数量的第二处理器核集合,包括:生成多组分配结果,每组分配结果中包括为每一个阶段的请求分配的满足相应数量的处理器核集合;针对该多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,该路径长度L满足:By way of example and not limitation, allocating a second processor core set that meets the number of requests for the current stage in multiple processors includes generating multiple sets of allocation results, and each set of allocation results includes requests for each stage The allocated set of processor cores meets the corresponding number; multiple path lengths are determined for the multiple sets of allocation results, and each set of allocation results corresponds to a path length, and the path length L satisfies:
其中,c
i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d
i,i+1表示执行该相邻阶段的请求的处理器核间的平均拓扑距离,M为该业务请求的多个阶段的请求的数量;其中,通信量可以表示处理器核间的交互次数。
Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing requests in adjacent phases, and d i, i + 1 represents the average topological distance between the processor cores executing requests in the adjacent phases, M is the number of requests in multiple stages of the service request; where the communication volume can represent the number of interactions between processor cores.
根据该多个路径长度中的最短路径长度对应的一组分配结果,为该当前阶段的请求分配满足该数量的处理器核。According to a set of allocation results corresponding to the shortest path length among the multiple path lengths, the request of the current stage is allocated to satisfy the number of processor cores.
具体地,在图8所示的CPU拓扑结构中,当采用超线程通信时,每个处理器核被抽象为逻辑核0与逻辑核1,16个处理器核被抽象为32个逻辑核。Specifically, in the CPU topology shown in FIG. 8, when using hyper-threaded communication, each processor core is abstracted into logical core 0 and logical core 1, and 16 processor cores are abstracted into 32 logical cores.
假设该业务请求需要分为3个阶段的请求进行处理,该3个阶段的请求分别记为M
0、M
1与M
2,例如,通过上述的确定用于执行当前阶段的请求的处理器核的数量的方法,在当前周期内分别确定用于执行M
0、M
1与M
2的逻辑核的数量。其中,确定用于执行M
0的逻辑核的数量为8,确定用于执行M
1的逻辑核的数量为8,确定用于执行M
2的逻辑核的数量为16。
Assume that the business request needs to be processed in three phases. The three phase requests are denoted as M 0 , M 1, and M 2 respectively . For example, the processor core used to execute the request of the current phase is determined by the foregoing. The method of determining the number of logic cores used to execute M 0 , M 1 and M 2 in the current cycle. Among them, it is determined that the number of logic cores used to execute M 0 is 8, the number of logic cores used to execute M 1 is determined, and the number of logic cores used to execute M 2 is 16.
绑核关系计算模块502根据为M
0、M
1与M
2确定的逻辑核的数量,生成多组分配结果,每组分配结果中包括为每一阶段的请求分配的满足相应数量的逻辑核。
The binding relationship calculation module 502 generates multiple sets of allocation results according to the number of logical cores determined for M 0 , M 1, and M 2. Each group of allocation results includes logical cores that satisfy a corresponding number of allocations for each stage of the request.
例如,分配结果1为:将节点0中的逻辑核0~7分配给M
0,将节点0的逻辑核8~15分配给M
1,将节点1的逻辑核0~15分配给M
2;
For example, the allocation result 1 is: logical cores 0 to 7 in node 0 are assigned to M 0 , logical cores 8 to 15 of node 0 are assigned to M 1 , and logical cores 0 to 15 of node 1 are assigned to M 2 ;
分配结果2为:将节点0中的逻辑核0~3与节点1中的逻辑核0~3分配给M
0,将节点0中的逻辑核4~7与节点1中的逻辑核4~7分配给M
1,将节点0中的逻辑核8~15与节点1中的逻辑核8~15分配给M
2。
The allocation result 2 is: logic cores 0 to 3 in node 0 and logic cores 0 to 3 in node 1 are allocated to M 0 , and logic cores 4 to 7 in node 0 and logic cores 4 to 7 in node 1 It is assigned to M 1 , and logical cores 8 to 15 in node 0 and logical cores 8 to 15 in node 1 are assigned to M 2 .
针对分配结果1,使用式(3)计算路径长度,其中,将执行M
0与M
1的逻辑核之间的平均拓扑距离记为d
0,1,将执行M
1与M
2时逻辑核之间的平均拓扑距离记为d
1,2,则d
0,1=D
4, d
1,2=D
5,将执行M
0与M
1时逻辑核之间交互产生的通信量记为c
0,1,将执行M
1与M
2时逻辑核之间交互产生的通信量记为c
1,2,则分配结果1对应的路径长度L
1满足:
For the allocation result 1, use Equation (3) to calculate the path length, where the average topological distance between the logical cores that execute M 0 and M 1 is recorded as d 0,1 , and the logical cores when M 1 and M 2 are executed The average topological distance between them is recorded as d 1,2 , then d 0,1 = D 4 , d 1,2 = D 5 , and the communication volume generated by the interaction between the logical cores when performing M 0 and M 1 is recorded as c 0 , 1 , Let C 1,2 be the communication volume generated by the interaction between the logical cores when M 1 and M 2 are executed, then the path length L 1 corresponding to the allocation result 1 satisfies:
L
1=c
0,1×D
4+c
1,2×D
5 (4)
L 1 = c 0,1 × D 4 + c 1,2 × D 5 (4)
由表2得知,D
3=10,D
4=43,D
5=64,则L
1=c
0,1×43+c
1,2×64。
It is known from Table 2 that if D 3 = 10, D 4 = 43, and D 5 = 64, then L 1 = c 0,1 × 43 + c 1,2 × 64.
针对分配结果2,使用式(3)计算路径长度,其中,将执行M
0与M
1时逻辑核之间的平均拓扑距离记为d
0,1,将执行M
1与M
2时逻辑核之间的平均拓扑距离记为d
1,2,则d
0,1=D
3×0.5+D
5×0.5,d
1,2=D
4×0.5+D
5×0.5,将执行M
0与M
1时逻辑核之间交互产生的通信量记为c
0,1,将执行M
1与M
2时逻辑核之间交互产生的通信量记为c
1,2,则分配结果2对应的路径长度L
2满足:
For the distribution result 2, use Equation (3) to calculate the path length, where the average topological distance between the logical cores when M 0 and M 1 are executed is denoted as d 0,1 , and the logical cores when M 1 and M 2 are executed The average topological distance between them is recorded as d 1,2 , then d 0,1 = D 3 × 0.5 + D 5 × 0.5, d 1,2 = D 4 × 0.5 + D 5 × 0.5, M 0 and M 1 will be executed The communication volume generated by the interaction between the logical cores at time is recorded as c 0,1 , and the communication volume generated by the interaction between the logical cores at the time of executing M 1 and M 2 is recorded as c 1,2 , then the path length L corresponding to the allocation result 2 2 meets:
L
2=c
0,1×(D
3×0.5+D
5×0.5)+c
0,2×(D
4×0.5+D
5×0.5) (5)
L 2 = c 0,1 × (D 3 × 0.5 + D 5 × 0.5) + c 0,2 × (D 4 × 0.5 + D 5 × 0.5) (5)
由表2得知,D
3=10,D
4=43,D
5=64,则L
2=c
0,1×37+c
1,2×53.5。
It is known from Table 2 that if D 3 = 10, D 4 = 43, and D 5 = 64, then L 2 = c 0,1 × 37 + c 1,2 × 53.5.
可以看出,相对于分配结果1,分配结果2对应的路径长度较短,因此,绑核关系计算模块502将将节点0中的逻辑核0~3与节点1中的逻辑核0~3分配给M
0,将节点0中的逻辑核4~7与节点1中的逻辑核4~7分配给M
1,将节点0中的逻辑核8~15与节点1中的逻辑核8~15分配给M
2,并在下一周期的起始时刻将绑核关系中原来为该业务请求的各个阶段的请求分配的处理器核替换为重新分配的处理器核。
It can be seen that the path length corresponding to allocation result 2 is shorter than allocation result 1, so the binding core calculation module 502 will allocate logical cores 0 to 3 in node 0 and logical cores 0 to 3 in node 1. To M 0 , assign logical cores 4 to 7 in node 0 and logical cores 4 to 7 in node 1 to M 1 , and assign logical cores 8 to 15 in node 0 and logical cores 8 to 15 in node 1. Give M 2 and replace the processor core originally allocated for the request of each stage of the business request in the binding relationship with the reallocated processor core at the beginning of the next cycle.
根据上面实施例生成的多组处理器核的分配结果,针对该多组分配结果确定多个路径长度,通过为业务模块分配处理器核时考虑处理器核间的拓扑距离,并将多个路径长度中的最短路径长度对应的分配结果确定为最终的处理器核分配结果,从而保证处理器核之间的负载均衡,为业务请求每个阶段的请求确定处理器核集合,在处理器集合范围内调度当前阶段的请求,相对于直接选择存储系统中负载最轻的处理器核,考虑了各阶段的请求与影响处理器核处理各阶段的请求的时延的相关性,降低处理业务请求的时延。According to the allocation results of multiple sets of processor cores generated in the above embodiment, multiple path lengths are determined for the multiple sets of allocation results. By assigning processor cores to business modules, the topological distance between processor cores is considered, and multiple paths are The allocation result corresponding to the shortest path length in the length is determined as the final processor core allocation result, thereby ensuring load balancing among the processor cores, determining the processor core set for each stage of the business request request, and within the scope of the processor set. Compared with the direct selection of the lightest load processor core in the storage system, the scheduling of requests in the current phase takes into account the correlation between the requests in each phase and the delay that affects the processing of requests by the processor core in each phase, reducing the processing of business requests. Delay.
需要说明的是,上述仅列举两种逻辑核的分配结果仅是为了说明问题所做的示例性说明,并不对本发明实施例构成任何限定,实际应用中可以随机生成多中分配结果,并按照该多组分配结果中与最短路径长度对应的分配结果为各个阶段的请求分配逻辑核。示例性的,本发明实施例中,还可以基于处理器之间的连接关系或总线类型等影响时延的因素,为每一个阶段的请求分配处理器核集合。本发明实施例对此不作限定。It should be noted that the above-mentioned enumeration results of only two types of logical cores are merely exemplary explanations for explaining the problem, and do not constitute any limitation on the embodiments of the present invention. In actual applications, multiple allocation results may be randomly generated and The allocation result corresponding to the shortest path length among the multiple sets of allocation results allocates logical cores to the requests of each stage. Exemplarily, in the embodiments of the present invention, a processor core set may be allocated for each stage of the request based on factors such as the connection relationship between processors or the type of bus that affects the delay. The embodiment of the present invention does not limit this.
下面对本发明实施例提供的处理业务请求的配置方法700进行详细说明。图9示出了处理业务请求的配置方法的示意性流程图。The following describes in detail a configuration method 700 for processing a service request according to an embodiment of the present invention. FIG. 9 shows a schematic flowchart of a configuration method for processing a service request.
701,为业务请求的第一阶段的请求配置第一处理器核集合,该第一处理器核集合用于执行该第一阶段的请求。701. Configure a first set of processor cores for a request in a first phase of a service request, where the first set of processor cores is used to execute the first phase of the request.
具体地,业务请求的处理分为多个阶段进行,多个阶段对应多个阶段的请求,例如,该多个阶段的请求包括第一阶段的请求,为该第一阶段的请求配置一个处理器核集合(例如,第一处理器核集合),通过该第一处理器核集合来处理第一阶段的请求。Specifically, the processing of the service request is divided into multiple stages, and the multiple stages correspond to the multiple stage requests. For example, the multiple stage requests include the first stage requests, and a processor is configured for the first stage requests. A set of cores (eg, a first set of processor cores) through which the first stage of requests are processed.
702,配置第一规则,该第一规则指示向该第一处理器核集合中负载最轻的处理器核发送该第一阶段的请求。702. Configure a first rule, where the first rule instructs to send the request of the first stage to the lightest-loaded processor core in the first processor core set.
具体地,可以配置第一规则,该第一规则可以指示为该第一阶段的请求配置的第一处理器核集合中的负载最轻的处理器核执行该第一阶段的请求。Specifically, a first rule may be configured, and the first rule may indicate that the lightest-loaded processor core in the first set of processor cores configured for the request of the first stage executes the request of the first stage.
可选地,该方法还包括:Optionally, the method further includes:
703,为业务请求的第二阶段的请求配置第二处理器核集合,该第二处理器核集合用于执行该第二阶段的请求。703. Configure a second set of processor cores for the requests in the second phase of the service request, where the second set of processor cores is used to execute the requests in the second phase.
具体地,例如,该业务请求还包括第二阶段的请求,该第二阶段的请求可以是该第一阶段的请求之后的一个阶段的请求,为该第二阶段的请求配置一个处理器核集合(例如,第二处理器核集合),通过该第二处理器核集合来处理第二阶段内的请求。Specifically, for example, the service request also includes a request in a second phase, and the request in the second phase may be a request in a phase subsequent to the request in the first phase, and a processor core set is configured for the request in the second phase. (For example, a second set of processor cores), through which the requests in the second stage are processed.
704,配置第二规则,该第二规则指示向该第二处理器核集合中负载最轻的处理器核发送该第二阶段的请求。704. Configure a second rule, where the second rule instructs to send the request of the second stage to the lightest-loaded processor core in the second processor core set.
具体地,可以配置第二规则,该第二规则可以指示为该第二阶段的请求配置的第二处理器核集合中的负载最轻的处理器核执行该第二阶段的请求。Specifically, a second rule may be configured, and the second rule may indicate that the lightest-loaded processor core in the second set of processor cores configured for the request of the second stage executes the request of the second stage.
关于如何为第一阶段的请求与第二阶段的请求配置相应的处理器核集合,请参考方法600中的相关描述,为了简洁,此处不再赘述。Regarding how to configure the corresponding set of processor cores for the request in the first phase and the request in the second phase, please refer to the related description in the method 600. For the sake of brevity, it will not be repeated here.
通过为业务请求的每一阶段的请求分配一定数量的处理器核(例如,处理器核集合),并将每一阶段的请求均发送至为该阶段的请求分配的处理器核集合中的负载最轻的处理器核,相对于将业务请求发送至存储系统中多个处理器核当中负载最轻的处理器核,本发明实施例的处理业务请求的配置方法能够使得处理业务请求时,保证处理器核之间的负载均衡,为业务请求每个阶段的请求确定处理器核集合,在处理器集合范围内调度当前阶段的请求,相对于直接选择存储系统中负载最轻的处理器核,考虑了各阶段的请求与影响处理器核处理各阶段的请求的时延的相关性,降低处理业务请求的时延。By allocating a certain number of processor cores (for example, a set of processor cores) for each stage of a business request, and sending requests for each stage to a load in the set of processor cores allocated for the request for that stage The lightest processor core, compared to the lightest load processor core among multiple processor cores in the storage system, sends the service request to the service request configuration method according to the embodiment of the present invention, which can ensure that when processing a service request, Load balancing among processor cores, determining the set of processor cores for each stage of a business request, and scheduling the current stage of requests within the scope of the processor set. Compared to directly selecting the lightest-loading processor core in the storage system, The correlation between the request of each stage and the delay that affects the processing of the request by the processor core is considered to reduce the delay of processing business requests.
需要说明的是,上述仅以业务请求包括第一阶段的请求与第二阶段的请求为例进行说明,并不对本发明实施例构成特别限定,例如,该业务请求还可以包括其他阶段的请求。It should be noted that the above description is only based on the example that the service request includes the request of the first stage and the request of the second stage, and does not specifically limit the embodiment of the present invention. For example, the service request may also include requests of other stages.
进一步的,上述配置方法实施例中确定处理器核集合的方法可以参考前面本发明实施例相关部分的描述,在此不再赘述。Further, for the method for determining the processor core set in the foregoing configuration method embodiment, reference may be made to the description of the relevant part of the foregoing embodiment of the present invention, and details are not described herein again.
上文结合图6至图9,描述了本发明实施例提供的存储系统中处理业务请求的方法与处理业务请求的配置方法,下面结合图10至图11描述本发明实施例提供的处理业务请求的装置与存储系统。The method for processing service requests and the method for processing service requests in the storage system according to the embodiments of the present invention are described above with reference to FIGS. 6 to 9. The following describes service processing requests provided by the embodiments of the present invention with reference to FIGS. 10 to 11. Device and storage system.
图10为本发明实施例提供的处理业务请求的装置800的示意性框图,该装置800配置于存储系统中,包括收发模块801与处理模块802。FIG. 10 is a schematic block diagram of an apparatus 800 for processing a service request according to an embodiment of the present invention. The apparatus 800 is configured in a storage system and includes a transceiver module 801 and a processing module 802.
收发模块801,用于接收业务请求的当前阶段的请求,该当前阶段的请求为该业务请求的多个阶段的请求中的一个阶段的请求。The transceiver module 801 is configured to receive a request in a current stage of a service request, where the request in the current stage is a request in one of a plurality of stages in the service request.
处理模块802,用于确定执行该当前阶段的请求的第一处理器核集合,该第一处理器核集合为该多个处理器核的一个处理器核子集。The processing module 802 is configured to determine a first set of processor cores that executes the request at the current stage, where the first set of processor cores is a subset of the plurality of processor cores.
收发模块801,还用于向该第一处理器核集合负载最轻的处理器核发送该当前阶段的请求。The transceiver module 801 is further configured to send the request of the current stage to the processor core with the lightest load in the first processor core set.
可选地,处理模块802,还用于查询绑核关系,确定用于执行该当前阶段的请求的该第一处理器核集合,该绑核关系用于指示该当前阶段的请求与该第一处理器核集合之间的关联关系。Optionally, the processing module 802 is further configured to query a core binding relationship and determine the first processor core set used to execute the request of the current phase, and the core binding relationship is used to indicate that the request of the current phase and the first phase Association between processor core sets.
可选地,该处理模块802,还用于根据该第一处理器核集合,重新确定执行该当前阶段的请求的处理器核的数量;根据该重新确定的执行该当前阶段的请求的处理器核的数量,在该多个处理器核中为该当前阶段的请求分配满足该数量的第二处理器核集合;根据 该第二处理器核集合,生成新的绑核关系,该新的绑核关系用于指示该当前阶段的请求与该第二处理器核集合之间的关联关系。Optionally, the processing module 802 is further configured to re-determine the number of processor cores that execute the request of the current stage according to the first set of processor cores; and according to the re-determined processor that executes the request of the current stage The number of cores, among the plurality of processor cores, for the current stage request, a second set of processor cores satisfying the number is allocated; according to the second set of processor cores, a new binding core relationship is generated, and the new binding The core relationship is used to indicate an association relationship between the request in the current stage and the second processor core set.
可选地,该处理模块802,还用于确定该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率;根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,重新确定执行该当前阶段的请求的处理器核的数量。Optionally, the processing module 802 is further configured to determine a total utilization rate of the processor cores in the first processor core set and an average utilization rate of the plurality of processor cores; according to the first processor core set, The sum of the utilization of the processor cores and the average utilization of the plurality of processor cores re-determines the number of processor cores executing the request in the current stage.
可选地,该处理模块802,还用于根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,基于以下关系式重新确定执行该当前阶段的请求的处理器核的数量:Optionally, the processing module 802 is further configured to re-determine the execution of the current based on the following relationship based on the sum of the utilization rates of the processor cores in the first processor core set and the average utilization rate of the plurality of processor cores. Number of requested processor cores for the phase:
N=U
P/U
ave
N = U P / U ave
其中,N为重新确定的执行该当前阶段的请求的处理器核的数量,U
P为该第一处理器核集合中的处理器核的利用率总和,U
ave为该多个处理器核的平均利用率。
Among them, N is the number of processor cores that are re-determined to execute the request at the current stage, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the Average utilization.
可选地,该处理模块802,还用于生成多组分配结果,每组分配结果中包括为每一个阶段的请求重新分配的满足相应数量的处理器核集合;针对该多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,该路径长度L满足:Optionally, the processing module 802 is further configured to generate multiple sets of allocation results, and each set of allocation results includes a set of processor cores that meets a corresponding number of requests for reallocation for each stage of the request; Path length, each group of allocation results corresponds to a path length, the path length L satisfies:
其中,c
i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d
i,i+1表示执行该相邻阶段的请求的处理器核间的平均拓扑距离,M为该业务请求的多个阶段的请求的数量;根据该多个路径长度中的最短路径长度对应的一组分配结果,为该当前阶段的请求分配满足该数量的第二处理器核集合。
Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing requests in adjacent phases, and d i, i + 1 represents the average topological distance between the processor cores executing requests in the adjacent phases, M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the multiple path lengths, the request for the current stage is allocated a second set of processor cores that meets the number.
可选地,该第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,该处理模块802,还用于根据滑动窗口长度w与滑动步长d,在该K个处理器核中为该当前阶段的请求确定调度子区域,该调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;Optionally, the first processor core set includes K processor cores, where K is an integer greater than or equal to 3, and the processing module 802 is further configured to, according to the sliding window length w and the sliding step size d, in the K Among the processor cores, a scheduling sub-region is determined for the request of the current stage. The scheduling sub-region includes w processor cores, where w is an integer greater than or equal to 2 and less than K, and d is an integer greater than or equal to 1 and less than K. Integer
该收发模块801,还用于向该w个处理器核中负载最轻的处理器核发送该当前阶段的请求。The transceiver module 801 is further configured to send the request of the current stage to the lightest-loaded processor core among the w processor cores.
可选地,d与K互为质数。Optionally, d and K are prime numbers each other.
根据本发明实施例的处理业务请求的装置800可对应于执行本发明实施例中描述的方法600或方法700,并且装置800中的各个模块的上述和其它操作和/或功能分别为了实现图6中的方法600或图9中的方法700的相应流程,相应的,图5所示的各个模块可以对应到图8所示的一个或多个模块。为了简洁,在此不再赘述。The apparatus 800 for processing a service request according to the embodiment of the present invention may correspond to executing the method 600 or the method 700 described in the embodiment of the present invention, and the above and other operations and / or functions of each module in the apparatus 800 are respectively implemented in order to implement FIG. 6. Corresponding processes of the method 600 in FIG. 9 or the method 700 in FIG. 9, correspondingly, each module shown in FIG. 5 may correspond to one or more modules shown in FIG. 8. For brevity, I will not repeat them here.
进一步的,本发明实施例的处理业务请求的装置800具体实现可以是处理器,或者软件模块,或者处理器与软件模块的组合等,本发明实施例对此不作限定。Further, the specific implementation of the apparatus 800 for processing a service request in the embodiment of the present invention may be a processor, or a software module, or a combination of a processor and a software module, which is not limited in the embodiment of the present invention.
图11为本发明实施例提供的存储系统900的示意性框图,该存储系统包括处理器901与存储器902,处理器901包括多个处理器核;11 is a schematic block diagram of a storage system 900 according to an embodiment of the present invention. The storage system includes a processor 901 and a memory 902, and the processor 901 includes multiple processor cores.
存储器902,用于存储计算机指令;A memory 902, configured to store computer instructions;
该多个处理器核中的一个或多个处理器核用于执行该存储器902中存储的计算机指令,当该存储器902中的计算机指令被执行时,该一个或多个处理器核用于执行下列操作:接收业务请求的当前阶段的请求,该当前阶段的请求为该业务请求的多个阶段的请求中的 一个阶段的请求;确定执行该当前阶段的请求的第一存储系统核集合,该第一存储系统核集合为该多个存储系统核的一个存储系统核子集;向该第一存储系统核集合负载最轻的存储系统核发送该当前阶段的请求。One or more processor cores in the plurality of processor cores are used to execute computer instructions stored in the memory 902. When the computer instructions in the memory 902 are executed, the one or more processor cores are used to execute The following operations: receiving a request of a current stage of a service request, the current stage request being one of a plurality of stage requests of the service request; determining a first set of storage system cores to execute the current stage request, the The first storage system core set is a subset of the storage system cores of the plurality of storage system cores; and the request of the current stage is sent to the storage system core with the lightest load in the first storage system core set.
可选地,该一个或多个处理器核,还用于查询绑核关系,确定用于执行该当前阶段的请求的该第一存储系统核集合,该绑核关系用于指示该当前阶段的请求与该第一存储系统核集合之间的关联关系。Optionally, the one or more processor cores are further configured to query a core binding relationship, determine the first storage system core set used to execute the request of the current phase, and the core binding relationship is used to indicate the current phase. An association relationship between the request and the first storage system core set.
可选地,该一个或多个处理器核,还用于根据该第一存储系统核集合,重新确定执行该当前阶段的请求的存储系统核的数量;根据该重新确定的执行该当前阶段的请求的存储系统核的数量,在该多个存储系统核中为该当前阶段的请求分配满足该数量的第二存储系统核集合;根据该第二存储系统核集合,生成新的绑核关系,该新的绑核关系用于指示该当前阶段的请求与该第二存储系统核集合之间的关联关系。Optionally, the one or more processor cores are further configured to re-determine the number of storage system cores that execute the request of the current phase according to the first set of storage system cores; The number of requested storage system cores, and among the multiple storage system cores, a second storage system core set that satisfies the number is allocated to the current stage of the request; and a new binding core relationship is generated according to the second storage system core set, The new core binding relationship is used to indicate an association relationship between the request in the current stage and the second storage system core set.
可选地,该一个或多个处理器核,还用于确定该第一存储系统核集合中的存储系统核的利用率总和与该多个存储系统核的平均利用率;根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,重新确定执行该当前阶段的请求的处理器核的数量。Optionally, the one or more processor cores are further configured to determine a sum of utilization rates of the storage system cores in the first storage system core set and an average utilization rate of the plurality of storage system cores; according to the first processing The sum of the utilization rates of the processor cores in the processor core set and the average utilization rate of the plurality of processor cores re-determines the number of processor cores executing the request in the current stage.
可选地,该一个或多个处理器核,还用于根据该第一处理器核集合中的处理器核的利用率总和与该多个处理器核的平均利用率,基于以下关系式重新确定执行该当前阶段的请求的处理器核的数量:Optionally, the one or more processor cores are further configured to re-based on the sum of the utilization rates of the processor cores in the first processor core set and the average utilization rate of the plurality of processor cores based on the following relationship: Determine the number of processor cores executing requests for this current phase:
N=U
P/U
ave
N = U P / U ave
其中,N为重新确定的执行该当前阶段的请求的处理器核的数量,U
P为该第一处理器核集合中的处理器核的利用率总和,U
ave为该多个处理器核的平均利用率。
Among them, N is the number of processor cores that are re-determined to execute the request at the current stage, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the Average utilization.
可选地,该一个或多个处理器核,还用于生成多组分配结果,每组分配结果中包括为每一个阶段的请求重新分配的满足相应数量的处理器核集合;针对该多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,该路径长度L满足:Optionally, the one or more processor cores are further configured to generate multiple groups of allocation results, and each group of allocation results includes a set of processor cores that meets a corresponding number of requests reallocated for each stage of the request; The allocation result determines multiple path lengths. Each group of allocation results corresponds to a path length. The path length L satisfies:
其中,c
i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d
i,i+1表示执行该相邻阶段的请求的处理器核间的平均拓扑距离,M为该业务请求的多个阶段的请求的数量;根据该多个路径长度中的最短路径长度对应的一组分配结果,为该当前阶段的请求分配满足该数量的第二处理器核集合。
Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing requests in adjacent phases, and d i, i + 1 represents the average topological distance between the processor cores executing requests in the adjacent phases, M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the multiple path lengths, the request for the current stage is allocated a second set of processor cores that meets the number.
可选地,该第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,该一个或多个处理器核,还用于根据滑动窗口长度w与滑动步长d,在该K个处理器核中为该当前阶段的请求确定调度子区域,该调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;向该w个处理器核中负载最轻的处理器核发送该当前阶段的请求。Optionally, the first set of processor cores includes K processor cores, where K is an integer greater than or equal to 3, and the one or more processor cores are further used according to the sliding window length w and the sliding step size d. , In the K processor cores, determine a scheduling sub-region for the request of the current stage, the scheduling sub-region includes w processor cores, w is an integer greater than or equal to 2 and less than K, and d is greater than or equal to 1 And an integer smaller than K; sending the request of the current stage to the lightest-loaded processor core among the w processor cores.
可选地,该d与该K互为质数。Optionally, the d and the K are prime numbers each other.
本发明实施例图5所示的各模块可以为处理器核中的硬件逻辑,也可以是处理器核执行的计算机指令,或者硬件逻辑与计算机指令的组合等,本发明实施例对此不作限定。Each module shown in FIG. 5 in the embodiment of the present invention may be hardware logic in the processor core, or may be computer instructions executed by the processor core, or a combination of hardware logic and computer instructions, which is not limited in the embodiment of the present invention. .
根据本发明实施例的处理业务请求的装置800的各模块可以由处理器实现,也可以由 处理器与存储器共同实现,也可以由软件模块实现。相应的,图5所示的各个模块可以对应到图8所示的一个或多个模块,图8所示的模块包含图5所示的模块相应功能。Each module of the apparatus 800 for processing a service request according to an embodiment of the present invention may be implemented by a processor, may be implemented by a processor and a memory together, or may be implemented by a software module. Accordingly, each module shown in FIG. 5 may correspond to one or more modules shown in FIG. 8, and the module shown in FIG. 8 includes corresponding functions of the module shown in FIG. 5.
本发明实施例提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当该计算机指令在计算机上运行时,使得计算机执行本发明实施例中的处理业务请求的方法或处理业务请求的配置方法。An embodiment of the present invention provides a computer-readable storage medium. The computer-readable storage medium stores computer instructions. When the computer instructions are run on a computer, the computer executes a method for processing a service request in an embodiment of the present invention. Or configuration methods for processing business requests.
本发明实施例提供了一种包含计算机指令的计算机程序产品,当该计算机指令在计算机上运行时,使得计算机执行本发明实施例中的处理业务请求的方法或处理业务请求的配置方法。Embodiments of the present invention provide a computer program product containing computer instructions, and when the computer instructions are run on a computer, the computer is caused to execute the method for processing a service request or the method for configuring a service request in an embodiment of the present invention.
应理解,本发明实施例中提及的处理器可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that the processor mentioned in the embodiments of the present invention may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), and application-specific integrated circuits (DSPs). application specific integrated circuit (ASIC), ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
还应理解,本发明实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。It should also be understood that the memory mentioned in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory. Among them, the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrical memory Erase programmable read-only memory (EPROM, EEPROM) or flash memory. The volatile memory may be a random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (synchlink DRAM, SLDRAM ) And direct memory bus random access memory (direct RAMbus RAM, DR RAM).
需要说明的是,当处理器为通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件时,存储器(存储模块)集成在处理器中。It should be noted that when the processor is a general-purpose processor, a DSP, an ASIC, an FPGA, or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, the memory (memory module) is integrated in the processor.
应注意,本文描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。It should be noted that the memory described herein is intended to include, but is not limited to, these and any other suitable types of memory.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的范围。Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in connection with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the embodiments of the present invention.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices, and units described above can refer to the corresponding processes in the foregoing method embodiments, and are not repeated here.
在本发明实施例所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点, 所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the embodiments of the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are only schematic. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干计算机指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储计算机指令的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the embodiment of the present invention is essentially a part that contributes to the existing technology or a part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium. Including a plurality of computer instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in each embodiment of the present invention. The foregoing storage media include: U disks, mobile hard disks, read-only memories (ROM), random access memories (RAM), magnetic disks or compact discs, and other media that can store computer instructions .
以上所述,仅为本发明实施例的具体实施方式,但本发明实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明实施例的保护范围之内。因此,本发明实施例的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of the embodiments of the present invention, but the scope of protection of the embodiments of the present invention is not limited to this. Any person familiar with the technical field can easily implement the technical scope disclosed by the embodiments of the present invention. Any change or replacement is considered to be covered by the protection scope of the embodiments of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.
Claims (24)
- 一种存储系统中处理业务请求的方法,所述存储系统包含多个处理器核,其特征在于,包括:A method for processing a service request in a storage system. The storage system includes multiple processor cores, and is characterized in that it includes:接收业务请求的当前阶段的请求,所述当前阶段的请求为所述业务请求的多个阶段的请求中的一个阶段的请求;Receiving a request of a current stage of a service request, where the current stage request is one of a plurality of stages of the service request;确定执行所述当前阶段的请求的第一处理器核集合,所述第一处理器核集合为所述多个处理器核的一个处理器核子集;Determining a first set of processor cores that executes the request in the current phase, the first set of processor cores being a subset of the plurality of processor cores;向所述第一处理器核集合负载最轻的处理器核发送所述当前阶段的请求。And sending the request of the current stage to the processor core with the lightest load on the first processor core set.
- 根据权利要求1所述的方法,其特征在于,所述确定执行所述当前阶段的请求的第一处理器核集合,包括:The method according to claim 1, wherein the determining a first set of processor cores to execute the request of the current phase comprises:查询绑核关系,确定用于执行所述当前阶段的请求的所述第一处理器核集合,所述绑核关系用于指示所述当前阶段的请求与所述第一处理器核集合之间的关联关系。Query a core binding relationship to determine the first processor core set used to execute the request of the current phase, and the core binding relationship is used to indicate a relationship between the request of the current phase and the first processor core set Relationship.
- 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, further comprising:根据所述第一处理器核集合,重新确定执行所述当前阶段的请求的处理器核的数量;Re-determine the number of processor cores that execute the request in the current phase according to the first set of processor cores;根据所述重新确定的执行所述当前阶段的请求的处理器核的数量,在所述多个处理器核中为所述当前阶段的请求分配满足所述数量的第二处理器核集合;Allocating a second set of processor cores that meets the number of requests in the current stage among the plurality of processor cores according to the newly determined number of processor cores that executes the requests in the current stage;根据所述第二处理器核集合,生成新的绑核关系,所述新的绑核关系用于指示所述当前阶段的请求与所述第二处理器核集合之间的关联关系。Generating a new core binding relationship according to the second processor core set, where the new core binding relationship is used to indicate an association relationship between the request at the current stage and the second processor core set.
- 根据权利要求3所述的方法,其特征在于,所述根据所述第一处理器核集合,重新确定执行所述当前阶段的请求的处理器核的数量,包括:The method according to claim 3, wherein the re-determining the number of processor cores executing the request of the current phase according to the first set of processor cores comprises:确定所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率;Determining a total utilization rate of processor cores in the first processor core set and an average utilization rate of the plurality of processor cores;根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,重新确定执行所述当前阶段的请求的处理器核的数量。And re-determining the number of processor cores executing the request in the current stage according to the sum of the utilization ratios of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores.
- 根据权利要求4所述的方法,其特征在于,所述根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,重新确定执行所述当前阶段的请求的处理器核的数量,包括:The method according to claim 4, wherein the re-determining the execution unit is based on a sum of utilization rates of processor cores in the first set of processor cores and an average utilization rate of the plurality of processor cores. Describes the number of processor cores requested at the current stage, including:根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,基于以下关系式重新确定执行所述当前阶段的请求的处理器核的数量:Re-determining the number of processor cores executing the request in the current stage based on the sum of the utilization ratio of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores based on the following relationship :N=U P/U ave N = U P / U ave其中,N为重新确定的执行所述当前阶段的请求的处理器核的数量,U P为所述第一处理器核集合中的处理器核的利用率总和,U ave为所述多个处理器核的平均利用率。 Wherein N is the number of re-determined processor cores executing the current stage request, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the multiple processes. Processor core average utilization.
- 根据权利要求3至5中任一项所述的方法,其特征在于,所述在所述多个处理器核中为所述当前阶段的请求分配满足所述数量的第二处理器核集合,包括:The method according to any one of claims 3 to 5, characterized in that, in the plurality of processor cores, the request of the current stage is allocated a second set of processor cores that satisfies the number, include:生成多组分配结果,每组分配结果中包括为每一个阶段的请求重新分配的满足相应数量的处理器核集合;Generate multiple sets of allocation results, each set of allocation results including a set of processor cores that are re-allocated for each stage of the request;针对所述多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,所述 路径长度L满足:Multiple path lengths are determined for the multiple sets of allocation results, each set of allocation results corresponds to a path length, and the path length L satisfies:其中,c i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d i,i+1表示执行所述相邻阶段的请求的处理器核间的平均拓扑距离,M为所述业务请求的多个阶段的请求的数量; Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing the requests in adjacent stages, and d i, i + 1 represents the average topological distance between the processor cores executing the requests in the adjacent stages. M is the number of requests in multiple stages of the service request;根据所述多个路径长度中的最短路径长度对应的一组分配结果,为所述当前阶段的请求分配满足所述数量的第二处理器核集合。According to a set of allocation results corresponding to the shortest path length among the plurality of path lengths, the request of the current stage is allocated a second set of processor cores that satisfies the number.
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,所述向所述第一处理器核集合中负载最轻的处理器核发送所述当前阶段的请求,包括:The method according to any one of claims 1 to 6, wherein the first set of processor cores includes K processor cores, K is an integer greater than or equal to 3, and the The lightest-loaded processor core in a processor core set sending the request in the current stage includes:根据滑动窗口长度w与滑动步长d,在所述K个处理器核中为所述当前阶段的请求确定调度子区域,所述调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;According to the sliding window length w and sliding step d, a scheduling sub-region is determined for the current stage request in the K processor cores, the scheduling sub-region includes w processor cores, and w is greater than or equal to An integer of 2 and less than K, and d is an integer of 1 or more and less than K;向所述w个处理器核中负载最轻的处理器核发送所述当前阶段的请求。Sending the request of the current stage to the lightest-loaded processor core among the w processor cores.
- 根据权利要求7所述的方法,其特征在于,所述d与所述K互为质数。The method according to claim 7, wherein the d and the K are prime numbers each other.
- 一种处理业务请求的装置,其特征在于,所述装置配置于存储系统中,包括:An apparatus for processing a service request, wherein the apparatus is configured in a storage system and includes:收发模块,用于接收业务请求的当前阶段的请求,所述当前阶段的请求为所述业务请求的多个阶段的请求中的一个阶段的请求;A transceiver module, configured to receive a request of a current stage of a service request, where the request of the current stage is a request of one of the multiple stages of the service request;处理模块,用于确定执行所述当前阶段的请求的第一处理器核集合,所述第一处理器核集合为所述多个处理器核的一个处理器核子集;A processing module, configured to determine a first set of processor cores that executes the request at the current stage, where the first set of processor cores is a subset of the plurality of processor cores;所述收发模块,还用于向所述第一处理器核集合负载最轻的处理器核发送所述当前阶段的请求。The transceiver module is further configured to send the request of the current stage to the processor core with the lightest load in the first processor core set.
- 根据权利要求9所述的装置,其特征在于,所述处理模块,还用于查询绑核关系,确定用于执行所述当前阶段的请求的所述第一处理器核集合,所述绑核关系用于指示所述当前阶段的请求与所述第一处理器核集合之间的关联关系。The apparatus according to claim 9, wherein the processing module is further configured to query a core binding relationship, and determine the first set of processor cores used to execute the request of the current phase, the core binding The relationship is used to indicate an association relationship between the request in the current stage and the first processor core set.
- 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于根据所述第一处理器核集合,重新确定执行所述当前阶段的请求的处理器核的数量;根据所述重新确定的执行所述当前阶段的请求的处理器核的数量,在所述多个处理器核中为所述当前阶段的请求分配满足所述数量的第二处理器核集合;根据所述第二处理器核集合,生成新的绑核关系,所述新的绑核关系用于指示所述当前阶段的请求与所述第二处理器核集合之间的关联关系。The apparatus according to claim 10, wherein the processing module is further configured to re-determine the number of processor cores that execute the request of the current stage according to the first set of processor cores; Re-determining the number of processor cores executing the request in the current phase, and allocating, in the plurality of processor cores, the request in the current phase to a second set of processor cores satisfying the number; The two processor core sets generate a new core binding relationship, where the new core binding relationship is used to indicate an association relationship between the request at the current stage and the second processor core set.
- 根据权利要求11所述的装置,其特征在于,所述处理模块,还用于确定所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率;根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,重新确定执行所述当前阶段的请求的处理器核的数量。The apparatus according to claim 11, wherein the processing module is further configured to determine a sum of utilization rates of processor cores in the first set of processor cores and an average utilization of the plurality of processor cores Rate; and re-determining the number of processor cores executing the request in the current stage according to the sum of the utilization rates of the processor cores in the first set of processor cores and the average utilization rate of the plurality of processor cores.
- 根据权利要求12所述的装置,其特征在于,所述处理模块,还用于根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,基于以下关系式重新确定执行所述当前阶段的请求的处理器核的数量:The apparatus according to claim 12, wherein the processing module is further configured to: according to a sum of utilization rates of processor cores in the first set of processor cores and an average utilization of the plurality of processor cores Rate, based on the following relationship to re-determine the number of processor cores executing the request in the current stage:N=U P/U ave N = U P / U ave其中,N为重新确定的执行所述当前阶段的请求的处理器核的数量,U P为所述第一处理器核集合中的处理器核的利用率总和,U ave为所述多个处理器核的平均利用率。 Wherein N is the number of re-determined processor cores executing the current stage request, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the multiple processes. Processor core average utilization.
- 根据权利要求11至13中任一项所述的装置,其特征在于,所述处理模块,还用于The device according to any one of claims 11 to 13, wherein the processing module is further configured to:生成多组分配结果,每组分配结果中包括为每一个阶段的请求重新分配的满足相应数量的处理器核集合;针对所述多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,所述路径长度L满足:Generate multiple sets of allocation results. Each set of allocation results includes a set of processor cores that satisfy the corresponding number of requests for reallocation of each stage of the request; determine multiple path lengths for the multiple set of allocation results, and each set of allocation results corresponds to one Path length, where the path length L satisfies:其中,c i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d i,i+1表示执行所述相邻阶段的请求的处理器核间的平均拓扑距离,M为所述业务请求的多个阶段的请求的数量;根据所述多个路径长度中的最短路径长度对应的一组分配结果,为所述当前阶段的请求分配满足所述数量的第二处理器核集合。 Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing the requests in adjacent stages, and d i, i + 1 represents the average topological distance between the processor cores executing the requests in the adjacent stages. M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the plurality of path lengths, the request in the current stage is allocated a second that satisfies the number Processor core collection.
- 根据权利要求9至14中任一项所述的装置,其特征在于,所述第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,所述处理模块,还用于根据滑动窗口长度w与滑动步长d,在所述K个处理器核中为所述当前阶段的请求确定调度子区域,所述调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;The device according to any one of claims 9 to 14, wherein the first set of processor cores includes K processor cores, where K is an integer greater than or equal to 3, and the processing module further comprises: It is used to determine a scheduling sub-region for the current stage request in the K processor cores according to the sliding window length w and the sliding step d. The scheduling sub-region includes w processor cores, where w is greater than Or an integer of 2 and less than K, and d is an integer of 1 and less than K;所述收发模块,还用于向所述w个处理器核中负载最轻的处理器核发送所述当前阶段的请求。The transceiver module is further configured to send the request of the current stage to the processor core with the lightest load among the w processor cores.
- 根据权利要求15所述的装置,其特征在于,所述d与所述K互为质数。The device according to claim 15, wherein the d and the K are prime numbers to each other.
- 一种存储系统,其特征在于,所述存储系统包括多个处理器核与存储器;A storage system, characterized in that the storage system includes multiple processor cores and memories;存储器,用于存储计算机指令;Memory for storing computer instructions;所述多个处理器核中的一个或多个处理器核用于执行所述存储器中存储的计算机指令,当所述存储器中的计算机指令被执行时,所述一个或多个处理器核用于:One or more processor cores in the plurality of processor cores are used to execute computer instructions stored in the memory, and when the computer instructions in the memory are executed, the one or more processor cores are used in to:接收业务请求的当前阶段的请求,所述当前阶段的请求为所述业务请求的多个阶段的请求中的一个阶段的请求;确定执行所述当前阶段的请求的第一存储系统核集合,所述第一存储系统核集合为所述多个存储系统核的一个存储系统核子集;向所述第一存储系统核集合负载最轻的存储系统核发送所述当前阶段的请求。Receiving a request of a current stage of a service request, where the current stage request is one of a plurality of stage requests of the service request; determining a first set of storage system cores to execute the current stage request, so The first storage system core set is a subset of the storage system cores of the plurality of storage system cores; and sending the request of the current stage to the storage system core with the lightest load in the first storage system core set.
- 根据权利要求17所述的存储系统,其特征在于,所述一个或多个处理器核,还用于:The storage system according to claim 17, wherein the one or more processor cores are further configured to:查询绑核关系,确定用于执行所述当前阶段的请求的所述第一存储系统核集合,所述绑核关系用于指示所述当前阶段的请求与所述第一存储系统核集合之间的关联关系。Query a core binding relationship to determine the first storage system core set used to execute the request of the current phase, and the core binding relationship is used to indicate a relationship between the current phase request and the first storage system core set Relationship.
- 根据权利要求18所述的存储系统,其特征在于,所述一个或多个处理器核,还用于:The storage system according to claim 18, wherein the one or more processor cores are further configured to:根据所述第一存储系统核集合,重新确定执行所述当前阶段的请求的存储系统核的数量;根据所述重新确定的执行所述当前阶段的请求的存储系统核的数量,在所述多个存储系统核中为所述当前阶段的请求分配满足所述数量的第二存储系统核集合;根据所述第二 存储系统核集合,生成新的绑核关系,所述新的绑核关系用于指示所述当前阶段的请求与所述第二存储系统核集合之间的关联关系。Re-determining the number of storage system cores executing the request of the current phase according to the first set of storage system cores; Among the storage system cores, a second storage system core set that satisfies the number of requests for the current stage is allocated; and a new binding core relationship is generated according to the second storage system core set, and the new binding core relationship is used for And indicating an association relationship between the request in the current stage and the second storage system core set.
- 根据权利要求19所述的存储系统,其特征在于,所述一个或多个处理器核,还用于:The storage system according to claim 19, wherein the one or more processor cores are further configured to:确定所述第一存储系统核集合中的存储系统核的利用率总和与所述多个存储系统核的平均利用率;根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,重新确定执行所述当前阶段的请求的处理器核的数量。Determining a total utilization rate of the storage system cores in the first storage system core set and an average utilization rate of the plurality of storage system cores; and The average utilization of the plurality of processor cores re-determines the number of processor cores that executes the request in the current stage.
- 根据权利要求20所述的存储系统,其特征在于,所述一个或多个处理器核,还用于:The storage system according to claim 20, wherein the one or more processor cores are further configured to:根据所述第一处理器核集合中的处理器核的利用率总和与所述多个处理器核的平均利用率,基于以下关系式重新确定执行所述当前阶段的请求的处理器核的数量:Re-determining the number of processor cores executing the request in the current stage based on the sum of the utilization ratio of the processor cores in the first processor core set and the average utilization ratio of the plurality of processor cores :N=U P/U ave N = U P / U ave其中,N为重新确定的执行所述当前阶段的请求的处理器核的数量,U P为所述第一处理器核集合中的处理器核的利用率总和,U ave为所述多个处理器核的平均利用率。 Wherein N is the number of re-determined processor cores executing the current stage request, U P is the total utilization of the processor cores in the first set of processor cores, and U ave is the multiple processes. Processor core average utilization.
- 根据权利要求19至21中任一项所述的存储系统,其特征在于,所述一个或多个处理器核,还用于:The storage system according to any one of claims 19 to 21, wherein the one or more processor cores are further configured to:生成多组分配结果,每组分配结果中包括为每一个阶段的请求重新分配的满足相应数量的处理器核集合;针对所述多组分配结果确定多个路径长度,每一组分配结果对应一个路径长度,所述路径长度L满足:Generate multiple sets of allocation results. Each set of allocation results includes a set of processor cores that meets the corresponding number for each stage of the request for reassignment; determine multiple path lengths for the multiple set of allocation results, one for each set of allocation results Path length, where the path length L satisfies:其中,c i,i+1表示执行相邻阶段的请求的处理器核间交互产生的通信量,d i,i+1表示执行所述相邻阶段的请求的处理器核间的平均拓扑距离,M为所述业务请求的多个阶段的请求的数量;根据所述多个路径长度中的最短路径长度对应的一组分配结果,为所述当前阶段的请求分配满足所述数量的第二处理器核集合。 Among them, c i, i + 1 represents the communication volume generated by the interaction between the processor cores executing the requests in adjacent stages, and d i, i + 1 represents the average topological distance between the processor cores executing the requests in the adjacent stages. M is the number of requests in multiple stages of the service request; according to a set of allocation results corresponding to the shortest path length among the plurality of path lengths, the request in the current stage is allocated a second that satisfies the number Processor core collection.
- 根据权利要求17至22中任一项所述的存储系统,其特征在于,所述第一处理器核集合中包括K个处理器核,K为大于或等于3的整数,所述一个或多个处理器核,还用于:The storage system according to any one of claims 17 to 22, wherein the first processor core set includes K processor cores, K is an integer greater than or equal to 3, and the one or more Processor cores, also used for:根据滑动窗口长度w与滑动步长d,在所述K个处理器核中为所述当前阶段的请求确定调度子区域,所述调度子区域中包括w个处理器核,w为大于或等于2且小于K的整数,d为大于或等于1且小于K的整数;向所述w个处理器核中负载最轻的处理器核发送所述当前阶段的请求。According to the sliding window length w and sliding step d, a scheduling sub-region is determined for the current stage request in the K processor cores, the scheduling sub-region includes w processor cores, and w is greater than or equal to An integer of 2 and less than K, and d is an integer of 1 or more and less than K; sending the request of the current stage to the lightest-loaded processor core among the w processor cores.
- 根据权利要求23所述的存储系统,其特征在于,所述d与所述K互为质数。The storage system according to claim 23, wherein the d and the K are prime numbers to each other.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/098277 WO2020024207A1 (en) | 2018-08-02 | 2018-08-02 | Service request processing method, device and storage system |
CN201880005605.6A CN110178119B (en) | 2018-08-02 | 2018-08-02 | Method, device and storage system for processing service request |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/098277 WO2020024207A1 (en) | 2018-08-02 | 2018-08-02 | Service request processing method, device and storage system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020024207A1 true WO2020024207A1 (en) | 2020-02-06 |
Family
ID=67689271
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/098277 WO2020024207A1 (en) | 2018-08-02 | 2018-08-02 | Service request processing method, device and storage system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110178119B (en) |
WO (1) | WO2020024207A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118069374A (en) * | 2024-04-18 | 2024-05-24 | 清华大学 | Method, device, equipment and medium for accelerating intelligent training simulation transaction of data center |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112231099B (en) * | 2020-10-14 | 2024-07-05 | 北京中科网威信息技术有限公司 | Memory access method and device for processor |
CN114924866A (en) * | 2021-04-30 | 2022-08-19 | 华为技术有限公司 | Data processing method and related equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090064167A1 (en) * | 2007-08-28 | 2009-03-05 | Arimilli Lakshminarayana B | System and Method for Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks |
CN102411510A (en) * | 2011-09-16 | 2012-04-11 | 华为技术有限公司 | Method and device for mapping service data streams on virtual machines of multi-core processor |
CN102681902A (en) * | 2012-05-15 | 2012-09-19 | 浙江大学 | Load balancing method based on task distribution of multicore system |
CN102855218A (en) * | 2012-05-14 | 2013-01-02 | 中兴通讯股份有限公司 | Data processing system, method and device |
CN104391747A (en) * | 2014-11-18 | 2015-03-04 | 北京锐安科技有限公司 | Parallel computation method and parallel computation system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8015392B2 (en) * | 2004-09-29 | 2011-09-06 | Intel Corporation | Updating instructions to free core in multi-core processor with core sequence table indicating linking of thread sequences for processing queued packets |
CN102306139A (en) * | 2011-08-23 | 2012-01-04 | 北京科技大学 | Heterogeneous multi-core digital signal processor for orthogonal frequency division multiplexing (OFDM) wireless communication system |
CN103473120A (en) * | 2012-12-25 | 2013-12-25 | 北京航空航天大学 | Acceleration-factor-based multi-core real-time system task partitioning method |
US10467120B2 (en) * | 2016-11-11 | 2019-11-05 | Silexica GmbH | Software optimization for multicore systems |
-
2018
- 2018-08-02 WO PCT/CN2018/098277 patent/WO2020024207A1/en active Application Filing
- 2018-08-02 CN CN201880005605.6A patent/CN110178119B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090064167A1 (en) * | 2007-08-28 | 2009-03-05 | Arimilli Lakshminarayana B | System and Method for Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks |
CN102411510A (en) * | 2011-09-16 | 2012-04-11 | 华为技术有限公司 | Method and device for mapping service data streams on virtual machines of multi-core processor |
CN102855218A (en) * | 2012-05-14 | 2013-01-02 | 中兴通讯股份有限公司 | Data processing system, method and device |
CN102681902A (en) * | 2012-05-15 | 2012-09-19 | 浙江大学 | Load balancing method based on task distribution of multicore system |
CN104391747A (en) * | 2014-11-18 | 2015-03-04 | 北京锐安科技有限公司 | Parallel computation method and parallel computation system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118069374A (en) * | 2024-04-18 | 2024-05-24 | 清华大学 | Method, device, equipment and medium for accelerating intelligent training simulation transaction of data center |
Also Published As
Publication number | Publication date |
---|---|
CN110178119A (en) | 2019-08-27 |
CN110178119B (en) | 2022-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10534542B2 (en) | Dynamic core allocation for consistent performance in a non-preemptive scheduling environment | |
JP5514041B2 (en) | Identifier assignment method and program | |
JP5510556B2 (en) | Method and system for managing virtual machine storage space and physical hosts | |
US9866450B2 (en) | Methods and apparatus related to management of unit-based virtual resources within a data center environment | |
EP3281359B1 (en) | Application driven and adaptive unified resource management for data centers with multi-resource schedulable unit (mrsu) | |
WO2018120991A1 (en) | Resource scheduling method and device | |
US10394606B2 (en) | Dynamic weight accumulation for fair allocation of resources in a scheduler hierarchy | |
WO2021008197A1 (en) | Resource allocation method, storage device, and storage system | |
US11496413B2 (en) | Allocating cloud computing resources in a cloud computing environment based on user predictability | |
US20220156115A1 (en) | Resource Allocation Method And Resource Borrowing Method | |
WO2016041446A1 (en) | Resource allocation method, apparatus and device | |
JP2014522036A (en) | Method and apparatus for allocating virtual resources in a cloud environment | |
WO2020024207A1 (en) | Service request processing method, device and storage system | |
JP7506096B2 (en) | Dynamic allocation of computing resources | |
WO2020224531A1 (en) | Method and device for assigning tokens in storage system | |
CN112506650A (en) | Resource allocation method, system, computer device and storage medium | |
WO2022063273A1 (en) | Resource allocation method and apparatus based on numa attribute | |
WO2024022142A1 (en) | Resource use method and apparatus | |
CN116483740B (en) | Memory data migration method and device, storage medium and electronic device | |
JP2018190355A (en) | Resource management method | |
CN109298949B (en) | Resource scheduling system of distributed file system | |
US20120151175A1 (en) | Memory apparatus for collective volume memory and method for managing metadata thereof | |
US20150189013A1 (en) | Adaptive and prioritized replication scheduling in storage clusters | |
WO2017133421A1 (en) | Method and device for sharing resources among multiple tenants | |
JP7127155B2 (en) | cellular telecommunications network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18928883 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18928883 Country of ref document: EP Kind code of ref document: A1 |