WO2020124867A1 - 一种数据处理的方法、控制器、存储设备及存储系统 - Google Patents
一种数据处理的方法、控制器、存储设备及存储系统 Download PDFInfo
- Publication number
- WO2020124867A1 WO2020124867A1 PCT/CN2019/081221 CN2019081221W WO2020124867A1 WO 2020124867 A1 WO2020124867 A1 WO 2020124867A1 CN 2019081221 W CN2019081221 W CN 2019081221W WO 2020124867 A1 WO2020124867 A1 WO 2020124867A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- read
- request
- write
- storage area
- storage
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
Definitions
- the present application relates to the storage field, and in particular, to a method, controller, storage device, and storage system for processing data stored in a storage device.
- a solid state storage performs an IO request according to the order in which the IO request reaches the Die according to the IO request to access the same die in the SSD. If there is a block in the Die to be erased, the erase request to erase the block is added to the queue of requests to be processed corresponding to the Die. For example, if the write request and the erase request are ranked in front of the read request in the queue of pending requests, the SSD first performs the write operation corresponding to the write request, then performs the erase operation corresponding to the erase request, and finally Perform the read operation corresponding to the read request.
- the time consumed by the write operation and the erase operation is much longer than that of the read operation.
- the invention provides a data processing method, a controller, a storage device and a storage system.
- the storage device executes the IO request according to the execution time in the IO request when executing the IO request, thereby making the emergency IO requests are processed in a timely manner.
- a first aspect of the present invention provides a data processing method, which is executed by a controller, and the controller communicates with a storage device.
- the controller and the storage device may be the storage and storage device in the storage array, or the storage and storage device in the server.
- the controller adds the execution time of the IO request to the IO request, and then sends the IO request to the storage device.
- the execution time is used to indicate that the storage device is The IO request is processed before the execution time arrives.
- the storage device executes the IO request according to the execution time in the IO request when executing the IO request, so that the urgent IO request is processed in time.
- the controller may also add a timeout indicator to the IO request.
- the timeout indication flag is used to indicate whether the storage device returns error information when the IO request has not been processed by the storage device after the execution time is exceeded, and the error information is used to indicate the IO request execution error.
- the controller may be notified in time, and the controller may promptly determine a new processing strategy for the IO request, such as rereading or writing to a new location, etc. .
- the controller may determine the type of the IO request, for example, an IO request generated externally, or an IO request corresponding to an internal key service, or corresponding to an array background service IO request, and then determine the execution time of the IO request according to the determined type of the IO request.
- each type of IO request is preset in the controller with a different execution time, and then when setting the execution time for the received IO request, use The current time plus the execution time set for the type to which the IO request belongs allows the storage device to execute the IO request according to the completion time of the IO request.
- a second aspect of the present invention provides a data processing method, which is executed by a storage device. After the storage device obtains the IO request, since the IO request includes an execution time, the execution time indicates that the storage device finishes processing the IO request before the execution time arrives, so the storage device may The execution time of the IO request executes the IO request.
- the storage device executes the IO request according to the execution time in the IO request when executing the IO request, so that the urgent IO request is processed in time.
- the storage device includes multiple storage blocks, and after the storage device obtains the IO request, it determines the storage block accessed by the IO request, and then The IO request is placed in a queue of pending requests corresponding to the storage block according to the execution time, and then the IO request is executed according to the execution time of the IO request in the queue.
- the storage array can manage pending requests according to the execution time of IO requests.
- the storage device includes multiple storage areas, each storage area is composed of at least one storage block, the IO request is a write request, and the storage device determines the When an IO request accesses a storage block, first select a storage area from the plurality of storage areas where no erase operation is performed, and then determine a storage block accessed by the IO request according to the selected storage area.
- the write request and the erase request can be separately executed in different storage areas in the above manner, so that the probability of urgent IO requests being processed in time can be increased.
- each memory block includes a plurality of sub-blocks, the sub-blocks are the minimum unit for performing the erasing operation, and each memory area includes two modes, read+write mode And read + erase mode, when the storage area is used to write data, the storage area is set to the read + write mode, when the storage area is used to perform an erase operation, the storage The area is set to the read+erase mode.
- the memory block selects a memory area that is not currently performing an erasing operation from the plurality of memory areas, first select a memory area in the read+write mode from the plurality of memory areas; determine the selected Whether the number of free sub-blocks in the storage area is below the threshold; when the number of free sub-blocks in the selected storage area is below the threshold, the mode of the storage area is set to read+erase mode; There are storage areas that are not in read + write mode and read + erase mode; when there are storage areas that are not in read + write mode and read + erase mode, they will not be stored in read + write mode and read + erase mode The area serves as a storage area where no erase operation is performed. When the number of free sub-blocks in the selected storage area is not lower than the threshold, the selected storage area in the read+write mode is used as the storage area where no erase operation is performed.
- the storage device selects a storage area that is not currently performing an erasing operation, if the number of free sub-blocks in the selected storage area is lower than a threshold, it can be used to write data in the storage area If there is too little space, the write efficiency is affected, so that the storage area can be erased for later use, and a new storage area is selected again for the IO request.
- the storage device when it selects a storage area that is not currently performing an erasing operation from the plurality of storage areas, it first selects the storage area from the plurality of storage areas.
- the storage area of the read + write mode determine whether the read and write pressure undertaken by the selected storage area exceeds the threshold; when the read and write pressure undertaken by the selected storage area exceeds the threshold, determine whether there is not in the read + write mode and The storage area of the read+erase mode; when there is a storage area that is not in the read+write mode and the read+erase mode, select the storage area that is not in the read+write mode and the read+erase mode as the non-executed Erase storage area.
- the selected selected storage area is used as the storage area where no erasing operation is performed.
- the selected storage area by judging whether the read and write pressure exceeds the threshold, you can select the storage area with low read and write pressure to write data, so as to avoid writing data to the storage area with high read and write pressure, which affects the write efficiency.
- the storage area further includes a read+write+erase mode, and in the read+write+erase mode, a read operation can be performed in the storage area , Write operations, and erase operations.
- the storage device converts all the modes of the plurality of storage areas into the read+write+erase mode.
- each memory block includes multiple sub-blocks, the sub-blocks are the minimum unit for performing the erasing operation, and each memory area includes two modes, read +write mode and read+erase mode, when the storage area is used to write data, the storage area is set to the read+write mode, when the storage area is used to perform an erase operation, then The storage area is set to the read+erase mode.
- the storage device performs the following method to select a storage area for writing data to be written, first selects a storage area in a read+write mode, and determines whether the number of blank sub-blocks in the selected storage area is below a threshold, and if it is below the threshold , The mode of the storage area is converted to read + erase mode; determine whether there is a storage area in the SSD storage area that is not in read + write mode and read + erase mode if there is not in read + write mode and read +Erase mode storage area, the storage area is used as a storage area that does not perform an erasing operation.
- the disk processor 1021 will All storage areas in the SSD are switched to the read+write+erase mode; if the number of blank blocks in the selected storage area is not lower than the threshold, the read of the storage area currently in read+write mode is determined Whether the write pressure is too high, if the pressure of the storage area currently in the read+write mode is not large, it is determined that the storage area is the storage area for writing the write request, if the current is in the read+write mode If the pressure of the storage area is too high, the disk processor 1021 switches all storage areas in the SSD to a read+write+erase mode.
- the queue of pending requests corresponding to each storage block is a linked list group.
- each storage block corresponds to a read linked list group and a write linked list group
- the read linked list group is used to mount the read request according to the execution time of the read request in the IO request
- the write linked list group is used according to the IO The execution time of the write request in the request mounts the write request.
- the read linked list group and the write linked list group respectively include multiple linked lists, each linked list represents a time range, the time ranges of two adjacent linked lists are continuous, and the storage The device will determine the time range to which the execution time of the read request or write request belongs; mount the read request or write request to the linked list corresponding to the time range.
- the speed of IO request lookup can be increased, thereby improving the efficiency of IO request scheduling and execution.
- the storage area further includes at least one disk-level linked list group, the linked list group corresponding to the storage block is a block-level linked list group, and each disk-level linked list group includes multiple Disk-level linked list, each disk-level linked list represents a time range, the time range is the current time plus a preset duration, the time range of two adjacent disk-level linked lists is continuous, the at least one disk-level linked list group and all The block-level linked list group constitutes different levels.
- the block-level linked list group is the lowest-level linked list group, in which the sum of the time ranges represented by all the linked lists of the linked list group of the next level is the first of the linked list group of the previous level The time range represented by a linked list.
- the length of the entire linked list can be reduced, the time accuracy represented by the linked list can be provided, and the efficiency of IO request scheduling execution can be improved.
- the storage device divides the write request or the erase request to be executed into multiple shards; after executing one shard, determines whether there is an urgent The read request needs to be processed.
- the urgent read request is a read request whose execution time is earlier than the execution time of the write request or erase request; if there is an urgent read request to be processed, the execution of the write request or erase is suspended Request and execute the urgent read request.
- the storage device divides the write request or the erase request to be executed into multiple shards; after executing one shard, determines whether there is an urgent The read request needs to be processed.
- the urgent read request is a read request whose execution time is earlier than the current time plus the execution time of the next fragment plus the execution time of the read request; if there is an urgent read request to be processed, the execution is suspended The write request or the erase request, and execute the urgent read request.
- the storage device divides the write request or the erase request to be executed into multiple shards; after executing one shard, determines whether there is an urgent The read request needs to be processed.
- the execution time of the read request is earlier than the execution time of the next shard plus the execution time of x serially executed read requests.
- x is the time when a write request or an erase request is suspended. The maximum number of read requests that are allowed to be serially executed; if there are urgent read requests that need to be processed, the execution of the write request or erase request is suspended, and the urgent read request is executed.
- a third aspect of the invention provides a data processing method applied to a storage device, the storage device including a plurality of storage areas.
- the storage device receives a write request, the write request carries data to be written, and then selects a storage area from the plurality of storage areas where no erase operation is performed; writes the data to be written into the selected storage area .
- the write request and the erase request can be separately executed in different storage areas in the above manner, so that the probability of urgent IO requests being processed in time can be increased.
- the storage device includes multiple storage areas, each storage area is composed of at least one storage block, the IO request is a write request, and the storage device determines the When an IO request accesses a storage block, first select a storage area from the plurality of storage areas where no erase operation is performed, and then determine a storage block accessed by the IO request according to the selected storage area.
- the write request and the erase request can be separately executed in different storage areas in the above manner, so that the probability of urgent IO requests being processed in time can be increased.
- each memory block includes multiple sub-blocks, the sub-blocks are the minimum unit for performing the erasing operation, and each memory area includes two modes, read+write mode And read + erase mode, when the storage area is used to write data, the storage area is set to the read + write mode, when the storage area is used to perform an erase operation, the storage The area is set to the read+erase mode.
- the memory block selects a memory area that is not currently performing an erasing operation from the plurality of memory areas, first select a memory area in the read+write mode from the plurality of memory areas; determine the selected Whether the number of free sub-blocks in the storage area is below the threshold; when the number of free sub-blocks in the selected storage area is below the threshold, the mode of the storage area is set to read+erase mode; There are storage areas that are not in read + write mode and read + erase mode; when there are storage areas that are not in read + write mode and read + erase mode, they will not be stored in read + write mode and read + erase mode The area serves as a storage area where no erase operation is performed. When the number of free sub-blocks in the selected storage area is not lower than the threshold, the selected storage area in the read+write mode is used as the storage area where no erase operation is performed.
- the storage device selects a storage area that is not currently performing an erasing operation, if the number of free sub-blocks in the selected storage area is lower than a threshold, it can be used to write data in the storage area If there is too little space, the write efficiency is affected, so that the storage area can be erased for later use, and a new storage area is selected again for the IO request.
- the storage device when selecting a storage area from the plurality of storage areas that is not currently performing an erasing operation, the storage device first selects a storage area from the plurality of storage areas.
- the storage area of the read + write mode determine whether the read and write pressure undertaken by the selected storage area exceeds the threshold; when the read and write pressure undertaken by the selected storage area exceeds the threshold, determine whether there is not in the read + write mode and The storage area of the read+erase mode; when there is a storage area that is not in the read+write mode and the read+erase mode, select the storage area that is not in the read+write mode and the read+erase mode as the non-executed Erase storage area.
- the selected selected storage area is used as the storage area where no erasing operation is performed.
- the selected storage area by judging whether the read and write pressure exceeds the threshold, you can select the storage area with low read and write pressure to write data, so as to avoid writing data to the storage area with high read and write pressure, which affects the write efficiency.
- the storage area further includes a read+write+erase mode, and in the read+write+erase mode, a read operation can be performed in the storage area , Write operations, and erase operations.
- the storage device converts all the modes of the plurality of storage areas into the read+write+erase mode.
- the storage device further receives a read request, and the read request and the write request include an execution time, and the execution time is used to indicate that the storage device is in the Process the read request or write request before the execution time arrives; execute the read request or write request according to the execution time of the read request or write request.
- a fourth aspect of the present invention provides a controller.
- the controller includes a plurality of functional modules, and the functions performed by the functional modules are the same as the functions performed by the steps in the data processing method provided in the first aspect.
- a fifth aspect of the present invention provides a storage device.
- the storage device includes a plurality of functional modules, and the functions performed by the functional modules are the same as the functions performed by the steps in the data processing method provided in the second aspect.
- a sixth aspect of the present invention provides a storage device.
- the storage device includes multiple functional modules, and the functions performed by the functional modules are the same as the functions performed by the steps in the data processing method provided in the third aspect.
- a seventh aspect of the present invention provides a data processing system, including the various controllers provided in the first aspect and the second aspect and providing various storage devices.
- An eighth aspect of the present invention provides a controller.
- the controller includes a processor and a storage unit.
- the storage unit stores program instructions.
- the processor executes the program instructions in the storage unit to execute the first The data processing method provided on the one hand.
- An eighth aspect of the present invention provides a storage device.
- the storage device includes a processor and a storage unit.
- the storage unit stores program instructions.
- the processor executes the program instructions in the storage unit to execute the first The data processing method provided in the second aspect or the third aspect.
- a storage medium in which the computer program code is stored, and when the computer program code runs on a computer, the computer is caused to perform the method of the first aspect or the second aspect or the third aspect .
- FIG. 1 is a hardware structure diagram of a server provided by an embodiment of the present invention.
- FIG. 2 is a hardware structure diagram of a storage device provided by an embodiment of the present invention.
- FIG. 3 is a schematic diagram of constructing a storage area with a storage block in a storage device as a granularity in an embodiment of the present invention.
- FIG. 4 is a flowchart of a method for a controller in a server to process IO requests in an embodiment of the present invention.
- FIG. 5 is a flowchart of the storage device processing the received write request in the embodiment of the present invention.
- FIG. 6 is a flowchart of a method for determining a storage area of data to be written of a write request in an embodiment of the present invention.
- FIG. 7 is a schematic diagram of a first type of linked list group for mounting IO requests in an embodiment of the present invention.
- FIG. 8 is a schematic diagram of a second middle linked list group for mounting IO requests in an embodiment of the present invention.
- FIG. 9 is a flowchart of the storage device processing the received read request in the embodiment of the present invention.
- FIG. 10 is a functional block diagram of a controller in an embodiment of the present invention.
- FIG. 11 is a functional block diagram of a storage device in an embodiment of the present invention.
- FIG. 12 is a schematic diagram of a read IO request in an embodiment of the present invention.
- FIG. 1 it is a hardware structure diagram of a server 100 in an embodiment of the present invention.
- the server 100 includes a controller 101, multiple storage devices 102, a memory 103, an interface 104, and a bus 105.
- the controller 101, the storage device 102, and the interface 104 are connected to the bus 105, and the controller 101 accesses the storage device 102 through the bus 105.
- the interface 104 is used to connect to a host (not shown), and transmits the IO request received from the host to the controller 101 for processing.
- the memory 103 includes an application program (not shown) that the controller 101 runs. By running the application program, the controller 101 can manage the storage device 102 or enable the server 100 to externally Provide services.
- the server 100 may be a storage array, and the storage device 102 may be an SSD.
- FIG. 2 it is a structural diagram of the storage device 102 in the server 100.
- Each storage device 102 in the server 100 has the same structure, and one of them will be described below as an example.
- Each storage device 102 includes a disk processor 1021, a cache 1022, and a physical storage space composed of a plurality of Nand flash 1023.
- the disk processor 1021 is configured to receive the IO request sent by the controller 101 and execute the IO request to access data from the physical storage space.
- the cache 1022 is used to store an application program run by the disk processor 1021. By running the application program, the disk processor 1021 can implement access to and management of data in the storage area 1023.
- the physical storage space is divided into multiple storage areas, and each storage area is composed of at least one storage block, where the storage block is a Die that constitutes a Nand flash.
- FIG. 3 it is a schematic diagram of a RAID in which the storage area is composed of multiple Dies in the Nand flash.
- each Nand flash includes multiple Die1024.
- disk arrays Redundant Arrays of Independent Drives, RAID
- each storage device 102 includes 16 Nand flash particles, and each particle includes 4 Dies. If each 16 Dies constitutes a disk array 1025, 4 RAID1025s can be constructed in the storage device 102.
- Each Die 1024 includes multiple sub-blocks 1026, and each sub-block 1026 is a minimum unit for performing an erase operation.
- each Die forms a storage area.
- the controller 101 when receiving the IO request sent by the host, the controller 101 sends the IO request to the storage device 102 accessed by the IO request for processing. After the storage device 102 receives the IO request, it will further determine the storage block accessed by the IO request. For IO requests to access the same storage block, the disk processor 1021 of the storage device 120 will sequentially execute the IO requests according to the order of the IO requests to access the Die. If there is a block in the Die to be erased, the erase request to erase the block is added to the queue of requests to be processed corresponding to the Die.
- the disk processor 1021 first performs the write operation corresponding to the write request, Then the erase operation corresponding to the erase request is executed, and finally the read operation corresponding to the read request is executed.
- the time consumed by write operations and erase operations is much greater than that of read operations.
- the time consumed by read operations is generally 80 ⁇ s
- the time consumed by write operations is generally 1-3 ms
- the time consumed by erase operations is generally 3 ms- 15ms, so for urgent read requests, you must wait until the previous write request or erase request is executed, which can easily cause a delay in the read operation.
- IO requests generated by operations such as garbage collection and inspection
- the operations corresponding to the IO requests generated by these operations take longer time, and other urgent IO requests For example, the delay of read requests will have a greater impact.
- the controller 101 sets the execution time for the received IO request, and sends the IO request with the execution time set to the storage device 102, and the storage device 102 adjusts the IO request according to the execution time of the IO request.
- the execution sequence enables urgent IO requests to be processed in time to avoid timeout of urgent IO requests.
- FIG. 4 it is a flowchart of a method for the controller 101 to increase the execution time for an IO request in an embodiment of the present invention.
- step S401 the controller 101 determines the type of IO request.
- the types of IO requests generally include three categories.
- the first type is the IO request generated outside the server 100, for example, the IO request sent by the external host to the server.
- the IO request generated outside the server 100 includes two types. One is that the host responds to the user.
- the second type is IO corresponding to key services inside the server 100, such as metadata reading and writing;
- the third type is IO requests corresponding to array background tasks, such as cache flashing, hard disk reconstruction, and garbage collection.
- These three types of IO requests include two kinds of requests, namely read requests and write requests.
- step S402 the controller 101 sets an execution time for the IO request according to the type of the IO request.
- the controller 101 synchronizes the system time in the server 100 to each storage device 102 periodically (for example, 1 minute/1 hour/day), so that the time in the server 100 and each storage device 102 can be kept synchronized.
- the execution time can be set in advance for each type of IO request, for example, the execution time set for the read request and the write request generated by the host in response to the user's operation are 200 ⁇ s and 400 ⁇ s, respectively, for metadata reading and writing
- the execution time set for the generated read request and write request is 200 ⁇ s and 500 ⁇ s respectively.
- the execution time set for the read request and write request for metadata is 500 ⁇ s and 2ms respectively.
- the execution time set by the request is 10ms and 1s respectively.
- the execution time of the read request and the write request generated when performing garbage collection are 10ms and 2s respectively.
- the read request and write request of the snapshot, clone, copy, hyperactive, backup in the value-added service can also be set separately Execution time.
- the execution durations of the various types of IO requests listed above are only examples and are not intended to limit the present invention. In actual applications, different execution durations can be set for different IO requests according to actual conditions.
- the controller 101 When the controller 101 receives the IO request and recognizes the type of the IO request, it will obtain the execution time set for the type of IO request according to the type of the IO request, and then set the execution time for the received IO request.
- the execution time is equal to the current system time of the server 100 plus the execution time of the acquired IO request of this type.
- the storage device can execute the IO request according to the execution time of the IO request when executing the IO request, so that the urgent IO request can be processed in time, and the storage device executes the IO request according to the execution time
- the specific process will be described below.
- the read request in the first type of IO request if the data read by the read request hits in the memory 103, the data read from the memory is directly returned to the host .
- the data to be written in the write request is written into the memory 103, and then a feedback instruction of writing completion is returned to the host.
- a new write request is generated, and the generated new IO request belongs to the cache of the third type IO request, and then the The new system IO request to set the execution time.
- a field is added to the IO request to carry the execution time.
- FIG. 12 is a schematic diagram of a read request under the NVMe protocol.
- the read request includes 64 bytes, where Opcode is the command identifier, which is used to identify the command as a read request, and also includes other parameters related to the read request, such as namespace identifier (Namespace Identifier), metadata pointer ( Metadata Pointer), memory address (Data Pointer), starting logical address (Starting LBA) when returning data to the disk, etc.
- namespace identifier namespace Identifier
- Metadata Pointer metadata pointer
- Data Pointer memory address
- Starting LBA starting logical address
- a byte capable of carrying the execution time can be arbitrarily selected among these blank bytes, for example, four bytes of the second command line are selected to carry the execution time.
- a field is added to the IO request, and this field is used to carry a time-out indication mark, and the time-out indication mark is used to indicate that the execution time of the IO request exceeds the execution time of the IO request Whether to return the mark immediately.
- the time-out indication mark can be carried through this bit .
- timeout indication flag when the timeout indication flag is 0, it means that when the execution time of the IO request exceeds the execution time of the IO request, there is no need to return immediately, and when the timeout indication flag is 1, it means that When the execution time of the IO request exceeds the execution time of the IO request, it needs to be returned immediately.
- the position carrying the time-out indication mark may also be any blank position in the existing read command, and the time-out indication mark may also have different definitions.
- FIG. 12 is only an example, and in actual application, a suitable position can be selected according to the actual situation And appropriate tags.
- step S403 the controller 101 sends the IO request to the storage device 102 accessed by the IO request.
- the IO request carries the LUN ID and logical address where the data to be accessed is located, and the storage device 102 where the data to be accessed is located can be confirmed according to the LUN ID and the logical address.
- the storage device 102 After the storage device 102 receives the IO request, it will process the read request and the write request differently. The processing methods of the write request and the read request by the storage device will be described below.
- FIG. 5 it is a flowchart of a method for the storage device 102 to process a write request.
- step S501 the disk processor 1021 of the storage device 102 receives the write request sent by the controller 101.
- Step S502 Determine whether the write mode of the storage device 102 is a write-back mode.
- Two storage modes are generally provided in the storage device 102, one is a write-back mode, and the other is a write-through mode.
- the write-back mode is that the storage device 102 writes the data to be written in the write request first to the In the cache 1022, after the writing is completed, the feedback instruction of the completion of writing is returned, and then the data in the cache 1022 is written into the Nand flash 1023 by means of the cache flashing method.
- the write-through mode the storage device 102 writes the data to be written in the write request to the cache 1022 and the flash 1023 at the same time.
- the manufacturer When the storage device is shipped from the factory, the manufacturer generally sets the write mode of the storage device. The default setting is generally the write-back mode.
- the user can enter a preset command to modify. You can query whether the write mode of the storage device is write-back mode or write-through mode by querying the parameters of the storage device.
- Step S503 if the write mode in the storage device 102 is the write-back mode, after the disk processor 1021 writes the data to be written in the IO request into the cache 1022, it returns a write completion feedback instruction to Controller 101.
- step S504 the disk processor 1021 determines the storage block of the data to be written to which the write request is written.
- the write request here may be a write request for determining that the write mode of the storage device is not a write-back mode but a write-through mode in step S501, and that the data to be written needs to be written to the Nand flash, or may be a storage
- a write request generated by a device background operation such as a cache flash disk or a write request generated by garbage collection
- the disk processor 1021 will also write for the background operation Request to set the execution time.
- the execution time please refer to the setting of the execution time of the third type IO request.
- the disk processor 1021 first determines the storage area of the data to be written in the IO request, and then determines the storage block where the data to be written is written. When the storage area is a single storage block, the storage block into which the data to be written can be directly determined.
- the write operation and the erase operation are processed in different storage areas. Specifically, when selecting a storage area for the write request, a storage area that is performing an erase operation is not selected. It is recorded in the metadata of the storage area whether an erasing operation is being performed in the storage area. This ensures that write operations and erase operations do not occur in the same storage area. Thereby avoiding that the write operation and the erase operation exist in a storage block at the same time, and urgent IO requests, such as the read request is arranged after the write request and the erase request, need to be completed after the write operation and the erase operation are completed, thus The situation that caused the read request to be delayed.
- the storage area has two operation modes, namely a read+write mode , And read + erase mode, but when the current operating mode of the storage device 102 cannot satisfy the write operation and erase operation are performed in the two storage areas, you can also switch the operation mode of the storage area will traditional read + Write + erase mode.
- the operation mode of the storage area can be recorded in the metadata of the storage area.
- step S601 the disk processor 1021 selects a storage area in a read+write mode.
- the disk processor 1021 may select the storage of the read+write mode by querying the operation mode of the storage area recorded in the metadata of each storage area Area.
- step S602 the disk processor 1021 determines whether the number of blank sub-blocks in the selected storage area is lower than a threshold. Step S603: If the number of blank sub-blocks in the selected storage area is lower than the threshold, the disk processor 1021 converts the mode of the storage area into a read+erase mode.
- each storage area includes multiple sub-blocks, and the sub-block is the smallest unit to perform the erasing operation.
- the sub-block is the smallest unit to perform the erasing operation.
- a minimum threshold of a blank sub-block is set for each storage area. When the number of blank sub-blocks in the storage area When the number is lower than the minimum threshold, it is not allowed to continue writing data, and the mode of the storage area is converted to the read + erase mode.
- the mode of the storage area is converted into a read+erase mode
- some sub-blocks with more invalid data are selected in the storage block for erasing, so as to free up blank sub-blocks.
- the erase operation is performed, if the number of blank sub-blocks in the storage area is greater than a certain value, the mark of the storage area in the read+erase mode may be cleared in case the subsequent write request is received .
- step S604 the disk processor 1021 determines whether there is a storage area in the storage area of the storage device that is not in the read+write mode and the read+erase mode.
- step S605 if there is a storage area that is not in the read+write mode and the read+erase mode, the disk processor 1021 regards the storage area as a storage area where no erase operation is performed.
- the write performance of the storage area is relatively poor and cannot be used to write the data to be written in the write request, which requires
- the data selects a new storage area.
- a new storage area is selected, a storage area that is not in a read+write mode and a read+erase mode is selected. This can make multiple storage partitions bear the pressure of writing.
- step S606 if there are no storage areas that are not in the read+write mode and the read+erase mode, the disk processor 1021 switches all storage areas in the storage device to the read+write+erase mode.
- the disk processor will remove the read+ marked in the metadata of each storage area The mark of the write mode or read+erase mode, so that the storage area is restored to the existing read+write+erase mode, that is, each storage area can perform read, write, and erase operations.
- Step S607 if it is determined in step S602 that the number of blank blocks in the selected storage area is not lower than the threshold, the disk processor 1021 determines whether the read and write pressure of the storage area currently in the read+write mode is too high Big.
- whether the read and write pressure of the storage area is too large may be determined according to whether the read delay of the storage area is greater than the first preset value or whether the write delay is greater than the second preset value
- Step S608 if the read and write pressure of the storage area currently in the read+write mode is too large, then step S604 is executed.
- the first preset value is twice the time delay for the disk processor 1021 to process only one read request (no interference from other requests)
- the second preset value is that the disk processor 1021 only processes one 2 times the delay when writing a request (no interference from other requests).
- the disk processor 1021 determines that the storage area is a storage area where no erasing operation is performed.
- the read delay of the storage area is not greater than the first preset value, or the write delay is not greater than the second preset value, it is considered that the read and write pressure of the storage area is not too great.
- the data to be written can be written into the storage block constituting the storage area.
- the data is directly written into the storage block.
- step S608 when it is determined in step S602 that the number of blank sub-blocks is not lower than the threshold, step S608 may be directly executed, that is, the storage area is used as the write request In the storage area, there is no need to perform step S607, that is, there is no need to determine whether the pressure in the storage area is too high.
- step S601 is executed, that is, after the storage area currently in the read+write mode is selected, step S607 is directly executed, that is, whether the pressure of the storage area is too high, and step S602 and Step S603.
- Step S505 Place the write request in a queue of pending requests corresponding to the storage block according to the execution time of the write request.
- the queue of the request to be processed is a linked list group, but the linked list group is only an example, and any other queue that can sort IO requests according to execution time falls within the scope of the present invention.
- the following uses a linked list group as an example to explain how to sort the IO requests according to execution time.
- the linked list group corresponding to each storage block includes a write linked list group and a read linked list group.
- the linked list group corresponding to each storage block does not distinguish between the write linked list group and the read linked list group, but mounts the read request and the write request in the same linked list group.
- the structure of the write-linked list group and the read-linked list group is the same.
- the following uses only the write-linked list group as an example for description.
- the write linked list group includes multiple write linked lists.
- the linked list header of each written linked list represents a time range.
- the linked list headers of two adjacent linked lists have a certain time interval.
- the time interval may be the same or different. In the embodiments, the same time interval is taken as an example for description.
- FIG. 7 it is a schematic diagram of a write linked list group in an embodiment of the present invention.
- the head of the linked list of the first linked list in the write linked list group is T+5ms, and the time range it represents is T ⁇ T+5ms, where T represents the current system time.
- the head of the linked list of the second linked list of the linked list group is T+10ms, and the time range represented by it is T+5ms to T+10ms, and so on, the time ranges identified by two adjacent linked lists differ by 5ms. In this way, the time range within which the execution time falls can be determined according to the execution time of the write request, and then the write request can be mounted under the linked list corresponding to the time range.
- a multi-level linked list group may be established.
- the first-level linked list group and the second-level linked list group are disk-level linked list groups
- the third level is a Die-level linked list group.
- the first-level linked list group includes x linked lists, and the interval of the time range represented by the linked list header of each linked list in the linked list group is 1s, then the number of linked lists x is determined by the maximum execution time of the write request Decide.
- the time range indicated by the head T+1s of the first linked list is T ⁇ T+1s, where T is the current system time.
- the time range indicated by the linked list with the head T+2s is T+1s ⁇ T+2s, and so on, the time range indicated by the linked list with the head T+xs is T+x-1s ⁇ T+xs .
- the second-level linked list group further divides the time range indicated by the link list header of the first linked list in the first-level linked list group into a plurality of linked lists. For example, as shown in FIG. 8, in the embodiment of the present invention, the time range represented by the first linked list in the secondary linked list group is divided with a granularity of 5 ms.
- the three-level linked list is the linked list group corresponding to each Die.
- the three-level linked list group is a further division of the time range represented by the linked list header of the first linked list in the second-level linked list group, for example, 200 ⁇ s Granularity.
- the disk processor 1021 When the disk processor 1021 needs to write the data to be written in the write request to the storage area 1023, first mount the write request to the first-level linked list group and all the data according to the execution time in the write request
- the execution time corresponds to the linked list. For example, if the current system time T is 11h: 25m: 30s, and the execution time of the write request is 11h: 25m: 32s: 66ms, then since the execution time of the write request is T+2s 65ms, the time range belongs to T+2s ⁇ T+3s, then attach the write request to the third linked list with the head T+3s.
- the header of the third linked list in the first-level linked list group becomes T+1s, which becomes the first in the first-level linked list group.
- the execution time of the write request mounted in the third linked list becomes T+65ms, then the disk processor 1021 executes the write request mounted in the first linked list according to the execution
- the time is linked to each linked list of the secondary linked list group, for example, a write request with an execution time of T+66ms is mounted to the linked list with a linked list header of T+70ms.
- the disk processor 1021 can execute the write requests in the linked list according to the order of execution time of the write requests in the linked list group corresponding to the Die.
- write linked list and read linked list are only examples. In specific implementation, it is not necessary to distinguish between the read linked list and the written linked list. Instead, the read and write requests are mounted in a linked list, and the disk processor can be based on the IO request in the linked list. Type partition read request and write request. .
- Step S506 when it is determined that the write request cannot be completed after the execution time of the write request is reached, and the time-out indicator carried in the write request indicates that the time for executing the write request exceeds
- the immediately returned flag will give feedback information to the controller 101 of the execution failure.
- Step S507 when the write request is executed before the execution time of the write request is reached, the disk processor 1021 divides the write request into a plurality of fragments.
- before the execution time is reached means that the current time is earlier than the execution time of the write request minus the execution time of the write request. This can ensure that the write request can be executed before the execution time is reached.
- the disk processor 1021 slices the write request at a granularity of 100 ⁇ s before executing the write request.
- step S508 each time the disk processor 1021 executes a write request fragment, it will go to the read linked list corresponding to the storage block to confirm whether there is an urgent read request to be processed.
- the urgent read request is a read request whose execution time is earlier than the execution time of the write request, or the execution time is earlier than the current time plus the execution duration of the next fragment of the write request.
- the read request with the execution time of the read request, or the read request that is earlier than the current system time plus the execution time of the next shard plus the execution time of x read requests, where x is the execution allowed when a write request is suspended The maximum number of serially executed read requests.
- step S509 if there is, the execution of the write request is suspended, the read request is executed, and after the execution of the read request is completed, the execution of the next fragment of the write request is continued.
- Step S510 if not, continue to execute the next fragment of the write request.
- step S511 if the write request is successfully executed, the execution result is returned to the controller 101.
- FIG. 9 it is a flowchart of a method for processing a read request by the storage device 102.
- step S901 the disk processor 1021 receives a read request.
- step S902 the disk processor 1021 judges whether the data to be read of the read request hits in the cache 1022.
- the cache 1022 caches frequently accessed hot data and newly written data, so when the disk processor receives a read request, it first goes to the cache according to the logical address carried in the read request It is judged whether the data to be read is stored in the cache, that is, whether the data to be read of the read request is hit in the cache 1022.
- step S903 if it hits, the data to be read is returned to the controller 101.
- step S904 if there is no hit, the storage area to which the data read by the read request belongs is determined, and the storage block where the read request is located is further determined.
- the data to be read needs to be read from the physical storage area 1023.
- the data to be read must first be determined according to the logical address of the read request The storage area where the read data is located. If the storage area is a RAID composed of multiple storage blocks, the storage block where the read request is located may be further determined. If the storage area is composed of one storage block, the determined storage area is the storage block.
- Step S905 Hang the read request into the queue of pending requests corresponding to the storage block.
- the queue of the pending request is implemented in the form of a linked list group, and each storage block corresponds to a read linked list group and a write linked list group, but in other embodiments, the read linked list group and a
- the linked list group can also be a linked list group.
- the structure of the read linked list group is the same as the structure of the written linked list group. For details, refer to the related descriptions in FIG. 7 and FIG. 8.
- the queue of the request to be processed is exemplified by a linked list group, but in the implementation application, other ways of sorting IO requests according to execution time fall within the protection scope of the present invention.
- Step S906 if the read request is executed before the execution time of the read request is reached, after the execution of the read request is completed, the read data is returned to the controller.
- Step S907 if the read request has not been executed after the execution time of the read request is reached, and the time-out indicator carried in the read request indicates that the read time exceeds the read time
- the execution time of the request the mark that is returned immediately when the read request has not been scheduled for execution, then returns feedback information of the execution failure to the controller 101.
- the controller 101 may re-read the read request. If the IO request carries a time-out indication flag indicating that the execution time of the IO request exceeds the execution time of the IO request and no error flag is returned, the storage device 102 reads the read request without processing, When the host determines that the IO times out, it resends the read request.
- the disk processor in order to allow the read request to be processed as soon as possible, if the operation mode of the storage area where the data to be read of the read request is located is read+write mode, the disk processor will execute The read request is scheduled during the write request. For details, please refer to the description of steps S507 and 508 in FIG. 5.
- the disk processor will call the read request during the execution of the erase operation, and the specific method is: Assuming that a read operation takes 80 ⁇ s and an erase operation takes 10 ms, the erase operation is divided into 50 slices, and each slice corresponds to 200 ⁇ s. At intervals of 200 ⁇ s, the disk processor determines whether it is necessary to suspend the erase operation in response to an urgent read request. If there is an urgent read request, the read request is executed. After the urgent read request is executed, the next one is executed. Fragmentation.
- the urgent read request is a read request whose execution time is earlier than the execution time of the erase request, or a read request whose execution time is earlier than the current time plus the execution duration of the next fragment of the erase request, or Earlier than the current system time plus the execution time of the next fragment of the erase request plus the execution time of x read requests, where x is the maximum number of serial read requests that can be executed when a write request is suspended .
- FIG. 10 it is a schematic structural diagram of a controller provided by an embodiment of the present application.
- the controller may be the controller 101 in FIG. 1 described above.
- the controller 101 includes a type determination module 1001, a setting module 1002, and a sending module 1003.
- the type determination module 1001 is used to determine the type of IO request.
- the controller determines the type of IO request and the classification of IO request, please refer to the description of step S401 in FIG. 4.
- the setting module 1002 is configured to set an execution time for the IO request according to the type of IO request. Please refer to the relevant description of step S402 for the way in which the setting module 1002 sets the execution time for each IO request.
- the setting module 1002 adds a field to the IO request to carry the execution time.
- the setting module 1002 will also add a field to the IO request. This field is used to carry a time-out indication mark, and the time-out indication mark is used to indicate that the time during which the IO request is executed exceeds that of the IO request Whether to return the mark immediately at execution time.
- the sending module 1003 is configured to send the IO request to the storage device 102 accessed by the IO request.
- the function performed by the sending module 1003 corresponds to step S403 in FIG. 4, please refer to the related description of step S403 in FIG. 4.
- the controller 101 may include only the setting module 1002 and the sending module 1003.
- FIG. 11 it is a schematic structural diagram of a storage device provided by an embodiment of the present application.
- the storage device may be the storage device 102 in FIGS. 1 and 2 described above.
- the storage device includes an acquisition module 1101 and an execution module 1103.
- the obtaining module 1101 is used to obtain an IO request.
- the IO request may be an IO request received from the controller 101, or may be an IO request generated by a background operation of a storage device, such as cache flashing, or garbage collection. IO request.
- the acquisition module 1101 will also determine whether the write mode of the storage device 102 is a write-back mode. If the write mode of the storage device 102 is a write-back mode, the acquisition After the module 1101 writes the data to be written in the IO request into the cache 1022, it returns a feedback instruction of writing completion to the controller 101.
- the acquisition module determines whether the read request hits in the cache 1022, and if it hits, returns a feedback instruction of the read hit to the controller 101.
- the obtaining module 1101 please refer to the description of steps S501 to S503 in FIG. 5 and the description of steps S901 to S903 in FIG. 9.
- the execution module 1104 is configured to execute the IO request according to the execution time of the IO request.
- the execution module 1103 determines that after the execution time of the write request arrives When the write request cannot be completed, and the time-out indicator carried in the write request indicates that when the time to execute the write request exceeds the execution time of the IO request, the mark that is returned immediately will return execution failure Feedback information to the controller 101, when the write request is executed before the execution time is reached, the write request is divided into multiple fragments, and the execution module 1103 executes a write request Sharding, will go to the read list corresponding to the storage block to confirm whether there is an urgent read request to be processed, if there is, then suspend the execution of the write request, execute the read request, and wait for the completion of the read request, then Continue to execute the next segment of the write request.
- the execution module 1103 executes the read request before the execution time of the read request is completed, returns the read data to the controller 101, and when the execution time is reached, the read request If the execution is not completed, if the time-out indicator carried in the IO request indicates that the execution time of the IO request exceeds the execution time of the IO request, an error flag is returned, an execution error is returned to the controller 101.
- the controller 101 may re-read the read request. If the IO request carries a time-out indication flag indicating that the execution time of the IO request exceeds the execution time of the IO request and no error flag is returned, the storage device 102 reads the read request without processing, When the host determines that the IO times out, it resends the read request. For details, please refer to the description of steps S907 and 908 in FIG. 9.
- the execution module 1104 When the execution module 1104 executes the erase request, it will divide the erase request into multiple shards, and after each shard is executed, the execution module 1104 will determine whether it is necessary to suspend the erase operation in response to an urgent read Request, if there is an urgent read request, execute the read request, and wait for the urgent read request to be executed before executing the next shard.
- the urgent read request is a read request whose execution time is earlier than the execution time of the erase request, or a read request whose execution time is earlier than the current time plus the execution duration of the next fragment of the erase request, or Earlier than the current system time plus the execution time of the next fragment of the erase request plus the execution time of x read requests, where x is the maximum number of serial read requests that can be executed when a write request is suspended .
- the storage device further includes a selection module 1102.
- the selection module 1102 is used to determine the memory block operated by the IO request.
- the IO request is a write request, please refer to step S504 and FIG. 5 in FIG.
- the IO request is a read request, refer to the related description of step S904 in FIG.
- the storage device 102 further includes a sorting module 1103, the sorting module is configured to, after determining the storage block operated by the IO request, Request to insert the IO request into the IO queue corresponding to the storage block according to the execution time of the IO request.
- the IO queue is a linked list group.
- the function performed by the sorting module 1103 is the same as the function performed by step S505 in FIG. 5 and step S905 in FIG. 9.
- the present invention provides another embodiment of a storage device.
- the storage device includes an acquisition module 1101, a selection module 1102, and an execution module 1104.
- the obtaining module 1101 is used to receive a write request, and the write request carries data to be written.
- the selection module 1102 is used to select a storage area where no erase operation is performed from the plurality of storage areas.
- the IO request is a write request
- the execution module 1104 is used to write the data to be written into the selected storage area.
- the controller 101 further includes a processor and a memory.
- the memory may be a cache, and the cache is used to store the control instruction.
- the processor is used to obtain an IO request, and the execution time of the IO request is added to the IO request.
- the execution time is used to instruct the storage device to finish processing the IO request before the execution time arrives;
- the IO request is sent to the storage device.
- the control instruction may be stored in the memory 103 of the server, the processor may read the control instruction in the memory 03, and then perform the above processing .
- the modules in the foregoing embodiments may be implemented by software, hardware, or a combination of both.
- the software exists in the form of computer program instructions and is stored in the memory, and the processor may be used to execute the program instructions and implement the above method flow.
- the processor may include, but is not limited to, at least one of the following: central processing unit (central processing unit, CPU), microprocessor, digital signal processor (DSP), microcontroller (microcontroller unit, MCU), or artificial intelligence
- CPU central processing unit
- DSP digital signal processor
- microcontroller microcontroller unit, MCU
- Various computing devices that run software, such as processors, and each computing device may include one or more cores for executing software instructions to perform operations or processing.
- the processor can be built in an SoC (system on chip) or application specific integrated circuit (ASIC), or it can be an independent semiconductor chip.
- SoC system on chip
- ASIC application specific integrated circuit
- the processor processes the core used to execute software instructions for calculation or processing, and may further include necessary hardware accelerators, such as field programmable gate array (field programmable gate array (FPGA), PLD (programmable logic device) Or a logic circuit that implements dedicated logic operations.
- FPGA field programmable gate array
- PLD programmable logic device
- the hardware may be a CPU, microprocessor, DSP, MCU, artificial intelligence processor, ASIC, SoC, FPGA, PLD, dedicated digital circuit, hardware accelerator, or non-integrated discrete device Any one or any combination of them, it can run the necessary software or does not depend on the software to perform the above method flow.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明实施例提供一种数据处理的方法、控制器、存储设备及存储系统。在该方案中,所述控制器在IO请求中添加所述IO请求的执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求,所述控制器将添加了执行时间的IO请求发送至存储设备。所述存储设备在接收到所述IO请求时,即可根据所述IO请求的执行时间执行所述IO请求。采用本发明实施例的方法,可以使紧急的IO请求及时得到处理,避免延迟。
Description
本申请涉及存储领域,尤其涉及对存储至存储设备的数据进行处理的方法、控制器、存储设备及存储系统。
现有技术中,固态存储设备(solid state storage,SSD)对于访问SSD中同一个die的IO请求,会根据IO请求达到所述Die的次序,执行IO请求。如果所述Die中有块(block)要擦除,则将擦除所述block的擦除请求加入所述Die对应的待处理的请求的队列中。例如,若在待处理的请求的队列中,写请求和擦除请求排在读请求的前面,则所述SSD首先执行写请求对应的写操作,然后执行擦除请求对应的擦除操作,最后再执行读请求对应的读操作。然而,写操作和擦除操作所消耗的时间远远大于读操作,这样对于紧急的读请求,也必须等到前面的写请求或者擦除请求执行完后才能执行,这样很容易引起读操作的延时。另外,对于SSD内部的一些操作,例如,垃圾回收,巡检等操作所产生的IO请求,这些操作所产生的IO请求对应的操作所消耗的时间更长,则对其他比较紧急的IO请求,例如读请求的时延影响会更大。
发明内容
本发明提供一种数据处理方法、控制器、存储设备及存储系统,通过在IO请求中增加执行时间,使存储设备在执行IO请求时,按照IO请求中的执行时间执行IO请求,从而使紧急的IO请求及时处理。
本发明第一方面提供一种数据处理的方法,由控制器执行,所述控制器与存储设备通信。所述控制器和存储设备可以为存储阵列中存储器和存储设备,也可以是服务器中的存储器和存储设备。所述控制器在执行所述方法时,会在IO请求中添加所述IO请求的执行时间,然后将所述IO请求发送至所述存储设备,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求。
通过在所述IO请求中通过在IO请求中增加执行时间,使存储设备在执行IO请求时,按照IO请求中的执行时间执行IO请求,从而使紧急的IO请求及时处理。
在第一方面的一种可选的实施例中,所述控制器还会在所述IO请求中添加超时指示标记。所述超时指示标记用于指示所述存储设备在超过所述执行时间,所述IO请求还没有被所述存储设备处理完时,是否返回错误信息的标记,所述错误信息用于指示所述IO请求执行错误。
这样,在所述IO请求没有在所述IO请求的执行时间内完成时,可以及时通知控制器,控制器可及时对所述IO请求确定新的处理策略,如重读或者写入新的位置等。
在第一方面的一种可选的实施例中,在所述控制器可以确定所述IO请求的类型,例如是外部产生的IO请求、或者内部关键业务对应的IO请求、或者阵列后台业务对应的IO请求,然后根据所确定的IO请求的类型确定所述IO请求的执行时间。
通过对不同类型的IO请求设置不同的执行时间,可以使需要紧急处理的IO请求优先得到处理。
在第一方面的第一种可选的实施例中,在所述控制器中没每种类型的IO请求预设了不同的执行时长,然后在给接收到的IO请求设置执行时间时,利用当前时间加上为该IO请求所属类型设置的执行时长,使存储设备可以根据IO请求的完成时间执行IO请求。
本发明第二方面提供一种数据处理的方法,由存储设备执行。所述存储设备获取IO请求后,由于IO请求中包括执行时间,所述执行时间指示所述存储设备在所述执行时间到达之前处理完所述IO请求,所以,所述存储设备可以按照所述IO请求的执行时间执行所述IO请求。
通过在所述IO请求中通过在IO请求中增加执行时间,使存储设备在执行IO请求时,按照IO请求中的执行时间执行IO请求,从而使紧急的IO请求及时处理。
在第二方面一种可选的实施例中,所述存储设备包括多个存储块,所述存储设别在获取所述IO请求后,会确定所述IO请求所访问的存储块,然后将所述IO请求按照所述执行时间置于所述存储块对应的待处理请求的队列中,再按照所述队列中的所述IO请求的执行时间执行所述IO请求。
通过对每个存储块设置待处理请求的队列,可以方面存储阵列按照IO请求的执行时间管理待处理请求。
在第二方面一种可选的实施例中,所述存储设备包括多个存储区,每个存储区由至少一个存储块组成,所述IO请求为写请求,在所述存储设备确定所述IO请求所访问的存储块时,首先从所述多个存储区中选择没有执行擦除操作的存储区,然后根据所选择的存储区确定所述IO请求所访问的存储块。
由于写请求和擦除请求的执行时间都比较长,所以通过上述方式可以将写请求和擦除请求分开在不同的存储区中执行,从而可使紧急的IO请求被及时处理的机率增加。
在第二方面的一种可选实施例中,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区包括两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式。所述存储块在从所述多个存储区中选择当前没有执行擦除操作的存储区时,首先从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区中的空闲子块的个数是否低于阈值;当所选择的存储区中的空闲子块的个数低于阈值,则将所述存储区的模式设置为读+擦除模式;判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,将不处于读+写模式及读+擦除模式的存储区作为没有执行擦除操作的存储区。当所选择的存储区中的空闲子块的个数不低于阈值,则将所选择的处于所述读+写模式的存储区作为没有执行擦除操作的存储区。
这样,所述存储设备在选择当前没有执行擦除操作的存储区时,如果所选择的存储区的空闲子块(block)的个数低于阈值,说明所述存储区中可以用来写数据的空间太少,则影响写入效率,这样就可以将所述存储区进行擦除,以备后用,而重新为所述IO请求选择新的存储区。
在第二方面的另一个可选的实施例中,所述存储设备在从所述多个存储区中选择当前没有执行擦除操作的存储区时,首先从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区所承担的读写压力是否超过阈值;当所选择的存储区所承担的读写压力超过阈值,判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,选择所述不处于读+写模式及读+擦除模式的 存储区作为所述没有执行擦除操作的存储区。而当所选择的存储区所承担的读写压力没有超过阈值,则将所选择的所选择的存储区作为所述没有执行擦除操作的存储区。
在选定的存储区时,通过判断读写压力是否超过阈值,可以选择读写压力小的存储区写入数据,这样避免将数据写入读写压力大的存储区,从而影响写入效率。
在第二方面一种可选的实施例中,所述存储区还包括读+写+擦除的模式,在所述读+写+擦除模式下,在所述存储区中能够执行读操作、写操作、及擦除操作。当不存在不处于读+写模式及读+擦除模式的存储区时,所述存储设备将所述多个存储区的模式全部转换为读+写+擦除模式。
通过将存储区的模式转换同时可以进行读写擦,避免在存储设备无法为IO找到合适的存储区时,无法正常写入数据的情况发生。
在本发明第二方面的一种可选的实施例中,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区包括两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式。所述存储设备执行如下方法选择写入待写数据的存储区,首先选择处于读+写模式的存储区,判断所选择的存储区中空白子块的个数是否低于阈值,如果低于阈值,则将所述存储区的模式转换为读+擦除模式;判断SSD的存储区中是否有不处于读+写模式及读+擦除模式的存储区如果有不处于读+写模式及读+擦除模式的存储区,则将所述存储区作为没有执行擦除操作的存储区,如果没有不处于读+写模式及读+擦除模式的存储区,则所述盘处理器1021将SSD中的所有存储区都切换为读+写+擦除的模式;判断所选择的存储区中空白块的个数不低于阈值,则确定所述当前处于读+写模式的存储区的读写压力是否太大,如果所述当前处于读+写模式的存储区的压力不大,则确定所述存储区为写入所述写请求的存储区,如果所述当前处于读+写模式的存储区的压力过大,则所述盘处理器1021将SSD中的所有存储区都切换为读+写+擦除的模式。
在在第二方面一种可选的实施例中,每个存储块对应的待处理请求的队列为链表组。且每个存储块对应一个读链表组和写链表组,所述读链表组用于按照所述IO请求中的读请求的执行时间挂载读请求,所述写链表组用于按照所述IO请求中的写请求的执行时间挂载写请求。
通过将读请求和写请求挂载在不同的链表中,可以提供IO请求调度执行的效率。
在在第二方面一种可选的实施例中,所述读链表组和写链表组分别包括多个链表,每个链表表示一个时间范围,相邻两个链表的时间范围连续,所述存储设备会确定读请求或者写请求的执行时间所属的时间范围;将所述读请求或者写请求挂载至与所述时间范围对应的链表下。
通过划分多个时间范围,并将IO请求挂载跟执行时间所落入的时间范围对应的链表,可以提高IO请求查找的速度,从而提升IO请求调度执行的效率。
在在第二方面一种可选的实施例中,所述存储区还包括至少一个盘级链表组,所述存储块对应的链表组为块级链表组,每个盘级链表组包括多个盘级链表,每个盘级链表表示一个时间范围,所述时间范围为当前时间加上一段预设时长,相邻两个盘级链表的时间范围连续,所述至少一个盘级链表组和所述块级链表组构成不同的级别,所述块级链表组为最低级别的链表组,其中下一级别的链表组的所有链表所表示的时间范围的总和为上一级别的链表组的第一个链表所表示的时间范围。
通过划分多级链表,可以减少整个链表的长度,提供链表所表示的时间精度,提升IO请求调度执行的效率。
在本发明在第二方面一种可选的实施例中,所述存储设备将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述紧急的读请求为执行时间早于所述写请求或者擦除请求的执行时长的读请求;如果有紧急的读请求需要处理,则暂停执行所述写请求或擦除请求,并执行所述紧急的读请求。
在本发明在第二方面一种可选的实施例中,所述存储设备将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述紧急的读请求为执行时间早于当前时间加上下一个分片的执行时长再加上读请求的执行时长的读请求;如果有紧急的读请求需要处理,则暂停执行所述写请求或擦除请求,并执行所述紧急的读请求。
在本发明在第二方面一种可选的实施例中,所述存储设备将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述读请求的执行时间早于下一个分片的执行时间加上x个串行执行的读请求的执行时长,x为暂停执行一次写请求或擦除请求时,所能允许串行执行的读请求的最大数量;如果有紧急的读请求需要处理,则暂停执行所述写请求或擦除请求,并执行所述紧急的读请求。
通过对写请求或者擦除请求分片,并在没执行完一个分片后,会去看有没有紧急的IO请求需要处理,如果有,则执行紧急的读IO请求,这样避免在执行耗时比较长的写请求或和擦请求时,使紧急的读IO请求得不到及时处理。
被发明第三方面提供一种数据处理方法,应用于存储设备,所述存储设备包括多个存储区。所述存储设备接收写请求,所述写请求中携带待写数据,然后从所述多个存储区中选择没有执行擦除操作的存储区;将所述待写数据写入所选择的存储区。
由于写请求和擦除请求的执行时间都比较长,所以通过上述方式可以将写请求和擦除请求分开在不同的存储区中执行,从而可使紧急的IO请求被及时处理的机率增加。
在第三方面一种可选的实施例中,所述存储设备包括多个存储区,每个存储区由至少一个存储块组成,所述IO请求为写请求,在所述存储设备确定所述IO请求所访问的存储块时,首先从所述多个存储区中选择没有执行擦除操作的存储区,然后根据所选择的存储区确定所述IO请求所访问的存储块。
由于写请求和擦除请求的执行时间都比较长,所以通过上述方式可以将写请求和擦除请求分开在不同的存储区中执行,从而可使紧急的IO请求被及时处理的机率增加。
在第三方面的一种可选实施例中,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区包括两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式。所述存储块在从所述多个存储区中选择当前没有执行擦除操作的存储区时,首先从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区中的空闲子块的个数是否低于阈值;当所选择的存储区中的空闲子块的个数低于阈值,则将所述存储区的模式设置为读+擦除模式;判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,将不处于读+写模式及读+擦除模式的存储区作为没有执行擦除操作的存储区。当 所选择的存储区中的空闲子块的个数不低于阈值,则将所选择的处于所述读+写模式的存储区作为没有执行擦除操作的存储区。
这样,所述存储设备在选择当前没有执行擦除操作的存储区时,如果所选择的存储区的空闲子块(block)的个数低于阈值,说明所述存储区中可以用来写数据的空间太少,则影响写入效率,这样就可以将所述存储区进行擦除,以备后用,而重新为所述IO请求选择新的存储区。
在第三方面的另一个可选的实施例中,所述存储设备在从所述多个存储区中选择当前没有执行擦除操作的存储区时,首先从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区所承担的读写压力是否超过阈值;当所选择的存储区所承担的读写压力超过阈值,判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,选择所述不处于读+写模式及读+擦除模式的存储区作为所述没有执行擦除操作的存储区。而当所选择的存储区所承担的读写压力没有超过阈值,则将所选择的所选择的存储区作为所述没有执行擦除操作的存储区。
在选定的存储区时,通过判断读写压力是否超过阈值,可以选择读写压力小的存储区写入数据,这样避免将数据写入读写压力大的存储区,从而影响写入效率。
在第三方面一种可选的实施例中,所述存储区还包括读+写+擦除的模式,在所述读+写+擦除模式下,在所述存储区中能够执行读操作、写操作、及擦除操作。当不存在不处于读+写模式及读+擦除模式的存储区时,所述存储设备将所述多个存储区的模式全部转换为读+写+擦除模式。
通过将存储区的模式转换同时可以进行读写擦,避免在存储设备无法为IO找到合适的存储区时,无法正常写入数据的情况发生。
在第三方面一种可选的实施例中,所述存储设备还接收读请求,所述读请求和所述写请求中包括执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理所述读请求或者写请求;按照所述读请求或写请求的执行时间执行所述读请求或者写请求。
本发明第四方面提供一种控制器,所述控制器包括多个功能模块,各个功能模块所执行的功能与第一方面所提供的数据处理方法中的各个步骤所执行的功能相同。
本发明第五方面提供一种存储设备,所述存储设备包括多个功能模块,各个功能模块所执行的功能与第二方面所提供的数据处理方法中的各个步骤所执行的功能相同。
本发明第六方面提供一种存储设备,所述存储设备包括多个功能模块,各个功能模块所执行的功能与第三方面所提供的数据处理方法中的各个步骤所执行的功能相同。
本发明第七面提供一种数据处理系统,包括第一方面提供的各种控制器及第二方面及提供各种存储设备。
本发明第八方面提供一种控制器,所述控制器包括处理器及存储单元,所述存储单元中存储有程序指令,所述处理器执行所述存储单元中的程序指令以执行所述第一方面所提供的数据处理方法。
本发明第八方面提供一种存储设备,所述存储设备包括处理器及存储单元,所述存储单元中存储有程序指令,所述处理器执行所述存储单元中的程序指令以执行所述第二方面或第三方面所提供的数据处理方法。
第九方面,提供了一种存储介质,所述存储介质中存储有计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行第一方面或者第二方面或者第三方 面的方法。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。
图1为本发明实施例所提供的服务器的硬件结构图。
图2为本发明实施例所提供的存储设备的硬件结构图。
图3为本发明实施例中以存储设备中的存储块为粒度,构建存储区的示意图。
图4为本发明实施例中的服务器中的控制器对IO请求处理的方法的流程图。
图5为本发明实施例中的存储设备对接收到的写请求进行处理的流程图。
图6为本发明实施例中确定写入写请求的待写数据的存储区的方法的流程图。
图7为本发明实施例中用于挂载IO请求的第一种链表组的示意图。
图8为本发明实施例中用于挂载IO请求的第二中链表组的示意图。
图9为本发明实施例中的存储设备对接收到的读请求进行处理的流程图。
图10为本发明实施例中控制器的功能模块图。
图11为本发明实施例中的存储设备的功能模块图。
图12为本发明实施例中的读IO请求的示意图。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。
如图1所示,为本发明实施例中的服务器100的硬件结构图。所述服务器100包括控制器101、多个存储设备102、内存103、接口104及总线105。所述控制器101、存储设备102及所述接口104连接至所述总线105,所述控制器101通过所述总线105访问所述存储设备102。所述接口104用于连接主机(图未示),并将从主机接收到的IO请求传输至所述控制器101进行处理。所述内存103包括所述控制器101运行的应用程序(图未示),通过运行所述应用程序,所述控制器101可以实现对所述存储设备102的管理,或者使所述服务器100对外提供服务。在本发明实施例中,所述服务器100可以为存储阵列,所述存储设备102可以为SSD。
如图2所示,为服务器100中的存储设备102的结构图。服务器100中的各存储设备102的结构相同,下面以其中一个为例进行说明。
每个存储设备102包括盘处理器1021、缓存1022、及由多个Nand flash1023构成的物理存储空间。所述盘处理器1021用于接收所述控制器101发送的IO请求,并执行所述IO请求,以从所述物理存储空间中存取数据。所述缓存1022用于存储所述盘处理器1021运行的应用程序,通过运行所述应用程序,所述盘处理器1021可以实现对所述存储区1023中的数据的存取及管理。
在本发明实施例中,所述物理存储空间被划分为多个存储区,每个存储区由至少一个存储块构成,其中存储块为构成Nand flash的Die。
如图3所示,为所述存储区由Nand flash中的多个Die构成的RAID的示意图。一般每个Nand flash包括多个Die1024。为了保证存储设备102中的数据可靠性,在存储设 备102中,会以Die为粒度建立磁盘阵列(Redundant Arrays of Independent Drives,RAID)。例如每个存储设备102包括16个Nand flash颗粒,每个颗粒包含4个Die,若每16个Die构成一个磁盘阵列1025,则在所述存储设备102中就可以构建4个RAID1025。上面存储设备102所包含的flash颗粒的个数、每个颗粒所包含的die的个数,还有组成RAID的Die的数量都是举例说明,并不用于限制本发明。每个Die1024中包括多个子块1026,每个子块1026为进行擦除操作的最小单位。
对于可靠性要求不高的存储设备102,也可以不用构成RAID,则每个Die即构成一个存储区。
已有技术中,控制器101在接收到主机发送的IO请求时,会将IO请求发送给所述IO请求所访问的存储设备102进行处理。在存储设备102收到IO请求后,会进一步确定所述IO请求所访问的存储块。对于访问同一个存储块的IO请求,存储设备120的盘处理器1021会根据访问所述Die的IO请求的次序,顺序执行IO请求。如果所述Die中有块(block)要擦除,则将擦除所述block的擦除请求加入所述Die对应的待处理的请求的队列中。例如,若在待处理的请求的队列中,第一个为写请求,第二个为擦除请求、第三个为读请求,则所述盘处理器1021首先执行写请求对应的写操作,然后执行擦除请求对应的擦除操作,最后再执行读请求对应的读操作。然而,写操作和擦除操作所消耗的时间远远大于读操作,例如,读操作消耗的时间一般为80μs,写操作消耗的时间一般为1-3ms,擦除操作消耗的时间一般为3ms-15ms,这样对于紧急的读请求,也必须等到前面的写请求或者擦除请求执行完后才能执行,这样很容易引起读操作的延时。另外,对于存储设备内部的一些操作,例如,垃圾回收,巡检等操作所产生的IO请求,这些操作所产生的IO请求对应的操作所消耗的时间更长,则对其他比较紧急的IO请求,例如读请求的时延影响会更大。
本发明实施例中,控制器101会对接收到的IO请求设置执行时间,并将设置了执行时间的IO请求下发至存储设备102,存储设备102会根据IO请求的执行时间调整IO请求的执行顺序,使紧急的IO请求及时处理,从而避免紧急的IO请求出现超时。下面将结合图4-图9详细介绍本发明实施例所提供的方案。
如图4示,为本发明实施例中的控制器101对IO请求增加执行时间的方法的流程图。
步骤S401,控制器101判断IO请求的类型。
IO请求的类型一般包括三大类,第一类为服务器100外部产生的IO请求,例如外部主机发送给服务器的IO请求,服务器100外部产生的IO请求又包括两类,一种是主机响应用户的操作产生的IO请求;另一种为服务器100的增值业务对应的IO请求,例如快照、克隆、复制、双活、备份等。第二类为服务器100内部的关键业务对应的IO,例如元数据的读写;第三类为阵列后台任务对应的IO请求,例如,cache刷盘、硬盘重构、垃圾回收等。这三类IO请求都包括两种请求,即读请求和写请求。
步骤S402,控制器101根据IO请求的类型为所述IO请求设置执行时间。
控制器101会周期性地(例如,1分钟/1小时/一天)将服务器100中的系统时间同步到各个存储设备102中,这样可以保证服务器100与各个存储设备102中的时间保持同步。
在服务器100中,可预先为每种类型的IO请求设置好执行时长,例如,为主机响应用户的操作产生的读请求和写请求所设置的执行时长分别为200μs及400μs,为元数据读写产生的读请求和写请求所设置的执行时长分别为200μs及500μs,对元数据的读请求和 写请求所设置的执行时长分别为500μs及2ms,对执行硬盘重构时产生的读请求和写请求所设置的执行时长分别为10ms和1s。对执行垃圾回收时产生的读请求和写请求所设置的执行时长分别为10ms和2s,同样,对增值业务中的快照、克隆、复制、双活、备份中读请求和写请求也可以分别设置执行时长。上面所列举的各种类型的IO请求的执行时长只是举例说明,并非用于限定本发明,在实际应用中,可以根据实际情况为不同的IO请求设置不同的执行时长。
当控制器101接收到IO请求,并识别出IO请求的类别后,会根据该IO请求的类型获取为该类型的IO请求设置的执行时长,然后为接收到的IO请求设置执行时间,所设置的执行时间等于服务器100的当前的系统时间加上所获取的该类型的IO请求的执行时长。通过为每个IO请求设置执行时间,可以使存储设备在执行所述IO请求时根据IO请求的执行时间执行IO请求,从而使紧急的IO请求能够及时得到处理,存储设备按照执行时间执行IO请求的具体过程将在下文描述。
在此需要说明的是,对于第一类IO请求中的读请求,如果所述读请求所读取的数据在所述内存103中命中,则直接返回从所述内存中读取的数据至主机。对于第一类IO请求中的写请求,则将所述写请求中的待写数据写入所述内存103,然后返回写完成的反馈指令给主机。在后续将所述内存103中的数据写入所述存储设备102时,则产生新的写请求,所产生的新的IO请求属于上述第三类IO请求中的cache刷盘,然后重新为所述新的系IO请求设置执行时间。
在本发明实施例中,在IO请求中会增加一个字段来携带所述执行时间。如图12所示,为NVMe协议下的读请求的示意图。该读请求包括64个字节,其中Opcode为命令标识,用于标识该命令为读请求、此外还包括其他与该读请求相关的参数,例如命名空间的标识(Namespace Identifier)、元数据指针(Metadata Pointer)、指向盘返回数据时的内存地址(Data Pointer)、起始逻辑地址(Starting LBA)等,以上参数都是现有的读命令中定义的参数,在此不再赘述。除了上述参数,读请求中还有一些空白的位置没有定义,例如第二命令行的四个字节、第三命令行的四个字节、第十三命令行的四个字节都没有定义,所以在本发明实施例中,可以在这些空白的字节中任意选择能够携带所述执行时间的字节,例如,选择第二命令行的四个字节来携带所述执行时间。
可选地,在所述IO请求中还会增加一个字段,该字段用于携带超时指示标记,所述超时指示标记用于指示在所述IO请求执行完的时间超过所述IO请求的执行时间时,是否立即返回的标记。还以图12为例,在所述读命令的字节Byte1中,也就是命令行0的第13位(bit)为空白,还没有被定义,所以可通过这个位来携带所述超时指示标记。例如,当所述超时指示标记为0时,表示在所述IO请求执行完的时间超过所述IO请求的执行时间时,不需要立即返回,当所述超时指示标记为1时,表示在所述IO请求执行完的时间超过所述IO请求的执行时间时,需要立即返回。携带所述超时指示标记的位置也可以为现有读命令中的任意空白位置,且超时指示标记也可以有不同的定义,图12只是举例说明,实际应用中,可根据实际情况选择合适的位置及合适的标记。
步骤S403,所述控制器101将所述IO请求发送至所述IO请求所访问的存储设备102。
所述IO请求中携带有待访问数据所在的LUN ID和逻辑地址,根据所述LUN ID和所述逻辑地址即可确认待访问数据所在的存储设备102。
在所述存储设备102接收到所述IO请求后,会对读请求和写请求做不同的处理,下面将分别介绍所述存储设备对写请求及读请求的处理方法。
如图5所示,为所述存储设备102对写请求处理的方法的流程图。
步骤S501,所述存储设备102的盘处理器1021接收到控制器101发送的写请求。
步骤S502,判断所述存储设备102的写模式是否为回写模式。
在存储设备102中一般提供两种写模式,一种为回写模式,一种为透写模式,所述回写模式为所述存储设备102将写请求中的待写数据先写入所述缓存1022中,写完后,即返回写完成的反馈指令,后续通过cache刷盘的方式再将所述缓存1022中的数据写入所述Nand flash1023中。所述透写模式为所述存储设备102将写请求中的待写数据同时写入所述缓存1022及Nand flash1023。在存储设备出厂时,厂家一般会设置好存储设备的写模式,一般默认的设置为回写模式,当用户后续需要将存储设备的写模式改为透写模式时,可通过输入预设的命令进行修改。通过查询存储设备的参数即可查询到存储设备的写模式为回写模式还是透写模式。
步骤S503,若所述存储设备102中的写模式为回写模式,所述盘处理器1021将所述IO请求中的待写数据写入所述缓存1022之后,即返回写完成的反馈指令至控制器101。
在后续将所述缓存1022中的数据写入所述存储区时,则产生新的写请求,且会重新为所述新的写请求设置执行时间,执行时间的设定可参考上述第三类IO请求的执行时间的设定。
步骤S504,所述盘处理器1021确定写入写请求的待写入数据的存储块。
这里的写请求可以为步骤S501中判断所述存储设备的写模式不是回写模式,而是通写模式时,需要将所述待写数据写入所述Nand flash的写请求,也可以是存储设备后台操作产生的写请求,例如cache刷盘,或者垃圾回收产生的写请求,在存储设备盘内产生基于后台操作产生的写请求时,所述盘处理器1021也会为后台操作产生的写请求设置执行时长,执行时长的设定可参考上述第三类IO请求的执行时长的设定。在存储区为图3所示的存储区时,所述盘处理器1021首先确定写入所述IO请求中的待写数据存储区,然后确定写入所述待写数据的存储块。而在所述存储区为单个存储块时,则可直接确定写入所述待写数据的存储块。
可选地,在本发明实施例中,由于写操作和擦除操作需要的时间比较长,所以将写操作和擦除操作在不同的存储区中进行处理。具体地,在为所述写请求选择存储区时,不选择正在执行擦除操作的存储区。在所述存储区的元数据中会记录所述存储区中是否在执行擦除操作。这样可以保证写操作和擦除操作不会在同一存储区发生。从而避免了写操作和擦除操作同时存在于一个存储块,紧急的IO请求,例如读请求排在写请求和擦除请求的后面时,需要在写操作和擦除操作完成之后才能完成,从而导致读请求被延迟的情况发生。
由于读操作既可以在执行写操作的存储区中执行,也可以在执行擦除操作的存储区中执行,所以,在本发明实施例中,存储区有两种操作模式,即读+写模式,和读+擦除模式,但是当存储设备102当前的操作模式无法满足使写操作和擦除操作分别在两个存储区中执行时,也可以将存储区的操作模式切换会传统的读+写+擦除的模式。关于存储区的操作模式,可以在存储区的元数据中记录。
下面将参考图6介绍如何为所述写请求选择没有执行擦除操作的存储区。
步骤S601,所述盘处理器1021选择处于读+写模式的存储区。
由于每个存储区的元数据中都记录有存储区的操作模式,所以所述盘处理器1021可以通过查询每个存储区的元数据中记录的存储区的操作模式选择读+写模式的存储区。
步骤S602,所述盘处理器1021判断所选择的存储区中空白子块的个数是否低于阈值。步骤S603,如果所选择的存储区中空白子块的个数低于阈值,则所述盘处理器1021将所述存储区的模式转换为读+擦除模式。
如图3所示,每个存储区包括多个子块,子块是执行擦除操作的最小单位,随着对存储区的持续操作,存储区中空白的子块会越来越少。空白子块越少,存储区的写性能就会越差,所以,在本发明实施例中,会为每个存储区设定一个空白子块的最低阈值,当存储区中空白子块的个数低于最低阈值时,则不允许继续写入数据,而将存储区的模式转换为读+擦除模式。在将存储区的模式转换为读+擦除模式后,即在所述存储块中选择一些无效数据比较多的子块进行擦除,以释放出空白子块。在执行擦除操作后,若所述存储区中的空白子块的个数大于一定值的时候,则可清除所述存储区处于读+擦除模式的标记,以备后续接收到写请求使用。
步骤S604,所述盘处理器1021判断存储设备的存储区中是否有不处于读+写模式及读+擦除模式的存储区。
步骤S605,如果有不处于读+写模式及读+擦除模式的存储区,则所述盘处理器1021将所述存储区作为没有执行擦除操作的存储区。
如果所选择的存储区中空白子块的个数低于阈值,所述存储区的写性能比较差,不能用于写入所述写请求中的待写数据,这就需要给所述待写数据选择新的存储区,在选择新的存储区时,会选择不处于读+写模式及读+擦除模式的存储区。这样可以使多个存储区分担写的压力。
步骤S606,如果没有不处于读+写模式及读+擦除模式的存储区,则所述盘处理器1021将存储设备中的所有存储区都切换为读+写+擦除的模式。
如果没有不处于读+写模式及读+擦除模式的存储区,则说明当前整个存储设备的写压力比较大,所以所述盘处理器会去除每个存储区的元数据中标记的读+写模式或者读+擦除模式的标记,从而使所述存储区恢复到现有的读+写+擦除的模式,即每个存储区都可以进行读、写、擦除的操作。
步骤S607,如果在步骤S602中判断所选择的存储区中空白块的个数不低于阈值,则所述盘处理器1021确定所述当前处于读+写模式的存储区的读写压力是否太大。
在本发明实施例中,可根据存储区的读时延是否大于第一预设值,或者写时延是否大于第二预设值来判断所述存储区的读写压力是否太大
步骤S608,如果所述当前处于读+写模式的存储区的读写压力太大,则执行步骤S604。
如果存储区的读时延大于第一预设值,或者写时延大于第二预设值,则认为存储区的读写压力太大。所述第一预设值为所述盘处理器1021只处理一个读请求的时延的2倍(没有其他请求的干扰),所述第二预设值为所述盘处理器1021只处理一个写请求时的时延的2倍(没有其他请求的干扰)。
如果所述当前处于读+写模式的存储区的读写压力太大,则所述盘处理器1021确定所述存储区为没有执行擦除操作的存储区。
如果存储区的读时延不大于第一预设值,或者写时延不大于第二预设值,则认为存储区的读写压力不太大。
在确定了没有执行擦除操作的存储区,即可将所述待写数据写入组成所述存储区的存储块中。当所述存储区只有一个存储块时,则将数据直接写入所述存储块中。
需要说明的是,在本发明其他实施例中,当在步骤S602中判断空白子块的个数不低于阈值时,则可以直接执行步骤S608,即将所述存储区作为写入所述写请求的存储区,不需要执行步骤S607,即不需要判断存储区的压力是否太大。
在本发明另一实施例中,执行了步骤S601,即选择了当前处于读+写模式的存储区后,直接执行步骤S607,即判断存储区的压力是否太大,而不需要执行步骤S602及步骤S603。
步骤S505,根据所述写请求的执行时间将所述写请求置于所述存储块对应的待处理请求的队列中。
在本发明实施例中,所述待处理请求的队列为链表组,但链表组只是举例说明,任何其他可以对IO请求按照执行时间排序的队列都属于本发明保护的范围。下面以链表组为例说明如何对所述IO请求按照执行时间排序。
在本发明实施例中,每个存储块对应的链表组都包括写链表组和读链表组。在其他实施例中,每个存储块对应的链表组不区分写链表组和读链表组,而是将读请求和写请求挂载在同一个链表组中。
写链表组和读链表组的结构相同,下面仅以写链表组为例进行说明。
所述写链表组中包括多个写链表,每个写链表的链表头表示一个时间范围,相邻的两个写链表的链表头具有一定的时间间隔,该时间间隔可以相同也可以不同,本实施例中,以相同的时间间隔为例进行说明。
如图7所示,为本发明实施例中一个写链表组的示意图。所述写链表组的第一个链表的链表头为T+5ms,其表示的时间范围为T~T+5ms,其中T表示当前系统时间。所述写链表组的第二个链表的链表头为T+10ms,其表示的时间范围为T+5ms~T+10ms,以此类推,相邻两个链表所标识的时间范围相差5ms。如此,即可根据写请求的执行时间确定所述执行时间所落入的时间范围,则可将该写请求挂载至该时间范围对应的链表下。
可选地,为了减少每个Die的写链表组和/或读链表组的链表个数,在本发明实施例中,可建立多级链表组。如图8所示,以三级链表组为例,其中一级链表组和二级链表组为盘级链表组,第三级为Die级链表组。
如图8所示,在一级链表组中包括x个链表,该链表组中的各链表的链表头所表示的时间范围的间隔为1s,则链表的个数x由写请求的最大执行时间决定。其中第一个链表的链表头T+1s表示的时间范围为T~T+1s,其中T为当前的系统时间。链表头为T+2s的链表所表示的时间范围为T+1s~T+2s,以此类推,则链表头为T+xs的链表所表示的时间范围为T+x-1s~T+xs。
二级链表组为将所述一级链表组中的第一个链表的链表头所表示的时间范围进一步划分为多个链表。例如如图8所示,在本发明实施例中,将所述二级链表组中的第一链表表示的时间范围以5ms的粒度进行划分。
三级链表即为每个Die所对应的链表组,所述三级链表组为对所述二级链表组中的第一个链表的链表头所表示的时间范围的进一步划分,例如以200μs的粒度进行划分。
在所述盘处理器1021需要将写请求中的待写数据写入存储区1023时,首先根据所述写请求中的执行时间将所述写请求挂载至所述一级链表组下与所述执行时间对应的链表中。例如,若当前系统时间T为11h:25m:30s,所述写请求的执行时间为11h:25m:32s:66ms,则由于写请求的执行时间为T+2s65ms,所属的时间范围为T+2s~T+3s,则将所述写请求挂至表头为T+3s的第三个链表中。随着时间的推移,当当 前系统时间变为11h:25m:32s,所述一级链表组中的第三个链表的表头变为T+1s,即成为所述一级链表组中的第一个链表,挂载在所述第三个链表中的所述写请求的执行时间变为T+65ms,则所述盘处理器1021将所述第一个链表所挂载的写请求根据执行时间挂至二级链表组的各链表中,例如将执行时间为T+66ms的写请求挂载至链表头为T+70ms的链表中。同理,随着时间的推移,当当前系统时间T为11h25m:32s:65ms时,则之前链表头为T+70ms的链表的链表头变为T+5ms,而执行时间为T+66ms的写请求也变为T+1ms,则所述盘处理器将挂载在链表头T+5ms下的写请求根据执行时间挂载至所述三级链表组中,例如,可将执行时间为T+1ms的写请求挂至链表头为T+1ms的链表中。如此,所述盘处理器1021即可根据所述Die对应的链表组中各写请求的执行时间的顺序执行链表中的写请求。
上述写链表和读链表只是举例说明,在具体实现时,也可以不区分读链表和写链表,而是把读写请求都挂载在一个链表中,盘处理器可以根据链表中的IO请求的类型分区读请求和写请求。。
步骤S506,当判断在所述写请求的执行时间到达之后所述写请求不能被执行完成时,且所述写请求中携带的超时指示标记指示的是在执行完所述写请求的时间超过所述IO请求的执行时间时,立即返回的标记,则将执行失败的反馈信息给控制器101。
步骤S507,当在到达所述写请求的执行时间之前,所述写请求被执行时,所述盘处理器1021将所述写请求切分为多个分片。在本发明实施例中,所述到达所述执行时间之前是指当前时间早于所述写请求的执行时间减去所述写请求的执行时长。这样可以保证在到达所述执行时间之前所述写请求可以被执行完成。
假设执行所述写请求耗时为2ms,而一般执行一个读请求的时间为80μs,则所述盘处理器1021在执行所述写请求前,将所述写请求以100μs为粒度进行切片。
步骤S508,所述盘处理器1021每执行完一个写请求的分片,会去所述存储块对应的读链表中确认是否有紧急的读请求需要处理。
在本发明实施例中,所述紧急的读请求为执行时间早于所述写请求的执行时间的读请求,或者执行时间早于当前时间加上所述写请求的下一个分片的执行时长及读请求的执行时长的读请求,或者早于当前系统时间加上下一个分片的执行时长加上x个读请求的执行时长的读请求,x为暂停执行一次写请求时,所能允许执行的串行执行的读请求的最大数量。
步骤S509,如果有,则暂停执行所述写请求,执行所述读请求,等所述读请求执行完毕,则继续执行所述写请求的下一个分片。
步骤S510,如果没有,则继续执行所述写请求的下一个分片。
步骤S511,如果该写请求成功执行完毕,则返回执行结果给控制器101。
如图9所示,为所述存储设备102对读请求处理的方法的流程图。
步骤S901,所述盘处理器1021接收到读请求。
步骤S902,所述盘处理器1021判断所述读请求的待读数据是否在缓存1022中命中。
为了提升读写效率,缓存1022中缓存有经常访问的热数据及新写入的数据,所以所述盘处理器在接收到读请求时,首先根据所述读请求中携带的逻辑地址去缓存中看所述待读数据是否存储在所述缓存中,即判断所述读请求的待读数据是否在缓存1022中命中。
步骤S903,如果命中,则返回待读数据至所述控制器101。
如果所述待读数据在所述缓存中,即表示在缓存中命中,则将缓存中的待读数据返回至控制器101。步骤S904,如果没有命中,则确定所述读请求所读的数据所属的存储区,进一步确定所述读请求所在的存储块。
如果没有在所述缓存1022中命中,则需要从物理存储区1023中读取所述待读数据,在读取所述待读数据时,首先要根据所述读请求的逻辑地址确定所述待读数据所在的存储区,如果所述存储区为有多个存储块组成的RAID,则可进一步确定所述读请求所在的存储块。如果所述存储区由一个存储块组成,则所确定的存储区即为所述存储块。
步骤S905,将所述读请求挂至所述存储块对应的待处理请求的队列中。
在本发明实施例中所述待处理请求的队列以链表组的形式实现,且每个存储块对应一个读链表组和一个写链表组,但在其他实施方式中,所述读链表组和一个写链表组也可以是一个链表组。所述读链表组的结构与写链表组的结构相同,具体可参考图7及图8中的相关描述。在本发明实施里中,只是以链表组举例所述所述待处理请求的队列,但在实施应用中,也其他可以对IO请求按照执行时间排序的方式都属于本发明保护的范围之内。
步骤S906,如果所述读请求在到达所述读请求的执行时间前,所述读请求被执行,则在所述读请求执行完毕后,返回所读取的数据至所述控制器。
步骤S907,如果所述读请求在到达所述读请求的执行时间后,所述读请求还未被执行完,且所述读请求中携带的超时指示标记指示的是在当前时间超过所述读请求的执行时间,所述读请求还未被调度执行时立即返回的标记,则返回执行失败的反馈信息给控制器101。所述控制器101在接收到所述执行错误的反馈信息后,可重新读取所述读请求。如果所述IO请求携带的是超时指示标记指示执行所述IO请求的时间超过所述IO请求的执行时间时不返回错误的标记时,则所述存储设备102读所述读请求不做处理,当所述主机判断所述IO超时时,则重新发送所述读请求。
在本发明实施例中,为了让所述读请求能够尽快被处理,如果所述读请求的待读的数据所在的存储区的操作模式为读+写模式,则所述盘处理器会在执行所述写请求的过程中调度所述读请求,具体请参考图5中的步骤S507及508的描述。
如果所述读请求的待读数据所在的存储区的操作模式为读+擦除的模式,则述盘处理器会在执行所述擦除操作的过程中调用所述读请求,具体方法为:假设一个读操作的耗时为80μs,一个擦除操作需要耗时10ms,将擦除操作切分为50个分片,每个分片对应200μs。每间隔200μs的时候,盘处理器判定是否需要暂停擦除操作来响应紧急的读请求,如果有紧急的读请求,则执行所述读请求,等紧急的读请求执行完毕后,再执行下一个分片。所述紧急的读请求为执行时间早于所述擦除请求的执行时间的读请求,或者执行时间早于当前时间加上所述擦除请求的下一个分片的执行时长的读请求,或者早于当前系统时间加上所述擦除请求下一个分片的执行时长加上x个读请求的执行时长,x为暂停执行一次写请求时,所能允许执行的串行读请求的最大数量。
如图10所示,为本申请实施例提供的一种控制器结构示意图,具体例子中,所述控制器可以是上述图1中的控制器101。
在一个具体的实施例中,所述控制器101包括类型确定模块1001、设置模块1002、及发送模块1003。所述类型确定模块1001用于判断IO请求的类型,关于控制器判断IO请求的类型的方式,及IO请求的分类,请参考图4中对于步骤S401的描述。
所述设置模块1002用于根据IO请求的类型为所述IO请求设置执行时间。关于设置 模块1002为每个IO请求设置执行时间的方式请参考步骤S402的相关描述。
可选地,所述设置模块1002会在IO请求中增加一个字段来携带所述执行时间。另外,所述设置模块1002还会在述IO请求中会增加一个字段,该字段用于携带超时指示标记,所述超时指示标记用于指示在执行所述IO请求的时间超过所述IO请求的执行时间时,是否立即返回的标记。
所述发送模块1003用于将所述IO请求发送至所述IO请求所访问的存储设备102。所述发送模块1003所执行的功能与图4中的步骤S403对应,请参考图4中的步骤S403的相关描述。
在本发明其他实施例中,所述控制器101的也可以只包括所述设置模块1002及发送模块1003。
如图11所示,为本申请实施例提供的一种存储设备结构示意图,具体的例子中,所述存储设备可以为上述图1和2中的存储设备102。
在一个具体的实施例中,所述存储设备包括获取模块1101和执行模块1103。
所述获取模块1101用于获取IO请求,所述IO请求可以为可以是从控制器101接收到的IO请求,也可以是存储设备后台操作产生的IO请求,例如cache刷盘,或者垃圾回收产生的IO请求。当所述IO请求为写请求时,所述获取模块1101还会判断所述存储设备102的写模式是否为回写模式,若所述存储设备102中的写模式为回写模式,所述获取模块1101将所述IO请求中的待写数据写入所述缓存1022之后,即返回写完成的反馈指令至控制器101。当所述IO请求为读请求时,所述获取模块判断所述读请求是否在缓存1022中命中,如果命中,则返回读命中的反馈指令至所述控制器101。关于所述获取模块1101所执行的功能请参考图5中的步骤S501至S503的描述,及图9中的步骤S901至S903的描述。
所述执行模块1104用于按照所述IO请求的执行时间执行所述IO请求,当所述IO请求为写请求时,所述执行模块1103当判断在所述写请求的执行时间到达之后所述写请求不能被执行完成时,且所述写请求中携带的超时指示标记指示的是在执行所述写请求的时间超过所述IO请求的执行时间时,立即返回的标记,则将返回执行失败的反馈信息给控制器101,当在到达所述执行时间之前,所述写请求被执行时,将所述写请求切分为多个分片,所述执行模块1103每执行完一个写请求的分片,会去所述存储块对应的读链表中确认是否有紧急的读请求需要处理,如果有,则暂停执行所述写请求,执行所述读请求,等所述读请求执行完毕,则继续执行所述写请求的下一个分片,如果没有,则继续执行所述写请求的下一个分片,如果该写请求成功执行完毕,则返回执行结果给控制器101。所述执行模块1103所执行的功能与图5中的步骤S506-511所执行的功能相同,具体请参考图5中的步骤S506至511。当所述IO请求为读请求时,所述执行模块1103在到达所述读请求的执行时间之前,所述读请求被执行完成,返回读数据至控制器101,到达执行时间,所述读请求没有执行完成,若所述IO请求中携带的超时指示标记指示执行所述IO请求的时间超过所述IO请求的执行时间时返回错误的标记时,则返回执行错误至控制器101。所述控制器101在接收到所述执行错误的反馈信息后,可重新读取所述读请求。如果所述IO请求携带的是超时指示标记指示执行所述IO请求的时间超过所述IO请求的执行时间时不返回错误的标记时,则所述存储设备102读所述读请求不做处理,当所述主机判断所述IO超时时,则重新发送所述读请求。具体请参考图9中的步骤S907及908的描述。
当所述执行模块1104执行擦除请求时,会将擦除请求切分为多个分片,每执行完一个分片,所述执行模块1104会判断是否需要暂停擦除操作来响应紧急的读请求,如果有紧急的读请求,则执行所述读请求,等紧急的读请求执行完毕后,再执行下一个分片。所述紧急的读请求为执行时间早于所述擦除请求的执行时间的读请求,或者执行时间早于当前时间加上所述擦除请求的下一个分片的执行时长的读请求,或者早于当前系统时间加上所述擦除请求下一个分片的执行时长加上x个读请求的执行时长,x为暂停执行一次写请求时,所能允许执行的串行读请求的最大数量。
在另一个实施例中,所述存储设备还进一步包括选择模块1102。所述选择模块1102用于确定所述IO请求所操作的存储块,在所述IO请求为写请求时,所述选择模块1102确定所操作的存储块的方法请参考图5中的步骤S504及图6的相关描述,在所述IO请求为读请求时,则参考图9中的步骤S904的相关描述。
在另一个实施例中,在以上实施例的基础上,所述存储设备102进一步包括排序模块1103,所述排序模块用于在确定了所述IO请求所操作的存储块之后,将所述IO请求按照所述IO请求的执行时间将所述IO请求插入所述存储块对应的IO队列中。在本发明实施例中,所述IO队列为链表组,关于链表组的形式请参考图7及图8的描述。所述排序模块1103所执行的功能与图5中的步骤S505及图9中的步骤S905执行的功能相同,具体请参考图5中的步骤S505及图9中的步骤S905的相关描述。
继续参考图11,本发明提供了另外一个存储设备的实施例,在该实施例中所述存储设备包括获取模块1101、选择模块1102、及执行模块1104。
所述获取模块1101用于接收写请求,所述写请求中携带待写数据。所述选择模块1102用于从所述多个存储区中选择没有执行擦除操作的存储区,在所述IO请求为写请求时,所述选择模块1102确定所操作的存储块的方法请参考图5中的步骤S504及图6的相关描述,在所述IO请求为读请求时,则参考图9中的步骤S904的相关描述。所述执行模块1104用于将所述待写数据写入所选择的存储区,具体请参考图5中的步骤S506-511及图9中的步骤S907及908的相关描述。
在本发明实施例中,所述控制器101进一步包括处理器及存储器,所述的存储器可以为缓存,所述缓存用于存储所述控制指令。
所述处理器用于获取IO请求,在IO请求中添加所述IO请求的执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求;将所述IO请求发送至所述存储设备。所述处理器的执行上述处理的具体方式可以参考上述图4所对应的实施例中的详细描述。可以理解的是,在另一种实现方式中,所述的控制指令可以存储在所述服务器的内存103中,所述处理器可以读取所述内存在03中的控制指令,然后执行上述处理。
需要说明的是,上述实施例中的模块可以以软件、硬件或二者结合来实现。当以上任一模块或单元以软件实现的时候,所述软件以计算机程序指令的方式存在,并被存储在存储器中,处理器可以用于执行所述程序指令并实现以上方法流程。所述处理器可以包括但不限于以下至少一种:中央处理单元(central processing unit,CPU)、微处理器、数字信号处理器(DSP)、微控制器(microcontroller unit,MCU)、或人工智能处理器等各类运行软件的计算设备,每种计算设备可包括一个或多个用于执行软件指令以进行运算或处理的核。该处理器可以内置于SoC(片上系统)或专用集成电路(application specific integrated circuit,ASIC),也可是一个独立的半导体芯片。该处理器内处理用于 执行软件指令以进行运算或处理的核外,还可进一步包括必要的硬件加速器,如现场可编程门阵列(field programmable gate array,FPGA)、PLD(可编程逻辑器件)、或者实现专用逻辑运算的逻辑电路。
当以上模块或单元以硬件实现的时候,该硬件可以是CPU、微处理器、DSP、MCU、人工智能处理器、ASIC、SoC、FPGA、PLD、专用数字电路、硬件加速器或非集成的分立器件中的任一个或任一组合,其可以运行必要的软件或不依赖于软件以执行以上方法流程。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。
Claims (43)
- 一种数据处理的方法,由控制器执行,所述控制器与存储设备通信,所述方法包括;在IO请求中添加所述IO请求的执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求;将所述IO请求发送至所述存储设备。
- 如权利要求1所述的方法,其特征在于,所述方法还包括:在所述IO请求中添加超时指示标记,所述超时指示标记用于指示所述存储设备在超过所述执行时间,所述IO请求还没有被所述存储设备处理完时,是否返回错误信息的标记,所述错误信息用于指示所述IO请求执行错误。
- 如权利要求1或2所述的方法,其特征在于,所述方法还包括:确定所述IO请求的类型,根据所述IO请求类型确定所述IO请求的执行时间。
- 如权利要求3所述的方法,其特征在于,所述根据IO请求类型确定所述IO请求的执行时间包括:根据所述IO请求的类型确定所述IO请求的执行时长;将所述IO请求的执行时长加上所述控制器的当前时间得到所述IO请求的执行时间。
- 一种数据处理的方法,由存储设备执行,所述方法包括;获取IO请求,所述IO请求包括执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求;按照所述IO请求的执行时间执行所述IO请求。
- 如权利要求5所述的方法,其特征在于,所述存储设备包括多个存储块,所述方法还包括:在获取所述IO请求后,确定所述IO请求所访问的存储块;将所述IO请求按照所述执行时间置于所述存储块对应的待处理请求的队列中;所述按照所述IO请求的执行时间执行所述IO请求包括:按照所述队列中的所述IO请求的执行时间执行所述IO请求。
- 如权利要求6所述的方法,其特征在于,所述存储设备包括多个存储区,每个存储区由至少一个存储块组成,所述IO请求为写请求,所述确定所述IO请求所访问的存储块包括:从所述多个存储区中选择没有执行擦除操作的存储区;根据所选择的存储区确定所述IO请求所访问的存储块。
- 如权利要求7所述的方法,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区支持两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式,所述从所述多个存储区中选择没有执行擦除操作的存储区包括:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区中的空闲子块的个数是否低于阈值;当所选择的存储区中的空闲子块的个数不低于阈值,则将所选择的处于所述读+写模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求8所述的方法,其特征在于,当所选择的存储区中的空闲子块的个数低于阈值,则将所述存储区的模式设置为读+擦除模式;判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,将不处于读+写模式及读+擦除模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求7所述的方法,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区支持两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为读+写模式,当所述存储区正在执行擦除操作时,则被设置为所述读+擦除模式,所述确定所述IO请求所访问的存储块包括:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区所承担的读写压力是否超过阈值;当所选择的存储区所承担的读写压力没有超过阈值,则将所选择的所选择的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求10所述的方法,其特征在于,当所选择的存储区所承担的读写压力超过阈值,判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,选择所述不处于读+写模式及读+擦除模式的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求9或11所述的方法,其特征在于,所述存储区还包括读+写+擦除的模式,在所述读+写+擦除模式下,在所述存储区中能够执行读操作、写操作、及擦除操作;所述方法还包括:当不存在不处于读+写模式及读+擦除模式的存储区时,将所述多个存储区的模式全部转换为读+写+擦除模式。
- 如权利要求5至12任意一项所述的方法,其特征在于,所述方法还包括:将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述紧急的读请求为执行时间早于所述写请求或者擦除请求的执行时间的读请求;如果有紧急的读请求需要处理,则暂停执行所述写请求或擦除请求,并执行所述紧急的读请求。
- 如权利要求5至12任意一项所述的方法,其特征在于,将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述紧急的读请求为执行时间早于下一个分片的执行时间加上x个串行执行的读请求的执行时长的读请求,x为暂停执行一次写请求或擦除请求时,所能允许串行执行的读请求的最大数量;如果有紧急的读请求需要处理,则挂起所述写请求或擦除请求,并执行所述紧急的读请求。
- 一种数据处理系统,包括:控制器,用于在IO请求中添加所述IO请求的执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求;及将所述IO请求发送至所述存储设备;存储设备,用于接收所述IO请求,按照所述IO请求的执行时间执行所述IO请求。
- 一种数据处理方法,应用于存储设备,其特征在于,所述存储设备包括多个存储区,所述方法包括:接收写请求,所述写请求中携带待写数据:从所述多个存储区中选择没有执行擦除操作的存储区;将所述待写数据写入所选择的存储区。
- 如权利要求16所述的方法,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区支持两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式,所述从所述多个存储区中选择当前没有执行擦除操作的存储区包括:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区中的空闲子块的个数是否低于阈值;当所选择的存储区中的空闲子块的个数不低于阈值,则将所选择的处于所述读+写模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求17所述的方法,其特征在于,当所选择的存储区中的空闲子块的个数低于阈值,则将所述存储区的模式设置为读+擦除模式;判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,将不处于读+写模式及读+擦除模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求16所述的方法,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区支持两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为读+写模式,当所述存储区正在执行擦除操作时,则被设置为所述读+擦除模式,所述从所述多个存储区中选择当前没有执行擦除操作的存储区包括:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区所承担的读写压力是否超过阈值;当所选择的存储区所承担的读写压力没有超过阈值,则将所选择的所选择的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求19所述的方法,其特征在于,当所选择的存储区所承担的读写压力超过阈值,判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,选择所述不处于读+写模式及读+擦除模式的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求18或20所述的方法,其特征在于,所述存储区还包括读+写+擦除的 模式,在所述读+写+擦除模式下,在所述存储区中能够执行读操作、写操作、及擦除操作;所述方法还包括:当不存在不处于读+写模式及读+擦除模式的存储区时,将所述多个存储区的模式全部转换为读+写+擦除模式。
- 如权利要求16至21任意一项所述的方法,其特征在于,所述方法还包括:接收读请求,所述读请求和所述写请求中包括执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理所述读请求或者写请求;按照所述读请求或写请求的执行时间执行所述读请求或者写请求。
- 一种控制器,所述控制器与存储设备通信,包括;设置模块,用于在IO请求中添加所述IO请求的执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求;发送模块,用于将所述IO请求发送至所述存储设备。
- 如权利要求23所述的控制器,其特征在于,所述设置模块还用于:在所述IO请求中添加超时指示标记,所述超时指示标记用于指示所述存储设备在超过所述执行时间,所述IO请求还没有被所述存储设备处理完时,是否返回错误信息的标记,所述错误信息用于指示所述IO请求执行错误。
- 如权利要求23或24所述的控制器,其特征在于,还包括:类型确定模块,用于确定所述IO请求的类型,根据所述IO请求类型确定所述IO请求的执行时间。
- 如权利要求3所述的控制器,其特征在于,所述类型确定模块在根据IO请求类型确定所述IO请求的执行时间时,具体用于:根据所述IO请求的类型确定所述IO请求的执行时长;将所述IO请求的执行时长加上当前时间得到所述IO请求的执行时间。
- 一种存储设备,其特征在于,包括:获取模块,用于获取IO请求,所述IO请求包括执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理完所述IO请求;执行模块,用于按照所述IO请求的执行时间执行所述IO请求。
- 如权利要求26所述的存储设备,其特征在于,所述存储设备还包括多个存储块、选择模块、及排序模块;所述选择模块用于在获取所述IO请求后,确定所述IO请求所访问的存储块;所述排序模块用于将所述IO请求按照所述执行时间置于所述存储块对应的待处理请求的队列中;所示执行模块在所述按照所述IO请求的执行时间执行所述IO请求时,具体用于按照所述队列中的所述IO请求的执行时间执行所述IO请求。
- 如权利要求27所述的存储设备,其特征在于,所述存储设备包括多个存储区,每个存储区由至少一个存储块组成,所述IO请求为写请求,所述选择模块在确定所述IO请求所访问的存储块时,具体用于:从所述多个存储区中选择没有执行擦除操作的存储区;根据所选择的存储区确定所述IO请求所访问的存储块。
- 如权利要求28所述的存储设备,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区包括两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式,所述选择模块从所述多个存储区中选择当前没有执行擦除操作的存储区时,具体用于:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区中的空闲子块的个数是否低于阈值;当所述选择模块所选择的存储区中的空闲子块的个数不低于阈值,则将所选择的处于所述读+写模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求29所述的存储设备,其特征在于,当所选择的存储区中的空闲子块的个数低于阈值,则将所述存储区的模式设置为读+擦除模式;判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,将不处于读+写模式及读+擦除模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求28所述的存储设备,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区支持两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为读+写模式,当所述存储区正在执行擦除操作时,则被设置为所述读+擦除模式,所述选择模块从所述多个存储区中选择当前没有执行擦除操作的存储区时,具体用于:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区所承担的读写压力是否超过阈值;当所述选择模块所选择的存储区所承担的读写压力没有超过阈值,则将所选择的所选择的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求31所述的存储设备,其特征在于,当所选择的存储区所承担的读写压力超过阈值,判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,选择所述不处于读+写模式及读+擦除模式的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求30或32所述的存储设备,其特征在于,所述存储区还包括读+写+擦除的模式,在所述读+写+擦除模式下,在所述存储区中能够执行读操作、写操作、及擦除操作;所述选择模块还用于:当不存在不处于读+写模式及读+擦除模式的存储区时,将所述多个存储区的模式全部转换为读+写+擦除模式。
- 如权利要求26至33任意一项所述的存储设备,其特征在于,所述执行模块在执行写请求或擦除请求时,具体用于:将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述紧急的读请求为执行时间早于当前时间加上下一个分片的执行时长的读请求;如果有紧急的读请求需要处理,则暂停执行所述写请求或擦除请求,并执行所述紧急的读请求。
- 如权利要求26至33任意一项所述的存储设备,其特征在于,所述执行模块在执行写请求或擦除请求时,具体用于:将需要执行的写请求或者擦除请求切分成多个分片;在执行完一个分片后,确定是否有紧急的读请求需要处理,所述紧急的读请求为执行时间早于当前时间加上下一个分片的执行长加上x个串行执行的读请求的执行时长的读请求,x为暂停执行一次写请求或擦除请求时,所能允许串行执行的读请求的最大数量;如果有紧急的读请求需要处理,则暂停执行所述写请求或擦除请求,并执行所述紧急的读请求。
- 一种存储设备,其特征在于,所述存储设备包括多个存储区,所述存储设备还包括:获取模块,用于接收写请求,所述写请求中携带待写数据:选择模块,用于从所述多个存储区中选择没有执行擦除操作的存储区;执行模块,用于将所述待写数据写入所选择的存储区。
- 如权利要求36所述的存储设备,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区支持两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为所述读+写模式,当所述存储区用于执行擦除操作时,则所述存储区被设置为所述读+擦除模式,所述选择模块从所述多个存储区中选择没有执行擦除操作的存储区时具体用于:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区中的空闲子块的个数是否低于阈值;当所述选择模块所选择的存储区中的空闲子块的个数不低于阈值,则将所选择的处于所述读+写模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求37所述的存储设备,其特征在于,当所选择的存储区中的空闲子块的个数低于阈值,则将所述存储区的模式设置为读+擦除模式;判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,将不处于读+写模式及读+擦除模式的存储区作为没有执行擦除操作的存储区。
- 如权利要求36所述的存储设备,其特征在于,每个存储块包括多个子块,所述子块为执行所述擦除操作的最小单位,每个存储区包括两种模式,读+写模式及读+擦除模式,当所述存储区用于写入数据时,所述存储区被设为读+写模式,当所述存储区正在执行擦除操作时,则被设置为所述读+擦除模式,所述选择模块在从所述多个存储区中选择没有执行擦除操作的存储区时,具体用于:从所述多个存储区中选择处于所述读+写模式的存储区;判断所选择的存储区所承担的读写压力是否超过阈值;当所述选择模块所选择的存储区所承担的读写压力没有超过阈值,则将所选择的所选择的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求39所述的存储设备,其特征在于,当所选择的存储区所承担的读写压力超过阈值,判断是否存在不处于读+写模式及读+擦除模式的存储区;当存在不处于读+写模式及读+擦除模式的存储区,选择所述不处于读+写模式及读+擦除模式的存储区作为所述没有执行擦除操作的存储区。
- 如权利要求38或40所述的存储设备,其特征在于,所述存储区还包括读+写+擦除的模式,在所述读+写+擦除模式下,在所述存储区中能够执行读操作、写操作、及擦除操作;所述选择模块还用于:当不存在不处于读+写模式及读+擦除模式的存储区时,将所述多个存储区的模式全部转换为读+写+擦除模式。
- 如权利要求36至41任意一项所述的存储设备,其特征在于,所述获取模块还用于接收读请求,所述读请求和所述写请求中包括执行时间,所述执行时间用于指示所述存储设备在所述执行时间到达之前处理所述读请求或者写请求;所述执行模块用于按照所述读请求或写请求的执行时间执行所述读请求或者写请求。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19899326.3A EP3879393A4 (en) | 2018-12-16 | 2019-04-03 | DATA PROCESSING METHOD, CONTROL UNIT, STORAGE DEVICE AND STORAGE SYSTEM |
US17/347,041 US11954332B2 (en) | 2018-12-16 | 2021-06-14 | Data processing method, controller, storage device, and storage system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811538076 | 2018-12-16 | ||
CN201811538076.4 | 2018-12-16 | ||
CN201811571773.XA CN111324296B (zh) | 2018-12-16 | 2018-12-21 | 一种数据处理的方法、控制器、存储设备及存储系统 |
CN201811571773.X | 2018-12-21 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/347,041 Continuation US11954332B2 (en) | 2018-12-16 | 2021-06-14 | Data processing method, controller, storage device, and storage system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020124867A1 true WO2020124867A1 (zh) | 2020-06-25 |
Family
ID=71102035
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/081221 WO2020124867A1 (zh) | 2018-12-16 | 2019-04-03 | 一种数据处理的方法、控制器、存储设备及存储系统 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2020124867A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210081770A1 (en) * | 2019-09-17 | 2021-03-18 | GOWN Semiconductor Corporation | System architecture based on soc fpga for edge artificial intelligence computing |
US20220197862A1 (en) * | 2020-12-17 | 2022-06-23 | SK Hynix Inc. | Journaling apparatus and method in a non-volatile memory system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102843366A (zh) * | 2012-08-13 | 2012-12-26 | 北京百度网讯科技有限公司 | 一种网络资源访问权限控制方法及装置 |
US20140351747A1 (en) * | 2013-05-24 | 2014-11-27 | Canon Anelva Corporation | Information processing apparatus for processing plural event data generated by processing apparatus |
CN105677744A (zh) * | 2015-12-28 | 2016-06-15 | 曙光信息产业股份有限公司 | 一种文件系统中提高服务质量的方法和装置 |
CN106598878A (zh) * | 2016-12-27 | 2017-04-26 | 湖南国科微电子股份有限公司 | 一种固态硬盘冷热数据分离方法 |
CN106998317A (zh) * | 2016-01-22 | 2017-08-01 | 高德信息技术有限公司 | 异常访问请求识别方法及装置 |
CN107305473A (zh) * | 2016-04-21 | 2017-10-31 | 华为技术有限公司 | 一种io请求的调度方法及装置 |
-
2019
- 2019-04-03 WO PCT/CN2019/081221 patent/WO2020124867A1/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102843366A (zh) * | 2012-08-13 | 2012-12-26 | 北京百度网讯科技有限公司 | 一种网络资源访问权限控制方法及装置 |
US20140351747A1 (en) * | 2013-05-24 | 2014-11-27 | Canon Anelva Corporation | Information processing apparatus for processing plural event data generated by processing apparatus |
CN105677744A (zh) * | 2015-12-28 | 2016-06-15 | 曙光信息产业股份有限公司 | 一种文件系统中提高服务质量的方法和装置 |
CN106998317A (zh) * | 2016-01-22 | 2017-08-01 | 高德信息技术有限公司 | 异常访问请求识别方法及装置 |
CN107305473A (zh) * | 2016-04-21 | 2017-10-31 | 华为技术有限公司 | 一种io请求的调度方法及装置 |
CN106598878A (zh) * | 2016-12-27 | 2017-04-26 | 湖南国科微电子股份有限公司 | 一种固态硬盘冷热数据分离方法 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210081770A1 (en) * | 2019-09-17 | 2021-03-18 | GOWN Semiconductor Corporation | System architecture based on soc fpga for edge artificial intelligence computing |
US11544544B2 (en) * | 2019-09-17 | 2023-01-03 | Gowin Semiconductor Corporation | System architecture based on SoC FPGA for edge artificial intelligence computing |
US20220197862A1 (en) * | 2020-12-17 | 2022-06-23 | SK Hynix Inc. | Journaling apparatus and method in a non-volatile memory system |
US11704281B2 (en) * | 2020-12-17 | 2023-07-18 | SK Hynix Inc. | Journaling apparatus and method in a non-volatile memory system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11954332B2 (en) | Data processing method, controller, storage device, and storage system | |
US10649815B2 (en) | Apparatus and method of managing shared resources in achieving IO virtualization in a storage device | |
US10521375B2 (en) | Controller for a memory system | |
CN105339913B (zh) | 管理不对称存储器系统的写性能 | |
KR101562781B1 (ko) | 비휘발성 스토리지에 대한 셀프-저널링 및 계층적 일치성 | |
TW201942738A (zh) | 電子機器、電腦系統及其等之控制方法 | |
JP7013294B2 (ja) | メモリシステム | |
KR102679967B1 (ko) | 메모리 시스템에서의 유효 데이터 관리 방법 및 장치 | |
CN106469126B (zh) | 处理io请求的方法及其存储控制器 | |
TWI824837B (zh) | 記憶體系統 | |
US11593262B1 (en) | Garbage collection command scheduling | |
CN110109845B (zh) | 缓存数据管理方法、装置、计算机设备及存储介质 | |
WO2020124867A1 (zh) | 一种数据处理的方法、控制器、存储设备及存储系统 | |
US20210081235A1 (en) | Memory system | |
US10515671B2 (en) | Method and apparatus for reducing memory access latency | |
CN114968099A (zh) | 一种访问nvm的方法及nvm控制器 | |
JP2005115600A (ja) | 情報処理装置及び方法 | |
CN109388333B (zh) | 降低读命令处理延迟的方法与装置 | |
US9465745B2 (en) | Managing access commands by multiple level caching | |
CN114253461A (zh) | 混合通道存储设备 | |
CN111488298A (zh) | 优化nvm接口命令执行顺序的方法与装置 | |
CN108628759B (zh) | 乱序执行nvm命令的方法与装置 | |
US20200004636A1 (en) | Data Storage System with Strategic Contention Avoidance | |
EP2344947B1 (en) | Storage controller and method of controlling storage controller | |
CN116235138A (zh) | 一种应用于固态硬盘ssd的数据读取方法及相关装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19899326 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2019899326 Country of ref document: EP Effective date: 20210607 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |