CN110764710A - Data access method and storage system of low-delay and high-IOPS - Google Patents

Data access method and storage system of low-delay and high-IOPS Download PDF

Info

Publication number
CN110764710A
CN110764710A CN201911036827.7A CN201911036827A CN110764710A CN 110764710 A CN110764710 A CN 110764710A CN 201911036827 A CN201911036827 A CN 201911036827A CN 110764710 A CN110764710 A CN 110764710A
Authority
CN
China
Prior art keywords
threads
storage system
data
cpu cores
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911036827.7A
Other languages
Chinese (zh)
Other versions
CN110764710B (en
Inventor
王田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Memblaze Technology Co Ltd
Original Assignee
Beijing Memblaze Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Memblaze Technology Co Ltd filed Critical Beijing Memblaze Technology Co Ltd
Priority to CN201911036827.7A priority Critical patent/CN110764710B/en
Publication of CN110764710A publication Critical patent/CN110764710A/en
Application granted granted Critical
Publication of CN110764710B publication Critical patent/CN110764710B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System (AREA)

Abstract

The invention discloses a data access method in a storage system, wherein the storage system comprises a first group of CPU cores and a second group of CPU cores, each of the first group of CPU cores is exclusively used for running one of a plurality of first threads, the second group of CPU cores is used for running a plurality of second threads, and the method comprises the following steps: acquiring a write request through the first thread and writing data to a buffer area; writing, by the second thread, data in the buffer to a non-volatile storage device. The invention can give consideration to the delay, the IOPS and the bandwidth of the storage system, and particularly for a novel storage system based on a flash memory, the architecture provided by the invention has obvious advantages on the delay and the IOPS compared with the traditional aggregation scheme.

Description

Data access method and storage system of low-delay and high-IOPS
Technical Field
The present invention relates to a storage system software architecture, and in particular, to a method and an apparatus for accessing data in a storage system.
Background
Aggregation is often used in conventional IO software to provide high throughput. In this design, multiple IO requests are aggregated together and processed in bulk. The IO processing path may be generally divided into relatively independent parts, such as compression, deduplication, etc., each of which may be handled in a pipelined manner as a subtask. Wherein each subtask can be processed by aggregation to improve throughput. Traditionally, aggregation has also been beneficial to increase the efficiency of networks and disks, etc.
The use of a scheduler and read-ahead strategy can be considered an aggregate approach.
Disclosure of Invention
The aggregation method cannot achieve high IOPS (number of read/write operations Per Second) at low latency and low queue depth, although it can achieve high bandwidth. For an individual IO of the aggregated multiple IOs, it is necessary to wait until the same lot of IOs are completed before reporting a message of its completion to the upper layer. This results in a significant increase in delay. At the same time, the task switching introduced by the multitasking CPUs in the storage system may further introduce unpredictable latency. For newer types of high-speed storage media, such as SSDs, random reads can achieve very high bandwidth and IOPS, especially for read requests. In this case, the advantage of the existing aggregation methods providing high throughput would not exist. Meanwhile, the performance of the network is rapidly developed, and high bandwidth is provided, and meanwhile, low delay of IOPS (internet of things) can be provided. The existing aggregated IO mode will become an obstacle in the process of promoting IOPS and reducing latency.
The invention aims to overcome the defects that the prior art cannot achieve high IOPS and low delay on a novel high-speed storage medium, and finally achieves the result of balancing three indexes of high IOPS, low delay and high bandwidth.
According to an aspect of the present invention, there is provided a data access method in a storage system, wherein the storage system includes a first group of CPU cores and a second group of CPU cores, each of the first group of CPU cores is dedicated to run one of a plurality of first threads, and the second group of CPU cores is configured to run a plurality of second threads, the method including: obtaining a write request through one of the plurality of first threads and writing data to a buffer; writing, by the second thread, data in the buffer to a non-volatile storage device.
According to an embodiment of the present invention, further comprising: and sending the completion information of the write request to a write request sender through the first thread.
According to one embodiment of the invention, the data in the buffer is aggregated by the second thread and then written to the storage device.
According to one embodiment of the invention, the number of write requests per fetch by the first thread is a small integer close to 1.
According to one embodiment of the invention, wherein some of the plurality of second threads are dedicated to writing data to the first storage device and others of the plurality of second threads are dedicated to writing data to the second storage device.
According to one embodiment of the invention, the second thread writes the data in the buffer to a storage device after compressing the data.
According to one embodiment of the present invention, wherein the CPU cores are divided into a first group of CPU cores and a second group of CPU cores using cgroup.
According to one embodiment of the invention, a taskset is used to assign each first thread a CPU core exclusive to it.
According to one embodiment of the invention, polling and interrupts are used to obtain write requests.
According to one embodiment of the present invention, the write request issuing party is a network card.
According to a second aspect of the present invention, there is also provided a data access apparatus in a storage system, wherein the storage system includes a first group of CPU cores and a second group of CPU cores, each of the first group of CPU cores is dedicated to run one of a plurality of first threads, the second group of CPU cores is configured to run a plurality of second threads, the apparatus includes: means for obtaining a write request by one of the plurality of first threads and writing data to a buffer; means for writing data in the buffer to a non-volatile storage device by the second thread.
According to a third aspect of the present invention, there is also provided a data access method in a storage system, where the storage system includes a first group of CPU cores and a second group of CPU cores, where each of the first group of CPU cores is dedicated to run one of a plurality of first threads, and the second group of CPU cores is configured to run a plurality of second threads, the method including: acquiring a read request through one of the first threads and sending the read request to a storage device; in response to the memory device completing the read request, data read from the memory device is returned to the read request issuer by one of the first threads.
According to one embodiment of the third aspect of the present invention, the number of read requests per fetch of one of the first threads is a smaller integer close to 1.
According to one embodiment of the third aspect of the present invention, the number of read requests for data read out from the memory device by one of the first threads at a time is a smaller integer close to 1.
According to one embodiment of the third aspect of the present invention, the number of read requests for data read out from the memory device by one of the first threads at a time is a smaller integer close to 1.
According to one embodiment of the third aspect of the present invention, the read request issuer is a network card.
According to one embodiment of the third aspect of the present invention, the CPU cores are divided into the first group of CPU cores and the second group of CPU cores by using cgroup.
According to one embodiment of the third aspect of the present invention, a taskset is used to assign each first thread a CPU core exclusive to it.
According to one embodiment of the third aspect of the present invention, the read request is obtained by polling and interrupt.
According to a fourth aspect of the present invention, there is also provided a data access apparatus in a storage system, wherein the storage system includes a first group of CPU cores and a second group of CPU cores, each of the first group of CPU cores is dedicated to run one of a plurality of first threads, the second group of CPU cores is configured to run a plurality of second threads, the apparatus includes: means for obtaining a read request by one of the plurality of first threads and sending the read request to a storage device; means for returning data read from the memory device to a read request originator by one of the first threads in response to the memory device completing the read request.
According to a fifth aspect of the present invention, there is also provided a data access method in a storage system, wherein the storage system includes a first group of CPU cores and a second group of CPU cores, and each of the first group of CPUs is dedicated to run one of a plurality of first threads; a second set of CPU cores is for running a plurality of second threads, the method comprising: obtaining a write request through one of the plurality of first threads and writing data to a buffer; writing data in the buffer to a non-volatile storage device by one of the plurality of second threads; acquiring a read request through one of the first threads and sending the read request to a storage device; in response to the memory device completing the read request, data read from the memory device is returned to the read request issuer by one of the first threads.
According to a fifth aspect of the present invention, there is also provided a data access device in a storage system, wherein the storage system includes a first group of CPU cores and a second group of CPU cores, and each of the first group of CPUs is dedicated to run one of a plurality of first threads; the second set of CPU cores is for running a plurality of second threads, the apparatus comprising: means for obtaining a write request by one of the plurality of first threads and writing data to a buffer; means for writing data in the buffer to a non-volatile storage device through one of the plurality of second threads; means for obtaining a read request by one of the plurality of first threads and sending the read request to a storage device; means for returning data read from the memory device to a read request originator by one of the first threads in response to the memory device completing the read request.
According to a sixth aspect of the present invention there is provided a computer program comprising computer program code which, when loaded into a computer system and executed thereon, causes the computer system to perform a method of data access in a storage system provided in accordance with an aspect of the present invention or a method of data access in a storage system provided in accordance with a third aspect of the present invention or a method of data access in a storage system provided in accordance with a fifth aspect of the present invention.
According to a seventh aspect of the present invention, there is provided a program comprising program code which, when loaded into a storage device and executed thereon, causes the storage device to perform a method of data access in a storage system provided according to an aspect of the present invention or a method of data access in a storage system provided according to a third aspect of the present invention or a method of data access in a storage system provided according to a fifth aspect of the present invention.
The invention can give consideration to the delay, the IOPS and the bandwidth of the storage system, and particularly for a novel storage system based on a flash memory, the architecture provided by the invention has obvious advantages on the delay and the IOPS compared with the traditional aggregation scheme. The invention is not only suitable for the storage system based on the flash memory, but also suitable for the storage system adopting the storage media such as a magnetic disk, an XPoint, a PCRAM, an MRAM, an RRAM, a FeRAM and the like.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. Wherein like reference numerals are followed by like parts throughout the several views, the last letter designation thereof will be omitted when referring generally to these parts. In the drawings:
FIG. 1 shows a schematic diagram of a CPU core and thread settings according to one embodiment of the invention;
FIG. 2A illustrates a flow diagram of a method of data access in a storage system according to one embodiment of the invention;
FIG. 2B illustrates a flow diagram of a method of data access in a storage system according to one embodiment of the invention;
FIG. 3 is a schematic structural diagram illustrating a data access method in a storage system according to an embodiment of the present invention;
FIG. 4 illustrates a schematic structural diagram of a data access device in a storage system according to another aspect of the present invention;
FIG. 5 illustrates a flow diagram of a method of data access in a storage system according to one embodiment of the invention;
FIG. 6 is a schematic structural diagram illustrating a data access method in a storage system according to an embodiment of the present invention;
FIG. 7 illustrates a schematic structural diagram of a data access device in a storage system, in accordance with another aspect of the present invention;
FIG. 8 illustrates a flow diagram of a method of data access in a storage system according to one embodiment of the invention;
FIG. 9 illustrates a block diagram of a data access device in a storage system in accordance with another aspect of the subject invention.
In the drawings, the same or similar reference numbers are used to refer to the same or similar elements.
Detailed Description
The invention is further described with reference to the following figures and detailed description of embodiments.
CPU resources in the storage system are divided and isolated again, IO processing flow in the storage system is divided into a plurality of stages, and different stages can be executed by different threads respectively. A synchronous stage and an asynchronous stage are distinguished from an IO processing flow in a storage system, and meanwhile, a working thread is divided into a first thread and a second thread.
FIG. 1 shows a schematic diagram of a CPU core and thread settings according to one embodiment of the present invention.
As shown in FIG. 1, a non-aggregated thread group in a storage system includes a plurality of first threads, where each first thread is bound to an independent CPU core. The first thread performs the synchronization phase of the IO processing, such as processing of the front end of a write request and processing of a read request. The aggregated thread group includes a plurality of second threads, wherein the second threads share a set of CPU cores. The CPU cores used by the second thread and the first thread do not coincide. The second thread performs an asynchronous phase of IO processing, such as processing of the back end of a write request. Preferably, an independent CPU core is bound to each first thread, so that the task execution of the first thread is not preempted before being completed, thereby reducing overhead caused by thread switching and ensuring that the IO processing phase executed by the first thread is completed quickly. And the second thread can be scheduled or preempted in the execution process, so that the IO processing stage executed by the second thread is promoted to be efficiently completed, and the waste of resources such as a processor is reduced.
According to one embodiment of the invention, specific implementations of CPU core and thread settings include, and are not limited to, the use of cgroup and/or taskset. The CPU cores are partitioned into a first set of CPU cores and a second set of CPU cores using cgroup, wherein each of the first set of CPU cores is dedicated to running one of a plurality of first threads and the second set of CPU cores is used to running a plurality of second threads. The first set of CPU core numbers is consistent with the thread number of the first thread. The number of threads of the second thread is typically greater than the number of the second set of CPU cores, with the second thread allowed to be scheduled between.
And allocating the exclusive CPU core for each first thread by using the taskset, wherein the taskset can set that a certain first thread can only run on a certain CPU core in a fine-grained manner. Generally, the setting of the CPU core and the thread is usually performed at the initialization of the storage system, and may also be dynamically adjusted during the operation process according to the requirements.
FIG. 2A illustrates a flow diagram of a method for data access in a storage system according to one embodiment of the invention.
Fig. 3 shows a schematic diagram of a data access method in a storage system according to an embodiment of the invention.
As shown in fig. 2A, the data access method in the storage system includes: step S210: acquiring a write request through one of a plurality of first threads and writing data to a buffer; step S220: the data in the buffer is written to the non-volatile storage device by the second thread.
Referring also to fig. 3, in step S210, a write request is obtained from a write request issuer, which may be a network card, an FC (fiber channel) adapter, an InfiniBand card, or the like. In the embodiment of fig. 3, the write request sender is a network card. The manner in which write requests are obtained includes, and is not limited to, the use of polling and interrupts. By way of example, the first thread gets 1 write request at a time. In another example, when the first thread obtains a write request, there are already several (e.g., 2 or 3) write requests, and the first thread also obtains and processes these write requests. Alternatively, the number of write requests that the first thread acquires and processes at a time is a small integer close to 1 (e.g., 2, 3, etc.), but the goal of the first thread is to speed up the processing of write requests, and thus it is not desirable to wait for write requests to arrive in order to process multiple write requests at a time. Each first thread runs through the CPU core exclusive to the first thread, so that the threads are guaranteed not to influence delay due to scheduling, and the running efficiency is improved.
With continued reference to fig. 2A and 3, in step S220, the second thread performs an operation of "writing data in the buffer to the nonvolatile storage device". In the embodiment of FIG. 3, the non-volatile storage device is a disk device, and the data in the buffer is written to the disk device by the second thread. This operation includes, but is not limited to, using the following two methods: each disk device corresponds to a second thread, and each second thread processes a write request for accessing the corresponding disk device and writes data to the corresponding disk device; a number of second threads process write requests to access arbitrary disk devices, writing data to these disk devices. Disk devices include, but are not limited to, hard disks, Solid State Disks (SSDs), and the like. In a further embodiment, the second thread writes data from multiple write requests to the disk device after aggregating the data in the buffer. In still further embodiments, additional operations may be performed on the data after aggregation and before writing to disk, including, but not limited to, deduplication, compression, and the like.
FIG. 2B illustrates a flow diagram of a method for data access in a storage system according to one embodiment of the invention.
As shown in fig. 2B, the data access method in the storage system: step S210: acquiring a write request through one of a plurality of first threads and writing data to a buffer; step S212: sending the completion information of the write request to a write request sender through a first thread; step S220: the data in the buffer is written to the non-volatile storage device by the second thread.
Referring also to fig. 3, after the first thread writes the data of the write request to the buffer, a message that the write request processing is completed is transmitted to the write request issuer (step S212). The write request sender includes, but is not limited to, a network card, and may also be, for example, an FC (fiber channel) adapter, an InfiniBand card, or the like.
Those skilled in the art will appreciate that there is no dependency between the step S212 performed by the first thread and the step S220 performed by the second thread, and may occur simultaneously. Although step S212 occurs before step S220 in fig. 2B, step S220 may also occur before step S212 in various embodiments.
According to one embodiment of the present invention, the buffer configuration needs to be taken care of during implementation. The size of the buffer is chosen so that the write IOPS of the user is not affected when the write bandwidth of the disk fluctuates. For a scene with high data reliability requirement, it is required to ensure that the buffer area can be normally recovered after power failure. To accomplish this, the buffer may be placed on a Non-Volatile medium, including but not limited to NVDIMM (Non-Volatile Dual In-line Memory Module), NVRAM (Non-Volatile RAM), etc. Buffers may also be provided by the DRAM.
With continued reference to FIG. 3, the various stages of the flow of write request processing according to an embodiment of the present invention are shown numerically in FIG. 3. (1) The network card receives a writing request from a user or a server; (2) one of the first threads writes the write request received by the network card into a buffer area; (3) one of the first threads returns a message of completing the write request to the network card; (4) and the network card returns the message of completing the write request to the user or the server. The first threads in the stages (2) and (3) of the processing flow may be the same threads or different first threads. The data of the buffer is written to the disk device by the second thread. (5) The second thread fetches the data from the buffer and writes to the disk device. For the same write request, the stages (5) and (3) can be performed simultaneously. In the storage system, the operation of writing the write request into the buffer by the first thread and the operation of taking the data out of the buffer and writing the data into the disk device by the second thread can occur simultaneously. In an embodiment according to the present invention, a plurality of first threads execute stages (2) and (3) of a write request processing flow in parallel, and a plurality of second threads execute stage (5) of the write request processing flow in parallel.
According to another aspect of the present invention, the present invention further provides a data access device in a storage system, where the storage system includes a first group of CPU cores and a second group of CPU cores, where each of the first group of CPU cores is dedicated to run one of a plurality of first threads, and the second group of CPU cores is configured to run a plurality of second threads, as shown in fig. 4, the device includes: means 410 for obtaining a write request by one of the first plurality of threads and writing data to the buffer; means 420 for writing the data in the buffer to the non-volatile storage device by the second thread.
FIG. 5 shows a flow diagram of a method of data access in a storage system according to an embodiment of the invention.
FIG. 6 shows a schematic diagram of a data access method in a storage system according to an embodiment of the invention.
As shown in fig. 5, the data access method in the storage system includes: step S510: acquiring a read request through one of a plurality of first threads and sending the read request to a storage device; step S520: in response to the memory device completing the read request, data read from the memory device is returned to the read request issuer by one of the first plurality of threads.
Referring to fig. 6, in step S510, the first thread performs an operation of obtaining a read request from a read request issuer and sending the read request to the storage device. The read request issuing party may be a network card, an FC (fiber channel) adapter, an InfiniBand card, and the like. In this embodiment, the storage device is a magnetic disk. Methods of obtaining read requests include, and are not limited to, using polling and interrupts. The number of read requests per fetch by the first thread may be 1, or may be a smaller integer (e.g., 2, 3, etc.) close to 1. It will be appreciated that the goal of the first thread is to speed up the processing of read requests, and thus it is undesirable to wait for a read request to arrive in order to process multiple read requests at once.
The operation to the disk device is asynchronous, one of the first threads first sends a read request to the disk device, and the one of the first threads receives a notification after the operation of the disk device is completed. The operation of sending the read request to the disk device is performed by one of the first threads. And receives a notification of completion of the read request processing from the disk device and acquires the processing result of the read request, step S520, which is also performed by one of the first threads. In further embodiments, additional processing logic may be added to the operations of retrieving the read request and sending the read request to the disk device, including but not limited to handling situations such as a read request hitting a cache.
Referring to fig. 6, in step S520, the first thread performs an operation of returning data acquired from the disk device to the read request issuer. The read request sender includes, but is not limited to, a network card, and may also be, for example, an FC (fiber channel) adapter, an InfiniBand card, or the like. In further embodiments, additional processing of the data may be performed in this operation, including but not limited to decompression, decryption, and the like. The data volume returned by the operation may correspond to one read request or a smaller number of multiple read requests. Mechanisms for retrieving data from a disk device include, and are not limited to, interrupts and polling. The thread that acquires data in step S520 and the thread that sends a read request in step S510 may be the same or different. When the first thread to fetch data is the same thread as the first thread to send the read request, the first thread may be temporarily brought out of the CPU core while waiting for the disk device to return the read result.
With continued reference to FIG. 6, the various stages of the flow of read request processing according to an embodiment of the present invention are shown numerically in FIG. 6. (1) The network card receives a reading request from a user or a server; (2) one of the first threads sends a read request to the disk device based on the read request received by the network card; (3) one of the first threads receives a read request processing result returned by the disk device and sends the read request processing result to the network card; (4) and the network card returns the processing result of the reading request to the user or the server. The first threads in the stages (2) and (3) of the processing flow may be the same threads or different first threads. In an embodiment in accordance with the invention, the plurality of first threads execute stage (2) and stage (3) of the read request processing flow in parallel.
According to another aspect of the present invention, the present invention further provides a data access device in a storage system, where the storage system includes a first group of CPU cores and a second group of CPU cores, where each of the first group of CPU cores is dedicated to run one of a plurality of first threads, and the second group of CPU cores is configured to run a plurality of second threads, as shown in fig. 7, the device includes: means 710 for obtaining a read request by one of a plurality of first threads and sending the read request to a storage device; means 720 for returning data read from the memory device to the read request originator by one of the first threads in response to the memory device completing the read request.
FIG. 8 shows a flow diagram of a method of data access in a storage system according to an embodiment of the invention.
As shown in fig. 8, the data access method in the storage system includes the steps of: step S810: acquiring a write request through one of a plurality of first threads and writing data to a buffer; step S820: writing the data in the buffer to the non-volatile storage device by one of the plurality of second threads; step S830: acquiring a read request through one of a plurality of first threads and sending the read request to a storage device; step S840: in response to the memory device completing the read request, data read from the memory device is returned to the read request issuer by one of the first plurality of threads.
According to another aspect of the present invention, the present invention further provides a data access device in a storage system, wherein the storage system includes a first group of CPU cores and a second group of CPU cores, wherein each of the first group of CPUs is dedicated to run one of a plurality of first threads; the second group of CPU cores is configured to run a plurality of second threads, as shown in fig. 9, and the apparatus includes: means 910 for obtaining a write request by one of a plurality of first threads and writing data to a buffer; means 920 for writing data in the buffer to the non-volatile storage device through one of a plurality of second threads; means 930 for obtaining a read request by one of the first plurality of threads and sending the read request to the storage device; means 940 for returning data read from the memory device to the read request originator by one of the first threads in response to the memory device completing the read request.
According to another aspect of the invention, there is also provided a computer program comprising computer program code to, when loaded into a computer system and executed thereon, cause said computer system to perform the method described above.
According to another aspect of the present invention, there is also provided a program comprising program code which, when loaded into and executed on a storage device, causes the storage device to carry out the method described above.
The invention can give consideration to the delay, IOPS and bandwidth of the storage system, and particularly for a novel storage system based on a flash memory, the architecture provided by the invention has obvious advantages in delay and IOPS compared with the traditional IO aggregation scheme. The invention is not only suitable for the storage system based on the flash memory, but also suitable for the storage system adopting the storage media such as a magnetic disk, an XPoint, a PCRAM, an MRAM, an RRAM, a FeRAM and the like.
It will be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, respectively, can be implemented by various means including computer program instructions. These computer program instructions may be loaded onto a general purpose computer, special purpose computer, or other programmable data control apparatus to produce a machine, such that the instructions which execute on the computer or other programmable data control apparatus create means for implementing the functions specified in the flowchart block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data control apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including computer-readable instructions for implementing the function specified in the flowchart block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data control apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
Accordingly, blocks of the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based computer systems that perform the specified functions or steps, or combinations of special purpose hardware and computer instructions.
At least a portion of the various blocks, operations, and techniques described above may be performed using hardware, by controlling a device to execute firmware instructions, by controlling a device to execute software instructions, or any combination thereof. When implemented using a control device executing firmware and software instructions, the software or firmware instructions may be stored on any computer-readable storage medium, such as a magnetic disk, optical disk or other storage medium, in RAM or ROM or flash memory, a control device, hard disk, optical disk, magnetic disk, or the like. Likewise, the software and firmware instructions may be delivered to a user or a system via any known or desired delivery means including, for example, on a computer readable disk or other portable computer storage mechanism or via a communications medium. Communication media typically embodies computer readable instructions, data structures, sequence modules or other data in a modulated data signal such as a carrier wave or other transport mechanism. By way of example, and not limitation, communication media includes wired media such as a wired network or single-wire connection, and wireless media such as acoustic, radio frequency, infrared and other wireless media. Thus, the software and firmware instructions may be transmitted to a user or a system via a communication channel such as a telephone line, a DSL line, a cable television line, a fiber optic cable, a wireless channel, the Internet, etc. (such software is provided via a portable storage medium and is viewed as being the same or interchangeable). The software or firmware instructions may include machine-readable instructions that, when executed by the control device, cause the control device to perform various actions.
When implemented in hardware, the hardware may include one or more discrete components, integrated circuits, Application Specific Integrated Circuits (ASICs), and the like.
It is to be understood that the present invention may be implemented in software, hardware, firmware, or a combination thereof. The hardware may be, for example, a control device, an application specific integrated circuit, a large scale integrated circuit, or the like.
Although the present invention has been described with reference to examples, which are intended to be illustrative only and not to be limiting of the invention, changes, additions and/or deletions may be made to the embodiments without departing from the scope of the invention.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these embodiments pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (6)

1. A data access method in a storage system, wherein the storage system comprises a first group of CPU cores and a second group of CPU cores, wherein each of the first group of CPU cores is exclusively used for running one of a plurality of first threads; a second set of CPU cores is for running a plurality of second threads, the method comprising:
obtaining a write request through one of the plurality of first threads and writing data to a buffer;
writing data in the buffer to a non-volatile storage device by one of the plurality of second threads;
acquiring a read request through one of the first threads and sending the read request to a storage device;
in response to the memory device completing the read request, data read from the memory device is returned to the read request issuer by one of the first threads.
2. The method of claim 1, further comprising: and sending the completion information of the write request to a write request sender through the first thread.
3. The method of any of claims 1-2, wherein the data in the buffer is aggregated by the second thread and written to the storage device.
4. The method of any of claims 1-3, wherein some of the plurality of second threads are dedicated to writing data to the first storage device and others of the plurality of second threads are dedicated to writing data to the second storage device.
5. A data access device in a storage system, wherein the storage system comprises a first group of CPU cores and a second group of CPU cores, wherein each of the first group of CPU cores is dedicated to run one of a plurality of first threads; the second set of CPU cores is for running a plurality of second threads, the apparatus comprising:
means for obtaining a write request by one of the plurality of first threads and writing data to a buffer;
means for writing data in the buffer to a non-volatile storage device through one of the plurality of second threads;
means for obtaining a read request by one of the plurality of first threads and sending the read request to a storage device;
means for returning data read from the memory device to a read request originator by one of the first threads in response to the memory device completing the read request.
6. A computer program comprising computer program code to, when loaded into a computer system and executed thereon, cause said computer system to perform a method of data access in a storage system as claimed in any one of claims 1 to 4.
CN201911036827.7A 2016-01-30 2016-01-30 Low-delay high-IOPS data access method and storage system Active CN110764710B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911036827.7A CN110764710B (en) 2016-01-30 2016-01-30 Low-delay high-IOPS data access method and storage system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911036827.7A CN110764710B (en) 2016-01-30 2016-01-30 Low-delay high-IOPS data access method and storage system
CN201610067814.6A CN107025064B (en) 2016-01-30 2016-01-30 A kind of data access method of the high IOPS of low latency

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201610067814.6A Division CN107025064B (en) 2016-01-30 2016-01-30 A kind of data access method of the high IOPS of low latency

Publications (2)

Publication Number Publication Date
CN110764710A true CN110764710A (en) 2020-02-07
CN110764710B CN110764710B (en) 2023-08-11

Family

ID=59524724

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201911036827.7A Active CN110764710B (en) 2016-01-30 2016-01-30 Low-delay high-IOPS data access method and storage system
CN201610067814.6A Active CN107025064B (en) 2016-01-30 2016-01-30 A kind of data access method of the high IOPS of low latency

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201610067814.6A Active CN107025064B (en) 2016-01-30 2016-01-30 A kind of data access method of the high IOPS of low latency

Country Status (1)

Country Link
CN (2) CN110764710B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327280A (en) * 2021-12-29 2022-04-12 以萨技术股份有限公司 Message storage method and system based on cold-hot separation storage

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109995855B (en) * 2019-03-20 2021-12-10 北京奇艺世纪科技有限公司 Data acquisition method, device and terminal

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101630276A (en) * 2009-08-18 2010-01-20 深圳市融创天下科技发展有限公司 High-efficiency memory pool access method
CN101650698A (en) * 2009-08-28 2010-02-17 曙光信息产业(北京)有限公司 Method for realizing direct memory access
CN102098344A (en) * 2011-02-21 2011-06-15 中国科学院计算技术研究所 Method and device for synchronizing editions during cache management and cache management system
CN102866957A (en) * 2012-07-31 2013-01-09 中国人民解放军国防科学技术大学 Multi-core multi-thread microprocessor-oriented virtual active page buffer method and device
CN104484131A (en) * 2014-12-04 2015-04-01 珠海金山网络游戏科技有限公司 Device and corresponding method for processing data of multi-disk servers
CN104598278A (en) * 2015-01-16 2015-05-06 联想(北京)有限公司 Data processing method and electronic equipment
US20150286586A1 (en) * 2014-04-07 2015-10-08 Oracle International Corporation System and Method for Implementing Scalable Adaptive Reader-Writer Locks
US20150317182A1 (en) * 2014-05-05 2015-11-05 Google Inc. Thread waiting in a multithreaded processor architecture

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10207639A (en) * 1997-01-28 1998-08-07 Sony Corp High speed data recording/reproducing device and method therefor
CN101196835B (en) * 2006-12-07 2012-01-11 国际商业机器公司 Method and apparatus for communicating between threads
CN101840312B (en) * 2009-12-31 2012-01-11 创新科存储技术有限公司 RAID5 data writing method and device for multi-core processor
CN103345451B (en) * 2013-07-18 2015-05-13 四川九成信息技术有限公司 Data buffering method in multi-core processor
CN103593148B (en) * 2013-11-08 2017-10-27 大唐移动通信设备有限公司 The method and device that a kind of CDF sides offline bill data are quickly accessed

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101630276A (en) * 2009-08-18 2010-01-20 深圳市融创天下科技发展有限公司 High-efficiency memory pool access method
CN101650698A (en) * 2009-08-28 2010-02-17 曙光信息产业(北京)有限公司 Method for realizing direct memory access
CN102098344A (en) * 2011-02-21 2011-06-15 中国科学院计算技术研究所 Method and device for synchronizing editions during cache management and cache management system
CN102866957A (en) * 2012-07-31 2013-01-09 中国人民解放军国防科学技术大学 Multi-core multi-thread microprocessor-oriented virtual active page buffer method and device
US20150286586A1 (en) * 2014-04-07 2015-10-08 Oracle International Corporation System and Method for Implementing Scalable Adaptive Reader-Writer Locks
US20150317182A1 (en) * 2014-05-05 2015-11-05 Google Inc. Thread waiting in a multithreaded processor architecture
CN104484131A (en) * 2014-12-04 2015-04-01 珠海金山网络游戏科技有限公司 Device and corresponding method for processing data of multi-disk servers
CN104598278A (en) * 2015-01-16 2015-05-06 联想(北京)有限公司 Data processing method and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
RISHI AGARWAL.ETC: "Rebound: Scalable Checkpointing for Coherent Shared Memory", IEEE *
杨芳菊;: "基于CPU/GPU异构平台并行优化的研究", 电脑编程技巧与维护, no. 18 *
陈怀松;陈家琪;: "IOCP写服务程序时的关键问题研究", 计算机工程与设计, no. 17 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327280A (en) * 2021-12-29 2022-04-12 以萨技术股份有限公司 Message storage method and system based on cold-hot separation storage
CN114327280B (en) * 2021-12-29 2024-02-09 以萨技术股份有限公司 Message storage method and system based on cold and hot separation storage

Also Published As

Publication number Publication date
CN107025064A (en) 2017-08-08
CN110764710B (en) 2023-08-11
CN107025064B (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN110741342B (en) Blockchain transaction commit ordering
US20190163364A1 (en) System and method for tcp offload for nvme over tcp-ip
US9021228B2 (en) Managing out-of-order memory command execution from multiple queues while maintaining data coherency
US10241799B2 (en) Out-of-order command execution with sliding windows to maintain completion statuses
US8108571B1 (en) Multithreaded DMA controller
US7953915B2 (en) Interrupt dispatching method in multi-core environment and multi-core processor
KR102532173B1 (en) Memory access technology and computer system
US9639407B1 (en) Systems and methods for efficiently implementing functional commands in a data processing system
US8606992B2 (en) Dynamically switching command types to a mass storage drive
US11010094B2 (en) Task management method and host for electronic storage device
US9747233B2 (en) Facilitating routing by selectively aggregating contiguous data units
WO2016155238A1 (en) File reading method in distributed storage system, and server end
CN111615692A (en) Data transfer method, calculation processing device, and storage medium
CN110851276A (en) Service request processing method, device, server and storage medium
CN110764710B (en) Low-delay high-IOPS data access method and storage system
US20160057068A1 (en) System and method for transmitting data embedded into control information
CN115981893A (en) Message queue task processing method and device, server and storage medium
US20180144018A1 (en) Method for changing allocation of data using synchronization token
US11210089B2 (en) Vector send operation for message-based communication
US20160062925A1 (en) Method and system for managing storage device operations by a host device
KR101085393B1 (en) Method and apparatus for executing command to multitasking a plurality of process
JPWO2018003244A1 (en) Memory controller, memory system and information processing system
CN118567566A (en) Instruction processing method, electronic device and non-transitory computer readable recording medium
JP2013539577A (en) Interrupt-based command processing
CN117539801A (en) Method, device and related equipment for remotely accessing memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100192 room A302, building B-2, Dongsheng Science Park, Zhongguancun, 66 xixiaokou Road, Haidian District, Beijing

Applicant after: Beijing yihengchuangyuan Technology Co.,Ltd.

Address before: 100192 room A302, building B-2, Dongsheng Science Park, Zhongguancun, 66 xixiaokou Road, Haidian District, Beijing

Applicant before: BEIJING MEMBLAZE TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant