CN112749111A - Method, computing device and computer system for accessing data - Google Patents

Method, computing device and computer system for accessing data Download PDF

Info

Publication number
CN112749111A
CN112749111A CN201911053658.8A CN201911053658A CN112749111A CN 112749111 A CN112749111 A CN 112749111A CN 201911053658 A CN201911053658 A CN 201911053658A CN 112749111 A CN112749111 A CN 112749111A
Authority
CN
China
Prior art keywords
command
processor
data
computing device
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911053658.8A
Other languages
Chinese (zh)
Inventor
李涛
阙鸣健
郭中天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201911053658.8A priority Critical patent/CN112749111A/en
Publication of CN112749111A publication Critical patent/CN112749111A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Abstract

The application provides a method, a computing device and a computer system for accessing data, which can reduce system overhead. The computer system comprises a first processor, a second processor and a computing device, wherein the computing device is respectively connected with the first processor and the second processor, and the method comprises the following steps: the method comprises the steps that computing equipment obtains a first IO command sent by a first processor, wherein the first IO command is a write operation command or a read operation command; the computing device sends a first IO command to the second processor; the second processing device sends an indication command to the computing device according to the first IO command, wherein the indication command is used for indicating the computing device to move data; and the computing device moves the data to be written from the memory of the first processor to the memory of the second processor in a DMA mode or moves the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the instruction command.

Description

Method, computing device and computer system for accessing data
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, a computing device, and a computer system for accessing data.
Background
Cloud computing refers to unified management and scheduling of a large number of computing resources and storage resources connected by a network, and provides on-demand services for end users. Cloud services providers offer computing resources and data storage resources to remote end users through a collection of servers disposed in a data center. The computing resources and the storage resources provided by the cloud computing for the user can be elastically stretched along with the actual change of the user business, and the operation and maintenance cost of the user is effectively reduced through the elastic computing capacity and the mode of allocation according to the requirement. Cloud computing may provide various types of service types for users, such as infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). The IaaS provides user computer infrastructure services, the PaaS provides accessible complete or partial application development services for users, and the SaaS provides complete and directly usable applications for users. Virtualization is the basis of cloud computing, and end users typically purchase cloud services in the form of a Virtual Machine (VM) that is rented. A virtual machine may refer to a complete computer system with complete hardware system functionality, emulated by software, running in a completely isolated environment.
For cloud service providers, the cost of cloud computing services is related to the utilization of cloud servers. The larger the proportion of the system overhead of the cloud computing occupying the computing resources of the cloud server is, the less the computing resources which can be provided for the user are, the lower the utilization rate of the cloud server is, which leads to the increase of the cost of the cloud computing service. The data shows that the system overhead of the current cloud computing service is as high as 13.6%, and with the further improvement of the network and storage performance, the system overhead of the cloud computing service exceeds 25%. Therefore, the user needs to rent more virtual machines to meet the working requirement, which will lead to an increase in the rental cost and a decrease in the competitive capacity of the business.
In order to improve the utilization rate of the cloud server, a cloud service provider usually chooses to offload storage management functions to other hardware besides a Central Processing Unit (CPU) of the cloud server, so as to reduce the system overhead of the cloud server. For example, the cloud server is connected to the ARM through a PCIe port. The storage processing function of the CPU of the cloud server may be transferred to a Reduced Instruction Set Computer (RISC) to reduce the system overhead of the cloud server. Typical RISC may include microprocessors such as an enhanced RISC machine (ARM). However, in this implementation, the input/output (IO) performance of data is limited by the main frequency and instruction set of the RISC processor, and multiple RISC processors are required to implement the storage offload function in parallel. The RISC processor implements a storage management function through software, which relates to an Erasure Code (EC) algorithm, a Data Integrity Field (DIF) algorithm, encryption and decryption algorithms, and includes a large amount of calculation work, and a plurality of RISC processors are required to process IO data in parallel, which may cause IO delay and unstable throughput, and at the same time, the implementation cost is high.
Disclosure of Invention
The application provides a method for accessing data, a computing device and a computer system, which can reduce the system overhead of the computer system and improve the IO performance of the data.
In a first aspect, a method for accessing data is provided, which is applied to a computer system, and the computer system includes: a first processor, a second processor, and a computing device, the computing device being connected to the first processor and the second processor, respectively, the second processor being configured to connect to a storage pool, the method comprising: the computing device obtains a first input/output (IO) command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool; the computing device sending the first IO command to the second processor; the second processing device sends an instruction command to the computing device according to the first IO command, wherein the instruction command is used for instructing the computing device to move data; and the computing equipment moves the data to be written from the memory of the first processor to the memory of the second processor in a Direct Memory Access (DMA) mode or moves the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the indication command.
The embodiment of the application provides a method for accessing data, which is applied to a computer system, wherein the computer system comprises a computing device and can use hardware to realize partial storage management functions unloaded by a first processor, and the partial storage management functions comprise the step of moving data between a memory of the first processor and a memory of a second processor in a DMA mode, so that the system overhead of the first processor and the system overhead of the second processor can be reduced. And the computing equipment realizes the storage management function by using hardware, and can reduce the signaling interaction on software, thereby reducing the IO time delay of access data, improving the IO speed and providing stable IO performance.
With reference to the first aspect, in a possible implementation manner of the first aspect, the sending, by the computing device, the first IO command to the second processor includes: the computing device allocates the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool; the computing device selects an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command; and the second processor acquires the first IO command from the sub-command queue.
In this embodiment, the computing device may perform QoS management on the IO command received from the first processor, for example, divide the IO command into different volume queues, or select the IO command from the volume queues and allocate the IO command to a sub-command queue. Therefore, partial memory management functions of the first processor or the second processor can be unloaded, so that the system overhead of the first processor and the second processor is reduced, and the IO memory management performance and efficiency of data access are improved.
With reference to the first aspect, in a possible implementation manner of the first aspect, the sending, by the second processing device, an instruction command to the computing device according to the first IO command includes: the second processor generates a first instruction command according to the first IO command, where the first instruction command is used to instruct to move data to be written from the memory of the first processor to the memory of the second processor; the second processor sends the first indication command to the computing device.
With reference to the first aspect, in a possible implementation manner of the first aspect, the performing, by the computing device, data to be written to move from the memory of the first processor to the memory of the second processor based on the instruction command includes: the computing equipment reads a data block from the memory of the first processor in a DMA mode; the computing device computing a check data block of the read data block; and the computing equipment moves the read data block and the check data block to the memory of the second processor in a DMA mode.
In this embodiment of the present application, the verification calculation may be implemented in the process of moving the data block from the first processor to the second processor, that is, the verification calculation is completed in the computing device. Therefore, the data does not need to be read out from the memory of the second processor again for check calculation, one-time memory copy is reduced, the memory resource is saved, and the data access flow is simplified.
With reference to the first aspect, in a possible implementation manner of the first aspect, the first IO command is a write operation command, and the method further includes: the second processor decomposes the first IO command to obtain a plurality of first sub IO commands, wherein data blocks requested to be written by different sub IO commands correspond to different physical addresses in the storage pool; the second processor determines a first stripe, where the first stripe includes at least one sub IO command in the first sub IO commands, and the first stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be written by the sub IO command included in the first stripe corresponds to a same storage device in the storage pool; and the second processor sends the data corresponding to the first stripe to the same storage device in the storage pool.
In this embodiment, during a write operation, the second processor may perform data access on multiple sub IO command sets corresponding to the same storage device in the storage pool in one stripe, so as to improve access efficiency.
With reference to the first aspect, in a possible implementation manner of the first aspect, the sending, by the second processing device, an instruction command to the computing device according to the first IO command includes: the second processor generates a second instruction command according to the first IO command, where the second instruction command is used to instruct the computing device to move data to be read from the memory of the second processor to the memory of the first processor; the second processor sends the second indication command to the computing device.
With reference to the first aspect, in a possible implementation manner of the first aspect, the first IO command is a read operation command, and the method further includes: the second processor generates a read data request command according to the first IO command, wherein the read data request command is used for requesting data to be read from the storage pool; the second processor sending the read data request command to the memory pool; and the second processor acquires the data to be read from the storage pool.
With reference to the first aspect, in a possible implementation manner of the first aspect, the generating, by the second processor, a read data request command according to the first IO command includes: the second processor decomposes the first IO command to obtain a plurality of second IO sub-commands, wherein data blocks requested to be acquired by different IO sub-commands correspond to different physical addresses in the storage pool; the second processor determines a second stripe, where the second stripe includes at least one second sub IO command of the second sub IO commands, and the second stripe further includes at least one sub IO command obtained by decomposition based on other IO commands, where a data block requested to be obtained by the sub IO command in the second stripe corresponds to a same storage device in the storage pool; and the second processor generates the read data request command according to the second stripe, wherein the read data request command is used for requesting the storage pool to acquire data corresponding to the second stripe.
In the embodiment of the application, in the read operation process, a plurality of sub IO command sets corresponding to the same storage device may be subjected to data access in one stripe, so as to improve access efficiency.
With reference to the first aspect, in a possible implementation manner of the first aspect, the method further includes: the second processor writes a completion entry corresponding to the first IO command into a completion queue under the condition that the first IO command is determined to be completed; and the computing equipment sends IO completion information to the first processor according to the completion queue, wherein the IO completion information is used for indicating that a first IO command is completed.
With reference to the first aspect, in a possible implementation manner of the first aspect, the computing device stores the IO command using a data cache system, where the data cache system includes: the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry; the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier; the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
Optionally, the producer pointer is used to point to an address range of a next write IO command in the K address ranges.
Optionally, the consumer pointer is used to point to an address range of a next read IO command in the K address ranges.
Optionally, the cache space is further configured to: in a case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the idle identification, no IO command is stored in the second address range.
In a second aspect, a method for accessing data is provided, where the method is performed by a computing device, and the computing device is connected to the first processor and the second processor respectively, and the second processor is configured to connect to a storage pool, and the method includes: the computing device obtains a first input/output (IO) command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool; the computing device sending the first IO command to the second processor; the computing equipment receives an instruction command sent by the second processor, wherein the instruction command is used for instructing the computing equipment to move data; and the computing equipment moves the data to be written from the memory of the first processor to the memory of the second processor in a Direct Memory Access (DMA) mode or moves the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the indication command.
With reference to the second aspect, in a possible implementation manner of the second aspect, the sending, by the computing device, the first IO command to the second processor includes: the computing device allocates the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool; and the computing equipment selects an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command.
With reference to the second aspect, in a possible implementation manner of the second aspect, the receiving, by the computing device, the instruction command sent by the second processor includes: the computing device receives a first instruction command sent by the second processor, the first instruction command is generated according to the first IO command, and the first instruction command is used for instructing to move data to be written from the memory of the first processor to the memory of the second processor.
With reference to the second aspect, in a possible implementation manner of the second aspect, the first IO command is a write operation command, and the moving, by the computing device, data to be written from a memory of the first processor to a memory of the second processor based on the indication command includes: the computing equipment reads a data block from the memory of the first processor in a DMA mode; the computing device computing a check data block of the read data block; and the computing equipment moves the read data block and the check data block to the memory of the second processor in a DMA mode.
With reference to the second aspect, in a possible implementation manner of the second aspect, the receiving, by the computing device, the instruction command sent by the second processor includes: and the computing device receives a second instruction command sent by the second processor, wherein the second instruction command is generated according to the first IO command, and the second instruction command is used for instructing the computing device to move the data to be read from the memory of the second processor to the memory of the first processor.
With reference to the second aspect, in a possible implementation manner of the second aspect, the method further includes: and the computing equipment sends IO completion information to the first processor under the condition that the first IO command is determined to be completed, wherein the IO completion information is used for indicating that the first IO command is completed.
With reference to the second aspect, in a possible implementation manner of the second aspect, the computing device stores the IO command using a data cache system, where the data cache system includes: the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry; the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier; the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
Optionally, the producer pointer is used to point to an address range of a next write IO command in the K address ranges.
Optionally, the consumer pointer is used to point to an address range of a next read IO command in the K address ranges.
Optionally, the cache space is further configured to: in a case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the idle identification, no IO command is stored in the second address range.
In a third aspect, a computer system is provided, where the computer system includes a first processor, a second processor, and a computing device, where the computing device is connected to the first processor and the second processor, the second processor is configured to be connected to a storage pool, and the computing device is configured to obtain a first input/output IO command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data to the storage pool, and the read operation command is used to request to read data from the storage pool; the computing device is to send the first IO command to the second processor; the second processing device is used for sending an instruction command to the computing device according to the first IO command, wherein the instruction command is used for instructing the computing device to move data; the computing device is further configured to move, based on the instruction command, data to be written from the memory of the first processor to the memory of the second processor in a Direct Memory Access (DMA) manner, or move data to be read from the memory of the second processor to the memory of the first processor in a DMA manner.
The embodiment of the application provides a computer system, which comprises a computing device, wherein the computing device can use hardware to realize partial storage management functions unloaded by a first processor, and the partial storage management functions comprise data moving between a memory of the first processor and a memory of a second processor in a DMA mode, so that the system overhead of the first processor and the system overhead of the second processor can be reduced. And the computing equipment realizes the storage management function by using hardware, and can reduce the signaling interaction on software, thereby reducing the IO time delay of access data, improving the IO speed and providing stable IO performance.
With reference to the third aspect, in a possible implementation manner of the third aspect, the computing device is specifically configured to: allocating the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool; selecting an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command; the second processor is specifically configured to obtain the first IO command from the sub-command queue.
With reference to the third aspect, in a possible implementation manner of the third aspect, the first IO command is a write operation command, and the second processing device is specifically configured to: generating a first instruction command according to the first IO command, where the first instruction command is used to instruct to move data to be written from the memory of the first processor to the memory of the second processor; sending the first indication command to the computing device.
With reference to the third aspect, in a possible implementation manner of the third aspect, the first IO command is a write operation command, and the computing device is specifically configured to: reading a data block from a memory of the first processor in a DMA mode; calculating a check data block of the read data block; and transferring the read data block and the check data block to the memory of the second processor in a DMA mode.
With reference to the third aspect, in a possible implementation manner of the third aspect, the first IO command is a write operation command, and the second processor is further configured to: decomposing the first IO command to obtain a plurality of first sub IO commands, wherein data blocks requested to be written by different sub IO commands correspond to different physical addresses in the storage pool; determining a first stripe, where the first stripe includes at least one sub IO command in the plurality of first sub IO commands, and the first stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be written by the sub IO command included in the first stripe corresponds to a same storage device in the storage pool; and sending the data corresponding to the first stripe to the same storage device in the storage pool.
With reference to the third aspect, in a possible implementation manner of the third aspect, the first IO command is a read operation command, and the second processing device is specifically configured to: generating a second instruction command according to the first IO command, where the second instruction command is used to instruct the computing device to move data to be read from the memory of the second processor to the memory of the first processor; sending the second indication command to the computing device.
With reference to the third aspect, in a possible implementation manner of the third aspect, the first IO command is a read operation command, and the second processor is further configured to: generating a read data request command according to the first IO command, wherein the read data request command is used for requesting data to be read from the storage pool; sending the read data request command to the storage pool; and acquiring the data to be read from the storage pool.
With reference to the third aspect, in a possible implementation manner of the third aspect, the second processor is specifically configured to: decomposing the first IO command to obtain a plurality of second IO sub commands, wherein the data blocks requested to be obtained by different IO sub commands correspond to different physical addresses in the storage pool; determining a second stripe, where the second stripe includes at least one second sub IO command of the second sub IO commands, and the second stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be obtained by the sub IO command in the second stripe corresponds to a same storage device in the storage pool; and generating the read data request command according to the second stripe, wherein the read data request command is used for requesting the storage pool to acquire data corresponding to the second stripe.
With reference to the third aspect, in a possible implementation manner of the third aspect, the second processor is further configured to, when it is determined that the first IO command is completed, write a completion entry corresponding to the first IO command into a completion queue; the computing device is further configured to send IO completion information to the first processor according to the completion queue, where the IO completion information is used to indicate that the first IO command is completed.
With reference to the third aspect, in a possible implementation manner of the third aspect, the computing device stores the IO command using a data cache system, where the data cache system includes: the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry; the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier; the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
Optionally, the producer pointer is used to point to an address range of a next write IO command in the K address ranges.
Optionally, the consumer pointer is used to point to an address range of a next read IO command in the K address ranges.
Optionally, the cache space is further configured to: in a case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the idle identification, no IO command is stored in the second address range.
In a fourth aspect, a computing device is provided, where the computing device is connected to a first processor and a second processor respectively, and the second processor is used to connect to a storage pool, and the computing device includes: an IO processing module, configured to obtain a first IO command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool; a QoS module, configured to send the first IO command to the second processor; the IO processing module is further configured to receive an instruction command sent by the second processor, where the instruction command is used to instruct the computing device to move data; and the direct memory access DMA module is used for moving the data to be written from the memory of the first processor to the memory of the second processor in a DMA mode or moving the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the indication command.
With reference to the fourth aspect, in a possible implementation manner of the fourth aspect, the QoS module is specifically configured to: allocating the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool; and selecting an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command.
With reference to the fourth aspect, in a possible implementation manner of the fourth aspect, the first IO command is a write operation command, the IO processing module is specifically configured to receive a first instruction command sent by the second processor, the first instruction command is generated according to the first IO command, and the first instruction command is used to instruct to move data to be written from a memory of the first processor to a memory of the second processor.
With reference to the fourth aspect, in a possible implementation manner of the fourth aspect, the first IO command is a write operation command, and the DMA module is configured to read a data block from a memory of the first processor in a DMA manner; the computing device further includes an algorithm engine module to: calculating a check data block of the read data block; the DMA module is further used for moving the read data block and the check data block to the memory of the second processor in a DMA mode.
With reference to the fourth aspect, in a possible implementation manner of the fourth aspect, the first IO command is a read operation command, the IO processing module is specifically configured to receive a second instruction command sent by the second processor, where the second instruction command is generated according to the first IO command, and the second instruction command is used to instruct the computing device to move data to be read from a memory of the second processor to a memory of the first processor.
With reference to the fourth aspect, in a possible implementation manner of the fourth aspect, the IO processing module is further configured to send IO completion information to the first processor under the condition that the first IO command is determined to be completed, where the IO completion information is used to indicate that the first IO command is completed.
With reference to the fourth aspect, in a possible implementation manner of the fourth aspect, the computing device stores the IO command using a data cache system, where the data cache system includes: the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry; the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier; the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
Optionally, the producer pointer is used to point to an address range of a next write IO command in the K address ranges.
Optionally, the consumer pointer is used to point to an address range of a next read IO command in the K address ranges.
Optionally, the cache space is further configured to: in a case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the idle identification, no IO command is stored in the second address range.
In a fifth aspect, a data caching system is provided, including: the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that data are stored in the corresponding entry, and the idle identifier is used for indicating that no data are stored in the corresponding entry; a producer control unit configured to perform the following operations: writing data into a first address range indicated by a producer pointer, and updating a first owner zone bit in the first address range to be the storage identifier; a consumer control unit configured to perform the following operations: determining whether to read data from a second address range indicated by a consumer pointer based on a second owner flag in the second address range.
Optionally, the producer pointer is used to point to the address range of the next write data in the K address ranges.
Optionally, the consumer pointer is used to point to an address range of the next read data in the K address ranges.
In the embodiment of the present application, an owner flag bit is configured in an entry stored in each address range in the data cache system to indicate whether data is stored in the corresponding entry. The data cache system can reduce the interaction between the data writing thread and the data reading thread and even allow independent operation of a producer thread and a consumer thread, thereby simplifying the management flow of the data cache system.
With reference to the fifth aspect, in a possible implementation manner of the fifth aspect, the data caching system further includes: a consumer control unit configured to perform the following operations: determining a second address range indicated by a consumer pointer, the consumer pointer being used to point to an address range of a next read IO command in the K address ranges; reading data from the second address range if a second owner flag bit in the second address range is recorded as the storage identity; and determining that no data is stored in the second address range under the condition that the second owner zone bit is recorded as the idle identification.
With reference to the fifth aspect, in a possible implementation manner of the fifth aspect, the data caching system further includes: a producer ring register for storing a producer ring flag bit for recording whether the cycle number of the producer pointer in the K address ranges is odd or even, wherein the assignment of the owner flag bit is determined according to the assignment of the producer ring flag bit; the producer control unit is configured to specifically perform the following operations: determining the cycle times indicated by the producer ring flag bit to be odd times; writing data in the ninth address range and updating a ninth owner flag bit in the ninth address range from a second value to a first value; alternatively, the producer control unit is configured to specifically perform the following operations: determining the cycle times indicated by the producer ring flag bit to be even number times; writing data in the ninth address range and updating a ninth owner flag bit in the ninth address range from the first value to the second value.
In the embodiment of the application, the upper layer software determines the assignment of the owner zone bit corresponding to the storage identifier according to the parity of the cycle number of the producer pointer, so that when the upper layer software reads data, whether the entry stores the data or not can be determined according to the parity of the cycle number of the consumer pointer and the assignment of the owner zone bit.
With reference to the fifth aspect, in a possible implementation manner of the fifth aspect, the data caching system further includes: the consumer ring register is used for storing a consumer ring flag bit so as to record the cycle times of the consumer pointer in the K address ranges as odd times or even times; the consumer control unit is configured to specifically perform the following operations: determining the number of cycles indicated by the consumer ring flag bit to be odd; reading data from the second address range if the second owner flag bit is recorded as the first value; alternatively, the consumer control unit is configured to specifically perform the following operations: determining the number of cycles indicated by the consumer ring flag bit to be an even number; reading data from the second address range if the second owner flag bit assignment is the second value.
In the embodiment of the application, the upper layer software determines that the assignment indication of the owner flag bit in the entry indicated by the consumer pointer is a storage identifier or an idle identifier according to the parity of the cycle number of the consumer pointer recorded by the consumer ring register, and further judges whether the entry stores a command.
With reference to the fifth aspect, in one possible implementation manner of the fifth aspect, the consumer control unit is configured to specifically perform the following operations: determining the number of cycles indicated by the consumer ring flag bit to be odd; determining that no data is stored in the second address range if the value assigned to the second owner flag bit is the second value; determining the number of cycles indicated by the consumer ring flag bit to be an even number; and determining that no data is stored in the second address range when the value assigned to the second owner flag bit is the first value.
With reference to the fifth aspect, in a possible implementation manner of the fifth aspect, the first value is 1, and the second value is 0; alternatively, the first value is 0 and the second value is 1.
In a sixth aspect, a chip is provided, on which a computing device as described in the fourth aspect or any one of its possible implementations is arranged
In a seventh aspect, a computer-readable storage medium is provided, which is used for storing program code comprising instructions for performing the method of the first aspect or any one of the possible implementations of the first aspect.
In an eighth aspect, a computer-readable storage medium is provided for storing program code comprising instructions for performing the method of the second aspect or any one of the possible implementations of the second aspect.
Drawings
Fig. 1 is a schematic architecture diagram of a cloud computing system 100 according to an embodiment of the present application.
Fig. 2 is a schematic diagram of an architecture of a computer system 200 according to an embodiment of the present application.
Fig. 3 is a system diagram of a computer system 300 according to yet another embodiment of the present application.
Fig. 4 is a schematic diagram of an architecture of a computer system 400 according to yet another embodiment of the present application.
Fig. 5 is a schematic diagram of an architecture of a computer system 400 according to yet another embodiment of the present application.
FIG. 6 is a logic flow diagram of a method of accessing data in accordance with an embodiment of the present application.
Fig. 7 is a flow chart illustrating a method 700 of accessing data according to an embodiment of the present application.
FIG. 8 is a diagram illustrating a computer system 400 performing a write operation according to an embodiment of the present application.
FIG. 9 is a diagram illustrating a computer system 400 performing a read operation according to an embodiment of the present application.
Fig. 10 is a schematic diagram of the encoding and decoding flow of the EC algorithm according to the embodiment of the present application.
FIG. 11 is a schematic diagram of an EC algorithm in a conventional storage offload scheme.
FIG. 12 is a schematic diagram of an EC algorithm in a storage offload scheme according to an embodiment of the present application.
Fig. 13 is a signaling processing flow diagram of a method 1300 of accessing data according to an embodiment of the present application.
FIG. 14 is a flowchart illustrating a write operation method 1400 according to an embodiment of the disclosure.
FIG. 15 is a block diagram illustrating a hash of a sub IO command according to an embodiment of the present application.
FIG. 16 is a block diagram of a hash stripe of a data block according to an embodiment of the present application.
Fig. 17 is a flowchart illustrating a read operation method 1700 according to an embodiment of the disclosure.
Fig. 18 is a schematic structural diagram of a data cache system according to an embodiment of the present application.
Fig. 19 is a schematic diagram illustrating an operating state of a data caching system according to an embodiment of the present application.
Fig. 20 is a schematic structural diagram of an entry in the data cache system according to an embodiment of the present application.
Fig. 21 is a schematic diagram of an operating state of a data caching system according to an embodiment of the present application.
Fig. 22 is a schematic diagram illustrating an operating state of a data caching system according to another embodiment of the present application.
Fig. 23 is a schematic diagram of an application scenario of a further embodiment of the present application.
Detailed Description
The technical solution in the present application will be described below with reference to the accompanying drawings.
To facilitate understanding, the following first presents several terms and concepts related to embodiments of the present application.
IaaS: that is, infrastructure or service, means that Internet Technology (IT) infrastructure is provided as a service to the outside through a network. In this service model, users do not construct a data center by themselves, but use infrastructure services including servers, storage, networks, and the like by leasing. IaaS may deliver computing resources in the form of virtualized operating systems, workload management software, hardware, network, and storage services. IaaS can provide computing power and storage services on demand. Instead of purchasing and installing the required resources in a conventional data center, the required resources are leased according to company needs. IaaS is generally classified into public, private, and hybrid clouds.
Virtual machine: virtualization is often the basis for cloud computing. A virtual machine may refer to a complete computer system with complete hardware system functionality, emulated by software, running in a completely isolated environment, in such a way as to create multiple virtual systems within a single physical system.
Reduced Instruction Set Computer (RISC): is a microprocessor that executes fewer types of computer instructions, and the microprocessor employed in RISC may be referred to as a RISC processor. RISC has a relatively simple instruction system, requiring hardware to execute only a limited and most frequently used portion of instructions, and most complex operations are composed of simple instructions using sophisticated compilation techniques. RISC is relative to a Complex Instruction Set Computer (CISC). So-called complex instruction set computers rely on increasing the hardware architecture of the machine to meet the increasing performance demands placed on the computer. Because computers require additional transistors and circuit elements for each type of instruction to execute, larger computer instruction sets make microprocessors more complex and perform operations slower. Thus, RISC can execute operations at a faster speed. The RISC processor may include, for example, a microprocessor such as a strong RISC machine (ARM).
X86 architecture: is a set of computer language instructions executed by a microprocessor that identifies a set of general purpose computer instructions. Commonly referred to as the standard numbering abbreviation of a general computer column of intel corporation.
Direct Memory Access (DMA): the direct memory operation or the group data transmission mode may also be referred to as a data interaction mode in which data is directly accessed from a memory without passing through a CPU. In DMA mode, CPU only needs to give out command to DMA controller, DMA controller controls data transmission, data transmission is finished and information is fed back to CPU, so that CPU resource occupancy is reduced and system resource is saved. In other words, a DMA transfer copies data from one address space to another, and the CPU is used to initiate this transfer, which itself is carried out and completed by the DMA controller. A typical example is to move a block of external memory to a memory area inside the chip. The DMA method does not save a field or restore a field during data transfer. Because the CPU does not participate in the transmission operation at all, the operations of fetching instruction, fetching number, sending number and the like of the CPU are omitted. Memory address modification, counting of the number of transferred words, etc., are not implemented in software, but directly in hardware circuitry. Therefore, the DMA mode can meet the requirement of high-speed I/O equipment and is beneficial to the exertion of CPU efficiency.
The DMA may include Remote DMA (RDMA) and local DMA. RDMA refers to transferring data directly from the memory of one computer to another computer over a network without access by both operating systems. Local DMA refers to DMA data transfers that do not require passing through a network.
Peripheral component interconnect express (PCIe) bus: a high speed serial computer expansion bus standard. The PCIe bus may be used to support active power management, error reporting, end-to-end reliability transport, hot plugging, and other functions.
Super path interconnect (UPI): a computer expansion bus standard for the X86 architecture.
Network card: which may also be referred to as a Network Interface Controller (NIC), a network adapter, or a local area network receiver, is a type of computer hardware designed to allow computers to communicate over a computer network. Each network card corresponds to a unique Media Access Control (MAC) address.
Memory (memory): the memory may be referred to as an internal memory or a main memory, and is used for temporarily storing arithmetic data in the CPU and data exchanged with an external memory such as a hard disk. The CPU transfers the data to be operated to the memory for operation, and transmits the result from the memory after the operation is finished.
Non-Volatile Memory standard (Non-Volatile Memory express, NVMe): the method is a logic device interface standard, and accesses a nonvolatile memory medium attached through a bus based on the bus transmission protocol specification of a device logic interface. Virtio IO protocol: the virtual IO interface framework can be referred to as Virtio protocol or Virtio for short, is a semi-virtualized virtual IO interface framework, can support various types of IO equipment, has good expansibility and compatibility compared with full-virtualized IO, is widely used in virtualized scenes, and becomes a fact standard. With the increase of the number of single tenants of the virtual machine and the gradual and strong demand on network performance bandwidth, the cost of the host machine bearing the Virtio IO protocol is larger and larger, so that the rear-end load of the Virtio IO can be unloaded to a network card or other hardware, and the efficiency of the host machine is improved.
Host machine: refers to a computer, or physical machine, on which a virtual machine is installed.
Erasure Code (EC) algorithm: the method is a coding fault-tolerant technology, and the basic principle is that transmitted signals are segmented, a certain check code is added, and then certain relation is generated among the segments, so that even if part of the signals are lost in the transmission process, a receiving end can still calculate complete information through an algorithm. Erasure codes can be classified into three types of error detection, error correction, and erasure correction according to different functions of error control.
Fig. 1 is a schematic architecture diagram of a cloud computing system 100 according to an embodiment of the present application. As shown in fig. 1, cloud computing system 100 includes servers 101, storage pools 102, and clients 103. The client 103 is connected to the server 101 via a network, and the client 103 accesses the server 101 via the network. The server 101 may be connected to the storage pool 102 via a network. For example, the server 101 may communicate with the storage pool 102 via a network card. The server 101 may write data to the storage pool 102 or read data from the storage pool 102 according to an input command of the client 103. Alternatively, for ease of description, only one server, one storage pool, and one client are included in FIG. 1. Those skilled in the art will appreciate that the server 101 in FIG. 1 may comprise a collection of servers; the storage pool 102 may comprise an aggregate of storage devices, e.g., the storage pool 102 may be a distributed storage system; the client 103 may include a plurality of clients.
It should be understood that the cloud computing system 100 in fig. 1 is only an example and not a limitation, and the cloud computing system 100 may also include other types of architectures or variants, which are suitable for the application environment of the embodiments of the present application.
Fig. 2 is a schematic diagram of an architecture of a computer system 200 according to an embodiment of the present application. As shown in fig. 2, the computer system 200 includes a first processor 10 and a network card 70. The first processor 10 is used to run Virtual Machines (VMs) and Virtual Machine Monitors (VMM). The functions of the respective modules are described below.
Virtual machine: refers to a complete computer system with complete hardware system functions, which is simulated by software and runs in a completely isolated environment. Some of the instruction subsets of the virtual machine may be processed in a host (host) machine, which may be referred to above as the first processor, and other instruction portions may be executed in an emulated manner. A processor may support one or more virtual machines, and the virtual machines and the virtual machine monitor may communicate according to a communication protocol specification. By way of example, an NVMe/Virtio front-end module is disposed in the virtual machine, and the NVMe/Virtio front-end module is used for executing a front-end portion of an NVMe protocol and/or a front-end portion of a Virtio protocol.
A virtual machine monitor: otherwise known as hypervisor (hypervisor), is system software for maintaining multiple isolated virtual machines. The virtual machine monitor is used for managing real resources of the computer system and providing an interface for the virtual machine. The virtual machine monitor comprises an NVMe/Virtio back-end module which is used for executing the back-end part of the NVMe protocol and/or the back-end part of the Virtio protocol. The hypervisor also includes a virtual block system process (VBS process) module and an RDMA module. The virtual block system processing module may be configured to process data written in or read from the storage pool, for example, may perform operations such as striping, erasure correction code, DIF, encryption and decryption processing on the data block. The RDMA module is used for executing remote DMA data movement.
The first processor 10 and the network card 70 may be connected via a bus, which may include, for example, a PCIe protocol bus or a UPI protocol bus. The first processor 10 may be connected to a network through a network card 70. For example, the first processor 10 may be connected to the storage pool 102 via the network card 70, and may communicate and transfer data with the storage pool 102 via the network card. As an example, the first processor 10 may comprise a processor of the X86 architecture of intel corporation, for example. The computer system 200 may be applied to the server 101 of the cloud computing system 100 in fig. 1.
In the computer system 200, the first processor 10 also needs to perform a storage management function of data, and thus the overhead involved is large. Cloud service providers need to purchase more servers to achieve the computing resource capacity required by the customers, and therefore the cost is high. To solve the above problem, the memory management function of the first processor 10 is usually selected to be offloaded to other lower-cost processors, such as RISC processors, to reduce the overhead of the first processor 10, thereby achieving the purpose of saving cost.
Fig. 3 is a system diagram of a computer system 300 according to yet another embodiment of the present application. As shown in fig. 3, the computer system 300 includes a first processor 10 and a second processor 20. By way of example, the first processor 10 comprises a CISC processor and the second processor 20 comprises a RISC processor. In contrast to computer system 200 in FIG. 2, computer system 300 offloads the storage management functions of first processor 10 into second processor 20. By way of example, the first processor 10 comprises an X86 architecture processor and the second processor 20 comprises an ARM.
In the computer system 300, the IO speed of data is limited by the main frequency and instruction set of the second processor 20. For example, the computer system 300 may require 16 ARM to achieve an IO processing speed of 1 mega input/output per second (MIOPS). In addition, the storage management function involves various algorithms such as an Erasure Code (EC) algorithm, a Data Integrity Field (DIF) algorithm, encryption, decryption, and the like, and involves a large amount of calculation work. Since the second processor 20 uses software to implement these computational tasks, the IO latency is large. For example, the optimal delay can only be around 150 microseconds (μ s). Therefore, although the use of the second processor 20 to offload the memory management function of the first processor 10 can reduce the system overhead of the first processor 10, the IO performance is not ideal and may affect the optimization and upgrade of the cloud computing system.
In order to solve the above problem, an embodiment of the present application provides a computing device, which may be disposed between the first processor 10 and the second processor 20, and utilize a hardware function of the computing device to execute a part of the storage management function unloaded from the first processor 10, so as to improve IO performance of data storage and reduce IO latency. The computer system 400 and the computing device provided by the embodiment of the present application will be described in detail with reference to the accompanying drawings.
Fig. 4 is a schematic diagram of an architecture of a computer system 400 according to yet another embodiment of the present application. As shown in fig. 4, the computer system 400 includes a first processor 10, a second processor 20, a computing device 50, and a network card 70. The computing device 50 is disposed between the first processor 10 and the second processor 20, and the second processor 20 is connected to a network card 70. The second processor 20 may be connected to the network via a network card 70, for example, the second processor 20 may be connected to the storage pool 102 via the network card 70. Storage pool 102 is used to store data in the cloud computing system. For example, the storage pool 102 may be a distributed storage system. The first processor 10 is connected to its corresponding memory, and the second processor 20 is also connected to its corresponding memory.
In some examples, the first processor 10 comprises a CISC processor and the second processor 20 comprises a RISC processor. For example, the first processor 10 may be a processor of the X86 architecture and the second processor 20 may be AMR. The computer system 400 may be applied to the server 101 in the cloud computing system 100 in fig. 1.
Optionally, the computing device 50 includes a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC).
In the embodiment of the present application, the computing device 50 may be configured to offload part of the storage management function of the first processor 10, and the computing device 50 may implement the offloaded part of the storage management function in hardware. For example, computing device 50 may move data between the memory of first processor 10 and the memory of second processor 20 via DMA. Or computing device 50 may manage quality of service (QoS) for IO commands through hardware. Or computing device 50 may also implement algorithms related to the data using hardware. E.g., EC, DIF, encryption, decryption, etc. The computer system and the method for accessing data according to the embodiment of the present application will be described with reference to the accompanying drawings.
Fig. 5 is a schematic diagram of an architecture of a computer system 400 according to yet another embodiment of the present application. As shown in fig. 5, the first processor 10 includes therein a virtual machine, the virtual machine includes an NVMe/Virtio front-end module, and the computing device 50 may include therein an NVMe/Virtio back-end module. The NVMe/Virtio backend module is used to interface with the NVMe/Virtio frontend module in the first processor 10. The computing device may also include an End Point (EP) module to process an interface specification under the PCIe standard to interface with the second processor 20.
In some examples, an IO processing module is also included in the computing device 50 for receiving and analyzing signaling received from the first processor or the second processor. For example, the IO processing module may be configured to analyze IO commands received from the first processor 10, which may include a write operation command and a read operation command.
In some examples, a quality of service (QoS) module is also included in the computing device 50 and is configured to implement IO commanded traffic management.
In some examples, the computing device further comprises a DMA module, and the DMA module is configured to implement data transfer between the memory of the first processor 10 and the memory of the second processor 20 by means of DMA.
In some examples, an algorithm Engine (ENG) module is further included in the computing device, and the algorithm engine module is configured to implement, in hardware, an associated algorithm for storing data, such as an EC, DIF, encryption, decryption, and the like.
FIG. 6 is a logic flow diagram of a method of accessing data in accordance with an embodiment of the present application. As shown in FIG. 6, the logical flow of accessing data involves a volume queue, a sub-command queue, and a completion queue. The queues may be disposed in a storage area of the computing device 50 or may be disposed in the memory of the second processor 20. The functions of the above-described respective queues are described as follows.
Volume queue (volume queue): the storage pool includes a plurality of logical hard disks, different logical hard disks correspond to different volume Identifiers (IDs), and each volume identifier may correspond to a volume queue. The IO command may include a volume Identifier (ID). The first processor may send the IO command generated by the virtual machine to the computing device 50, and the computing device 50 allocates the IO command to the corresponding volume queue according to the volume identifier.
Subcommand queue (subcommandqueue): may refer to a queue of IO commands to be executed. Computing device 50 may select an IO command to be executed from the volume queue and assign the selected command to the sub-command queue. The sub command queue may refer to a wait queue for the second processor 20 to process the IO command. The second processor 20 may read the IO command to be processed from the sub-command queue. For example, computing device 50 may assign IO commands to sub-command queues based on their priority information or other information.
Completion queue (completion queue): the second processor 20 may be configured to process a write/read operation corresponding to the IO command and write a completion entry into the completion queue after completing the corresponding operation. After reading the completion entry in the completion queue, computing device 50 may feed back to the virtual machine that the IO command is complete.
Optionally, the flow of accessing the data also involves an IO control information block, which may be disposed in a storage area of the computing device 50, or may also be disposed in the memory of the second processor 20. The function of the IO control information block is described below.
IO control information block (IO control block): after receiving the IO command from the virtual machine, the computing device 50 may write information of the IO command into the IO control information block and allocate a corresponding IO command identifier. After obtaining the identification of the IO command, the second processor 20 reads the complete information of the IO command from the IO control information block by using the identification as an index. In other words, the indexes of the IO commands are transmitted in the volume queue and the sub-command queue, instead of the IO commands themselves, so that the storage burden of the system is reduced, and the signaling interaction efficiency is improved.
Referring to fig. 6, a logic flow for accessing data in an embodiment of the present application includes the following steps.
S1, the first processor 10 generates an IO command, which includes a write operation command or a read operation command.
For example, the virtual machine of the first processor 10 may generate an IO command according to a write data request or a read data request input by a client. In the embodiment of the present application, the communication between the virtual machine and the other device may also be understood as the communication between the first processor 10 and the other device.
S2, the computing device 50 receives the IO command, writes information of the IO command into the IO control information block, and allocates a corresponding IO command identifier.
Wherein, what is transmitted in the volume queue and the sub-command queue is the IO command identification, not the IO command itself. As an example, an IO processing module in the computing device 50 may be used to perform the operations in S2.
S3, computing device 50 allocates the IO command to the volume queue.
As an example, the volume identification is included in the IO command, and the computing device 50 may allocate the IO command to the corresponding volume queue according to the volume identification. Optionally, a QoS module in the computing device 50 may be used to perform the operations in S3.
S4, the computing device 50 selects an IO command to be processed from the volume queue and adds the IO command to the sub-command queue.
The sub-command queue is a waiting queue for IO commands to be processed. Computing device 50 may be used to perform QoS traffic management of IO commands, such as allocating IO commands into volume queues or allocating IO commands into sub-command queues. As an example, the QoS module in the processor 50 may be configured to perform the operation in S4.
S5, after determining that the IO command is completed, the second processor 20 writes a completion entry corresponding to the IO command into the completion queue.
In executing the IO commands, computing device 50 may be used to handle QoS traffic management, algorithms, and DMA data movement functions. The computing device 50 may use hardware to implement the related algorithm in the access data, and the second processor 20 may be configured to implement IO command parsing and splitting in terms of software, which may facilitate subsequent version upgrading and operation and maintenance in a manner of combining software and hardware. Hereinafter, the IO command parsing and splitting method in the present application will be described in detail with reference to the accompanying drawings.
S6, after the computing device 50 reads a completion entry in the completion queue, it sends an IO completion command to the first processor 10 to notify the first processor 10 that the IO command is completed.
Fig. 7 is a flow chart illustrating a method 700 of accessing data according to an embodiment of the present application. The method is a process for writing data to a storage pool and may be applied to computer system 400 in fig. 4 or fig. 5. As shown in fig. 7, the method 700 includes:
s701, the computing device obtains a first IO command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool.
Alternatively, the storage pool may be the storage pool 102 of fig. 1, to which the second processor is connected via a network card, and which may include distributed storage devices.
Optionally, the generating, by the first processor, the first IO command includes: a virtual machine in a first processor generates a first IO command. The first processor may generate the first IO command according to an output operation of a client. For example, for a write operation, the first processor may generate a write operation command and data to be written according to input information of the client, where the data to be written is stored in a memory of the first processor. For example, for a read operation, the first processor may generate a read operation command based on input information of the client.
For example, fig. 8 is a diagram illustrating a computer system 400 performing a write operation according to an embodiment of the present application. As shown in fig. 8, the write operation process includes two phases, in the first phase, the computing device 50 moves the data to be written from the memory of the first processor 10 to the memory of the second processor 20, and in the second phase, the second processor 20 sends the data to be written to the storage pool 102 through the network card 70.
For example, fig. 9 is a schematic diagram of a computer system 400 according to an embodiment of the present application performing a read operation. As shown in fig. 9, the read operation process includes two phases, in the first phase, the second processor 20 reads data from the storage pool 102 through the network card 70 and stores the data in the memory of the second processor 20. In the second phase, the computing device 50 moves the read data from the memory of the second processor 20 to the memory of the first processor 10.
In some examples, the first processor writes the first IO command to a commit queue, which may be disposed in a memory of the first processor 10. The computing device 50 obtains the IO command from the submission queue. In particular, the first processor 10 may notify the computing device 50 to obtain the IO command. For example, the first processor 10 may write a corresponding entry in a doorbell register (DB), and the computing device knows that an IO command needs to be read from the commit queue by reading the DB. Wherein the DB may be provided in a storage area of the computing device 50.
S702, the computing device sends the first IO command to the second processor.
In some examples, after the computing device obtains the first IO command, the method 700 further includes: and the computing equipment adds the first IO command into a corresponding volume queue, wherein different volume queues correspond to different logical hard disks in the storage pool. The computing device selects an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command. The second processor may retrieve the first IO command from the sub-command queue.
And S703, the second processing device sends an instruction command to the computing device according to the first IO command, wherein the instruction command is used for instructing the computing device to move data.
Optionally, if the first IO command is a write operation command, the sending, by the second processing device, an instruction command to the computing device according to the first IO command includes: the second processor generates a first instruction command according to the first IO command, where the first instruction command is used to instruct to move data to be written from the memory of the first processor to the memory of the second processor; the second processor sends the first indication command to the computing device.
Optionally, the first instruction command includes a start address and a data block length of the data to be written in the first processor 10, and further includes a start address of the data to be written in a memory of the second processor.
Optionally, the second processor may send a plurality of first instruction commands to the computing device, where each first instruction command is used to move the partial data corresponding to the first IO command, and the first instruction command may also be used to instruct to move partial data corresponding to other IO commands. The computing device may move data corresponding to the first IO command from the memory of the first processor to the memory of the second processor through multiple operations.
Optionally, if the first IO command is a read operation command, the sending, by the second processing device, an instruction command to the computing device according to the first IO command includes: the second processor generates a second instruction command according to the first IO command, where the second instruction command is used to instruct the computing device to move data to be read from the memory of the second processor to the memory of the first processor; the second processor sends the first indication command to the computing device.
Optionally, the second instruction command includes a start address and a data length of data in the memory of the second processor, and may further include a start address and a data length of data stored in the memory of the first processor.
Optionally, the second processor may send a plurality of second instruction commands to the computing device, where each second instruction command indicates that the moved data includes a part of data corresponding to the IO command, and the computing device may move the data from the memory of the second processor to the memory of the first processor through multiple operations.
S704, based on the instruction command, the computing device moves the data to be written from the memory of the first processor to the memory of the second processor in a DMA manner, or moves the data to be read from the memory of the second processor to the memory of the first processor in a DMA manner.
The embodiment of the present application provides a computing device, which may be disposed between a first processor and a second processor, and may implement, by using hardware, a partial storage management function of offloading of the first processor 10, where the partial storage management function includes moving data between a memory of the first processor and a memory of the second processor in a DMA manner, so as to reduce system overhead of the first processor and the second processor. And the computing equipment realizes the storage management function by using hardware, and can reduce the signaling interaction on software, thereby reducing the IO time delay of access data, improving the IO speed and providing stable IO performance.
In the embodiment of the application, in the data writing process, the computing device obtains data from the memory of the first processor and caches the data in the memory of the second processor, the bandwidth between the first processor and the computing device cannot become a system bottleneck, and the computing device does not need to be externally hung with a memory, so that the memory cost is saved, and the hardware area is reduced.
In the embodiment of the application, in the process of reading data, the computing device obtains the data from the memory of the second processor and caches the data in the memory of the first processor, the bandwidth between the first processor and the computing device cannot become a system bottleneck, and the computing device does not need to be externally hung with a memory, so that the memory cost is saved, and the hardware area is reduced.
In the embodiment of the present application, the computing device 50 performs QoS management on the obtained IO command. Optionally, the QoS management includes allocating the IO command to the corresponding volume queue, and the QoS management further includes selecting an IO command to be processed from the volume queue and writing the selected IO command into the sub-command queue, so that the second processor 20 processes the IO command in the sub-command queue. As a specific example, the IO command may include a volume identifier, and the computing device allocates the IO command to a corresponding volume queue according to the volume identifier of the IO command, organizes information of the IO command into an IO control information block, and stores the IO control information block in the memory of the second processor 20. Optionally, the information of the IO command may include at least one of the following information: volume identification, effective length of data block, read indication information or write indication information, address information of source data block, and information whether encryption or decryption is required.
In this embodiment, the computing device may perform QoS management on the IO command received from the first processor, for example, divide the IO command into different volume queues, or select the IO command from the volume queues and allocate the IO command to a sub-command queue. Therefore, partial memory management functions of the first processor or the second processor can be unloaded, so that the system overhead of the first processor and the second processor is reduced, and the IO memory management performance and efficiency of data access are improved.
Optionally, the moving, by the computing device, data to be written from the memory of the first processor to the memory of the second processor by a DMA method based on the instruction command includes: the computing equipment reads a data block from the memory of the first processor in a DMA mode according to the first indication command; the computing device computing a check data block of the read data block; and the computing equipment stores the read data block and the check data block into the memory of the second processor in a DMA mode.
In a specific example, the first indication command may include a start address and a data block length of a data block to be written in the first processor 10, and further include a start address of the data block and a check data block in a memory of the second processor.
In this embodiment of the present application, the verification calculation may be implemented in the process of moving the data block from the first processor to the second processor, that is, the verification calculation is completed in the computing device. Therefore, the data does not need to be read out from the memory of the second processor again for check calculation, one-time memory copy is reduced, the memory resource is saved, and the data access flow is simplified.
As an example, the check algorithm may employ an EC algorithm of N + M. Wherein N, M is an integer of 1 or more. The N + M EC algorithm may refer to that every N data blocks correspond to M parity data blocks in the parity calculation.
Fig. 10 is a schematic diagram of the encoding and decoding flow of the EC algorithm according to the embodiment of the present application. As an example, N ═ 3 and M ═ 2. I.e. every 3 data blocks corresponds to 2 check data blocks during the encoding process. In the decoding process, if less than or equal to M data blocks and/or check blocks are lost or damaged, the remaining data blocks and check data blocks may be used to repair the data.
FIG. 11 is a schematic diagram of an EC algorithm in a conventional storage offload scheme. As shown in FIG. 11, in a conventional storage offload scheme, a block of data is first moved from the memory of a first processor to the memory of a second processor. After the data is moved, the second processor performs EC calculation on the data block in a software manner, and at this time, the data needs to be read out from the memory of the second processor again, and then EC calculation is performed, so that one more memory copy is required, and memory resources are occupied.
FIG. 12 is a schematic diagram of an EC algorithm in a storage offload scheme according to an embodiment of the present application. As shown in fig. 12, in the embodiment of the present application, in the process of moving a data block from the memory of the first processor to the memory of the second processor, EC calculation may be implemented, that is, EC calculation is completed in the computing device, so that it is not necessary to read data from the memory of the second processor again in EC calculation, one memory copy is reduced, memory resources are saved, and a flow of accessing data is simplified.
Optionally, the computing device may be implemented by using an FPGA or an ASIC with a lower cost, and may implement a related algorithm in a storage function by using hardware instead of software, so that efficiency of the algorithm may be improved, IO delay of the computer system may be reduced, and IO performance of the computer system may be improved.
Optionally, if the first IO command is a write operation command, after the first IO command is acquired, the method 700 further includes: the second processor decomposes the first IO command to obtain a plurality of first sub IO commands, wherein data blocks requested to be written by different sub IO commands correspond to different physical addresses in the storage pool; the second processor determines a first stripe, where the first stripe includes at least one sub IO command in the first sub IO commands, and the first stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be written by the sub IO command included in the first stripe corresponds to a same storage device in the storage pool; and the second processor sends the data corresponding to the first stripe to the same storage device in the storage pool.
Each sub IO command may be used to request to write a data block, where each data block corresponds to a physical address in the storage pool. As an example, the size of one data block may be 8 Kilobytes (KB). In a distributed storage architecture, a storage pool includes a plurality of storage devices (alternatively referred to as servers), each of which may include one or more physical hard disks, each of which includes a plurality of physical addresses for storing corresponding data blocks. The physical addresses of the data blocks corresponding to the multiple sub IO commands decomposed based on the same IO command may be located in the same storage device or may be located in different storage devices. And the physical addresses of the data blocks corresponding to different sub IO commands decomposed based on different IO commands may be located in the same storage device. Therefore, the second processor may group together a plurality of sub IO commands corresponding to the same storage device to obtain a stripe, and send data to the same storage device in the storage pool on the basis of the stripe, so as to improve efficiency of data access.
The above manner of performing data access on a plurality of sub IO command sets corresponding to the same storage device in one stripe may be referred to as a hash stripe. The stripe-wise manner of the method of accessing data in the embodiment of the present application will be described further below with reference to fig. 13.
Optionally, if the first IO command is a read operation command, the method 700 further includes: the second processor generates a read data request command according to the first IO command, wherein the read data request command is used for requesting data to be read from the storage pool; the second processor sending the read data request command to the memory pool; and the second processor acquires the data to be read from the storage pool.
Optionally, the generating, by the second processor, a read data request command according to the first IO command includes: the second processor decomposes the first IO command to obtain a plurality of second IO sub-commands, wherein data blocks requested to be acquired by different IO sub-commands correspond to different physical addresses in the storage pool; the second processor determines a second stripe, where the second stripe includes at least one second sub IO command of the second sub IO commands, and the second stripe further includes at least one sub IO command obtained by decomposition based on other IO commands, where a data block requested to be obtained by the sub IO command in the second stripe corresponds to a same storage device in the storage pool; and the second processor generates the read data request command according to the second stripe, wherein the read data request command is used for requesting the storage pool for data to be read corresponding to the second stripe.
Similar to the write data flow, each sub IO command may be used to read one data block, where each data block corresponds to one physical address in the storage pool. In a distributed storage architecture, a storage pool includes a plurality of storage devices (alternatively referred to as servers), each of which may include one or more physical hard disks, each of which includes a plurality of physical addresses for storing corresponding data blocks. The physical addresses of the data blocks corresponding to the multiple sub IO commands decomposed based on the same IO command may be located in the same memory device or may be located in different memory devices. And the physical addresses of the data blocks corresponding to different sub IO commands decomposed based on different IO commands may be located in the same storage device. Therefore, the second processor may group together a plurality of sub IO commands corresponding to the same storage device to obtain a stripe, and read data from the storage pool based on the stripe, so as to improve efficiency of accessing the data.
Fig. 13 is a signaling processing flow diagram of a method 1300 of accessing data according to an embodiment of the present application. As shown in fig. 13, the method 1300 includes:
s1301, the computing device receives an IO command sent by the first processor.
For example, a first processor writes IO commands in a commit queue and a computing device reads IO commands from the commit queue.
S1302, the computing device analyzes the IO command and distributes the IO command to the volume queue.
For example, the computing device allocates the IO command to the corresponding volume queue according to the volume identifier of the IO command. And storing the key information corresponding to the IO command in an IO control information block, and using the identification (or index) of the IO command in the volume queue and the sub-command queue.
S1303, the computing device selects the IO command to be processed from the volume queue and distributes the IO command to the sub-command queue.
For example, the IO commands in the sub-command queue are for invocation by a second processor.
S1304, the second processor obtains the IO command from the sub-command queue and analyzes the IO command.
Optionally, parsing the IO command may include the following operations: decomposing the IO command into a plurality of sub IO commands; and carrying out stripe-making processing on the sub IO command.
In some examples, after the second processor obtains the IO command, the IO command may be decomposed into a plurality of sub IO commands. Each sub-IO command corresponds to a data block, each data block corresponds to a physical address in the storage pool, and different data blocks correspond to different physical addresses in the storage pool. The storage pool may include a plurality of storage devices, each storage device includes one or more physical hard disks, and each physical hard disk includes a plurality of physical addresses for storing corresponding data blocks.
In some examples, the second processor performs stripe-wise processing on the sub IO commands, i.e., aggregates a plurality of sub IO commands corresponding to the same storage device into one stripe. The sub IO commands in a stripe may be from different IO commands, and a stripe may correspond to a message or packet. As an example, 32 sub IO commands may be included in one stripe. In the write operation process, the second processor may send a data block corresponding to the sub IO command in one stripe as a message to the storage pool through the network card. In the read operation process, the second processor may send a read data request command to the storage pool to request to read a data block corresponding to the sub IO command in one stripe, with the sub IO command in one stripe as one aggregate.
S1305, the second processor sends an instruction command to the computing device to instruct the computing device to move the data.
For example, in a write operation flow, the second processor generates a first instruction command according to the first IO command, where the first instruction command is used to instruct to move data from the memory of the first processor to the memory of the second processor; the second processor sends the first indication command to the computing device.
For another example, in a read operation flow, the second processor generates a read data request command according to the first IO command, where the read data request command is used to request the storage pool for data to be read; the second processor sends the read data request command to the storage pool through a network card; the second processor acquires the data to be read from the storage pool through a network card; after the data to be read is acquired, the second processor sends a second instruction command to the computing device, where the second instruction command is used to instruct the computing device to move the data to be read from the memory of the second processor to the memory of the first processor.
And S1306, the computing equipment executes data moving in a DMA mode according to the instruction command.
For example, a DMA module in the computing device may be used to perform the data movement.
For example, in a write operation flow, the computing device reads a data block from the memory of the first processor in a DMA manner according to a first instruction command; the computing device computing a check data block of the read data block; and the computing equipment stores the read data block and the check data block into the memory of the second processor in a DMA mode. And then the second processor sends the data block, namely the check data block to the storage pool through the network card, and the writing operation process is finished.
For another example, in the read operation flow, the computing device moves the data from the memory of the second processor to the memory of the first processor in a DMA manner according to the second instruction command, and the read operation process is completed.
In the embodiment of the application, a matched signaling flow is provided for the computer system, so that the computing device can realize the unloaded partial storage management function of the first processor, and the data access efficiency of the computer system can be improved.
The following will continue to describe a specific flow of the computing device executing the write operation method and the read operation method in conjunction with fig. 14 to 17.
FIG. 14 is a flowchart illustrating a write operation method 1400 according to an embodiment of the disclosure. The write operation flow includes two phases. In the first phase, the computing device fetches a block of data from the memory of the first processor, calculates a check bit, and caches the block of data and the block of check data in the memory of the second processor. In the second stage, the second processor performs fragmentation and organization of the message header, and sends the message to the storage pool through the network card to complete the data writing operation. Here, S1401 to S1410 describe the processing flow of the first stage, and S1410 to S1413 describe the processing flow of the second stage. The method 1400 is described as follows.
S1401, the first processor generates an IO command, and the IO command is a write operation command.
S1402, the computing device obtains the IO command.
S1403, the computing device allocates the IO command to the volume queue.
S1404, selecting the IO command from the volume queue by the computing device to add into the sub-command queue.
S1405, the second processor acquires the IO command from the sub-command queue.
For brevity, specific contents in S1401-S1405 may refer to specific descriptions in fig. 7 to fig. 12, and are not described herein again.
S1406, the second processor generates a first instruction command according to the IO command, where the first instruction command is used to instruct to move the data block from the memory of the first processor to the memory of the second processor.
Optionally, in the write operation flow, the second processor may perform a stripe-hashing and a stripe-hashing on the sub IO command, and generate the first indication command according to a result of the stripe-hashing and the stripe-hashing. One stripe corresponds to one message or message, and a data block corresponding to a sub IO command in one stripe is used for one EC calculation.
FIG. 15 is a block diagram illustrating a hash of a sub IO command according to an embodiment of the present application. The computing device may perform EC computation for N + M according to the hash-and-strip result of the sub IO command. As an example, in fig. 15, N is 3 and M is 2. Each stripe in the EC calculation corresponds to N sub IO commands of each row, and data blocks corresponding to the N sub IO commands can be used for one EC calculation and generate M check data blocks. For the N + M EC algorithm, the data blocks may be filled into N queues according to a certain rule, where the N queues may be N0, N1, and N2 in fig. 15. The EC algorithm is then used to calculate parity data blocks for M queues, which may be M0 and M1 in FIG. 15.
In the longitudinal direction, each queue of the N queues may be referred to as a stripe, a plurality of sub IO commands are combined into one stripe, and each queue corresponds to one message or packet. As an example, each stripe includes 32 sub IO commands, in other words, each data block corresponding to 32 sub IO commands is packed together to form a message and sent. After the corresponding check data block is completed, each of the M queues may also constitute a message and send it.
S1407, the computing device receives the first instruction command from the second processor.
As a specific example, the first indicating command may be transmitted between the computing device and the second processor by way of a command queue. And the second processor writes the first instruction command into the command queue, and the computing equipment takes the command out of the command queue and performs analysis processing.
Optionally, the first indication command includes address information of the data block in the memory of the first processor, and the address information may include a start address and a data block length. The first indication command further includes address information of the data block and the check data block in the memory of the second processor, and the address information may include a start address and a data length.
Alternatively, the second processor may generate the first indication command according to the result of the hash partitioning. The data blocks that the first indication command indicates to move may include data blocks corresponding to N sub IO commands corresponding to each stripe. The computing device may perform EC calculation after reading the N data blocks to obtain M check data blocks.
Optionally, the second processor may send a plurality of first instruction commands to the computing device, where each first instruction command is used to move a part of data corresponding to the first IO command, and the first instruction command may also be used to instruct to move data corresponding to other IO commands. The computing device may move data corresponding to the first IO command from the memory of the first processor to the memory of the second processor through multiple operations.
And S1408, reading the data block from the memory of the first processor by the computing equipment according to the first instruction command.
As a specific example, the first indication command includes address information of the data block in a memory of the first processor, and the computing device may read the data block from the memory of the first processor according to the first indication command.
And S1409, the computing equipment carries out check computation on the read data block to obtain a check data block.
As a specific example, the computing device may compute the check data block according to the hash partitioning result in fig. 15.
And S1410, the computing device stores the data block and the check data block in the memory of the second processor.
As a specific example, the first indication command further includes address information of the data block and the check data block in the memory of the second processor, and the computing device may store the data block and the check data block in specified locations in the memory of the second processor.
S1411, the second processor generates a message according to the result of the hash stripe.
For example, the second processor may compose and send a message after completing the data blocks in one stripe according to the result of stripe-padding in S1406, and the message may be sent to the storage pool through the network card.
FIG. 16 is a block diagram of a hash stripe of a data block according to an embodiment of the present application. As shown in fig. 16, after acquiring the data blocks and the check data blocks, the second processor may assemble the data blocks or the check data blocks corresponding to each stripe into a packet. And adds a message header to the message, which may include control information. For example, each message may be denoted as MSG 0, MSG1, MSG 2, MSG 3, MSG 4. The second processor may send the message to the storage pool via the network card. In a distributed storage system, messages corresponding to each stripe may be sent to different destinations in a storage pool.
And S1412, the second processor instructs the network card to send the message.
As a specific example, the second processor may write a Work Queue Entry (WQE) corresponding to the packet in the submission queue, and notify the network card to read the work queue entry. The commit queue may be located in memory of the second processor. Optionally, the WQE is configured to indicate address information of data corresponding to the packet in a memory of the second processor, where the address information may include a start address, a data length, and other information.
S1413, the network card reads the message from the memory of the second processor and sends the message to the storage pool.
As a specific example, after reading the corresponding work queue entry, the network card may read a message from the memory of the second processor according to the address information in the work queue entry, and send the message to the storage pool.
FIG. 17 is a flowchart illustrating a method 1700 of reading data operations according to an embodiment of the present application. As shown in FIG. 17, a read data operation typically includes two phases. In the first stage, the second processor receives the data blocks sent by the storage pool through the network card and stores the data blocks in the memory of the second processor. In the second stage, the second processor sends a second instruction command to the computing device, and the computing device moves the data block from the memory of the second processor to the memory of the first processor. Among them, S1701 to S1709 describe a first stage, and S1710 to S1714 describe a second stage.
S1701, the first processor generates an IO command which is a read data command.
S1702, the computing device obtains the IO command.
S1703, the computing device allocates the IO command to the volume queue.
S1704, the computing device selects an IO command from the volume queue to add into the sub-command queue.
S1705, the second processor acquires the IO command from the sub-command queue.
For brevity, the specific contents in S1701-S1705 can be referred to the specific description in fig. 7, and are not described herein again.
And S1706, the second processor generates a read data request command according to the IO command, wherein the read data request command is used for requesting the data to be read from the storage pool.
Optionally, in the process of reading data, after the second processor acquires the IO command, the IO command may be decomposed into a plurality of sub IO commands, the sub IO commands are subjected to stripe-hashing processing, and the read data request command is generated according to a result of the stripe-hashing. Wherein multiple sub IO commands in a stripe may correspond to data in the same storage device in a storage pool. The read data request command may be used to request that data corresponding to a stripe be read.
S1707, the second processor sends a read data request command to the network card.
S1708, the network card receives data from the storage pool and stores the data into the memory of the second processor.
Optionally, the second processor may send, to the network card, address information of a memory space prepared in a memory of the second processor for the data, where the address information may include a start address and a data length of the data. After receiving the data, the network card may move the data to a corresponding space in the memory of the second processor.
S1709, the network card informs the second processor that the data sent by the storage pool has been received.
As a specific example, the network card notifies the second processor that the reception of data is complete by adding a completion entry in the first completion queue. The card sends an interrupt signal to the second processor after adding the complete entry. After receiving the interrupt signal, the second processor may determine that the memory of the second processor has received the data to be read in by querying the first completion queue. Wherein, the first completion queue may refer to a queue for recording a completion read data request command.
And S1710, the computing device acquires a second instruction command generated by the second processor, wherein the second instruction command is used for instructing the computing device to move data from the memory of the second processor to the memory of the first processor.
As a specific example, the second instruction command includes address information of data in the memory of the second processor, such as a start address and a data length. The second instruction command further includes address information of the data stored in the memory of the first processor, for example, a start address and a data length.
And S1711, the computing equipment moves the data from the memory of the second processor to the memory of the first processor according to the second instruction command.
Optionally, the second processor may send a plurality of second instruction commands to the computing device, where the data moved by each second instruction command includes a part of data corresponding to the IO command, and the computing device may move the data from the memory of the second processor to the memory of the first processor through multiple operations.
S1712, the computing device informs the second processor that the second instruction command indicates the moved data.
Optionally, after the computing device completes the second instruction command, a completion entry may be added in the second completion queue to notify the second processor of completion of the data movement corresponding to the second instruction command. The computing device sends an interrupt signal to the second processor after adding the completion entry. The second processor, upon receiving the interrupt signal, may determine that the second instruction command is complete by querying a second completion queue.
And S1713, the second processor sends a read operation completion command to the computing device to indicate that the computing device has completed the read operation corresponding to the IO command.
As a specific example, the second processor may determine whether the computing device completes moving of all data blocks corresponding to the IO command according to a completion entry written by the computing device into the second completion queue. And if the read operation is finished, determining that the read operation corresponding to the IO command is finished.
As a specific example, the read operation completion command includes a completion entry corresponding to the IO completion command in the third completion queue. The third completion queue may be located in the memory of the second processor. After determining that the read operation corresponding to the data corresponding to the IO command is completed, the second processor writes a completion entry corresponding to the IO command into a third completion queue, and the computing device reads the third completion queue and determines that the read operation corresponding to the IO command is completed.
S1714, the computing device sends an IO completion command to the first processor, where the IO completion command is used to indicate that the read operation corresponding to the IO command has been completed.
As a specific example, the IO completion command may include a completion entry corresponding to the IO completion command in a fourth completion queue, where the fourth completion queue may be located in the memory of the first processor. After the computing device determines that the read operation corresponding to the IO command is completed, the completion entry corresponding to the IO command may be written in a fourth completion queue, and the first processor reads the corresponding completion entry from the fourth completion queue, thereby determining that the read operation corresponding to the IO command is completed.
Next, with reference to the drawings, a data caching system in the embodiment of the present application will be described continuously, and in the embodiment of the present application, the data caching system may also be referred to as a queue ring cache structure or an Extreme Simple Ring Interface (ESRI). By way of example and not limitation, each command queue referred to in this application embodiment may use the data cache system described in this application embodiment to store IO commands, for example, each command queue described above may include: volume queue, subcommand queue, completion queue in FIG. 6; or the commit queue of FIG. 7, the commit queue of FIG. 13; or the first through fourth completion queues in fig. 17, etc.
Fig. 18 is a schematic structural diagram of a data cache system according to an embodiment of the present application. As shown in fig. 18, the data caching system includes a cache space, a producer control unit, and a consumer control unit. Wherein the cache space is used for storing data, and the producer control unit and the consumer control unit can both access the cache space. The producer control unit is configured to execute a producer thread and the consumer control unit is configured to execute a consumer thread. The functions of the producer control unit and the consumer control unit may be implemented by software. In the embodiment of the present application, the upper layer software for writing commands to the data cache system may be referred to as a producer or a producer thread, and the upper layer software for reading commands from the data cache system may be referred to as a consumer or a consumer thread.
In some examples, the producer control unit and the consumer control unit may be located in different processing devices if a producer thread and a consumer thread are executed by different processing devices. For example, the functions of the producer control unit may be performed by the first processor 10 in FIG. 5, and the functions of the consumer control unit may be performed by the computing device 50 in FIG. 5. Alternatively, if the producer thread and the consumer thread are executed by the same processor, the functions of the producer control unit and the consumer control unit may be executed by the same processing device.
Fig. 19 is a schematic diagram illustrating an operating state of a data caching system according to an embodiment of the present application. As shown in fig. 19, the data caching system includes a cache space, where the cache space includes K address ranges, where the K address ranges are respectively used to store K entries, and each entry includes an owner flag bit, where the owner flag bit is used to indicate whether data is stored in each entry. K is an integer greater than 0, and by way of example, K in fig. 19 is 16. The owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that data are stored in the corresponding entry, and the idle identifier is used for indicating that no data are stored in the corresponding entry.
It should be noted that, in the embodiment of the present application, the data stored in each entry may be a command, such as an IO command, or may also be other types of data besides a command.
By way of example, fig. 20 is a schematic structural diagram of an entry in the data cache system according to an embodiment of the present application. As shown in fig. 20, the entry includes two field segments, an owner (ownershirp) flag bit and a command (command), respectively. Wherein owership may be an indication that the producer and consumer are communicating. For example, each time the upper layer software writes a command into an entry in the data cache system, the owner flag bit of the entry may be set to 1, and 1 indicates that the command is stored in the entry. Alternatively, the value of the owner flag bit may be set to 0, where 0 indicates that no command is stored in the entry. When reading the command, the upper layer software can judge whether the command is stored in the item according to the assignment of the zone bit of the owner. Alternatively, the command field segment in fig. 20 may also be replaced with a data (data) field segment.
With continuing reference to FIG. 19, the data caching system maintains a cache space using a producer Pointer (PI) for pointing to an address range of the next write data of the K address ranges and a consumer pointer (CI) for pointing to an address range of the next read data of the K address ranges. In the embodiment of the present application, the write data may refer to write data in an entry, and the read data may refer to read data in an entry.
Wherein the data caching system may maintain a register corresponding to the producer pointer and a register corresponding to the consumer pointer, respectively. In some examples, the producer pointer and the consumer pointer may be maintained by different processing devices if a producer thread and a consumer thread are executed by different processing devices. For example, for the commit queue in FIG. 7, the producer pointer may be maintained by the first processor 10 in FIG. 5 and the consumer pointer may be maintained by the computing device 50. Or for the sub-command queue in fig. 6, the producer pointer may be maintained by the computing device 50 in fig. 5 and the consumer pointer may be maintained by the second processor 20. Alternatively, if the producer and consumer threads are executed by the same processing device, the producer and consumer pointers may be maintained by the same processing device. For example, for the volume queue in FIG. 6, its producer and consumer pointers may be maintained by the computing device 50.
Optionally, the producer control unit is configured to perform the following operations: determining a first address range indicated by the producer pointer; and writing data in the first address range, and updating a first owner zone bit in the first address range to be the storage identification.
Optionally, the consumer control unit is configured to perform the following operations: determining a second address range indicated by the consumer pointer; determining whether to read data from the second address range based on the second owner flag.
For example, in the case that a second owner flag bit in the second address range is recorded as the storage identity, reading data from the second address range; and determining that no data is stored in the second address range under the condition that the second owner zone bit is recorded as the idle identification.
Wherein the producer control unit is configured to execute a producer thread and the consumer control unit is configured to execute a consumer thread. The functions of the producer control unit and the consumer control unit may be implemented by software. In some examples, the producer control unit and the consumer control unit may be located in different processing devices if a producer thread and a consumer thread are executed by different processing devices. For example, for the commit queue in FIG. 7, the producer thread may be executed by first processor 10 in FIG. 5 and the consumer thread may be executed by computing device 50. For the sub-command queue in FIG. 6, the producer thread may be executed by computing device 50 in FIG. 5 and the consumer thread may be executed by second processor 20. Alternatively, if the producer thread and the consumer thread are executed by the same processor, the functions of the producer control unit and the consumer control unit may be executed by the same processing device. For example, for the volume queue in FIG. 6, its producer and consumer threads may be executed by the computing device 50.
In the embodiment of the application, by configuring the owner flag bit in the entry stored in each address range in the data cache system to indicate whether data is stored in the corresponding entry, so that upper-layer software can determine whether data is stored in the entry by reading the storage identifier or the idle identifier of the owner flag bit, the data cache system can reduce interaction between a data writing thread and a data reading thread, and even allow the data writing thread and the data reading thread to run independently, thereby simplifying the management process of the data cache system.
Optionally, with continued reference to fig. 19, the data cache system further includes a producer loop (PLoop) register for storing a producer loop flag bit for recording whether the producer pointer indicates that the cycle number of the K address ranges is odd or even. For example, it may be set that PLoop ═ 0 represents odd-numbered times and PLoop ═ 1 represents even-numbered times. Alternatively, the number of times of PLoop is 1 and odd, and the number of times of PLoop is 0 and even may be set.
The producer control unit is configured to specifically perform the following operations: determining the cycle times indicated by the producer ring flag bit to be odd times; writing data in the first address range and updating a first owner flag bit in the first address range from a second value to a first value; alternatively, the producer control unit is configured to specifically perform the following operations: determining the cycle times indicated by the producer ring flag bit to be even number times; data is written in the first address range and a first owner flag bit in the first address range is updated from the first value to the second value.
It is to be understood that for an owner flag bit in the address range indicated by the producer pointer, the first value represents a stored identity and the second value represents a free identity when the number of cycles indicated by the producer ring flag bit is odd. When the number of circulation times indicated by the producer ring flag bit is even, the second value represents a storage identifier, and the first value represents an idle identifier.
Optionally, in this embodiment of the application, specific values of the first value and the second value are not limited as long as the specific values can respectively represent the storage identifier and the idle identifier. Alternatively, the first value and the second value may each be represented by a binary number. For example, the first value is 1, and the second value is 0; alternatively, the first value is 0 and the second value is 1.
Wherein the assignment of the owner flag is determined based on the assignment of the producer ring flag. And according to the fact that the cycle number of the producer pointer is odd number or even number, the assigned meanings of the owner zone bits are different. Therefore, when the upper layer software writes data in an entry in the data cache system, the assignment of the owner flag bit of the entry can be determined according to the current assignment of the PLoop. Thereby facilitating upper layer software to determine when reading data that the owner flag bit assignment indicates that data is stored or not stored.
In the embodiment of the application, the upper layer software determines the assignment of the owner zone bit corresponding to the storage identifier according to the parity of the cycle number of the producer pointer, so that when the upper layer software reads data, whether the entry stores the data or not can be determined according to the parity of the cycle number of the consumer pointer and the assignment of the owner zone bit.
With continued reference to FIG. 19, the data caching system further includes a consumer loop (CLOop) register to store consumer loop flag bits to record whether the producer pointer indicates that the number of cycles for the K address ranges is odd or even. For example, it may be set that CLoop ═ 0 represents odd-numbered times and CLoop ═ 1 represents even-numbered times. Alternatively, the number of times when CLoop is 1 is odd, and the number of times when CLoop is 0 is even may be set.
The consumer control unit is configured to specifically perform the following operations: reading data from the second address range in the case that a second owner flag bit in the second address range is recorded as the storage identifier, including: determining the number of cycles indicated by the consumer ring flag bit to be odd; reading data from the second address range if the second owner flag bit is recorded as the first value; alternatively, the consumer control unit is configured to specifically perform the following operations: determining the number of cycles indicated by the consumer ring flag bit to be an even number; reading data from the second address range if the second owner flag bit assignment is the second value.
Optionally, the first value is 1 and the second value is 0; alternatively, the first value is 0 and the second value is 1.
In the embodiment of the application, the upper layer software determines that the assignment indication of the owner flag bit in the entry indicated by the consumer pointer is a storage identifier or an idle identifier according to the parity of the cycle number of the consumer pointer recorded by the consumer ring register, and further judges whether data is stored in the entry.
For example, when the number of cycles of the producer pointer is odd, the PLoop is 0, and when the data cache system writes a command in the address range pointed by the producer pointer, the assignment of the owner flag bit is set to 1, where 1 indicates that the command is stored in the entry corresponding to the owner flag bit, and 0 indicates that the command is not stored in the entry corresponding to the owner flag bit. When the number of times of cycle of the producer pointer is even, the PLoop is 1, and when the data cache system writes a command in the address range pointed by the producer pointer, the assignment of the owner flag bit is configured to be 0, where 0 indicates that the command is stored in the entry corresponding to the owner flag bit, and 1 indicates that the command is not stored in the entry corresponding to the owner flag bit.
Optionally, the consumer pointer determines the meaning of the assignment of the current owner flag based on the assignment of the consumer ring flag bit when reading the command. For example, when the number of cycles of the consumer pointer is odd, the CLoop is 0, and when a command is read according to the consumer pointer, if the assignment of the owner flag bit is configured as 1, it indicates that the command is stored in the entry, and the upper layer software may read the command; if the assignment of the owner flag bit is configured to be 0, it indicates that no command is stored in the entry, and the upper layer software does not read the command. When the cycle number of the consumer pointer is even, the CLOop is 1, when a command is read according to the consumer pointer, if the assignment of the owner zone bit is configured as 0, the command is stored in the entry, and the command can be read by upper-layer software; if the assignment of the owner flag bit is configured as 1, it indicates that no command is stored in the entry, and the upper layer software does not read the command.
Optionally, the number K of address ranges included in the data cache system in this embodiment of the present application may be 2xAnd x is a positive integer greater than or equal to 1.
In some examples, the data caching system may be managed by a producer and a consumer. The producer may refer to an upper software module for writing the entry, and the consumer may refer to an upper software module for reading the entry. The producer is used to maintain a producer pointer to indicate where it has produced and a producer ring flag to indicate that an entry for a round has been produced. The consumer is also configured to maintain a consumer pointer indicating where to read the next entry. The consumer also maintains a consumer ring flag indicating that an entry for a round has been consumed. In embodiments of the present application, a production entry refers to a command to write in an entry in a data caching system, and a consumption entry refers to a command to read in an entry in the data caching system.
In the embodiment of the present application, it can be ensured by the upper layer software that the data cache system does not overflow, for example, the producer can ensure that K commands at most are sent to the data cache system, so the PI never exceeds the CI, and the overflow condition does not occur. The register indicating the full state and the register indicating the empty state may not be provided in the data cache system. The full state means that the number of unread entries in the data cache system has reached K, and the empty state means that the number of entries of write commands in the data cache system is 0. The definition mode of the data cache system can simplify the management process of the data cache system.
Fig. 21 is a schematic diagram of an operating state of a data caching system according to an embodiment of the present application. FIG. 21 shows the process of the producer pointer cycling from the first round to the second round. Among other things, a write command to the data caching system may be referred to as a production command or a production entry, and a read command from the data caching system may be referred to as a consume command or a consume entry. As shown in fig. 21, when the system is initialized, PI is 0, CI is 0, PLoop is 0, CLoop is 0, and owership of all entries is 0, which means that the command in the entry is not produced. When the first entry is produced, with PI ═ 0 as the index address, the owership field segment in the entry is set to 1, and the command field segment is set to command 0(command 0), which indicates that the entry with PI ═ 0 has been produced. The value of PI is then updated to PI +1, i.e., PI 1, to point to the next entry to be produced. PI continues to produce entries in this manner, after PI-K-1, the next value of PI is PI-0. At this time, the value of PLoop is changed from 0 to 1, indicating that the producer pointer has completed the first round of production, and PI returns to the initial value of 0.
Fig. 22 is a schematic diagram illustrating an operating state of a data caching system according to another embodiment of the present application. FIG. 22 shows the process of the consumer pointer cycling from the first round to the second round. As shown in fig. 22, in the first round, when the CLoop is 0, the consumer checks the owership value in the entry with the CI value as an index, if owership is 1, the command indicating the entry has been produced, and the consumer reads the corresponding command in the entry with CI as an index. If owership ═ 0, then the command indicating the entry has not been produced.
As the consumer continuously consumes the commands in the entries, the CI value continuously increases, the number of the commands to be processed in the data cache system is already lower than the maximum value K, and the producer can continue to produce the data cache system. At this point the producer started the second run of production, with PLoop 1. After the producer produces a new command, the value of the corresponding ownershirp is changed, namely the value of the ownershirp is changed from 1 to 0, which indicates that the item of the second round has been produced.
And after the consumer consumes the K items in the first round, the value of the CI returns to the initial value of 0, and the value of the CLOop is changed from 0 to 1, which indicates that the consumer has consumed the items in the first round. In the second round, the consumer needs to determine whether the owership value is 0, with owership ═ 0 indicating that the item has been produced, and owership ═ 1 indicating that the item has not been produced.
It can be seen that the producer and consumer each own the ring registers (PLoop and CLOop) to indicate the meaning of the owership value. For example, when the value of PLoop (or CLoop) is 0, it indicates that the number of cycles of the producer (or consumer) is odd, and ownershirp ═ 1 indicates that the item has been produced and can be consumed. When the value of PLoop (or CLoop) is 1, the number of cycles of the producer (or consumer) is even, and ownershirp ═ 0 indicates that the item has been produced and can be consumed.
In the embodiment of the application, the cycle number states of the producer pointer and the consumer pointer are recorded through the producer ring register and the consumer ring register, so that upper-layer software can determine the assignment meaning of the owner zone bit in the entry by reading the assignments of the producer ring register and the consumer ring register, and further judge whether the entry stores a command.
With continued reference to fig. 21, in the case that the PLoop is 1 and the loop is 0, the data cache system is full of K commands, and the consumer has not consumed any entries, and the number of commands in the data cache system has reached the upper command limit of the system. In the embodiment of the application, in order to avoid overflow of the data cache system, the design can be performed through upper-layer software, and it is ensured that a producer does not continue to update the command under the condition that the command data reaches the upper limit. The above-mentioned manner of designing the upper layer software may specifically refer to the description in fig. 23.
Fig. 23 is a schematic diagram of an application scenario of a further embodiment of the present application. FIG. 23 illustrates a computer system that includes a host processor 101 and an accelerator 102. The host processor 101 and the accelerator 201 may be the same type of processor or may be different types of processors. By way of example, the host processor 101 may be a CPU and the accelerator 201 may comprise an FPGA, ASIC, or the like. For example, the host processor 101 may be the first processor 10 in FIG. 5 and the accelerator 201 may be the computing device 50 in FIG. 5.
During the process of processing IO commands, two data caching systems may be maintained between the host processor 101 and the accelerator 102, one for caching the command queue and the other for caching the completion queue. For the command queue, the host processor 101 is the producer and the accelerator 201 is the consumer. The host processor 101 is used to write commands in the command queue and the accelerator 201 is used to read commands from the command queue. For the completion queue, the accelerator 201 is the producer and the host processor 101 is the consumer. Specifically, the accelerator 201 may process the command after reading the command, and write the completed command into the completion queue after completing the command, and the main processor 101 may read the command from the completion queue to learn the currently completed command. The main controller can control the number of commands buffered in the command queue, and after receiving a completion message of one command through the completion queue, the main controller sends a new command to the command queue, so that the number of the commands on line is ensured not to exceed the upper limit.
For example, assume that the command queue stores an upper limit on the number of commands to 256. The host processor 101 may first write 256 commands to the command queue. The host processor 101 is then constrained to not send new commands to the command queue. When an accelerator 201 completes a command, a completion entry for the command may be written to the completion queue. Host processor 101 may learn that the command is complete by reading a completion entry in the completion queue. Wherein the completion entry for the command may be understood as a completion message for the command. The host processor 101 may send a new command to the command queue. In subsequent processes, the host processor 101 may issue a new command to the command queue after receiving a command completion message until no new commands are generated. The upper layer software design mode can ensure that the number of commands in the command queue is less than or equal to 256, thereby avoiding the condition of command overflow in the data cache system.
In the embodiment of the present application, the computer system may ensure that the data cache system does not overflow through upper layer software, so that the data cache system may not be provided with a register indicating a full state and a register indicating an empty state, and this definition mode of the data cache system may simplify a management process of the data cache system.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (34)

1. A method for accessing data, applied to a computer system, the computer system comprising: a first processor, a second processor, and a computing device, the computing device being connected to the first processor and the second processor, respectively, the second processor being configured to connect to a storage pool, the method comprising:
the computing device obtains a first input/output (IO) command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool;
the computing device sending the first IO command to the second processor;
the second processing device sends an instruction command to the computing device according to the first IO command, wherein the instruction command is used for instructing the computing device to move data;
and the computing equipment moves the data to be written from the memory of the first processor to the memory of the second processor in a Direct Memory Access (DMA) mode or moves the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the indication command.
2. The method of claim 1, wherein the computing device sending the first IO command to the second processor comprises:
the computing device allocates the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool;
the computing device selects an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command;
and the second processor acquires the first IO command from the sub-command queue.
3. The method of claim 1 or 2, wherein the first IO command is a write operation command, and the sending, by the second processing device, an indication command to the computing device according to the first IO command comprises:
the second processor generates a first instruction command according to the first IO command, where the first instruction command is used to instruct to move data to be written from the memory of the first processor to the memory of the second processor;
the second processor sends the first indication command to the computing device.
4. The method of any of claims 1 to 3, wherein the first IO command is a write operation command, and wherein the moving of the data to be written from the memory of the first processor to the memory of the second processor by the computing device based on the indication command comprises:
the computing equipment reads a data block from the memory of the first processor in a DMA mode;
the computing device computing a check data block of the read data block;
and the computing equipment moves the read data block and the check data block to the memory of the second processor in a DMA mode.
5. The method of any of claims 1 to 4, wherein the first IO command is a write operation command, the method further comprising:
the second processor decomposes the first IO command to obtain a plurality of first sub IO commands, wherein data blocks requested to be written by different sub IO commands correspond to different physical addresses in the storage pool;
the second processor determines a first stripe, where the first stripe includes at least one sub IO command in the first sub IO commands, and the first stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be written by the sub IO command included in the first stripe corresponds to a same storage device in the storage pool;
and the second processor sends the data corresponding to the first stripe to the same storage device in the storage pool.
6. The method of claim 1 or 2, wherein the first IO command is a read operation command, and the sending, by the second processing device, an indication command to the computing device according to the first IO command comprises:
the second processor generates a second instruction command according to the first IO command, where the second instruction command is used to instruct the computing device to move data to be read from the memory of the second processor to the memory of the first processor;
the second processor sends the second indication command to the computing device.
7. The method of claim 1, 2 or 6, wherein the first IO command is a read operation command, the method further comprising:
the second processor generates a read data request command according to the first IO command, wherein the read data request command is used for requesting data to be read from the storage pool;
the second processor sending the read data request command to the memory pool;
and the second processor acquires the data to be read from the storage pool.
8. The method of claim 7, wherein the second processor generating a read data request command from the first IO command comprises:
the second processor decomposes the first IO command to obtain a plurality of second IO sub-commands, wherein data blocks requested to be acquired by different IO sub-commands correspond to different physical addresses in the storage pool;
the second processor determines a second stripe, where the second stripe includes at least one second sub IO command of the second sub IO commands, and the second stripe further includes at least one sub IO command obtained by decomposition based on other IO commands, where a data block requested to be obtained by the sub IO command in the second stripe corresponds to a same storage device in the storage pool;
and the second processor generates the read data request command according to the second stripe, wherein the read data request command is used for requesting the storage pool to acquire data corresponding to the second stripe.
9. The method according to any one of claims 1 to 8, further comprising:
the second processor writes a completion entry corresponding to the first IO command into a completion queue under the condition that the first IO command is determined to be completed;
and the computing equipment sends IO completion information to the first processor according to the completion queue, wherein the IO completion information is used for indicating that a first IO command is completed.
10. The method of any of claims 1 to 9, wherein the computing device stores IO commands using a data cache system comprising:
the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry;
the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier;
the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
11. A method for accessing data, the method being performed by a computing device, the computing device being respectively connected to the first processor and the second processor, the second processor being configured to connect to a storage pool, the method comprising:
the computing device obtains a first input/output (IO) command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool;
the computing device sending the first IO command to the second processor;
the computing equipment receives an instruction command sent by the second processor, wherein the instruction command is used for instructing the computing equipment to move data;
and the computing equipment moves the data to be written from the memory of the first processor to the memory of the second processor in a Direct Memory Access (DMA) mode or moves the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the indication command.
12. The method of claim 11, wherein the computing device sending the first IO command to the second processor comprises:
the computing device allocates the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool;
and the computing equipment selects an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command.
13. The method of claim 11 or 12, wherein the first IO command is a write operation command, and wherein the receiving, by the computing device, an indication command sent by the second processor comprises:
the computing device receives a first instruction command sent by the second processor, the first instruction command is generated according to the first IO command, and the first instruction command is used for instructing to move data to be written from the memory of the first processor to the memory of the second processor.
14. The method of any of claims 11 to 13, wherein the first IO command is a write operation command, and wherein the moving, by the computing device, data to be written from the memory of the first processor to the memory of the second processor based on the indication command comprises:
the computing equipment reads a data block from the memory of the first processor in a DMA mode;
the computing device computing a check data block of the read data block;
and the computing equipment moves the read data block and the check data block to the memory of the second processor in a DMA mode.
15. The method of claim 11 or 12, wherein the first IO command is a read operation command, and wherein receiving, by the computing device, an indication command sent by the second processor comprises:
and the computing device receives a second instruction command sent by the second processor, wherein the second instruction command is generated according to the first IO command, and the second instruction command is used for instructing the computing device to move the data to be read from the memory of the second processor to the memory of the first processor.
16. The method according to any one of claims 11 to 15, further comprising:
and the computing equipment sends IO completion information to the first processor under the condition that the first IO command is determined to be completed, wherein the IO completion information is used for indicating that the first IO command is completed.
17. The method of any of claims 11 to 16, wherein the computing device stores IO commands using a data cache system comprising:
the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry;
the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier;
the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
18. A computer system comprising a first processor, a second processor, and a computing device, the computing device coupled to the first processor and the second processor, respectively, the second processor configured to couple to a storage pool,
the computing device is configured to obtain a first input/output (IO) command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool;
the computing device is to send the first IO command to the second processor;
the second processing device is used for sending an instruction command to the computing device according to the first IO command, wherein the instruction command is used for instructing the computing device to move data;
the computing device is further configured to move, based on the instruction command, data to be written from the memory of the first processor to the memory of the second processor in a Direct Memory Access (DMA) manner, or move data to be read from the memory of the second processor to the memory of the first processor in a DMA manner.
19. The computer system of claim 18, wherein the computing device is specifically configured to: allocating the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool; selecting an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command; the second processor is specifically configured to obtain the first IO command from the sub-command queue.
20. The computer system of claim 18 or 19, wherein the first IO command is a write operation command, and the second processing device is specifically configured to: generating a first instruction command according to the first IO command, where the first instruction command is used to instruct to move data to be written from the memory of the first processor to the memory of the second processor; sending the first indication command to the computing device.
21. The computer system of any of claims 18 to 20, wherein the first IO command is a write operation command, the computing device to be specifically configured to: reading a data block from a memory of the first processor in a DMA mode; calculating a check data block of the read data block; and transferring the read data block and the check data block to the memory of the second processor in a DMA mode.
22. The computer system of any of claims 18 to 21, wherein the first IO command is a write operation command, the second processor further to: decomposing the first IO command to obtain a plurality of first sub IO commands, wherein data blocks requested to be written by different sub IO commands correspond to different physical addresses in the storage pool; determining a first stripe, where the first stripe includes at least one sub IO command in the plurality of first sub IO commands, and the first stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be written by the sub IO command included in the first stripe corresponds to a same storage device in the storage pool; and sending the data corresponding to the first stripe to the same storage device in the storage pool.
23. The computer system of claim 18 or 19, wherein the first IO command is a read operation command, and the second processing device is specifically configured to: generating a second instruction command according to the first IO command, where the second instruction command is used to instruct the computing device to move data to be read from the memory of the second processor to the memory of the first processor; sending the second indication command to the computing device.
24. The computer system of claim 18, 19 or 23, wherein the first IO command is a read operation command, the second processor further to: generating a read data request command according to the first IO command, wherein the read data request command is used for requesting data to be read from the storage pool; sending the read data request command to the storage pool; and acquiring the data to be read from the storage pool.
25. The computer system of claim 24, wherein the second processor is specifically configured to: decomposing the first IO command to obtain a plurality of second IO sub commands, wherein the data blocks requested to be obtained by different IO sub commands correspond to different physical addresses in the storage pool; determining a second stripe, where the second stripe includes at least one second sub IO command of the second sub IO commands, and the second stripe further includes at least one sub IO command decomposed based on other IO commands, where a data block requested to be obtained by the sub IO command in the second stripe corresponds to a same storage device in the storage pool; and generating the read data request command according to the second stripe, wherein the read data request command is used for requesting the storage pool to acquire data corresponding to the second stripe.
26. The computer system of any one of claims 18 to 25, wherein the second processor is further configured to, in the event that determination is made that the first IO command is completed, write a completion entry corresponding to the first IO command to a completion queue; the computing device is further configured to send IO completion information to the first processor according to the completion queue, where the IO completion information is used to indicate that the first IO command is completed.
27. The computer system of any of claims 18 to 26, wherein the computing device stores IO commands using a data cache system, the data cache system comprising:
the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry;
the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier;
the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
28. A computing device, wherein the computing device is respectively connected with a first processor and a second processor, and wherein the second processor is used for connecting with a storage pool, the computing device comprising:
an input/output (IO) processing module, configured to acquire a first IO command sent by the first processor, where the first IO command is a write operation command or a read operation command, the write operation command is used to request to write data into the storage pool, and the read operation command is used to request to read data from the storage pool;
a quality of service (QoS) module, configured to send the first IO command to the second processor;
the IO processing module is further configured to receive an instruction command sent by the second processor, where the instruction command is used to instruct the computing device to move data;
and the direct memory access DMA module is used for moving the data to be written from the memory of the first processor to the memory of the second processor in a DMA mode or moving the data to be read from the memory of the second processor to the memory of the first processor in a DMA mode based on the indication command.
29. The computing device of claim 28, wherein the QoS module is specifically configured to: allocating the first IO command to a volume queue corresponding to the first IO command, wherein IO commands in different volume queues correspond to different logical hard disks in the storage pool; and selecting an IO command from the volume queue to join a sub-command queue, wherein the selected IO command comprises the first IO command, and the sub-command queue is a waiting queue for the second processor to process the IO command.
30. The computing device according to claim 28 or 29, wherein the first IO command is a write operation command, the IO processing module is specifically configured to receive a first instruction command sent by the second processor, the first instruction command is generated according to the first IO command, and the first instruction command is used to instruct to move data to be written from a memory of the first processor to a memory of the second processor.
31. The computing device of any of claims 28 to 30, wherein the first IO command is a write operation command, the DMA module to read a block of data from the memory of the first processor in a DMA manner;
the computing device further includes an algorithm engine module to: calculating a check data block of the read data block;
the DMA module is further used for moving the read data block and the check data block to the memory of the second processor in a DMA mode.
32. The computing device according to claim 28 or 29, wherein the first IO command is a read operation command, the IO processing module is specifically configured to receive a second instruction command sent by the second processor, the second instruction command is generated according to the first IO command, and the second instruction command is used to instruct the computing device to move data to be read from a memory of the second processor to a memory of the first processor.
33. The computing device of any one of claims 28 to 32, wherein the IO processing module is further configured to send IO completion information to the first processor if it is determined that the first IO command is completed, where the IO completion information is used to indicate that the first IO command is completed.
34. The computing device of any of claims 28 to 33, wherein the computing device stores IO commands using a data cache system, the data cache system comprising:
the cache space comprises K address ranges, the K address ranges are respectively used for storing K entries, K is an integer greater than 0, each entry comprises an owner zone bit, the owner zone bit is used for recording a storage identifier or an idle identifier, the storage identifier is used for indicating that IO commands are stored in the corresponding entry, and the idle identifier is used for indicating that the IO commands are not stored in the corresponding entry;
the cache space is configured to: in the case that an IO command is received by a first address range indicated by a producer pointer, a first owner zone bit in the first address range is updated to the storage identifier;
the cache space is further configured to: in the case that a second owner flag bit in a second address range indicated by a consumer pointer is recorded as the storage identity, an IO command in the second address range is read.
CN201911053658.8A 2019-10-31 2019-10-31 Method, computing device and computer system for accessing data Pending CN112749111A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911053658.8A CN112749111A (en) 2019-10-31 2019-10-31 Method, computing device and computer system for accessing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911053658.8A CN112749111A (en) 2019-10-31 2019-10-31 Method, computing device and computer system for accessing data

Publications (1)

Publication Number Publication Date
CN112749111A true CN112749111A (en) 2021-05-04

Family

ID=75644768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911053658.8A Pending CN112749111A (en) 2019-10-31 2019-10-31 Method, computing device and computer system for accessing data

Country Status (1)

Country Link
CN (1) CN112749111A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117234431A (en) * 2023-11-14 2023-12-15 苏州元脑智能科技有限公司 Cache management method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681912A (en) * 2011-03-01 2012-09-19 株式会社日立制作所 Improving network efficiency for continuous remote copy
CN106537340A (en) * 2014-07-16 2017-03-22 戴尔产品有限公司 Input/output acceleration device and method for virtualized information handling systems
CN106575271A (en) * 2014-06-23 2017-04-19 谷歌公司 Managing storage devices
CN107077426A (en) * 2016-12-05 2017-08-18 华为技术有限公司 Control method, equipment and the system of reading and writing data order in NVMe over Fabric frameworks
US20180089099A1 (en) * 2016-09-29 2018-03-29 Intel Corporation Offload data transfer engine for a block data transfer interface
CN109328341A (en) * 2016-07-01 2019-02-12 英特尔公司 Processor, the method and system for the storage that identification causes remote transaction execution to stop
WO2019061014A1 (en) * 2017-09-26 2019-04-04 Intel Corporation Methods and apparatus to process commands from virtual machines
CN110300960A (en) * 2017-02-28 2019-10-01 株式会社日立制作所 The program change method of information system, management program and information system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681912A (en) * 2011-03-01 2012-09-19 株式会社日立制作所 Improving network efficiency for continuous remote copy
CN106575271A (en) * 2014-06-23 2017-04-19 谷歌公司 Managing storage devices
CN106537340A (en) * 2014-07-16 2017-03-22 戴尔产品有限公司 Input/output acceleration device and method for virtualized information handling systems
CN109328341A (en) * 2016-07-01 2019-02-12 英特尔公司 Processor, the method and system for the storage that identification causes remote transaction execution to stop
US20180089099A1 (en) * 2016-09-29 2018-03-29 Intel Corporation Offload data transfer engine for a block data transfer interface
CN107077426A (en) * 2016-12-05 2017-08-18 华为技术有限公司 Control method, equipment and the system of reading and writing data order in NVMe over Fabric frameworks
CN110300960A (en) * 2017-02-28 2019-10-01 株式会社日立制作所 The program change method of information system, management program and information system
WO2019061014A1 (en) * 2017-09-26 2019-04-04 Intel Corporation Methods and apparatus to process commands from virtual machines

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117234431A (en) * 2023-11-14 2023-12-15 苏州元脑智能科技有限公司 Cache management method and device, electronic equipment and storage medium
CN117234431B (en) * 2023-11-14 2024-02-06 苏州元脑智能科技有限公司 Cache management method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US10325343B1 (en) Topology aware grouping and provisioning of GPU resources in GPU-as-a-Service platform
EP3754498B1 (en) Architecture for offload of linked work assignments
US20200322287A1 (en) Switch-managed resource allocation and software execution
US10348830B1 (en) Virtual non-volatile memory express drive
US9575689B2 (en) Data storage system having segregated control plane and/or segregated data plane architecture
US20220335563A1 (en) Graphics processing unit with network interfaces
US9565269B2 (en) Non-volatile memory express over ethernet
US20210092069A1 (en) Accelerating multi-node performance of machine learning workloads
US8799917B2 (en) Balancing a data processing load among a plurality of compute nodes in a parallel computer
US20220103530A1 (en) Transport and cryptography offload to a network interface device
US8572407B1 (en) GPU assist for storage systems
US20230127141A1 (en) Microservice scheduling
CN110908600B (en) Data access method and device and first computing equipment
WO2020163327A1 (en) System-based ai processing interface framework
WO2022169519A1 (en) Transport and crysptography offload to a network interface device
CN109857545B (en) Data transmission method and device
US20210329354A1 (en) Telemetry collection technologies
US20220210097A1 (en) Data access technologies
Abbasi et al. A performance comparison of container networking alternatives
CN116257471A (en) Service processing method and device
CN109729110B (en) Method, apparatus and computer readable medium for managing dedicated processing resources
CN112749111A (en) Method, computing device and computer system for accessing data
US20210141535A1 (en) Accelerating memory compression of a physically scattered buffer
CN115686836A (en) Unloading card provided with accelerator
EP4030284A1 (en) Virtual device portability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination