CN117667761A

CN117667761A - Data access device, method, system, data processing unit and network card

Info

Publication number: CN117667761A
Application number: CN202211054647.3A
Authority: CN
Inventors: 钟刊; 崔文林
Original assignee: Chengdu Huawei Technology Co Ltd
Current assignee: Chengdu Huawei Technology Co Ltd
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2024-03-08
Also published as: WO2024045643A1

Abstract

A data access device, a method, a system, a data processing unit and a network card are disclosed, and relate to the field of data storage. In the process that the data access device writes data into the storage server, the storage space to which the data is written in the storage server cannot be accessed by other data access devices, that is, only one data access device can access the storage space at one time, so that the problem that data read by each data access device from one storage space in the storage server is inconsistent due to the fact that the data are written in the storage space by the plurality of data access devices is avoided. In addition, the data access device does not need to wait for interaction between the controller in the storage server and the disk corresponding to the storage space, and then writes the data from the memory into the storage server, so that the length of an IO path for writing the data in the storage server is reduced, and the data access efficiency between the data access device and the storage server is improved.

Description

Data access device, method, system, data processing unit and network card

Technical Field

The present disclosure relates to the field of data storage, and in particular, to a data access device, a method, a system, a data processing unit, and a network card.

Background

The shuffle is used to describe a process of aggregating data into different nodes after being scrambled, and takes an application of a distributed system running a storage intensive task as an example, for example, in Map/Reduce application, the shuffle is a bridge between a connection mapping node (Mapper) and a reduction node (Reducer), for example, the shuffle requests to transmit data between the mapping node and the reduction node based on a remote procedure call (remote procedure call, RPC). The data transmission realized based on the RPC request can be realized only by cooperation of a local node (such as a simplifying node) and a remote node (such as a mapping node), so that a large amount of network resources and memory are consumed, the disks IO of the local node and the remote node are more, and the data access efficiency is affected.

Disclosure of Invention

The application provides a data access device, a method, a system, a data processing unit and a network card, which solve the problem of low data access efficiency of a local node and a remote node.

In a first aspect, there is provided a data access device comprising: a processor, a memory and a data processing unit (data processing unit, DPU). By way of example, the Data Processing Unit (DPU) may refer to a removable accelerator card, such as a DPU card, and the data access device may include a server (or host) and the DPU card, the server may include the aforementioned processor and memory. In the data access device provided in this embodiment, the processor is configured to: writing data into a memory, and sending a data synchronization request to the DPU; the data synchronization request is used to instruct the storage of data to the storage server. The DPU is used for: based on the aforementioned data synchronization request, a locking request is sent to the storage server, and a data write command. The data writing command is used for indicating the storage server to write data into the storage space corresponding to the first address, and the locking request is used for indicating the storage server to set the storage space corresponding to the first address to be inaccessible by other data access devices in the process of executing the data writing command.

In this embodiment, in the process of writing data into the storage server by the data access device, a storage space (a storage space corresponding to the first address) to be written into by the data access device in the storage server cannot be accessed by other data access devices, that is, for a storage space of the storage server, a plurality of data access devices writing data into the storage space are mutually exclusive (multiple write mutual exclusion), that is, only one data access device can access the storage space at a time, so that the problem that data is inconsistent among the plurality of data access devices due to the fact that the data is written into the storage space of the storage server by the plurality of data access devices, and the data read from the storage space of the storage server by each data access device is different, is avoided. In addition, since the aforementioned section of storage space cannot be accessed by other data access devices, in the process of writing data into the storage server by the data access device provided in this embodiment, the data access device may bypass the controller (or the processor) included in the storage server, that is, the data access device does not need to wait for the interaction between the controller in the storage server and the disk corresponding to the storage space, and then writes the data into the storage server from the memory, thereby reducing the length of the IO path for writing data in the storage server, and increasing the data access efficiency between the data access device and the storage server.

In an alternative implementation, the DPU is further configured to: after the data is successfully written into the first address, an unlocking request is sent to the storage server. Such as an unlock request, for instructing the storage server to set the first address of the data to be accessible by other data access devices. Thus, for the data or the file stored in the storage server, after the data or the file is updated (such as writing new data, deleting, modifying, etc.) by the data access device, the storage server can unlock the access state of the data or the file, so that the updated data or the file can be accessed or modified by other data access devices, and the like, thereby avoiding the problem that the data or the file in the storage server can only be used by a single data access device, and the data access efficiency of the other data access devices to the data or the file is reduced.

In another alternative implementation, the DPU is further configured to: and sending failure indication messages to other data access devices. The invalidation indication message is used for indicating other data access devices to invalidate old data stored in the first address. If the data access device does not send failure indication information to other data access devices, the other data access devices execute the task based on the old data when the old data stored in the storage space corresponding to the first address is stored or mapped, which may cause an error in execution of the task. In contrast, in this embodiment, after the data access device writes data in the storage space corresponding to the first address in the storage server, the data access device sends a failure indication message to other data access devices, and the other data access devices fail the old data of the stored first address, and in the case that the other data access devices need to use the new data written in by the first address, the other data access devices re-read the new data in the storage space of the first address from the storage server, so that the problem that data read by multiple data access devices in the same storage space of the storage server are inconsistent, resulting in data access errors or reduced data access efficiency is avoided.

In another alternative implementation, the DPU is further configured to: and according to the invalidation information sent by other data access equipment, invalidating the data stored in the second address indicated by the invalidation information in the memory.

In this case, the second address is different from the first address described above. If the DPU receives failure information sent by other data access devices, the old data stored in the second address is failed, so that the problem that the data access efficiency of the data access device to the storage server is reduced due to the fact that the old data are inconsistent with new data stored in the storage space of the second address in the storage server and result in access errors caused by inconsistent data in the same storage space cached by the data access devices is avoided.

In another case, the second address is the same as the first address described above. It should be understood that, for a section of storage space in the storage server, the plurality of data access devices can modify data in the section of storage space at different times, so that the problem that the data in the section of storage space can only be modified by a single data access device in the data access process of the storage server is avoided, and the performance of the data access service provided by the storage server is improved.

As an alternative implementation, the processor is further configured to: in the event of a memory miss for data, a read request is sent to the DPU. The DPU is also used to: and reading the data stored in the first address from the storage server based on the first address carried by the read request. It should be understood that, for the data stored in the storage server by the data access device, other data access devices may read the newly written data (new data), and the data access device writing the data may also read the new data, so that the new data has consistency among the plurality of data access devices, and the data access performance of the data access device to the storage server is improved.

In a second aspect, a data access method is provided, the data access method being performed by a data access system comprising a data access device and a storage server, the data access device comprising a processor, a memory and a DPU. The data access method provided in this embodiment includes: the processor writes the data into the memory and sends a data synchronization request to the DPU; the data synchronization request is used to instruct the storage of data to the storage server. And the DPU sends a locking request and a data writing command to the storage server based on the data synchronization request. The data writing command is used for indicating the storage server to write data into the storage space corresponding to the first address, and the locking request is used for indicating the storage server to set the storage space corresponding to the first address to be inaccessible by other data access devices in the process of executing the data writing command.

In an optional implementation manner, the data access method provided in this embodiment further includes: after the data is successfully written into the first address, the DPU sends an unlocking request to the storage server. The unlock request is to instruct the storage server to set a first address of the data to be accessible by other data access devices.

In an optional implementation manner, the data access method provided in this embodiment further includes: the DPU sends a failure indication message to other data access devices. The invalidation indication message is used for indicating other data access devices to invalidate old data stored in the first address.

In an optional implementation manner, the data access method provided in this embodiment further includes: and the DPU invalidates the data stored in the second address indicated by the invalidation information in the memory according to the invalidation information sent by other data access devices.

In an optional implementation manner, the data access method provided in this embodiment further includes: in the event of a memory miss for data, the DPU sends a read request to the DPU. And the DPU reads the data stored in the first address from the storage server based on the first address carried by the read request.

In a third aspect, there is provided a data access system comprising: a storage server and a data access device as shown in any one of the implementations of the first aspect. The storage server is used for storing data to be synchronized by the data access device, and setting a storage space corresponding to a first address where the data is written to be inaccessible by other data access devices.

In a fourth aspect, there is provided a DPU comprising: control circuitry and interface circuitry. And the interface circuit is used for receiving data from other devices except the DPU and transmitting the data to the control circuit or transmitting the data from the control circuit to the other devices except the DPU. The control circuitry performs the functions of the DPU in any one of the possible implementations of the second aspect by logic circuitry or executing code instructions, and interface circuitry.

In a fifth aspect, a network card is provided, including: the DPU and the communication interface provided in the fourth aspect. Such as for transmitting data sent by the DPU or for receiving data sent by other devices to the DPU.

In a sixth aspect, a computer readable storage medium is provided, in which a computer program or instructions are stored which, when executed by a data access device or DPU, perform the operational steps of the method according to any one of the implementations of the second aspect.

In a seventh aspect, a computer program product is provided, which when run on a computer causes the computer to perform the operational steps of the method according to any one of the implementations of the second aspect. By way of example, the computer may be a data access device, a host, a DPU or DPU card, or the like.

Advantageous effects of the second aspect to the seventh aspect may be described with reference to any implementation manner of the first aspect, and are not described here again. Further combinations of the present application may be made to provide further implementations based on the implementations provided in the above aspects.

Drawings

FIG. 1 is a schematic diagram of a data access system according to the present application;

FIG. 2 is a schematic diagram I of a memory map provided herein;

FIG. 3 is a second schematic diagram of a memory map provided herein;

FIG. 4 is a flowchart illustrating a method for accessing data according to the present application;

FIG. 5 is a second flow chart of the data access method provided in the present application;

fig. 6 is a schematic diagram of data failure provided in the present application.

Detailed Description

The application provides a data access method, in the process that data are written into a storage server by data access equipment, a storage space to which the data are written in the storage server cannot be accessed by other data access equipment, namely, for a section of storage space of the storage server, a plurality of data access equipment writing the data in the section of storage space are mutually exclusive (short for multiple write mutual exclusion), namely, only one data access equipment can access the storage space at one time, and the problem that the data are inconsistent in a plurality of data access equipment due to the fact that the data are written in the section of storage space in the storage server by the plurality of data access equipment, and the data read from the section of storage space by each data access equipment are different is avoided. In addition, since the aforementioned section of storage space cannot be accessed by other data access devices, in the process of writing data into the storage server by the data access device provided in this embodiment, the data access device may bypass the controller (or the processor) included in the storage server, that is, the data access device does not need to wait for the interaction between the controller in the storage server and the disk corresponding to the storage space, and then writes the data into the storage server from the memory, thereby reducing the length of the IO path for writing data in the storage server, and increasing the data access efficiency between the data access device and the storage server.

The data access device and the corresponding data access method provided in the present application are described below with reference to the accompanying drawings, and description of related art will be given first.

Fig. 1 is a schematic structural diagram of a data access system provided in the present application, where the data access system 100 includes: a storage system 110 and a plurality of data access devices accessing the storage system 110, such as the data access device 1 and the data access device 2 shown in fig. 1. One or more servers in the storage system 110 may also have computing devices connected thereto, which may be used to provide more computing resources to the servers, or the computing functionality on the servers may be offloaded to an external acceleration device in order to improve the data access performance of the storage system 110.

The data access device may access the servers in the storage system 110 to access the data using a network, the communication functions of which may be implemented by a switch or router. In one possible example, the data access device may also communicate with the server over a wired connection, such as a peripheral component interconnect express (peripheral component interconnect express, PCIe) high speed bus, a computing express interconnect express (compute express link, CXL), a universal serial bus (universal serial bus, USB) protocol or other protocol bus, or the like.

The data access device comprises a host and a computing means, the data access device 1 comprises a host 1 and a computing means 131, and the data access device 2 comprises a host 2 and a computing means 132. In fig. 1, the computing device is represented by a DPU card, but should not be construed as limiting the application, and may include one or more processing units, which may be not only DPUs, but also central processing units (central processing unit, CPUs), other general purpose processors, digital signal processors (digital signal processor, DSPs), application specific integrated circuits (application specific integrated circuit, ASICs), field programmable gate arrays (field programmable gate array, FPGAs) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. The general purpose processor may be a microprocessor, but in the alternative, it may be any conventional processor. The computing device may also be an artificial intelligence (artificial intelligence, AI) -oriented special purpose processor such as a neural processor (neural processing unit, NPU) or a graphics processor (graphic processing unit, GPU) or the like. In physical form, one or more processing units included in the computing device may be packaged as a card, such as the DPU card of FIG. 1, that may access a host through a PCIe interface, CXL interface, unified Bus (UB) interface, NVlink interface, or other communication interface, and the host may offload portions of the data processing functions to the DPU card.

Illustratively, the host is a computer running an application. For example, if the computer running the application is a physical computing device, the physical computing device may be a server or a Terminal (Terminal). The terminal may also be referred to as a terminal device, a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), or the like. The terminal may be a cell phone, tablet, notebook, desktop, personal communication service (personal communication service, PCS) phone, desktop computer, wireless terminal in smart city, wireless terminal in smart home, etc. The embodiment of the application does not limit the specific technology and the specific equipment form adopted by the host. In some alternative implementations, the host shown in FIG. 1 may also be referred to as a client (client).

The storage system provided by the embodiment of the application can be a distributed storage system or a centralized storage system.

In one possible scenario, the storage system 110 shown in FIG. 1 may be a distributed storage system. As shown in fig. 1, the distributed storage system provided in this embodiment includes a storage cluster in which storage is integrated (storage is integrated). The storage cluster includes one or more servers (e.g., server 110A and server 110B shown in fig. 1) that may communicate with each other.

In some alternative implementations, the storage system 110 includes servers also referred to as storage servers. The server 110A illustrated in fig. 1 is herein described as a device having both computing and storage capabilities, such as a server, desktop computer, or the like. By way of example, an advanced reduced instruction (advanced reduced instruction set computer machines, ARM) server or an X86 server may be used as the server 110A herein. In hardware, as shown in fig. 1, the server 110A includes at least a processor 112, a memory 113, a network card 114, and a hard disk 105. The processor 112, the memory 113, the network card 114 and the hard disk 105 are connected by buses. Wherein the processor 112 and the memory 113 are used for providing computing resources. Specifically, the processor 112 is a CPU for processing data access requests (e.g., write data requests or read data requests, etc.) from outside the server 110A (application server or other servers), as well as requests generated internally by the server 110A. Illustratively, when the processor 112 receives write log requests, the data in the write log requests is temporarily stored in the memory 113. When the total amount of data in the memory 113 reaches a certain threshold, the processor 112 sends the data stored in the memory 113 to the hard disk 105 for persistent storage. In addition, the processor 112 is also used for data calculation or processing, etc. Only one processor 112 is shown in fig. 1, and in practical applications, there are often a plurality of processors 112, where one processor 112 in turn has one or more CPU cores. The present embodiment does not limit the number of CPUs and the number of CPU cores.

The memory 113 is an internal memory for directly exchanging data with the processor, and can read and write data at any time, and is fast, and is used as a temporary data memory for an operating system or other running programs. The memory includes at least two types of memory, for example, the memory may be a random access memory, such as a dynamic random access memory (dynamic random access memory, DRAM), or a storage class memory (storage class memory, SCM). DRAM is a semiconductor memory, which, like most random access memories (random access memory, RAM), is a volatile memory (volatile memory) device. SCM is a composite storage technology combining both traditional storage devices and memory characteristics, and storage class memories can provide faster read and write speeds than hard disks, but access speeds slower than DRAM, and are cheaper in cost than DRAM. However, the DRAM and SCM are only exemplary in this embodiment, and the memory may also include other random access memories, such as static random access memories (static random access memory, SRAM), and the like.

In addition, the memory 113 may be a dual in-line memory module or a dual in-line memory module (DIMM), i.e., a module composed of Dynamic Random Access Memory (DRAM), or may be a Solid State Disk (SSD). In practical applications, a plurality of memories 113 and different types of memories 113 may be configured in the storage server 110A. The number and type of the memories 113 are not limited in this embodiment. In addition, the memory 113 may be configured to have a power conservation function. The power-up protection function means that the data stored in the memory 113 is not lost when the system is powered down and powered up again. The memory having the power-saving function is called a nonvolatile memory.

The hard disk 105 is used to provide storage resources such as storage data and information on the data access status of each data access device (or host). For example, the data may be stored in the form of objects (objects) or files (files) on the storage hard disk 105 or the memory 113. The hard disk may be a magnetic disk or other type of storage medium such as a solid state disk or a shingled magnetic recording hard disk, or the like. By way of example, the hard disk 105 may be a solid state hard disk based on Non-volatile memory host controller interface Specification (Non-Volatile Memory Express, NVMe), such as an NVMe SSD.

The network card 114 in the server 110A is used to communicate with a host or other application server (such as the server 110B shown in fig. 1).

In one embodiment, the functionality of the processor 112 may be offloaded onto the network card 114. In other words, in such an embodiment, the processor 112 does not perform processing operations of the traffic data, but rather the network card 114 performs processing, address translation, and other computing functions of the traffic data.

In some application scenarios, the network card 114 may also have a persistent memory medium, such as persistent memory (persistent memory, PM), or non-volatile random access memory (non-volatile random access memory, NVRAM), or phase change memory (phase change memory, PCM), or the like. The CPU is used for executing address conversion, log reading and writing and other operations. The memory is used to temporarily store data to be written to the hard disk 105 or read data from the hard disk 105 to be sent to the controller. Or a programmable electronic component, such as a data processing unit (data processing unit, DPU). The DPU has the versatility and programmability of the CPU, but more specialized, and can operate efficiently on network packets, storage requests, or analysis requests. The DPU is distinguished from the CPU by a large degree of parallelism (requiring handling of a large number of requests). Alternatively, the DPU may be replaced with a processing chip such as a GPU, NPU, or the like. The network card 114 and the hard disk 105 have no attribution relation, and the network card 114 can access any hard disk 105 in the server 110B where the network card 114 is located, so that the hard disk can be conveniently expanded when the storage space is insufficient.

Fig. 1 is only an example provided in the embodiment of the present application, and the storage system 110 may further include more servers, memories, or hard disks, which are not limited by the number and specific form of the servers, memories, and hard disks.

In another possible scenario, a storage system provided by an embodiment of the present application may also store separate storage clusters for computing, where the storage clusters include a cluster of computing devices and a cluster of storage devices, where the cluster of computing devices includes one or more computing devices, where the computing devices may communicate with each other. The computing device may be a computing device such as a server, a desktop computer, or a controller of a storage array, etc. On hardware, the computing device may include a processor, memory, network cards, and the like. Wherein the processor is a CPU for processing data access requests from outside the computing device or requests generated internally to the computing device. For example, when the processor receives write requests sent by the user, the data carried in the write requests is temporarily stored in the memory. When the total amount of data in the memory reaches a certain threshold value, the processor sends the data stored in the memory to the storage device for persistent storage. In addition, the processor is used for data calculation or processing, such as metadata management, data de-duplication, data compression, virtualized storage space, address translation, and the like.

As an alternative implementation manner, the storage system provided in the embodiments of the present application may also be a centralized storage system. The centralized storage system is characterized by a unified portal through which all data from external devices is passed, which is the engine of the centralized storage system. The engine is the most central component in a centralized storage system in which many of the high-level functions of the storage system are implemented.

For example, there may be one or more controllers in the engine, in one possible example, if the engine has multiple controllers, any two controllers may have a mirror channel between them, so as to implement a function that any two controllers are backed up by each other, thereby avoiding hardware failure from causing the unavailability of the centralized storage system. The engine also includes a front-end interface and a back-end interface, wherein the front-end interface is configured to communicate with a computing device in the centralized storage system to provide storage services for the computing device. The back-end interface is used for communicating with the hard disk to expand the capacity of the centralized storage system. Through the back-end interface, the engine can be connected with more hard disks, so that a very large storage resource pool (short for memory pool) is formed.

The following provides a memory mapping implementation manner on the basis of the memory pool provided by the host and the storage system 110 shown in fig. 1, as shown in fig. 2, fig. 2 is a schematic diagram of a memory mapping provided in the present application, and the storage system 110 may further include a server 110C, and details about hardware implementation of the server may refer to fig. 1 and will not be repeated herein.

The main board of the host is inserted with a DPU card, such as DPU 1 inserted in the main board of host 1, DPU2 inserted in the main board of host 2, and DPU inserted in the main board of host 3. In this embodiment, a host with a DPU card inserted on a motherboard is referred to as a data access device of a memory pool.

In the following, a simple description of a hardware implementation of the host is given by taking the host 1 as an example, where the host 1 includes a processor 11 and a memory 12, and the embodiment of the present application is not limited to the specific connection medium between the processor 11 and the memory 12. The embodiment of the present application is illustrated in fig. 2 as a bus connection between the processor 11 and the memory 12, and the bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in fig. 2, but not only one bus or one type of bus. The host 1 may also comprise a communication interface for communicating with other devices via a transmission medium, whereby the apparatus used in the host 1 may communicate with other devices.

Memory 12 is used to store program instructions and/or data and processor 11 is coupled to memory 12. The coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units, or modules, which may be in electrical, mechanical, or other forms for information interaction between the devices, units, or modules. The processor 11 may cooperate with the memory 12. The processor 11 may execute program instructions stored in the memory 12.

In the embodiments of the present application, the processor may be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or execute the methods, steps, and logic blocks disclosed in the embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.

In the embodiment of the present application, the memory may be a nonvolatile memory, such as a hard disk (HDD) or SSD, or may be a volatile memory (RAM). The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory in the embodiments of the present application may also be circuitry or any other device capable of implementing a memory function for storing program instructions and/or data.

As shown in fig. 2, servers 110A through 110C together provide a memory pool for storing objects/files (objects/files) that are used to hold objects or files in storage system 110, which refer to a set of associated data, such as file 1 shown in fig. 2. In other words, the storage space occupied by the file 1 is allocated from the memory pool. Such as memory (e.g., DRAM, PMEM) or hard disk included in each server in the storage system 110 provides a global address Space, and distributed physical address (Distributed Physical Address, DPA) Space (DPA Space) is provided externally by the storage system 110, with DPA mapped to distributed virtual addresses (Distributed Virtual Address, DVA) through distributed page tables (Distributed Page Table, DPT). The user files or objects are built based on DVAs. An application in the host may map files/objects into the address space of the local process by means of a distributed memory map (distributed memory map), such as file cache 1 where memory space is provided by memory 12, and access it by load/store.

In a possible scenario, where the storage resource used by the file is a page (page) resource, such as the data page (page) shown in FIG. 2, the file cache 1 may also be referred to as a page cache (page cache). In different business scenarios or different workloads (workloads), the storage resources required for a file are different. Multiple files may be distinguished by the type of file.

File cache 1 includes metadata for file 1 and portions of data in file 1. The metadata is used to indicate the address of the data included in the file 1 in the storage server, which may be the aforementioned DVA or DPA, etc., and in the case where the data access device or host has the address, the data can be directly read from the storage server according to the address.

It should be appreciated that other data access devices and hosts may also map file 1 into a local cache. For example, the host 1 maps the file 1 to determine the file cache 2, and the file cache 2 includes metadata of the file 1 and part of data in the file 1, where the part of data and the data in the file 1 may be the same or different. As another example, the host 3 maps the file 1 to determine the file cache 3, and the file cache 3 includes metadata of the file 1 and partial data in the file 1, where the partial data and the data in the file 1 may be the same or different.

It should be noted that the memory pool shown in fig. 2 is implemented by one or more memories in the storage system 110 or by hard disk virtualization, but in some possible examples, the memory pool may be implemented by other storage media in the storage system 110, which is not limited in this application.

The mapping of the above files by the host may be implemented by an application program, such as mapping management software, and a memory map (mmap) management software is described below as an example, where fig. 3 is shown in fig. 3, fig. 3 is a schematic diagram of a memory map provided in the present application, and the data access system shown in fig. 3 includes a data access device 31 and a storage server 32. The data access device 31 may implement the functions of the host 1 and the computing means 131 shown in fig. 1, or the data access device 31 may implement the functions of the host 1 and the DPU 1 shown in fig. 2. Storage server 32 may refer to any one or a combination of servers in storage system 110, and is not described in detail herein.

The file 1 in the storage server 32 includes data stored in a plurality of data pages, such as the data pages P1 to P6 shown in fig. 3.

Taking the example where the data access device 31 includes the host 1 and the DPU 1 as an example, the data access device 31 maps a global file or object (e.g., file 1) to the address space of the host 1 and loads the data mapped to the address space of the host 1 into a local page buffer (e.g., file buffer 1) through a page fault (page fault) process.

After the file 1 has been mapped from the storage server 32 to the data access device 31, the storage server 32 may add the host 1 to a host list, where the host list is used to indicate a plurality of hosts mapped with the file 1, and the plurality of hosts may read data of the file 1 from the storage server 32 by querying metadata of the file 1, without waiting for a controller in the storage server to interact with a disk corresponding to the storage space, to read the data from the storage server to a memory of the host, thereby reducing a length of an IO path of the data access device for reading the data from the storage server, and increasing data access efficiency between the data access device and the storage server.

In one possible example, if the storage server 32 provides the storage space of the file 1 in the form of a Virtual Address (VA), when the data access device 31 maps the file to the Address space of the host 1, mapping between VA and a Physical Address (Physical Address) of the host 1 is required, and the mapping between VA and the Physical Address may be determined by the form of a page table. As this page table stores the mapping relationship between VA provided to the file 1 in the storage server 32 and the physical address provided to the file cache 1 by the data access device 31, taking fig. 3 as an example, VA of P1 to P6 in the file 1 of the storage server 32 are respectively: after Virt1 to Virt6, the data access device 31 establishes the address mapping relationship between the file 1 in the storage server 32 and the file cache 1 in the memory 312, the physical addresses of P1 to P6 are respectively: phys1 to Phys6.

In the case that the data access device 31 writes an address failure into the file cache 1, the mapping management software may update the address mapping relationship maintained by the page, so that the data access device 31 may modify the data in the file cache 1 according to the updated page table, and then synchronize the modified content to the storage server 32 through the address mapping relationship indicated by the page table, so as to implement cache consistency in the storage server 32 and the data access device 31.

After mmap, the data access device reads and writes the data in the file cache 1 through load/store. Wherein the load operation is used for reading data in the file cache 1, and the store operation is used for writing data into the file cache 1. In this embodiment of the present application, the store operation triggers write protection, for example, mapping management software stores old data in the file cache 1 into a write copy Log (copy on write Log, cow Log), and if the data access device 31 fails to synchronize the data written into the file cache 1 to the storage server 32, the data access device 31 can roll back the data with failed synchronization based on the content stored in the copy Log, so as to avoid the loss after the data fails to synchronize, and improve the data security.

Possible specific implementations are given below in fig. 4 and 5 for the process of synchronizing data by the data access device 31 to the storage server 32, and for the process of the data access device 31 reading data from the storage server 32.

As shown in fig. 4, fig. 4 is a schematic flow chart of a data access method provided in the present application, which may be applied to the data access system shown in fig. 3, where the data access device 31 includes a processor 311, a memory 312 and a DPU 313, and the hardware implementation of the processor 311, the memory 312 and the DPU 313 may refer to the content of the host 1 and the DPU 1 in fig. 2, which is not described herein again.

The data access method includes the following steps S410 to S440.

S410, the processor 311 writes the data to the memory 312.

Illustratively, when the storage space in the memory 312 establishes an address mapping relationship with the storage space provided by the memory pool, the data access device 31 may write data to the storage space in the memory 312 and synchronize the data to the memory pool provided by the storage server 32 according to the address mapping relationship. The content of the address mapping relationship may be described with reference to fig. 3, and is not described herein.

In a possible scenario, when the file cache 1 and the first address in the storage server 32 have established an address mapping relationship, the processor 311 writes the data (hereinafter referred to as new data in the embodiment) into the storage space corresponding to the first address in the file cache 1.

In another possible case, the address mapping relationship is not established between the file cache 1 and the first address in the storage server 32, the processor 311 establishes an address mapping relationship between the storage space of the first address and the storage space in the file cache 1, reads old data stored in the storage space of the first address in the storage server 32 through the address mapping relationship, and the processor 311 writes new data into the storage space corresponding to the first address in the file cache 1. The manner in which the processor 311 writes the data to the memory 312 may be overwriting or additional writing.

In one possible example, the storage space corresponding to the first address may refer to one or more data pages (pages), such as P1 through P6 shown in fig. 4. Here, the storage space corresponding to the first address is referred to as P1 as an example.

For example, if the processor 311 writes new data to P1 in the file cache 1 by overwriting, the new data will overwrite old data, and when the new data is written to P1 by the processor 311, the old data stored in P1 in the file cache 1 will be invalidated.

For another example, if the old data is insufficient to occupy the storage space (P1) provided by one page and the new data is insufficient to occupy the storage space of P1, the processor 311 writes the new data into P1 in the file cache 1 by using the additional write mode. If P1 is provided with a storage space of 4KB, old data occupies 2KB, and new data occupies the remaining 2KB.

S420, after the processor 311 writes the data to the memory 312, a data synchronization request is sent to the DPU 313.

The data synchronization request is used to instruct the storage of data to the storage server 32. For example, the data synchronization request is implemented by a "Memory sync" command.

Taking P1 in fig. 4 as an example, after the processor 311 writes data to P1 in the file cache 1, a "Memory sync" command is sent to the DPU 313 to cause the DPU 313 to store the data in the storage server 32.

S430, the DPU 313 transmits a lock request and a data write command to the storage server 32 based on the data synchronization request.

The data writing command is used for instructing the storage server 32 to write data into the storage space corresponding to the first address.

The locking request is used to instruct the storage server 32 to set the storage space corresponding to the first address to be inaccessible to other data access devices during the execution of the data write command. The memory space corresponding to the first address is illustratively set to be accessible only by the data access device 31, and the accessing process includes writing data, reading data, and the like.

As shown in fig. 4, the aforementioned other data access devices refer to: other hosts than the host 1 in the data access device 31, and the like, which are recorded in the host list of the storage server 32.

In the process that the data access device writes data into the storage server, the storage space (the storage space corresponding to the first address) to which the data is written in the storage server cannot be accessed by other data access devices, that is, for a section of storage space of the storage server, a plurality of data access devices writing data in the section of storage space are mutually exclusive (short for multiple write mutual exclusion), namely, only one data access device can access the storage space at one time, so that the problem that the data are inconsistent in the plurality of data access devices due to the fact that the data are written in the section of storage space in the storage server by the plurality of data access devices and the data read from the section of storage space by each data access device are different is avoided.

S440, the processor 311 writes the new data stored in the memory 312 into the memory space corresponding to the first address in the storage server 32.

Since the foregoing P1 cannot be accessed by other data access devices, in the process of writing data into the storage server by the data access device provided in this embodiment, the data access device may bypass a controller (or a processor) included in the storage server, that is, the data access device does not need to wait for interaction between the controller in the storage server and a disk corresponding to the storage space, and then writes data into the storage server from the memory, thereby reducing the length of an IO path for writing data into the storage server, and increasing the data access efficiency between the data access device and the storage server.

Notably, if data access device 31 misses in memory 312, processor 311 may also send a read request to DPU 313. The DPU 313 reads the data stored in the first address from the storage server 32 based on the first address carried by the read request.

For the data stored in the storage server by the data access device, other data access devices can read the newly written data (new data), and the data access device writing the data can also read the new data, so that the new data has consistency among a plurality of data access devices, and the data access performance of the data access device to the storage server is improved.

It should be understood that, during the long-term use of the storage server 32, the storage server 32 may also be accessed by other data access devices, and an implementation manner of locking and unlocking a storage space in the storage server is provided below, as shown in fig. 5, fig. 5 is a second schematic flow chart of the data access method provided in the present application, where the data access system applied by the data access method further includes: a data access device 33, the data access device 33 comprising a DPU 333 which may be implemented by the DPU 3 shown in fig. 2, the data access device 33 further comprising a host 3 shown in fig. 2.

Referring to fig. 5, the data access method provided in the present embodiment includes the following four stages.

Stage (1): DPU 313 sends a lock request to storage server 32 that instructs storage server 32 to lock the storage space to which new data is to be written.

As one possible example, mapping management software in the data access device 31 may obtain a data page (dirty page) in the file cache 1 where data is modified, thereby determining a dirty page list (dirty page list) in the file cache 1. The foregoing locking request may carry the dirty page list so that the storage server 32 determines the storage space to be locked from the dirty page list. For example, the new data to be written refers to the data stored in P1, P3, and P5 in the file cache 1, where P1, P3, and P5 in the file cache 1 may be referred to as dirty pages in the data access device 31, and the storage space in the storage server 32 to be written with the new data refers to the storage space corresponding to P1, P3, and P5 in the memory pool (as shown in the gray part in fig. 5).

In the storage server 32 shown in fig. 5, the storage server 32 maintains the locked or unlocked states of a plurality of data access devices by way of queues. If the lock state of one or more data pages in the memory pool is maintained in the lock request queue, for example, P1, P3 and P5 are accessible only to the data access device 31 and P2 is accessible only to the data access device 33 during the first time.

Stage (2): the data access device 31 writes the data stored by the dirty page to a storage space (e.g., the storage space of the first address) in the memory pool corresponding to the address of the dirty page.

Illustratively, after the storage server 32 locks the storage space corresponding to the address of the dirty page in the memory pool successfully, the DPU 313 writes the data stored by the dirty page back to the memory pool according to the address of the dirty page recorded in the file cache 1 in the memory pool by means of unilaterally writing the data.

Stage (3): the DPU 313 reads a host list of the plurality of hosts mapped with the file 1 from the storage server 32 by means of unilaterally reading data, and transmits an Invalidation (Invalidation) indication message to other hosts or data access devices in the host list to achieve cache consistency in the plurality of data access devices.

Illustratively, the invalidation indication message is used to instruct other data access devices to invalidate old data stored in the first address.

As shown in fig. 5, the file cache 3 of the data access device 33 maps the storage spaces corresponding to P1, P2, P3, P4, and P6 in the memory pool, and the data, because P1 and P3 are already covered by the new data written by the data access device 31, after the DPU 333 receives the invalidation instruction message sent by the DPU 313, the DPU 333 may invalidate the data in the file cache 3, such as the old data stored by P1 and P3 in the file cache 3, which is consistent with the address of the dirty page in the file cache 1.

As a possible example, DPU 333 may query a page table maintained by data access device 33 to determine the physical address of the data page to be invalidated in file cache 3 based on the invalidation indication message sent by DPU 313. The failure indication message carries VA of P1, P3 and P5: after querying the page table, DPU 333 determines the physical address to be invalidated in file cache 3 including: when the DPU 333 modifies the states of Virt1-Phys1 and Virt3-Phys3 in the page table to be invalid and the data access device 33 modifies P1 and P3 in the file cache 3, the modified data will not be synchronized to the memory pool. Under the condition that other data access devices need to use the new data written by the first address, the other data access devices read the new data in the storage space of the first address from the storage server again, so that the problems of data access errors or reduced data access efficiency caused by inconsistent data read by a plurality of data access devices in the same storage space of the storage server are avoided.

For the above process of invalidating old data by the DPU 333, this embodiment provides a possible implementation manner, as shown in fig. 6, fig. 6 is a schematic diagram of invalidation of data provided in this application, the data access device 33 includes the host 3 and the DPU 333, and the hardware implementation of the host 3 can refer to the content of the host 1 in fig. 2, which is not described herein again. The DPU 333 includes a processor 333A and a memory 333B, and the processor 333A may be a CPU (DPU CPU as shown in fig. 6), and the memory 333B may be a DRAM.

A plurality of page tables are maintained in the host 3, one page table corresponding to each file in one storage server, e.g., page table 1 corresponding to file 1.DPU 333 locally maintains a table recording all page table start positions of the node, and when DPU 333 receives a failure indication request sent by other DPUs, obtains a page table start address corresponding to a file identifier (obj ID) according to the file identifier. If the page table physical address determined by the file identifier "1" is "0x34adf", the data length of page table 1 corresponding to the file 1 is 64B.

Thus, DPU 333 reads page table 1 from the cache through the compute fast link (Compute Express Link, CXL) to local (memory 333B) and modifies page table entries corresponding to page table 1, such as Virt1-Phys1 and Virt3-Phys3 in FIG. 6, by processor 333A, rendering both page table entries invalid, and upon failure of a page table entry, data access device 33 will access the data corresponding to P1 and P3 in file 1 by reading from the storage server, thereby completing the failure of the cache, as the page table entries are associated with P1, P3 in file cache 3.

It should be noted that, in the above embodiment, the DPU 333 is taken as an example to describe the process of invalidating data in the file cache, and the DPU 313 may also receive invalidation information of other data access devices, so as to invalidate the data stored in the second address indicated by the invalidation information in the memory 312.

For example, the second address is different from the first address described above. If the DPU 313 receives the invalidation information sent by the other data access devices, invalidating the old data stored in the second address (e.g. P2), so as to avoid the problem that the data access device uses the old data to execute the task, where the old data is inconsistent with the new data stored in the storage space of the second address in the storage server, resulting in an access error generated by inconsistent data in the same storage space cached by the multiple data access devices, or the storage server synchronizes the data stored in the second address in the multiple data access devices after the data access device interacts with the storage server, resulting in a decrease in the data access efficiency of the data access device to the storage server.

As another example, the second address is the same as the first address described above. It should be understood that, for a section of storage space in the storage server, the plurality of data access devices can modify data in the section of storage space at different times, so that the problem that the data in the section of storage space can only be modified by a single data access device in the data access process of the storage server is avoided, and the performance of the data access service provided by the storage server is improved.

With continued reference to fig. 5, the data access method provided in the present embodiment further includes the following stage (4).

Stage (4): after the new data is successfully written to the first address, DPU 313 sends an unlock request to storage server 32. The unlock request is used to instruct the storage server 32 to set the first address of the data to be accessible by other data access devices.

For the data or the file stored in the storage server, after the data or the file is updated (such as writing new data, deleting, modifying, etc.) by the data access device, the storage server can unlock the access state of the data or the file, so that the updated data or the file can be accessed or modified by other data access devices, and the problem that the data or the file in the storage server can only be used by a single data access device, and the data access efficiency of other data access devices on the data or the file is reduced is avoided.

It will be appreciated that, in order to implement the functions of the above embodiments, the data access device includes corresponding hardware structures and/or software modules that perform the respective functions. Those of skill in the art will readily appreciate that the elements and method steps of the examples described in connection with the embodiments disclosed herein may be implemented as hardware or a combination of hardware and computer software. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application scenario and design constraints imposed on the solution.

The data access method and the data access device provided according to the present embodiment are described in detail above with reference to fig. 1 to 6, but the data access device provided in the embodiment of the present application may also be implemented by a software unit, for example, the data access device may be applied to the data access device described above, and the data access device may include: communication module, storage module and lock module. The storage module is used for writing data into the memory; the communication module is used for sending a data synchronization request to the DPU; the data synchronization request is used to instruct the storage of data to the storage server. The lock module applied to the DPU sends a locking request to the storage server based on the data synchronization request; the communication module is also used for sending a data writing command to the storage server. The data writing command is used for indicating the storage server to write data into the storage space corresponding to the first address, and the locking request is used for indicating the storage server to set the storage space corresponding to the first address to be inaccessible by other data access devices in the process of executing the data writing command.

It should be understood that the data access device of the embodiments of the present application may be implemented by a DPU. The data access device according to the embodiments of the present application may correspond to performing the methods described in the embodiments of the present application, and the foregoing and other operations and/or functions of each unit and module in the data access device are respectively for implementing the corresponding flow of each method in the foregoing drawings, which are not described herein for brevity.

For example, the DPU includes control circuitry and interface circuitry. And the interface circuit is used for receiving data from other devices except the DPU and transmitting the data to the control circuit or transmitting the data from the control circuit to the other devices except the DPU. The control circuit performs the functions of the DPU in the data access method described above through logic circuits or execution code instructions, and interface circuits.

The embodiment of the application also provides a network card, which comprises: the DPU and communication interface of the preceding embodiments. Such as for transmitting data sent by the DPU or for receiving data sent by other devices to the DPU. Thus, the DPU implements the operational steps of the data access method provided herein.

The method steps in this embodiment may be implemented by hardware, or may be implemented by executing software instructions by a processor. The software instructions may be comprised of corresponding software modules that may be stored in random access memory (random access memory, RAM), flash memory, read-only memory (ROM), programmable ROM (PROM), erasable Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may reside in a computing device. The processor and the storage medium may reside as discrete components in a network device or terminal device.

The application also provides a chip system which comprises a processor and is used for realizing the functions of the data processing unit in the method. In one possible design, the chip system further includes a memory for holding program instructions and/or data. The chip system can be composed of chips, and can also comprise chips and other discrete devices.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user device, or other programmable apparatus. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, e.g., floppy disk, hard disk, tape; optical media, such as digital video discs (digital video disc, DVD); but also semiconductor media such as solid state disks (solid state drive, SSD).

While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A data access device, comprising: a processor, a memory, and a data processing unit;

the processor is configured to: writing data into the memory, and sending a data synchronization request to the data processing unit; the data synchronization request is used for indicating to store the data to a storage server;

the data processing unit is used for: based on the data synchronization request, sending a locking request and a data writing command to the storage server;

the data writing command is used for indicating the storage server to write the data into the storage space corresponding to the first address, and the locking request is used for indicating the storage server to set the storage space corresponding to the first address to be inaccessible by other data access devices in the process of executing the data writing command.

2. The data access device of claim 1, wherein the data processing unit is further configured to: after the data is successfully written into the first address, an unlocking request is sent to the storage server;

the unlock request is to instruct the storage server to set a first address of the data to be accessible by other data access devices.

3. The data access device according to claim 1 or 2, wherein the data processing unit is further configured to: sending a failure indication message to the other data access equipment;

the invalidation indication message is used for indicating the other data access equipment to invalidate the old data stored in the first address.

4. A data access device according to any of claims 1-3, characterized in that the data processing unit is further adapted to: and according to the invalidation information sent by the other data access equipment, invalidating the data stored in the second address indicated by the invalidation information in the memory.

5. The data access device of any of claims 1-4, wherein the processor is further configured to: sending a read request to the data processing unit in the event of the memory miss for the data;

The data processing unit is further configured to: and reading the data stored in the first address from the storage server based on the first address carried by the read request.

6. A data access method, the data access method being performed by a data access system comprising a data access device and a storage server, the data access device comprising a processor, a memory and a data processing unit, the method comprising:

the processor writes data into the memory and sends a data synchronization request to the data processing unit; the data synchronization request is used for indicating to store the data to a storage server;

the data processing unit sends a locking request and a data writing command to the storage server based on the data synchronization request;

7. The method of claim 6, wherein the method further comprises:

after the data is successfully written into the first address, the data processing unit sends an unlocking request to the storage server;

8. The method according to claim 6 or 7, characterized in that the method further comprises:

sending a failure indication message to the other data access equipment;

9. The method according to any one of claims 6-8, further comprising:

and the data processing unit invalidates the data stored in the second address indicated by the invalidation information in the memory according to the invalidation information sent by the other data access equipment.

10. The method according to any one of claims 6-9, further comprising:

in the event of a memory miss for the data, the data processing unit sends a read request to the data processing unit;

The data processing unit reads data stored in a first address from the storage server based on the first address carried by the read request.

11. A data access system, comprising: a storage server and the data access device of any one of claims 1 to 5;

the storage server is used for storing the data to be synchronized by the data access device, and setting the storage space corresponding to the first address where the data is to be written to be inaccessible by other data access devices.

12. A data processing unit, comprising: a control circuit and an interface circuit;

the interface circuit is used for receiving data from other devices except the data processing unit and transmitting the data to the control circuit, or sending the data from the control circuit to the other devices except the data processing unit;

the control circuit performs the functions of the data processing unit of any one of claims 6 to 10 by logic circuits or executing code instructions, and the interface circuit.

13. A network card, comprising: the data processing unit and communication interface of claim 12;

The communication interface is used for sending the data sent by the data processing unit, or the communication interface is used for receiving the data sent by other devices to the data processing unit.