CN115525443A - Data storage method and related equipment - Google Patents

Data storage method and related equipment Download PDF

Info

Publication number
CN115525443A
CN115525443A CN202110821703.0A CN202110821703A CN115525443A CN 115525443 A CN115525443 A CN 115525443A CN 202110821703 A CN202110821703 A CN 202110821703A CN 115525443 A CN115525443 A CN 115525443A
Authority
CN
China
Prior art keywords
data
interface
shared
target
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110821703.0A
Other languages
Chinese (zh)
Inventor
孙宏伟
游俊
李光成
苏磊
刘勇
包小明
唐鲲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XFusion Digital Technologies Co Ltd
Original Assignee
XFusion Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XFusion Digital Technologies Co Ltd filed Critical XFusion Digital Technologies Co Ltd
Publication of CN115525443A publication Critical patent/CN115525443A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes

Abstract

According to the data storage method and the related device, after the network device obtains target data needing to be stored by a target application, the network device can store the target data in a target memory pool according to the type of the target data. The target memory pool comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the network equipment, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network equipment and/or a plurality of other network equipment. The network equipment can store the target data into different memory areas in the target memory pool according to the type of the target data, so that the storage efficiency of the target data is improved.

Description

Data storage method and related equipment
The present application claims priority from the chinese patent application entitled "a memory system" filed by the chinese patent office at 25/6/2021 with application number 202110713213.9, which is incorporated herein by reference in its entirety.
Technical Field
The present application relates to the field of communications, and in particular, to a data storage method and related device.
Background
At present, technologies that account for pooling and storage pooling as the primary infrastructure level have become quite popular. Memory pooling has begun to receive attention in the industry in order to improve the performance of applications. In a conventional memory pooling scheme, generally, from an architecture level, a network device may access a remote memory by multiple means when the network device confirms that a local memory is insufficient or data needs to be shared and exchanged, so as to implement sharing and pooling of a global memory.
The traditional pooling scheme realizes the sharing and pooling of remote memories from a system structure level, and the essence of the traditional pooling scheme is to expand local memories of network equipment, so that the efficient and transparent global memory pool service can be realized for the network equipment running with single-machine application. However, for a plurality of network devices running a certain non-standalone application, the conventional pooling scheme cannot effectively determine the number of network devices occupied by a partition (SWAP) and the capacity of a memory, and all data is indiscriminately stored in the same memory area, which is inefficient in storage.
Disclosure of Invention
The embodiment of the application provides a data storage method, and a network device can store target data into different memory areas in a target memory pool according to the type of the target data, so that the storage efficiency of the target data is improved.
A first aspect of the present application provides a data storage method, including: the method comprises the steps that network equipment acquires target data, target applications run on the network equipment, and the target data are used for indicating data needing to be stored by the target applications; the network device stores the target data in a target memory pool according to the type of the target data, wherein the target memory pool comprises a local memory area and a shared memory area, the local memory area is used for indicating local memory resources of the network device, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network device and/or a plurality of other network devices.
In the application, after the network device obtains the target data required to be stored by the target application, the network device can store the target data in the target memory pool according to the type of the target data. The target memory pool comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the network equipment, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network equipment and/or a plurality of other network equipment. The network equipment can store the target data into different memory areas in the target memory pool according to the type of the target data, so that the storage efficiency of the target data is improved.
In a possible implementation manner of the first aspect, the type of the target data includes non-shared data, where the non-shared data is used to indicate data that does not need to be accessed by other network devices, and the network device stores the target data in a target memory pool according to the type of the target data, including: the network equipment confirms that the target data is the non-shared data; and the network equipment stores the non-shared data in the local memory area according to the type of the non-shared data.
In this possible implementation manner, the network device determines whether the target data is shared data by determining whether the target data needs to be exchanged or shared with other network devices by the target application, and if the target application determines that the target data does not need to be exchanged or shared with other network devices, the target application may determine that the target data is non-shared data. The network device can store the non-shared data in different positions in the local memory area according to the type of the non-shared data.
In a possible implementation manner of the first aspect, the type of the non-shared data includes cold data, where the cold data is used to indicate that the network device accesses data with a low frequency, and the network device stores the non-shared data in the local memory area according to the type of the non-shared data, including: the network equipment confirms that the non-shared data is the cold data; and the network equipment stores the cold data into a disk of the local memory area.
In this possible implementation manner, the network device determines whether the target data is cold data by determining whether the access frequency of the target data reaches a preset threshold by the target application, and if the access frequency of the target data determined by the target application does not reach the preset threshold, the target application may determine that the target data is cold data. Since cold data is accessed at a low frequency and hot data is accessed at a high frequency, the cold data can be preferentially stored in memory resources such as a disk and the like with a low response speed. Optionally, the medium of the magnetic Disk may be a Hard Disk Drive (Hard Disk Drive, HDD), the medium of the magnetic Disk may also be a Solid State memory (SSD), and the medium of the magnetic Disk may also be other types of media, which is not limited herein.
In a possible implementation manner of the first aspect, the type of the non-shared data includes thermal data, where the thermal data is used to indicate that the network device accesses data with a high frequency, and the network device stores the non-shared data in the local memory area according to the type of the non-shared data, including: the network equipment confirms that the non-shared data is the hot data; and the network equipment stores the hot data into a cache of the local memory area.
In this possible implementation manner, the network device determines whether the target data is hot data by determining whether the access frequency of the target data reaches a preset threshold by the target application, and if the target application determines that the access frequency of the target data reaches the preset threshold, the target application may determine that the target data is hot data. Because the frequency of accessing cold data is low and the frequency of accessing hot data is high, the hot data can be preferentially stored in memory resources with high response speed, such as a cache. In order to further save the time consumed by the network device to read the hot data. Optionally, the cached medium may be a Dynamic Random Access Memory (DRAM), and the cached medium may also be other types of media, which is not limited herein.
In a possible implementation manner of the first aspect, the type of the target data includes shared data, where the shared data is used to indicate data that needs to be accessed by another network device, and the network device stores the target data in a target memory pool according to the type of the target data, including: the network equipment confirms that the target data is the shared data; and the network equipment stores the shared data in the shared memory area according to the type of the shared data.
In this possible implementation manner, after the network device determines that the type of the target data is the shared data, the network device may store the shared data in different locations in the shared memory area according to the type of the shared data. The types of the shared data at least comprise persistent data and non-persistent data, wherein the persistent data is used for indicating that the network equipment needs to read for multiple times and use for multiple times, and the non-persistent data is used for indicating that the network equipment needs to read for multiple times and discard the data after the network equipment reads for fewer times. The network device stores the shared Data in the shared Memory area, and other network devices can directly Access the shared Data from the shared Memory area through Remote Direct Memory Access (RDMA) and/or Data Streaming Association (DSA) protocols, so that the time consumed for accessing the target Data is shortened, and the operation efficiency is improved.
In a possible implementation manner of the first aspect, the type of the shared data includes persistent data, and the storing, by the network device, the shared data in the shared memory area according to the type of the shared data includes: the network equipment confirms that the shared data is the persistent data; the network device stores the persistent data in a non-volatile medium of the shared memory area.
In this possible implementation manner, the network device determines whether the target data is persistent data through the target application, and the target application may determine that the target data is persistent data. Optionally, the network device may also confirm whether the target data is persistent data by other ways, which is not limited herein. Since the persistent data needs to be read many times, the persistent data needs to be stored in the memory for a long time, and the network device may preferentially store the persistent data in the nonvolatile medium to prevent the persistent data from being lost. Optionally, the nonvolatile medium may be a Phase Change Memory (PCM), and the nonvolatile medium may also be other types of media, which is not limited herein.
In a possible implementation manner of the first aspect, the type of the shared data includes non-persistent data, and the storing, by the network device, the shared data in the shared memory area according to the type of the shared data includes: the network equipment confirms that the shared data is the non-persistent data; the network device stores the non-persistent data in a volatile medium of the shared memory area.
In this possible implementation, the network device determines whether the target data is non-persistent data through the target application. Since the non-persistent data need not be read multiple times, the non-persistent data need not be stored in memory for a long period of time, and the network device may preferentially store the non-persistent data in the volatile medium. Alternatively, the volatile medium may be DRAM, and the volatile medium may also be other types of media, which are not limited herein.
In a possible implementation manner of the first aspect, the target memory pool includes a northbound interface, where the northbound interface includes a first level interface, a second level interface, and a third level interface, where the first level interface includes a memory semantic interface, the second level interface includes a distributed data structure interface, and the third level interface includes an application semantic interface, a file semantic interface, and/or a programming model interface.
In this possible implementation, the target memory pool may further include a plurality of northbound interfaces, and different developers may use the target memory pool through appropriate interfaces according to the type and performance requirements.
In a possible implementation manner of the first aspect, the target application schedules the tertiary interface by selecting a specific interface object in the tertiary interface, the tertiary interface schedules the secondary interface by selecting a specific interface object in the secondary interface, the secondary interface schedules the primary interface by selecting a specific interface object in the primary interface, and the primary interface is used for scheduling the target memory pool.
In a possible implementation manner of the first aspect, the Memory semantic interface includes a Big Memory (Big Memory), a persistent multi-copy (plog), and/or a Memory object (Memory).
In a possible implementation manner of the first aspect, the distributed data structure interface includes a Key-Value (KV), a Hash (Hash), a B + Tree (B + Tree), an Array (Array), and/or a matrix (matrix).
In a possible implementation manner of the first aspect, the application semantic interface includes a Shuffle, a Cache, a Write-Ahead Logging (WAL), a Message Queue (MQ), and/or a Parameter server (AI Parameter Servers); the File semantic Interface comprises a Portable Operating System Interface (Posix), a Distributed File System (HDFS), MPI-IO and/or a Java File (Java File); the programming semantic interfaces include UPC, UPC + +, openSHMEM, and/or X10.
In a possible implementation manner of the first aspect, the target memory pool includes the local memory area and the shared memory area, the local memory area includes a disk and a cache, and the shared memory area includes a volatile medium and a nonvolatile medium.
A second aspect of the present application provides a network device that includes at least one processor, memory, and a communication interface. The processor is coupled with the memory and the communication interface. The memory is for storing instructions, the processor is for executing the instructions, and the communication interface is for communicating with other network devices under control of the processor. The instructions, when executed by the processor, cause the network device to perform the method of the first aspect or any possible implementation manner of the first aspect.
A third aspect of the present application provides a computer-readable storage medium storing a program for causing a network device to perform the method of the first aspect or any possible implementation manner of the first aspect.
A fourth aspect of the present application provides a computer program product storing one or more computer executable instructions that, when executed by the processor, perform the method of the first aspect or any one of the possible implementations of the first aspect.
A fifth aspect of the present application provides a chip, which includes a processor and a communication interface, where the processor is coupled to the communication interface, and the processor is configured to read an instruction to execute the method according to the first aspect or any one of the possible implementation manners of the first aspect.
A sixth aspect of the present application is a communication system, which includes the network device described in the first aspect or any one of the possible implementation manners of the first aspect.
A seventh aspect of the present application provides a converged memory system, where the converged memory system includes a plurality of network devices, and the network devices store data using a target memory pool;
the target memory pool comprises a local memory area and a shared memory area, the local memory area comprises a disk and a cache, and the shared memory area comprises a volatile medium and a nonvolatile medium.
An eighth aspect of the present application provides a converged memory system, where the converged memory system includes a first network device and a plurality of second network devices, and the first network device stores data using a target memory pool;
the target memory pool comprises a local memory area and a shared memory area, the local memory area is used for indicating a local memory of the first network device, the local memory area comprises a disk and a cache, the shared memory area is used for indicating a logic memory constructed after memory resources in a plurality of second network devices are pooled, and the shared memory area comprises a volatile medium and a nonvolatile medium.
According to the technical scheme, the embodiment of the application has the following advantages:
in the application, after the network device obtains the target data required to be stored by the target application, the network device can store the target data in the target memory pool according to the type of the target data. The target memory pool comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the network equipment, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network equipment and/or a plurality of other network equipment. The network equipment can store the target data into different memory areas in the target memory pool according to the characteristics of the target data, so that the storage efficiency of the target data is improved.
Drawings
Fig. 1 is a schematic view of an application scenario of a communication system provided in the present application;
fig. 2 is a schematic view of an application scenario of another communication system provided in the present application;
FIG. 3 is a schematic diagram of an application of a data storage method provided in the present application;
fig. 4 is a schematic structural diagram of a network device provided in the present application;
FIG. 5 is a schematic structural diagram of a northbound interface provided herein;
FIG. 6 is a schematic diagram of an application of a northbound interface provided in the present application;
fig. 7 is a schematic structural diagram of a network device provided in the present application;
fig. 8 is a schematic structural diagram of a network device according to the present application.
Detailed Description
The embodiments of the present application are described below with reference to the drawings, and it can be known by those skilled in the art that with the development of technology and the emergence of new scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
The terms "first," "second," and the like in the description and claims of this application and in the foregoing drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein.
In the present application, "and/or" is only an association relationship describing an association object, and means that there may be three relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. Also, in the description of the present application, "a plurality" means two or more than two unless otherwise specified. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
At present, technologies that account for pooling and storage pooling have become quite popular as the primary infrastructure level. Memory pooling has begun to receive attention in the industry in order to improve the performance of applications. In a conventional memory pooling scheme, generally, from an architecture level, a network device may access a remote memory by multiple means when the network device confirms that a local memory is insufficient or data needs to be shared and exchanged, so as to implement sharing and pooling of a global memory.
The traditional pooling scheme realizes the sharing and pooling of remote memories from a system structure level, and the essence of the traditional pooling scheme is to expand local memories of network equipment, so that the efficient and transparent global memory pool service can be realized for the network equipment running with single-machine application. However, for a plurality of network devices running a certain non-standalone application, the conventional pooling scheme cannot effectively determine the number of network devices occupied by a partition (SWAP) and the capacity of a memory, and all data is indiscriminately stored in the same memory area, which is inefficient in storage.
The present application provides a data storage method, a communication system and a network device for the conventional pooling scheme explained in the above method examples. The network device can store the target data into different memory areas in the target memory pool according to the characteristics of the target data, so that the storage efficiency of the target data is improved.
The following examples respectively describe the data storage method, the communication system and the network device provided by the present application with reference to the accompanying drawings. The communication system provided by the present application is first introduced.
The communication system provided by the present application may also be referred to as a converged memory system (large memory system), and the converged memory system may have two deployment manners, which are described in detail below.
The method I comprises the following steps: and (5) performing fusion deployment.
Fig. 1 is a schematic application scenario diagram of a communication system provided in the present application.
In the present application, it is assumed that the network device in the converged memory system is a compute node. The target memory pool in the computing node comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the computing node, and the shared memory area is used for indicating a logic memory formed by memory resources of the network device and a plurality of other network devices. If the converged memory system is deployed in a converged deployment mode, all memory resources in the shared memory region in the target memory pool are provided by the computing nodes in the computing node cluster, and part or all of the nodes in the computing node cluster contribute part of the memory of the computer as the shared memory region.
The second method comprises the following steps: and (5) separating and deploying.
Fig. 2 is a schematic view of an application scenario of another communication system provided in the present application.
In the present application, it is assumed that the network device in the converged memory system is a compute node. The target memory pool in the computing node comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the computing node, and the shared memory area is used for indicating a logic memory formed by the memory resources of a plurality of other network devices. If the converged memory system is deployed in a separate deployment manner, all memory resources in the shared memory region in the target memory pool are provided by the storage nodes in the separate memory cluster, and the storage nodes are only used for providing the memory resources in the shared memory region, and do not process other computing services.
In the present application, the converged memory system may be deployed in a converged deployment manner, the converged memory system may also be deployed in a separated deployment manner, and the converged memory system may also be deployed in other manners, which is not limited herein.
The above examples describe two deployment manners of the converged memory system, and the following examples describe specific implementation manners of the network device in the converged memory system provided by the present application.
In this application, optionally, the network device may be a computing node, and the convergence system may include one or more computing nodes. A plurality of computing nodes can form a computing node cluster, and all the computing nodes can be interconnected. A computing node may be a server, a desktop computer, or a controller of a storage array, a hard disk box, etc.
Functionally, a compute node is mainly used to compute or process data. In hardware, a computing node includes at least a processor, a memory, and a control unit. The processor is a Central Processing Unit (CPU) and is configured to process data from outside the network device or data generated inside the computing node. The storage is a device for storing data, and may be a memory or a hard disk. The memory is an internal memory which directly exchanges data with the processor, can read and write data at any time, is fast, and is used as a temporary data storage of an operating system or other programs in operation. The Memory includes at least two types of Memory, for example, the Memory may be a random access Memory (ram) or a Read Only Memory (ROM). For example, the Random Access Memory may be a Dynamic Random Access Memory (DRAM) or a Storage Class Memory (SCM). DRAM is a semiconductor Memory, and belongs to a volatile Memory (volatile Memory) device, like most Random Access Memories (RAMs). SCM is a hybrid storage technology that combines the features of both traditional storage devices and memory, memory-class memory providing faster read and write speeds than hard disks, but slower access speeds and lower cost than DRAM.
The structure of the communication system (converged memory system) provided by the present application is described in the above example, and the following example will describe the data storage method provided by the present application in detail with reference to the communication system described in the above example.
Fig. 3 is a schematic diagram of an application of a data storage method provided in the present application.
Referring to fig. 3, as shown in fig. 3, the data storage method provided in the present application at least includes steps 201 to 202.
201. The network device obtains target data.
In the application, the target application is run on the network device, and the target data is used for indicating data which needs to be stored by the target application. Optionally, the target application running on the network device may be a big data type application, the target application may be a High Performance Computing (HPC) type application, the target application may be an Artificial Intelligence (AI) type application, the target application may be a database type application, the target application may also be a World Wide Web (Web) type application, the target application may also be another type of application, and the specific details are not limited herein.
In this application, the target data may be data that needs to be stored by a process of the target application, and the process may be a local process of a single-process application or a local process of a distributed application, which is not limited herein.
202. And the network equipment stores the target data in the target memory pool according to the type of the target data.
In the present application, the target memory pool includes a local memory area and a shared memory area, where the local memory area is used to indicate a local memory resource of the network device, and the shared memory area is used to indicate a logical memory formed by memory resources of the network device and/or multiple other network devices.
In the application, after the network device obtains the target data required to be stored by the target application, the network device can store the target data in the target memory pool according to the type of the target data. The target memory pool comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the network equipment, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network equipment and/or a plurality of other network equipment. The network equipment can store the target data into different memory areas in the target memory pool according to the type of the target data, so that the storage efficiency of the target data is improved.
In the present application, steps 201 to 202 in the above method example illustrate the data storage method provided in the present application, and in step 202 in the above method example, the network device has a specific implementation manner for storing the target data in the target memory pool according to the type of the target data, and this specific implementation manner will be described in the following method example.
In this application, the type of the target data may include non-shared data and shared data, the non-shared data is used to indicate data that does not need to be accessed by other network devices, and the shared data is used to indicate data that needs to be accessed by other network devices.
Scene 1: the target data is non-shared data.
The network device confirms that the target data is non-shared data.
In the application, the network device determines whether the target data is shared data by determining whether the target data needs to be exchanged or shared with other network devices through the target application, and if the target application determines that the target data does not need to be exchanged or shared with other network devices, the target application can determine that the target data is non-shared data.
Optionally, the network device may determine whether the target data is the non-shared data through the target application, and the network device may also determine whether the target data is the non-shared data through a manner other than the target application, which is not limited herein.
And the network equipment stores the non-shared data in the local memory area according to the type of the non-shared data.
In the application, after the network device determines that the type of the target data is the non-shared data, the network device may store the non-shared data in different locations in the local memory area according to the type of the non-shared data. The types of the non-shared data include at least cold data indicating that the network device accesses data with a low frequency and hot data indicating that the network device accesses data with a high frequency.
Case 1: the unshared data is cold data.
The network device confirms the unshared data as cold data.
In this application, optionally, the network device determines whether the target data is cold data by determining, by the target application, whether the access frequency of the target data reaches a preset threshold, and if the target application determines that the access frequency of the target data does not reach the preset threshold, the target application may determine that the target data is cold data. Optionally, the network device may also confirm whether the target data is cold data by other ways, which is not limited herein.
The network device stores the cold data to a disk in a local memory area.
In the application, because the frequency of accessing cold data is low and the frequency of accessing hot data is high, cold data can be preferentially stored in memory resources with low response speed, such as a disk. Optionally, the medium of the magnetic disk may be an HDD, the medium of the magnetic disk may also be an SSD, and the medium of the magnetic disk may also be other types of media, which is not limited herein.
Case 2: the unshared data is hot data.
The network device confirms the unshared data as hot data.
In this application, optionally, the network device determines whether the target data is hot data by determining, by the target application, whether the access frequency of the target data reaches a preset threshold, and if the target application determines that the access frequency of the target data reaches the preset threshold, the target application may determine that the target data is hot data. Optionally, the network device may also confirm whether the target data is hot data by another way, which is not limited herein.
The network device stores the hot data in a cache of the local memory area.
In the application, because the frequency of accessing cold data is low and the frequency of accessing hot data is high, the hot data can be preferentially stored in memory resources with high response speed, such as a cache. Optionally, the cached medium may be DRAM, and the cached medium may also be other types of media, which is not limited herein.
Scene 2: the target data is shared data.
The network device confirms the target data as shared data.
In the application, the network device determines whether the target data is shared data by determining whether the target data needs to be exchanged or shared with other network devices by the target application, and if the target application determines that the target data needs to be exchanged or shared with other network devices, the target application can determine that the target data is shared data.
Optionally, the network device may determine whether the target data is shared data through the target application, and the network device may also determine whether the target data is shared data through another method other than the target application, which is not limited herein.
And the network equipment stores the shared data in the shared memory area according to the type of the shared data.
In this application, after the network device determines that the type of the target data is the shared data, the network device may store the shared data in different locations in the shared memory area according to the type of the shared data. The types of the shared data at least comprise persistent data and non-persistent data, the persistent data is used for indicating that the network equipment needs to read for multiple times and use for multiple times, and the non-persistent data is used for indicating that the network equipment reads for less times and then discards the data.
Case 1: the shared data is persistent data.
The network device confirms the shared data as persistent data.
In this application, optionally, the network device determines whether the target data is persistent data through the target application, and optionally, the network device may also determine whether the target data is persistent data through other manners, which is not limited herein.
The network device stores the persistent data to a non-volatile medium in the shared memory area.
In the application, since the persistent data needs to be read for multiple times, the persistent data needs to be stored in the memory for a long time, and the network device can preferentially store the persistent data in the nonvolatile medium to prevent the loss of the persistent data. Optionally, the non-volatile medium may be PCM, and the non-volatile medium may also be other types of media, which is not limited herein.
Case 2: the shared data is non-persistent data.
The network device confirms the shared data as persistent data.
In this application, optionally, the network device determines whether the target data is non-persistent data through the target application. Optionally, the network device may also determine whether the target data is non-persistent data by other ways, which is not limited herein.
The network device stores the non-persistent data in a volatile medium in the shared memory region.
In the application, since the non-persistent data does not need to be read for many times, the non-persistent data does not need to be stored in the memory for a long time, and the network device can preferentially store the non-persistent data in the volatile medium. Alternatively, the volatile medium may be DRAM, and the volatile medium may also be other types of media, which are not limited herein.
Fig. 4 is a schematic structural diagram of a network device provided in the present application.
Fig. 4 is an example of a process in which the network device stores target data in different areas of the target memory pool according to the type of the target data.
Referring to fig. 4, it is assumed that the node 1 is a network device, a target memory pool on the node 1 includes a local memory area and a shared memory area, and is a memory pool formed by fusing a single-node multi-level memory (local memory area) and a distributed memory pool (shared memory area), and volatile media and non-volatile media in the distributed memory pool can be independently distributed.
In the application, the local memory area has multiple hierarchies, wherein the multiple media comprise a cache and a disk, and the DRAM, the PCM, the SSD and the like form different hierarchies in the local memory area, so that the expansion of the multi-hierarchy memory of the DRAM, the PCM and the SSD is realized. The shared Memory area includes a global Memory Pool (Memory Pool) composed of volatile media and a Memory Pool (persistent Memory Pool) composed of nonvolatile media, the global Memory Pool composed of volatile media can provide a globally accessible Memory space to the Memory Pool, and the global Memory Pool composed of nonvolatile media can provide a Memory space with global Persistence capability.
If the target application (app) needs to store the target data, the target application firstly confirms the characteristics of the target data, if the target data is the unshared data, confirms the characteristics of the unshared data, if the target data is the cold data, the unshared data is stored in a disk, and if the target data is the hot data, the unshared data is stored in a cache. If the target data is shared data, the target data is stored into a global memory pool formed by the nonvolatile media if the target data is persistent data, and the target data is stored into the global memory pool formed by the volatile media if the target data is non-persistent data.
The above method illustrates a process in which the network device stores the target data in the target memory pool according to the type of the target data in the present application, and the target memory pool provided in the present application may further include a plurality of northbound interfaces, and different developers may use the target memory pool through appropriate interfaces according to the type and performance requirements.
Fig. 5 is a schematic structural diagram of a northbound interface provided in the present application.
Referring to fig. 5, the target memory pool includes a northbound interface, the northbound interface includes a first-level interface, a second-level interface, and a third-level interface, the first-level interface includes a memory semantic interface, the second-level interface includes a distributed data structure interface, and the third-level interface includes an application semantic interface, a file semantic interface, and/or a programming model interface.
In this application, optionally, the Memory semantic interface includes a large Memory (Big Memory), the Memory semantic interface may include a persistent multi-copy (plog), the Memory semantic interface may include a Memory object (Memory), and the Memory semantic interface may further include other interfaces, which is not limited herein.
In this application, optionally, the distributed data structure interface includes a Key Value (KV), the distributed data interface may include a Hash (Hash), the distributed data interface may include a B + Tree (B + Tree), the distributed data interface may include an Array (Array), the distributed data interface may include a matrix (matrix), and the distributed data interface may further include other interfaces, which is not limited herein.
In this application, optionally, the application semantic interface may include Shuffle, the application semantic interface may include Cache, the application semantic interface may include WAL, the application semantic interface may include Message Queue (MQ), the application semantic interface may include Parameter server (AI Parameter Servers), and the application semantic interface may also include other interfaces, which is not limited herein.
In this application, optionally, the File semantic Interface may include a Portable Operating System Interface (Posix), a Distributed File System (HDFS), an MPI-IO and/or a Java File (Java File);
in this application, optionally, the programming semantic interface may include a UPC + +, the programming semantic interface may include OpenSHMEM, the programming semantic interface may include an X10, and the programming semantic interface may further include other interfaces, which is not limited herein.
The interface object refers to an element in the interface, for example, the Big Memory mentioned above may be understood as a specific interface object in the Memory semantic interface.
Fig. 6 is an application diagram of a northbound interface provided in the present application.
Illustratively, the process of using the target memory pool through the northbound interface is described by taking the target application as an AI example.
Optionally, the northbound interface may further include a set communication semantic interface and a memory pool operator interface shown in fig. 6.
In the application, an AI application schedules a gather interface through a PS interface, the gather interface is used for querying target data, the gather interface selects a target matrix for searching data through a matrix interface (matrix), and the matrix interface inputs the target matrix into a target Memory pool for searching target data after selecting a Big Memory interface. After the target data is found through the large memory interface, the target data is returned to the scatter through the large memory interface and the matrix interface, and the scatter interface can reversely update data in the AI application through the ps interface by using the target data.
In the application, after the network device obtains the target data required to be stored by the target application, the network device can store the target data in the target memory pool according to the type of the target data. The target memory pool comprises a local memory area and a shared memory area, wherein the local memory area is used for indicating local memory resources of the network equipment, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network equipment and/or a plurality of other network equipment. The network equipment can store the target data into different memory areas in the target memory pool according to the type of the target data, so that the storage efficiency of the target data is improved.
The foregoing examples provide different embodiments of a data storage method, and a network device 30 is provided below, as shown in fig. 7, where the network device 30 is configured to execute steps executed by a network device (computing node) in the foregoing examples, and the executing steps and corresponding beneficial effects are specifically understood with reference to the foregoing corresponding examples, which are not described herein again, and the network device 30 includes:
an obtaining unit 301, configured to obtain target data, where the network device runs a target application, and the target data is used to indicate data that the target application needs to store;
a storage unit 302, configured to store the target data in a target memory pool according to the type of the target data, where the target memory pool includes a local memory area and a shared memory area, the local memory area is used to indicate a memory resource local to the network device, and the shared memory area is used to indicate a logic memory formed by memory resources of the network device and/or multiple other network devices.
In one possible implementation, the type of the target data includes non-shared data indicating data that does not need to be accessed by other network devices,
a confirming unit 303, configured to confirm that the target data is the non-shared data;
the storage unit 302 is configured to store the non-shared data in the local memory area according to the type of the non-shared data.
In one possible implementation, the type of the non-shared data includes cold data indicating that the network device accesses data with a low frequency,
the confirming unit 303 is configured to confirm that the non-shared data is the cold data;
the storage unit 302 is configured to store the cold data in a disk of the local memory area.
In one possible implementation, the type of the non-shared data includes thermal data indicating that the network device accesses data with a high frequency,
the confirming unit 303 is configured to confirm that the non-shared data is the hot data;
the storage unit 302 is configured to store the hot data in a cache of the local memory area.
In one possible implementation, the type of the target data includes shared data, the shared data is used for indicating data needing to be accessed by other network devices,
a confirming unit 303, configured to confirm that the target data is the shared data;
the storage unit 302 is configured to store the shared data in the shared memory area according to the type of the shared data.
In one possible implementation, the type of shared data includes persistent data,
the confirming unit 303 is configured to confirm that the shared data is the persistent data;
the storage unit 302 is configured to store the persistent data in a nonvolatile medium of the shared memory area.
In one possible implementation, the type of shared data includes non-persistent data,
the confirming unit 303, configured to confirm that the shared data is the non-persistent data;
the storage unit 302 is configured to store the non-persistent data in a volatile medium of the shared memory area.
In one possible implementation, the target memory pool includes a northbound interface, the northbound interface includes a first level interface, a second level interface, and a third level interface, the first level interface includes a memory semantic interface, the second level interface includes a distributed data structure interface, and the third level interface includes an application semantic interface, a file semantic interface, and/or a programming model interface.
In one possible implementation, the Memory semantic interface includes Big Memory, persistent multi-copy plog, and/or Memory object Memory.
In a possible implementation manner, the distributed data structure interface includes a key value KV, a Hash, a B + Tree, an Array, and/or a matrix.
In one possible implementation, the method is characterized in that,
the application semantic interface comprises a Shuffle, a Cache, a WAL, a message queue MQ and/or a Parameter server AI Parameter Servers; the File semantic interface comprises Posix, HDFS, MPI-IO and/or Java files; the programming semantic interfaces include UPC, UPC + +, openSHMEM, and/or X10.
It should be noted that, for the information interaction, the execution process, and other contents between the modules of the network device 30, the execution steps are consistent with the details of the above method steps since the method examples are based on the same concept, and reference may be made to the description in the above method examples.
The above examples provide different embodiments of the network device 30, and a network device 40 is provided below, as shown in fig. 8, the network device 40 is configured to execute steps executed by the network device (computing node) in the above examples, and the executing steps and corresponding beneficial effects are specifically understood with reference to the above corresponding examples, and are not described herein again.
Referring to fig. 8, a schematic structural diagram of a network device 40 is provided for the present application, where the network device 400 includes: a processor 402, a communication interface 403, a memory 401. Optionally, a bus 404 may be included. Wherein, the communication interface 403, the processor 402 and the memory 401 may be connected to each other through a bus 404; the bus 404 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus. This network device 400 may implement the functionality of the network device 30 in the example shown in fig. 7. Processor 402 and communication interface 403 may perform operations corresponding to network devices in the above method examples.
The following describes each component of the network device in detail with reference to fig. 8:
the memory 401 may be a volatile memory (volatile memory), such as a random-access memory (RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory (flash memory), a Hard Disk Drive (HDD) or a solid-state drive (SSD); or a combination of the above types of memories, for storing program code, configuration files, or other content that may implement the methods of the present application.
The processor 402 is a control center of the controller, and may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the examples provided in this application, such as: one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs).
The communication interface 403 is used for communication with other devices.
The processor 402 may perform the operations performed by the network device 30 in the example shown in fig. 7, which are not described herein again.
It should be noted that, for the information interaction, execution process, and other contents between the modules of the network device 400, the execution steps are consistent with the details of the method steps, and reference may be made to the description in the method examples.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application, which are essential or part of the technical solutions contributing to the prior art, or all or part of the technical solutions, may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (29)

1. A method of storing data, comprising:
the method comprises the steps that network equipment acquires target data, target applications run on the network equipment, and the target data are used for indicating data needing to be stored by the target applications;
the network device stores the target data in a target memory pool according to the type of the target data, wherein the target memory pool comprises a local memory area and a shared memory area, the local memory area is used for indicating local memory resources of the network device, and the shared memory area is used for indicating a logic memory formed by the memory resources of the network device and/or a plurality of other network devices.
2. The data storage method according to claim 1, wherein the type of the target data includes non-shared data, the non-shared data is used to indicate data that does not need to be accessed by other network devices, and the network device stores the target data in a target memory pool according to the type of the target data, including:
the network equipment confirms that the target data is the non-shared data;
and the network equipment stores the non-shared data in the local memory area according to the type of the non-shared data.
3. The data storage method according to claim 2, wherein the type of the non-shared data includes cold data, the cold data is used to indicate that the network device has low access frequency, and the network device stores the non-shared data in the local memory area according to the type of the non-shared data, including:
the network equipment confirms that the non-shared data is the cold data;
and the network equipment stores the cold data into a disk of the local memory area.
4. The data storage method according to claim 2, wherein the type of the non-shared data includes thermal data, the thermal data is used to indicate that the network device accesses data with a high frequency, and the network device stores the non-shared data in the local memory area according to the type of the non-shared data, and the method includes:
the network equipment confirms that the non-shared data is the hot data;
and the network equipment stores the hot data into a cache of the local memory area.
5. The data storage method according to claim 1, wherein the type of the target data includes shared data, the shared data is used to indicate data that needs to be accessed by other network devices, and the network device stores the target data in a target memory pool according to the type of the target data, including:
the network equipment confirms that the target data is the shared data;
and the network equipment stores the shared data in the shared memory area according to the type of the shared data.
6. The data storage method according to claim 5, wherein the type of the shared data includes persistent data, and the network device stores the shared data in the shared memory area according to the type of the shared data, including:
the network equipment confirms that the shared data is the persistent data;
the network device stores the persistent data in a non-volatile medium of the shared memory area.
7. The data storage method of claim 5, wherein the type of the shared data comprises non-persistent data, and wherein the network device stores the shared data in the shared memory area according to the type of the shared data, comprising:
the network equipment confirms that the shared data is the non-persistent data;
the network device stores the non-persistent data in a volatile medium of the shared memory region.
8. The data storage method according to any one of claims 1 to 7, wherein the target memory pool comprises a northbound interface, the northbound interface comprising a primary interface, a secondary interface and a tertiary interface, the primary interface comprising a memory semantic interface, the secondary interface comprising a distributed data structure interface, the tertiary interface comprising an application semantic interface, a file semantic interface and/or a programming model interface.
9. The network device of claim 8, wherein the target application schedules the tertiary interface by selecting a specific interface object in the tertiary interface, wherein the tertiary interface schedules the secondary interface by selecting a specific interface object in the secondary interface, wherein the secondary interface schedules the primary interface by selecting a specific interface object in the primary interface, and wherein the primary interface is configured to schedule the target memory pool.
10. The data storage method according to claim 9, wherein the Memory semantic interface comprises Big Memory, persistent multi-copy plog and/or Memory object Memory.
11. The data storage method according to any one of claims 8 to 10, wherein the distributed data structure interface comprises a key value KV, a Hash, a B + Tree, an Array, and/or a matrix.
12. The data storage method according to any one of claims 8 to 11,
the application semantic interface comprises a Shuffle, a Cache, a WAL, a message queue MQ and/or a Parameter server AI Parameter Servers;
the File semantic interface comprises Posix, HDFS, MPI-IO and/or Java files;
the programming semantic interfaces include UPC, UPC + +, openSHMEM, and/or X10.
13. A network device, comprising:
the network device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring target data, a target application runs on the network device, and the target data is used for indicating data which needs to be stored by the target application;
a storage unit, configured to store the target data in a target memory pool according to a type of the target data, where the target memory pool includes a local memory area and a shared memory area, the local memory area is used to indicate a memory resource local to the network device, and the shared memory area is used to indicate a logic memory formed by memory resources of the network device and/or multiple other network devices.
14. The network device of claim 13, wherein the type of the target data comprises non-shared data indicating data that does not need to be accessed by other network devices,
a confirming unit configured to confirm that the target data is the non-shared data;
the storage unit is configured to store the non-shared data in the local memory area according to the type of the non-shared data.
15. The network device of claim 14, wherein the type of the non-shared data comprises cold data indicating that the network device accesses data that is infrequent,
the confirming unit is used for confirming that the non-shared data is the cold data;
the storage unit is configured to store the cold data in a disk of the local memory area.
16. The network device of claim 14, wherein the type of the non-shared data comprises thermal data indicating that the network device accesses data that is frequent,
the confirming unit is used for confirming that the non-shared data is the hot data;
the storage unit is configured to store the hot data in a cache of the local memory area.
17. The network device of claim 13, wherein the type of the target data comprises shared data indicating data that needs to be accessed by other network devices,
a confirming unit configured to confirm that the target data is the shared data;
the storage unit is used for storing the shared data in the shared memory area according to the type of the shared data.
18. The network device of claim 17, wherein the type of shared data comprises persistent data,
the confirming unit is used for confirming that the shared data is the persistent data;
the storage unit is used for storing the persistent data into a nonvolatile medium of the shared memory area.
19. The network device of claim 17, wherein the type of shared data comprises non-persistent data,
the confirming unit is used for confirming that the shared data is the non-persistent data;
the storage unit is used for storing the non-persistent data into a volatile medium of the shared memory area.
20. The network device of any of claims 13 to 19, wherein the target memory pool comprises a northbound interface, the northbound interface comprising a primary interface comprising a memory semantic interface, a secondary interface comprising a distributed data structure interface, and a tertiary interface comprising an application semantic interface, a file semantic interface, and/or a programming model interface.
21. The network device of claim 20, wherein the target application schedules the tertiary interface by selecting a specific interface object in the tertiary interface, wherein the tertiary interface schedules the secondary interface by selecting a specific interface object in the secondary interface, wherein the secondary interface schedules the primary interface by selecting a specific interface object in the primary interface, and wherein the primary interface is configured to schedule the target memory pool.
22. The network device according to claim 21, wherein the Memory semantic interface comprises Big Memory, persistent multi-copy plog and/or Memory object Memory.
23. The network device of any of claims 21 to 23, wherein the distributed data structure interface comprises a key value KV, a Hash, a B + Tree, an Array, and/or a matrix.
24. The network device of any one of claims 20 to 23,
the application semantic interface comprises a Shuffle, a Cache, a pre-written log system WAL, a message queue MQ and/or a Parameter server AI Parameter Servers;
the File semantic interface comprises a portable operating system interface Posix, a distributed File system HDFS, an information transfer interface MPI-IO and/or a Java File;
the programming semantic interfaces include UPC, UPC + +, openSHMEM, and/or X10.
25. A network device, comprising:
a processor, a memory, and a communication interface;
the processor is connected with the memory and the communication interface;
the communication interface is used for communicating with other equipment;
the processor is configured to read the instructions stored in the memory and cause the network device to perform the method of any of claims 1 to 12.
26. A computer storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the method of any one of claims 1 to 12.
27. A communication system, the communication system comprising at least a network device;
the network device is a network device included in the data storage method according to any one of claims 1 to 12.
28. A converged memory system comprises a plurality of network devices, wherein the network devices store data using a target memory pool;
the target memory pool comprises a local memory area and a shared memory area, the local memory area comprises a disk and a cache, and the shared memory area comprises a volatile medium and a nonvolatile medium.
29. A kind of amalgamation memory system, the said amalgamation memory system includes the first network equipment and multiple second network equipment, the said first network equipment uses the memory pool of the goal to store the data;
the target memory pool comprises a local memory area and a shared memory area, the local memory area is used for indicating a local memory of the first network device, the local memory area comprises a disk and a cache, the shared memory area is used for indicating a logical memory constructed after memory resources in a plurality of second network devices are pooled, and the shared memory area comprises a volatile medium and a nonvolatile medium.
CN202110821703.0A 2021-06-25 2021-07-20 Data storage method and related equipment Pending CN115525443A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110713213 2021-06-25
CN2021107132139 2021-06-25

Publications (1)

Publication Number Publication Date
CN115525443A true CN115525443A (en) 2022-12-27

Family

ID=84693752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110821703.0A Pending CN115525443A (en) 2021-06-25 2021-07-20 Data storage method and related equipment

Country Status (1)

Country Link
CN (1) CN115525443A (en)

Similar Documents

Publication Publication Date Title
US9864759B2 (en) System and method for providing scatter/gather data processing in a middleware environment
US9276959B2 (en) Client-configurable security options for data streams
US9858322B2 (en) Data stream ingestion and persistence techniques
US9563426B1 (en) Partitioned key-value store with atomic memory operations
US20160132541A1 (en) Efficient implementations for mapreduce systems
US10356150B1 (en) Automated repartitioning of streaming data
CN108900626B (en) Data storage method, device and system in cloud environment
TW201220197A (en) for improving the safety and reliability of data storage in a virtual machine based on cloud calculation and distributed storage environment
CN110119304B (en) Interrupt processing method and device and server
US20210360065A1 (en) Distributed Metadata Management Method for Distributed File System
US10158710B2 (en) Efficient replication of changes to a byte-addressable persistent memory over a network
US11314694B2 (en) Facilitating access to data in distributed storage system
CN114625762A (en) Metadata acquisition method, network equipment and system
CN105760391B (en) Method, data node, name node and system for dynamically redistributing data
US11288237B2 (en) Distributed file system with thin arbiter node
US9703788B1 (en) Distributed metadata in a high performance computing environment
CN106815318B (en) Clustering method and system for time sequence database
CN112559565A (en) Abnormity detection method, system and device
CN111858656A (en) Static data query method and device based on distributed architecture
WO2015167453A1 (en) Computing system management using shared memory
CN114785662B (en) Storage management method, device, equipment and machine-readable storage medium
WO2012171363A1 (en) Method and equipment for data operation in distributed cache system
CN115525443A (en) Data storage method and related equipment
Garefalakis et al. Strengthening consistency in the cassandra distributed key-value store
CN108139980B (en) Method for merging memory pages and memory merging function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination