CN110535811B - Remote memory management method and system, server, client and storage medium - Google Patents

Remote memory management method and system, server, client and storage medium Download PDF

Info

Publication number
CN110535811B
CN110535811B CN201810515949.3A CN201810515949A CN110535811B CN 110535811 B CN110535811 B CN 110535811B CN 201810515949 A CN201810515949 A CN 201810515949A CN 110535811 B CN110535811 B CN 110535811B
Authority
CN
China
Prior art keywords
client
message
message block
server
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810515949.3A
Other languages
Chinese (zh)
Other versions
CN110535811A (en
Inventor
唐小岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
ZTE Corp
Original Assignee
Tsinghua University
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, ZTE Corp filed Critical Tsinghua University
Priority to CN201810515949.3A priority Critical patent/CN110535811B/en
Publication of CN110535811A publication Critical patent/CN110535811A/en
Application granted granted Critical
Publication of CN110535811B publication Critical patent/CN110535811B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/133Protocols for remote procedure calls [RPC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources

Abstract

The invention discloses a remote memory management method and system, a server, a client and a computer readable storage medium, wherein the method comprises the following steps: applying for a memory as a message area, and registering the message area to a network card; receiving a message block application sent by a client through Remote Direct Memory Access (RDMA) atomic operation, distributing a message block for the client and assigning the exclusive time of the message block; receiving a remote request sent by the client through RDMA write operation, and storing the information of the remote request into the message block; and processing and responding to the remote request of the client. The method comprises the steps that a memory is applied to a server side to serve as a message area, the message area is registered to a network card, and a message block application and a remote request sent by a client side through RDMA operation are received and processed; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.

Description

Remote memory management method and system, server, client and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a remote memory management method and system, a server, a client, and a computer-readable storage medium.
Background
The RPC (Remote Procedure Call) efficiency in the distributed environment determines the I/O (Input/Output) performance of the whole system, a conventional distributed system performs data Transmission by Remote Procedure Call based on TCP/IP (Transmission Control Protocol/Internet Protocol, Transmission Control Protocol/Internet interconnection Protocol), and the memory management of message Transmission is integrated into a network Protocol stack without consideration in system design.
In recent years, in order to improve the computational efficiency of distributed Memory, RDMA (Remote Direct Memory Access) technology is widely researched and applied to a storage system to accelerate the upper layer computational performance. In the RDMA communication process, transmission semantics have difference, and a memory can be remotely and directly accessed, so that remote memory management is particularly important when remote process calling based on RDMA is designed.
At present, no memory management module aiming at RDMA exists, and memory occupation in the process of cross-node message transmission can be effectively reduced. Meanwhile, in parallel processing, the prior art still adopts a simple batch processing model, and cannot fully utilize the high throughput characteristic of the network card.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a remote memory management method and system, a server, a client, and a computer-readable storage medium, so as to solve the problem of how to effectively reduce memory usage during a cross-node message transmission process and improve remote request throughput.
The technical scheme adopted by the embodiment of the invention for solving the technical problems is as follows:
according to an aspect of an embodiment of the present invention, there is provided a remote memory management method, where the method is used for a server, and the method includes:
applying for a memory as a message area, and registering the message area to a network card;
receiving a message block application sent by a client through RDMA atomic operation, distributing a message block for the client and assigning exclusive time of the message block;
receiving a remote request sent by the client through RDMA write operation, and storing the information of the remote request into the message block; and processing and responding to the remote request of the client.
According to another aspect of the embodiments of the present invention, there is provided a server, where the server includes: the memory management system comprises a memory, a processor and a remote memory management program which is stored on the memory and can run on the processor, wherein the remote memory management program realizes the steps of the remote memory management method when being executed by the processor.
According to another aspect of the embodiments of the present invention, a computer-readable storage medium is provided, in which a remote memory management program is stored, and when executed by a processor, the remote memory management program implements the steps of the remote memory management method described above.
According to another aspect of the embodiments of the present invention, there is provided a remote memory management method, where the method is used for a client, and the method includes:
sending a message block application to a server through RDMA atomic operation; the server side applies a memory as a message area and registers the message area to the network card; receiving a message block application sent by the client through RDMA atomic operation, distributing a message block for the client and assigning exclusive time of the message block;
after the message block application is successful, sending a remote request to the server through RDMA write operation, and waiting for the response of the server; the server receives a remote request sent by the client through RDMA write operation, and stores the information of the remote request into the message block; and processing and responding to the remote request of the client.
According to another aspect of the embodiments of the present invention, there is provided a client, including: the memory management system comprises a memory, a processor and a remote memory management program which is stored on the memory and can run on the processor, wherein the remote memory management program realizes the steps of the remote memory management method when being executed by the processor.
According to another aspect of the embodiments of the present invention, a computer-readable storage medium is provided, in which a remote memory management program is stored, and when executed by a processor, the remote memory management program implements the steps of the remote memory management method described above.
According to another aspect of the embodiments of the present invention, a remote memory management system is provided, where the system includes the server and the client.
According to the remote memory management method and system, the server, the client and the computer readable storage medium, the server applies for the memory as a message area and registers the message area to the network card, and receives and processes a message block application and a remote request sent by the client through RDMA operation; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Drawings
FIG. 1 is a diagram illustrating an RDMA data transfer architecture according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a remote memory management method according to a first embodiment of the present invention;
FIG. 3 is a diagram illustrating a server-side structure according to a second embodiment of the present invention;
fig. 4 is a flowchart illustrating a remote memory management method according to a fourth embodiment of the present invention;
FIG. 5 is a schematic diagram of a client according to a fifth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a remote memory management system according to a seventh embodiment of the present invention;
FIG. 7 is a schematic diagram of the construction of the working zone and the preheating zone of an embodiment of the present invention;
fig. 8 is a diagram illustrating an allocation field structure according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
For a better understanding of the present embodiment, before setting forth the present embodiment, the following terms of the present embodiment are explained:
the DMA (Direct Memory Access) allows some hardware devices to independently and directly read and write the Memory without the participation of a large amount of CPU (Central Processing Unit).
RDMA is a new network communication technology, which can directly access the remote memory without the direct participation of the operating systems of both parties, and realize the characteristics of high throughput and low delay. RDMA achieves zero-copy of data transmission by letting the network adapter directly transmit data to the memory of the other side, thereby eliminating direct participation of CPU and Cache, and reducing redundant field switching. Currently, the network protocol stack supporting RDMA technology includes Infiniband, RoCE (RDMA over Converged Ethernet), and iWARP, the former two are supported by Mellanox as hardware technology, and the latter two are all completely compatible with Ethernet because of the data link layer of the common Ethernet.
Fig. 1 shows a specific flow of RDMA communication: firstly, a local (node 1) CPU initiates a communication command to a network card in a way of MMIO (Memory mapping I/O), and after the local network card detects a new command, the local network card reads data to be transmitted from a Memory in a way of DMA (direct Memory access), packs the data and transmits the data in an RDMA (remote direct Memory access) network. After the network card of the opposite side (node 2) receives the data, the data is directly written into the address area corresponding to the memory in a DMA mode, and the corresponding completion information is written into the completion queue, so that the participation of the CPU of the opposite side is not involved in the whole process, the kernels of the two sides are bypassed, and the zero copy of data transmission is realized.
Before establishing communication, the two communication parties need to go through the following steps: opening the network card equipment; creating a protection domain, wherein the protection domain can be bound with an object created at a later stage to ensure the safety of data transmission, and any cross-domain operation can cause communication errors; registering a memory, and registering the communicated memory at the stage, wherein the specific method comprises the steps of establishing a mapping table of user mode addresses and memory addresses of the memory in the segment, storing the mapping table into a network card for caching, and generating key pairs (lkey and rkey) of the memory segment, wherein the network card needs to carry corresponding keys for identity confirmation when accessing the memory locally or remotely; creating a CQ (Completion Queue), putting corresponding Completion information into the Completion Queue by a sender after the message is successfully sent or a receiver successfully receives the message, and repeatedly detecting the Completion Queue by a user to verify whether the message is completely sent; creating a QP (Queue Pair), wherein the QP is composed of a Send Queue and a Receive Queue, a sender puts a message to be sent into the Send Queue, a receiver puts a receiving request into the Receive Queue, and the two parties carry out network communication in the mode; initializing QP states, and after two communication parties create one-to-one corresponding QPs, performing a series of handshake state conversion until a communication link is successfully established; QP may establish different connection types, including RC (replaceable connection), UC (nonreiable connection), and UD (nonreiable datagram); in the RC mode, QP can only perform one-to-one reliable transmission, and corresponding confirmation information feedback is generated after a data packet is successfully sent; in the UC mode, QP performs one-to-one transmission without acknowledgement information feedback; the UD mode has no one-to-one definition and no acknowledgement information feedback; the three transmission modes have different characteristics, and the support degree of the communication primitive is different.
Memory computing refers to a novel processing mode in which, in the face of the requirement for massive data and high real-time processing, a conventional storage system using a magnetic disk as a storage medium has difficulty in dealing with new challenges due to slow access speed, so that the storage system is transferred to a memory for real-time processing. The memory storage system mainly comprises two types, namely a memory database system and a memory file system. Currently, the mainstream memory file systems include Alluxio, IGFS, and the like. The Alluxio is mainly used for solving the existing problems of the Spark computing framework, accelerating the data processing performance and realizing single storage and reliable recovery of data by using the link. IGFS is a cache file system between a computing framework and HDFS (Hadoop distributed file system), and provides an interface compatible with HDFS to an upper layer, but unlike HDFS, IGFS does not have a separate metadata server, but performs data distribution in a hash manner.
RPC is a remote communication protocol that enables a program running on one computer to remotely call a function on another computer without the user having to be concerned with the underlying communication interaction policy. The remote procedure call is widely applied to the field of distributed systems, a client-server model is adopted, the call procedure is always initiated by a client, specifically, the call procedure comprises the steps of packaging and sending information such as a call function serial number, a call function parameter and the like to a server, then the server receives and executes a request, and after the server finishes executing the request, an execution result is returned to the client.
First embodiment
As shown in fig. 2, a first embodiment of the present invention provides a remote memory management method, where the method is used for a server, and the method includes:
step S11: and applying for a memory as a message area and registering the message area to the network card.
In this embodiment, a memory is applied as a message area at a server and registered to a network card to support direct reading and writing of a remote memory of a client, and the client can directly remotely access through RDMA primitives to realize message transmission between nodes. The message area is used for storing information of remote requests of the client.
Step S12: receiving a message block application sent by a client through Remote Direct Memory Access (RDMA) atomic operation, distributing a message block for the client and assigning the exclusive time of the message block.
In this embodiment, the message zone includes multiple independent message blocks, a client rents message blocks from the server through an RDMA atomic operation before sending a remote request, and the client occupies them separately for a system-specified time.
It should be noted that the server can detect multiple requests from the same client at the same time, perform batch processing and transmission, and improve response throughput.
Step S13: receiving a remote request sent by the client through RDMA write operation, and storing the information of the remote request into the message block; and processing and responding to the remote request of the client.
In this embodiment, after a message block application is successful, the client sends a remote request to the server through an RDMA write operation, and waits for a response from the server. And after receiving the remote request of the client, the server stores the information of the remote request into the message block. And meanwhile, the server starts a service thread polling scanning message area, and performs corresponding processing after monitoring the information of the remote request.
In one embodiment, the method further comprises:
dividing the message area into a working area and a preheating area;
and processing the message block application and the remote request sent by the client side in parallel through the working area and the preheating area.
In this embodiment, the message area of the server is divided into a working area and a warm-up area, which can be used to speed up the message processing of the server. Specifically, when the service thread of the server processes the new request in the working area, the preheating area opens the application, and the client can apply for the message block in the preheating area in advance and fill the new request in the preheating area in advance.
In this embodiment, the server may open multiple service threads that monitor different areas of the message region for parallel processing.
In this embodiment, after the processing the message block application and the remote request sent by the client in parallel through the working area and the pre-heating area, the method further includes:
acquiring switching signals of the working area and the preheating area;
and switching the working area into a preheating area and switching the preheating area into the working area according to the switching signal.
Specifically, when the service thread obtains the switching signal, the service thread firstly finishes processing the residual message of the working area, then switches the preheating area into the working area, and simultaneously switches the previous working area into the preheating area and accepts the application of the client.
In one embodiment, the receiving a message block request sent by a client through a remote direct memory access RDMA atomic operation, after allocating a message block and specifying an exclusive time of the message block for the client, further comprises:
if the number of the clients is larger than that of the message blocks, shortening the exclusive time of the message blocks;
and if the exclusive time of the message block is expired, feeding back the expiration information to the client so that the client clears the occupation state information and resends the message block application.
In the embodiment, through a management mechanism of 'application' and 'expiration', a plurality of clients are forced to be grouped according to a time sequence, and a service thread processes client requests of different groups in different time periods, so that a time sharing effect is achieved.
Specifically, when the number of the clients is greater than that of the message blocks of the server, the server forcibly shortens the exclusive time of the message blocks applied by the clients. Particularly, when the number of the clients is less than the number of the message blocks, the occupation time of the clients on the message blocks is set to be infinite; in this case, all clients are grouped into one group, the message area is sufficient, and therefore, without performing time-sharing processing and response, each client can always share the corresponding message block alone. When the client occupies the message block to the end, the server attaches the expiration information to the message response information, and the client needs to reapply the message block after analyzing the expiration information.
When the service thread switches the working area and the preheating area, the client end contained in the working area generates an expiration event, before the service thread executes the switching, the working area is integrally scanned, the residual request is responded, the 'expiration' field information is added in the reply information of the residual request, for other client ends, the 'expiration' request information is newly added and sent to the client end for informing the expiration of the message block; after receiving the 'due' information, the client clears the local occupation state information, and reappears after waiting for a fixed time; based on the method, when the request density of the client is high, every time the service thread executes message area switching, the residual request proportion is higher, the server can more easily inform that the message block is invalid by responding to the residual request, and the proportion of independently sending invalid information is relatively less, so that the introduced additional network overhead is less; when the request density of the client is low, the introduced extra notification request does not influence the overall performance because the current network hardware occupation ratio is not high; in conclusion, the message release mechanism based on the active feedback of the server can adapt to scenes with different concurrency degrees.
To better illustrate the working and preheating zones, the following description is made in conjunction with FIG. 7:
as shown in fig. 7, the message area of the server is divided into a working area and a preheating area, the server can start a plurality of service threads, and the service threads monitor different areas of the message area for parallel processing. The two-way arrow indicates that the working area and the preheating area can be switched, and the time line in the middle shows that the service thread processes different groups of client requests in different time periods, so that the time sharing effect is achieved.
In another embodiment, the applying for a memory as a message area and registering the message area to the network card further includes:
configuring a distribution field for judging the message block application result of the client at the head of the message block;
the value of the allocation field is updated periodically.
As an example, the structure of the allocation field may be as shown in fig. 8, where the Message Zone Index in the figure may represent a Message Zone Index Number, and the Client Number is a Client Number, that is, the Number of applications for the Message block. The Message Zone Index is 32bits long as the Client Number. The message block application process is roughly as follows:
the initial allocation field is (0, 0), wherein the first 0 represents that the first half of the message area is the preheating area, and the second 0 represents that the application number of the current message block is 0.
The client executes RDMA-FETCH _ AND _ ADD atomic operation, remotely executes atomic addition operation on the distribution field, AND simultaneously acquires the value (m, n) before the distribution field, if n is smaller than the total number of the message blocks, the message block application is successful, otherwise, the message block application fails, AND the same operation is executed after waiting for fixed time. The application and distribution of the message blocks are realized by using RDMA atomic operation, the aim is to reduce the expense of the server for applying and releasing the message blocks, in order to ensure that the response time of the client is in a reasonable range, the time is generally set to be less than one millisecond, so that the application and release operation of the message blocks frequently occurs, once the application and release protocol is designed to be too complex, huge expense is introduced into the server, the overall performance of a system is reduced, the application process server based on RDMA primitive does not need to participate, and the CPU expense in the application process is greatly reduced.
In this embodiment, the server periodically updates the value of the allocation field for guiding the client to dynamically apply for the message block.
As an example, when the number of the client groups does not change (taking two groups as an example), the server periodically switches between (0, a) and (1, b), where 0 represents the first half of the message area, 1 represents the second half of the message area, a represents the current occupation situation of the message area in the first half, and b represents the occupation situation of the second half of the message area; in the process of applying by the client through RDMA atomic operation, if the data before acquisition is (0, a)1) And a is a1<BmaxIn which B ismaxIndicates the total number of message blocks, (0, a)1) And the representative client applies for the message block in the first half of the message area, otherwise, the application fails, the fixed time is waited, and the application flow is executed again.
According to the remote memory management method, a server applies for a memory as a message area and registers the message area to a network card, and receives and processes a message block application and a remote request sent by a client through RDMA operation; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Second embodiment
As shown in fig. 3, a second embodiment of the present invention provides a server, where the server includes: a memory 21, a processor 22, and a remote memory management program stored on the memory 21 and executable on the processor 22, wherein the remote memory management program, when executed by the processor 22, is configured to implement the following steps of the remote memory management method:
applying for a memory as a message area, and registering the message area to a network card;
receiving a message block application sent by a client through Remote Direct Memory Access (RDMA) atomic operation, distributing a message block for the client and assigning the exclusive time of the message block;
receiving a remote request sent by the client through RDMA write operation, and storing the information of the remote request into the message block; and processing and responding to the remote request of the client.
When executed by the processor 22, the remote memory management program is further configured to implement the following steps of the remote memory management method:
dividing the message area into a working area and a preheating area;
and processing the message block application and the remote request sent by the client side in parallel through the working area and the preheating area.
When executed by the processor 22, the remote memory management program is further configured to implement the following steps of the remote memory management method:
acquiring switching signals of the working area and the preheating area;
and switching the working area into a preheating area and switching the preheating area into the working area according to the switching signal.
When executed by the processor 22, the remote memory management program is further configured to implement the following steps of the remote memory management method:
if the number of the clients is larger than that of the message blocks, shortening the exclusive time of the message blocks;
and if the exclusive time of the message block is expired, feeding back the expiration information to the client so that the client clears the occupation state information and resends the message block application.
When executed by the processor 22, the remote memory management program is further configured to implement the following steps of the remote memory management method:
configuring a distribution field for judging the message block application result of the client at the head of the message block;
the value of the allocation field is updated periodically.
The server side of the embodiment of the invention receives and processes the message block application and the remote request sent by the client side through RDMA operation by applying the memory as the message area at the server side and registering the message area to the network card; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Third embodiment
A third embodiment of the present invention provides a computer-readable storage medium, where a remote memory management program is stored on the computer-readable storage medium, and the remote memory management program is used to implement the steps of the remote memory management method according to the first embodiment when executed by a processor.
The computer-readable storage medium of the embodiment of the invention receives and processes the message block application and the remote request sent by the client through RDMA operation by applying the memory as the message area at the server and registering the message area to the network card; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Fourth embodiment
As shown in fig. 4, a fourth embodiment of the present invention provides a remote memory management method, where the method is applied to a client, and the method includes:
step S31: sending a message block application to a server through RDMA atomic operation; the server side applies a memory as a message area and registers the message area to the network card; receiving a message block application sent by the client through RDMA atomic operation, distributing a message block for the client and assigning exclusive time of the message block.
In this embodiment, a memory is applied as a message area at a server and registered to a network card to support direct reading and writing of a remote memory of a client, and the client can directly remotely access through RDMA primitives to realize message transmission between nodes. The message area is used for storing information of remote requests of the client.
In this embodiment, the message zone includes multiple independent message blocks, a client rents message blocks from the server through an RDMA atomic operation before sending a remote request, and the client occupies them separately for a system-specified time.
It should be noted that the client may send multiple requests simultaneously, and the server may detect multiple requests from the same client simultaneously, perform batch processing and transmission, and improve response throughput.
Step S32: after the message block application is successful, sending a remote request to the server through RDMA write operation, and waiting for the response of the server; the server receives a remote request sent by the client through RDMA write operation, and stores the information of the remote request into the message block; and processing and responding to the remote request of the client.
In this embodiment, after a message block application is successful, the client sends a remote request to the server through an RDMA write operation, and waits for a response from the server. And after receiving the remote request of the client, the server stores the information of the remote request into the message block. And meanwhile, the server starts a service thread polling scanning message area, and performs corresponding processing after monitoring the information of the remote request.
In one embodiment, if the server divides the message zone into a working zone and a warm-up zone, a message block application is sent to the warm-up zone by RDMA atomic operation.
In this embodiment, the message area of the server is divided into a working area and a warm-up area, which can be used to speed up the message processing of the server. Specifically, when the service thread of the server processes the new request in the working area, the preheating area opens the application, and the client can apply for the message block in the preheating area in advance and fill the new request in the preheating area in advance.
In this embodiment, the server may open multiple service threads that monitor different areas of the message region for parallel processing.
In this embodiment, if the server switches the working area to the preheating area and switches the preheating area to the working area according to the switching signal, the server sends a message block application to the working area switched to the preheating area through RDMA atomic operation.
Specifically, when the service thread obtains the switching signal, the service thread firstly finishes processing the residual message of the working area, then switches the preheating area into the working area, and simultaneously switches the previous working area into the preheating area and accepts the application of the client.
In an embodiment, after the message block application is successful, if expiration information fed back by the server is received, the occupied state information is cleared and the message block application is sent to the server again.
In the embodiment, through a management mechanism of 'application' and 'expiration', a plurality of clients are forced to be grouped according to a time sequence, and a service thread processes client requests of different groups in different time periods, so that a time sharing effect is achieved.
Specifically, when the number of the clients is greater than that of the message blocks of the server, the server forcibly shortens the exclusive time of the message blocks applied by the clients. Particularly, when the number of the clients is less than the number of the message blocks, the occupation time of the clients on the message blocks is set to be infinite; in this case, all clients are grouped into one group, the message area is sufficient, and therefore, without performing time-sharing processing and response, each client can always share the corresponding message block alone. When the client occupies the message block to the end, the server attaches the expiration information to the message response information, and the client needs to reapply the message block after analyzing the expiration information.
When the service thread switches the working area and the preheating area, the client end contained in the working area generates an expiration event, before the service thread executes the switching, the working area is integrally scanned, the residual request is responded, the 'expiration' field information is added in the reply information of the residual request, for other client ends, the 'expiration' request information is newly added and sent to the client end for informing the expiration of the message block; after receiving the 'due' information, the client clears the local occupation state information, and reappears after waiting for a fixed time; based on the method, when the request density of the client is high, every time the service thread executes message area switching, the residual request proportion is higher, the server can more easily inform that the message block is invalid by responding to the residual request, and the proportion of independently sending invalid information is relatively less, so that the introduced additional network overhead is less; when the request density of the client is low, the introduced extra notification request does not influence the overall performance because the current network hardware occupation ratio is not high; in conclusion, the message release mechanism based on the active feedback of the server can adapt to scenes with different concurrency degrees.
To better illustrate the working and preheating zones, the following description is made in conjunction with FIG. 7:
as shown in fig. 7, the message area of the server is divided into a working area and a preheating area, the server can start a plurality of service threads, and the service threads monitor different areas of the message area for parallel processing. The two-way arrow indicates that the working area and the preheating area can be switched, and the time line in the middle shows that the service thread processes different groups of client requests in different time periods, so that the time sharing effect is achieved.
In another embodiment, the message block application is judged to be successful by:
obtaining values (a, b) of the distribution fields, and if b is smaller than the total number of the message blocks, judging that the message block application is successful; the distribution field is configured in the header of the message block for the server to decide the message block application result of the client, wherein a represents the index number of the message area, and b represents the application number of the message block.
As an example, the structure of the allocation field may be as shown in fig. 8, where the Message Zone Index in the figure may represent a Message Zone Index Number, and the Client Number is a Client Number, that is, the Number of applications for the Message block. The Message Zone Index is 32bits long as the Client Number.
According to the remote memory management method, a server applies for a memory as a message area and registers the message area to a network card, and receives and processes a message block application and a remote request sent by a client through RDMA operation; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Fifth embodiment
As shown in fig. 5, a fifth embodiment of the present invention provides a client, where the client includes: a memory 31, a processor 32, and a remote memory management program stored on the memory 31 and executable on the processor 32, wherein the remote memory management program, when executed by the processor 32, is configured to implement the following steps of the remote memory management method:
sending a message block application to a server through RDMA atomic operation; the server side applies a memory as a message area and registers the message area to the network card; receiving a message block application sent by the client through RDMA atomic operation, distributing a message block for the client and assigning exclusive time of the message block;
after the message block application is successful, sending a remote request to the server through RDMA write operation, and waiting for the response of the server; the server receives a remote request sent by the client through RDMA write operation, and stores the information of the remote request into the message block; and processing and responding to the remote request of the client.
When executed by the processor 32, the remote memory management program is further configured to implement the following steps of the remote memory management method:
and if the server divides the message area into a working area and a preheating area, sending a message block application to the preheating area through RDMA atomic operation.
When executed by the processor 32, the remote memory management program is further configured to implement the following steps of the remote memory management method:
and if the server side switches the working area into the preheating area and switches the preheating area into the working area according to the switching signal, sending a message block application to the working area switched into the preheating area through RDMA atomic operation.
When executed by the processor 32, the remote memory management program is further configured to implement the following steps of the remote memory management method:
and after the message block application is successful, if the expiration information fed back by the server is received, emptying the occupation state information and sending the message block application to the server again.
When executed by the processor 32, the remote memory management program is further configured to implement the following steps of the remote memory management method:
obtaining values (a, b) of the distribution fields, and if b is smaller than the total number of the message blocks, judging that the message block application is successful; the distribution field is configured in the header of the message block for the server to decide the message block application result of the client, wherein a represents the index number of the message area, and b represents the application number of the message block.
The client side of the embodiment of the invention receives and processes the message block application and the remote request sent by the client side through RDMA operation by applying the memory as the message area at the server side and registering the message area to the network card; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Sixth embodiment
A sixth embodiment of the present invention provides a computer-readable storage medium, where a remote memory management program is stored on the computer-readable storage medium, and the remote memory management program is used to implement the steps of the remote memory management method according to the fourth embodiment when executed by a processor.
The computer-readable storage medium of the embodiment of the invention receives and processes the message block application and the remote request sent by the client through RDMA operation by applying the memory as the message area at the server and registering the message area to the network card; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
Seventh embodiment
As shown in fig. 6, a seventh embodiment of the present invention provides a remote memory management system, where the system includes a server 41 and a client 42, and the server 41 refers to the second embodiment, and the client 42 refers to the fifth embodiment, which are not described herein again.
The remote memory management system of the embodiment of the invention receives and processes the message block application and the remote request sent by the client through RDMA operation by applying the memory as the message area at the server and registering the message area to the network card; the memory occupation in the cross-node message transmission process can be effectively reduced, and the remote request throughput is improved.
It should be noted that the device embodiment and the method embodiment belong to the same concept, and specific implementation processes thereof are described in the method embodiment in detail, and technical features in the method embodiment are correspondingly applicable in the device embodiment, which is not described herein again.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The preferred embodiments of the present invention have been described above with reference to the accompanying drawings, and are not to be construed as limiting the scope of the invention. Those skilled in the art can implement the invention in various modifications, such as features from one embodiment can be used in another embodiment to yield yet a further embodiment, without departing from the scope and spirit of the invention. Any modification, equivalent replacement and improvement made within the technical idea of using the present invention should be within the scope of the right of the present invention.

Claims (13)

1. A remote memory management method is used for a server side, and is characterized by comprising the following steps:
applying for a memory as a message area, and registering the message area to a network card;
receiving a message block application sent by a client through Remote Direct Memory Access (RDMA) atomic operation, distributing a message block for the client and assigning the exclusive time of the message block;
receiving a remote request sent by the client through RDMA write operation, and storing the information of the remote request into the message block; processing and responding to the remote request of the client;
the message area is divided into a working area and a preheating area, so that the message block application and the remote request sent by the client are processed in parallel through the working area and the preheating area.
2. The method of claim 1, wherein the parallel processing of the message block request and the remote request sent by the client via the working area and the pre-heating area further comprises:
acquiring switching signals of the working area and the preheating area;
and switching the working area into a preheating area and switching the preheating area into the working area according to the switching signal.
3. The method of claim 1, wherein receiving a message block request sent by a client via a Remote Direct Memory Access (RDMA) atomic operation, after assigning a message block to the client and specifying an exclusive time for the message block, further comprises:
if the number of the clients is larger than that of the message blocks, shortening the exclusive time of the message blocks;
and if the exclusive time of the message block is expired, feeding back the expiration information to the client so that the client clears the occupation state information and resends the message block application.
4. The method of claim 1, wherein the applying for a memory as a message area and registering the message area to the network card further comprises:
configuring a distribution field for judging the message block application result of the client at the head of the message block;
the value of the allocation field is updated periodically.
5. A server, characterized in that the server comprises: a memory, a processor and a remote memory management program stored on the memory and executable on the processor, the remote memory management program when executed by the processor implementing the steps of the remote memory management method according to any one of claims 1 to 4.
6. A computer-readable storage medium, having a remote memory management program stored thereon, wherein the remote memory management program, when executed by a processor, implements the steps of the remote memory management method according to any one of claims 1 to 4.
7. A remote memory management method, which is used for a client, is characterized by comprising the following steps:
sending a message block application to a server through RDMA atomic operation, so that the server receives the message block application sent by the client through the RDMA atomic operation, allocates a message block to the client and specifies the exclusive time of the message block; if the server divides the message area into a working area and a preheating area so as to process the message block application and the remote request sent by the client side in parallel through the working area and the preheating area, the sending of the message block application to the server side through the RDMA atomic operation specifically comprises: sending a message block application to the preheat zone through RDMA atomic operation;
after the application of the message block is successful, sending a remote request to the server through RDMA write operation, and waiting for the response of the server, so that the server receives the remote request sent by the client through the RDMA write operation, stores the information of the remote request into the message block, and processes and responds to the remote request of the client.
8. The method of claim 7, wherein if the server switches the working area to the preheating area and switches the preheating area to the working area according to a switching signal, sending a message block application to the server through RDMA atomic operation specifically comprises: and sending a message block application to the working area switched into the preheating area through RDMA atomic operation.
9. The method according to claim 7, wherein after the message block application is successful, if the expiration information fed back by the server is received, the occupancy state information is cleared and the message block application is sent to the server again.
10. The method of claim 7, wherein the message block request is determined to be successful by:
obtaining values (a, b) of the distribution fields, and if b is smaller than the total number of the message blocks, judging that the message block application is successful; the distribution field is configured in the header of the message block for the server to decide the message block application result of the client, wherein a represents the index number of the message area, and b represents the application number of the message block.
11. A client, the client comprising: a memory, a processor and a remote memory management program stored on the memory and executable on the processor, the remote memory management program when executed by the processor implementing the steps of the remote memory management method according to any one of claims 7 to 10.
12. A computer-readable storage medium, having a remote memory management program stored thereon, which when executed by a processor implements the steps of the remote memory management method according to any one of claims 7 to 10.
13. A remote memory management system, characterized in that the system comprises the server of claim 5 and the client of claim 11.
CN201810515949.3A 2018-05-25 2018-05-25 Remote memory management method and system, server, client and storage medium Active CN110535811B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810515949.3A CN110535811B (en) 2018-05-25 2018-05-25 Remote memory management method and system, server, client and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810515949.3A CN110535811B (en) 2018-05-25 2018-05-25 Remote memory management method and system, server, client and storage medium

Publications (2)

Publication Number Publication Date
CN110535811A CN110535811A (en) 2019-12-03
CN110535811B true CN110535811B (en) 2022-03-04

Family

ID=68657010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810515949.3A Active CN110535811B (en) 2018-05-25 2018-05-25 Remote memory management method and system, server, client and storage medium

Country Status (1)

Country Link
CN (1) CN110535811B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404931B (en) * 2020-03-13 2021-03-30 清华大学 Remote data transmission method based on persistent memory
CN112596669A (en) * 2020-11-25 2021-04-02 新华三云计算技术有限公司 Data processing method and device based on distributed storage
CN113448634B (en) * 2021-05-31 2022-07-19 山东英信计算机技术有限公司 ROCE network card resource management method, device, equipment and readable medium
CN114153785B (en) * 2021-11-29 2022-08-30 北京志凌海纳科技有限公司 Memory management method and device based on remote direct memory access

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530167A (en) * 2013-09-30 2014-01-22 华为技术有限公司 Virtual machine memory data migration method and relevant device and cluster system
CN103929415A (en) * 2014-03-21 2014-07-16 华为技术有限公司 Method and device for reading and writing data under RDMA and network system
CN105353992A (en) * 2015-12-10 2016-02-24 浪潮(北京)电子信息产业有限公司 Energy-saving dispatching method for disks
CN106603409A (en) * 2016-11-30 2017-04-26 中国科学院计算技术研究所 Data processing system, method and apparatus
CN106657365A (en) * 2016-12-30 2017-05-10 清华大学 High concurrent data transmission method based on RDMA (Remote Direct Memory Access)
CN106873915A (en) * 2017-02-22 2017-06-20 郑州云海信息技术有限公司 A kind of data transmission method and device based on RDMA registers memory blocks
CN107479833A (en) * 2017-08-21 2017-12-15 中国人民解放军国防科技大学 Key value storage-oriented remote nonvolatile memory access and management method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10409762B2 (en) * 2016-03-08 2019-09-10 International Business Machines Corporation Remote direct memory access-based on static analysis of asynchronous blocks
US11086814B2 (en) * 2016-04-15 2021-08-10 Nec Corporation System and method for communication efficient sparse-reduce

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530167A (en) * 2013-09-30 2014-01-22 华为技术有限公司 Virtual machine memory data migration method and relevant device and cluster system
CN103929415A (en) * 2014-03-21 2014-07-16 华为技术有限公司 Method and device for reading and writing data under RDMA and network system
CN105353992A (en) * 2015-12-10 2016-02-24 浪潮(北京)电子信息产业有限公司 Energy-saving dispatching method for disks
CN106603409A (en) * 2016-11-30 2017-04-26 中国科学院计算技术研究所 Data processing system, method and apparatus
CN106657365A (en) * 2016-12-30 2017-05-10 清华大学 High concurrent data transmission method based on RDMA (Remote Direct Memory Access)
CN106873915A (en) * 2017-02-22 2017-06-20 郑州云海信息技术有限公司 A kind of data transmission method and device based on RDMA registers memory blocks
CN107479833A (en) * 2017-08-21 2017-12-15 中国人民解放军国防科技大学 Key value storage-oriented remote nonvolatile memory access and management method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Distributed queue-based locking using advanced network features";A. Devulapalli;《2005 International Conference on Parallel Processing (ICPP"05)》;20050801;全文 *
"基于RDMA跨态通信协议的研究与实现";李亮;《中国优秀硕士学位论文全文数据库-信息科技辑》;20171115;全文 *

Also Published As

Publication number Publication date
CN110535811A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN110535811B (en) Remote memory management method and system, server, client and storage medium
CN109088892B (en) Data transmission method, system and proxy server
US10284626B2 (en) Transporting operations of arbitrary size over remote direct memory access
CN111277616B (en) RDMA-based data transmission method and distributed shared memory system
EP0891585B1 (en) A method and apparatus for client managed flow control on a limited memory computer system
US7627627B2 (en) Controlling command message flow in a network
CN108268208A (en) A kind of distributed memory file system based on RDMA
US20140280398A1 (en) Distributed database management
EP2993838A1 (en) Method for setting identity of gateway device and management gateway device
CN102831018B (en) Low latency FIFO messaging system
CN112631788B (en) Data transmission method and data transmission server
WO2014082562A1 (en) Method, device, and system for information processing based on distributed buses
CN111404931B (en) Remote data transmission method based on persistent memory
US8539089B2 (en) System and method for vertical perimeter protection
WO2009097776A1 (en) System, device and method for achieving service upgrade
CN113259415B (en) Network message processing method and device and network server
CN105141603A (en) Communication data transmission method and system
WO2023236589A1 (en) Communication method and related components
CN111404986B (en) Data transmission processing method, device and storage medium
CN112351089B (en) Data transmission method, system and device between virtual machine and accelerator
KR20090071542A (en) Host posing network device and method thereof
CN110609746A (en) Method, apparatus and computer program product for managing network system
CN115509435A (en) Data reading and writing method, device, equipment and medium
CN116361038B (en) Acceleration computing management method, system, equipment and storage medium
CN114172945B (en) Method and equipment for realizing full duplex instant messaging through simulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant