CN115586980A - Remote procedure calling device and method - Google Patents

Remote procedure calling device and method Download PDF

Info

Publication number
CN115586980A
CN115586980A CN202211230705.3A CN202211230705A CN115586980A CN 115586980 A CN115586980 A CN 115586980A CN 202211230705 A CN202211230705 A CN 202211230705A CN 115586980 A CN115586980 A CN 115586980A
Authority
CN
China
Prior art keywords
memory
rpc
allocated
protocol frame
called
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211230705.3A
Other languages
Chinese (zh)
Inventor
姜哓庆
王鲲
陈飞
邹懋
杨智佳
刘福财
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vita Technology Beijing Co ltd
Original Assignee
Vita Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vita Technology Beijing Co ltd filed Critical Vita Technology Beijing Co ltd
Priority to CN202211230705.3A priority Critical patent/CN115586980A/en
Publication of CN115586980A publication Critical patent/CN115586980A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/544Remote

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The disclosure relates to a remote procedure call device and method to eliminate memory copy, reduce CPU resource consumption and improve RPC communication performance. The device comprises a memory distributor, a shared memory, an RDMA module and an RPC component; the memory distributor is used for responding to an RPC interface called by a calling end and determining to distribute a memory on a target memory, wherein the target memory is a shared memory to be distributed acquired by the shared memory from an operating system, or a registered memory acquired by an RDMA module from the operating system and registered to an RDMA network card; and the RPC component is used for packaging an RPC protocol frame corresponding to the RPC interface on the allocated memory, and the RPC protocol frame is used for indicating the called end to execute the operation corresponding to the RPC interface.

Description

Remote procedure calling device and method
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a remote procedure call apparatus and method.
Background
RPC (Remote Procedure Call) is widely applied to inter-process communication, and is mainly used in a plurality of scenarios such as data transmission, process synchronization, protocol handshake, and the like. The working mode of RPC is usually based on two types of operations, RPC request and RPC reply, based on which RPC protocol frames carrying small or large amounts of data can be encapsulated. In a distributed system, RPCs are often the underlying communication component that connects a set of processes with or across nodes. In the process of applying RPC to inter-process communication, hardware resources are consumed, the greater the resource consumption degree is, the lower the data transmission performance is, and the resource consumption and the data transmission performance are important for indexes such as overall delay and throughput of a distributed system. Therefore, how to reduce the resource consumption of RPC to improve the data transmission performance is an urgent problem to be solved.
Disclosure of Invention
The purpose of the present disclosure is to provide a remote procedure call device and method to eliminate memory copy, reduce CPU resource consumption, and improve RPC communication performance.
In order to achieve the above object, a first aspect of the embodiments of the present disclosure provides a remote procedure call apparatus, including: the device comprises a memory distributor, a shared memory, an RDMA module and an RPC component;
the memory allocator is used for responding to an RPC interface called by a calling end and determining to allocate memory on a target memory, wherein the target memory is a to-be-allocated shared memory acquired by the shared memory from an operating system, or a register memory acquired by the RDMA module from the operating system and registered to the RDMA network card;
the RPC component is used for packaging the RPC protocol frame corresponding to the RPC interface on the allocated memory, and the RPC protocol frame is used for indicating a called end to execute the operation corresponding to the RPC interface.
Optionally, the shared memory is configured to obtain a to-be-allocated shared memory of a first preset size from an operating system;
the memory distributor is used for responding to an RPC interface which is called by a calling end and is communicated with a called end in the same node, and determining a first distributed memory on the shared memory to be distributed;
and the RPC component is used for packaging a first RPC protocol frame on the first allocation memory and triggering a notification event for completing frame packaging, wherein the notification event is used for indicating a called end to execute the operation corresponding to the RPC interface.
Optionally, the RDMA module is configured to obtain a to-be-registered memory of a second preset size from an operating system, and register the to-be-registered memory to the RDMA network card to obtain a registered memory;
the memory allocator is used for responding to an RPC interface which is called by a calling end and is in cross-node communication with a called end, and determining a second allocated memory on the registered memory;
and the RPC component is used for packaging a second RPC protocol frame on the second allocated memory and sending the second RPC protocol frame to a called end so that the called end executes the operation corresponding to the RPC interface.
Optionally, the shared memory is further configured to map the shared memory to be allocated to a preset virtual address space, where the preset virtual address space is the same as the virtual address spaces of the calling end and the called end.
Optionally, the memory allocator is configured to respond to a call end calling an RPC interface for heap memory allocation, and determine, on a target memory, an allocated memory corresponding to a heap structure.
Optionally, the apparatus further comprises a coroutine module;
the coroutine module is used for initializing a coroutine stack space comprising a coroutine structure body so as to provide the coroutine structure body when the memory distributor distributes stack memories;
the memory distributor is used for responding to a calling end to call an RPC interface for stack memory distribution and determining a distributed memory corresponding to the coroutine structure on a target memory.
Optionally, the apparatus further comprises an address negotiation and translation module;
the address negotiation and translation module is configured to determine a virtual memory address that does not overlap with allocated memories of other call terminals when the memory allocator determines to allocate a memory.
Optionally, the apparatus further comprises an address negotiation and translation module; the notification event comprises a virtual memory address of the first allocated memory;
and the address negotiation and translation module is used for translating the virtual memory address of the first allocation memory when the calling end and the called end communicate with each other, so that the RPC component acquires the first RPC protocol frame from the first allocation memory according to the translated virtual memory address.
Optionally, the RPC component is further configured to obtain a first RPC protocol frame from the first allocation memory when the calling end and the called end communicate with the node, analyze the obtained first RPC protocol frame, and determine an operation code corresponding to the RPC interface according to the analyzed first RPC protocol frame, so that the called end executes an operation corresponding to the operation code.
A second aspect of the embodiments of the present disclosure provides a remote procedure call method, which may be applied to the remote procedure call apparatus of the first aspect, where the method includes:
responding to an RPC interface called by a calling end, determining to allocate a memory on a target memory through the memory allocator, wherein the target memory is a shared memory to be allocated, which is acquired from an operating system by the shared memory, or a register memory which is acquired from the operating system by the RDMA module and is registered to the RDMA network card;
and packaging the RPC protocol frame corresponding to the RPC interface on the allocated memory through the RPC component, wherein the RPC protocol frame is used for indicating a called end to execute the operation corresponding to the RPC interface.
The method comprises the steps that an RPC interface called by a calling end is responded, an allocated memory is determined on a target memory, and the target memory is a shared memory to be allocated acquired from an operating system by the shared memory or a registered memory acquired from the operating system by an RDMA module and registered to an RDMA network card, so that an RPC protocol frame corresponding to the RPC interface is packaged on the allocated memory, the operation that the RPC protocol frame packaged on the allocated memory is copied to the shared memory or the registered memory of the RDMA network card can be avoided, zero-copy RPC communication can be achieved, CPU resource consumption is reduced, and data transmission performance is improved.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a block diagram illustrating a remote procedure call device according to an exemplary embodiment of the present disclosure.
Fig. 2 is a schematic diagram illustrating a memory allocator determining allocated memory according to an exemplary embodiment of the present disclosure.
Fig. 3 is a schematic diagram of a coroutine module providing coroutine structure according to an exemplary embodiment of the disclosure.
Fig. 4 is a schematic diagram illustrating an address negotiation and translation module allocating a virtual memory address according to an exemplary embodiment of the disclosure.
FIG. 5 is a flow chart illustrating a method for remote procedure call in accordance with an exemplary embodiment of the present disclosure.
Detailed Description
The following detailed description of the embodiments of the disclosure refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
In the related art, the data channel of the RPC communication may be implemented based on a file handle, a shared memory, a TCP Socket, and the like. In the same-node RPC communication, the inter-process communication based on the shared memory can eliminate the transmission process of data between processes. In cross-node RPC communication, multiple data copies are inevitably introduced in interprocess communication based on TCP Socket, such as user mode copy, copy between a user mode cache region and a kernel mode cache region, and copy between the kernel mode cache region and a network card. In response, RDMA (Remote Direct Memory Access) high performance networks have been developed. The RPC component can realize cross-node high-performance communication by borrowing the unloading function of the RDMA network card. The RDMA technology fundamentally eliminates data copy between a user mode buffer area and a kernel mode buffer area in TCP and between the kernel mode buffer area and a network card device, and improves the processing performance of the RPC component to a certain extent.
However, since the memory allocator in the related art is independent of the shared memory and the registered memory of the RDMA network card, there still exists one user-mode data copy in both the RPC procedure between the same node processes based on the shared memory and the RPC procedure between the cross-node processes based on the RDMA. Specifically, when the call end performs an RPC call, the memory allocator may initialize data on a stack memory or a heap memory (specifically, whether stack memory allocation or heap memory allocation is performed depends on a call mode of the call end), and then pack the initialized data through the RPC component to obtain a corresponding RPC protocol frame. On the basis, the RPC protocol frame is copied to the shared memory during the same-node RPC communication so that the called end can acquire the RPC protocol frame from the shared memory, or the RPC protocol frame is copied to the registered memory during the cross-node RPC communication, so that the RDMA module can operate the RPC protocol frame on the registered memory so as to send the RPC protocol frame to the called end for use. In this process, there is still one copy of the user-mode data (i.e. the RPC protocol frame is copied to the shared memory or the registered memory). Since memory copy operations are typically implemented explicitly or implicitly via a loop execution instruction of the CPU (e.g., a rep instruction of the CPU), relevant data statistics indicate that: the CPU has at least 30% of the time to execute repeated instruction fragments or instructions, which results in considerable resource consumption. Therefore, it is necessary to reduce the number of cycles or the execution of such instructions to effectively reduce CPU resource consumption.
In view of this, embodiments of the present disclosure provide a remote call apparatus and method to implement a corresponding memory allocator on a shared memory and a registered memory of an RDMA network card, so that an allocated memory may be determined on the shared memory to be allocated, which is obtained from an operating system by the shared memory, or on the registered memory, which is obtained from the operating system and registered to the RDMA network card by an RDMA module, thereby avoiding an operation of copying an RPC protocol frame encapsulated on the allocated memory to the shared memory or the registered memory of the RDMA network card, so as to implement zero-copy RPC communication, thereby reducing CPU resource consumption and improving data transmission performance.
Referring to fig. 1, fig. 1 is a block diagram illustrating a remote procedure call device 100 according to an exemplary embodiment of the present disclosure. As shown in fig. 1, the remote procedure call device 100 may include a memory allocator 101, a shared memory 102, an RDMA module 103, and an RPC component 104.
The memory allocator 101 is configured to determine to allocate a memory on a target memory in response to an RPC interface called by a call end, where the target memory is a shared memory to be allocated, which is acquired from an operating system by the shared memory 102, or a registration memory, which is acquired from the operating system by the RDMA module 103 and is registered to the RDMA network card;
and the RPC component 104 is configured to package an RPC protocol frame corresponding to the RPC interface on the allocated memory, where the RPC protocol frame is used to instruct the called end to execute an operation corresponding to the RPC interface.
The calling terminal may refer to a first process running in the electronic device, and the called terminal may refer to a second process running in the same electronic device or another electronic device. The calling end can realize corresponding functions by calling the RPC interface, so that the calling end communicates with the called end, and the called end executes the operation corresponding to the RPC interface.
It should be understood that memory allocator 101 may include a shared memory allocator and a registered memory allocator. The RDMA module 103 may include an RDMA network card, and the RDMA module 103 may operate a register memory registered to the RDMA network card. On this basis, the corresponding memory allocator 101 (for example, the above-mentioned shared memory allocator) may be implemented on the to-be-allocated shared memory acquired by the shared memory 102 from the operating system, so that when the calling end and the called end perform the RPC communication with the same node, the memory allocator 101 may determine to allocate memory on the to-be-allocated shared memory in response to the RPC interface called by the calling end. Similarly, a corresponding memory allocator 101 (e.g., the above-described register memory allocator) may be implemented on a register memory acquired by the RDMA module 103 from the operating system and registered to the RDMA network card, so that when the calling end and the called end perform cross-node RPC communication, the RDMA module 103 may determine to allocate a memory on the register memory in response to the RPC interface called by the calling end. Therefore, the RPC protocol frame corresponding to the RPC interface can be packaged on the allocated memory, so that the called end is indicated to execute the operation corresponding to the RPC interface on the basis of zero copy.
By implementing the corresponding memory allocator 101 on the shared memory 102 and the registered memory of the RDMA network card, the allocation memory can be determined in response to the RPC interface called by the calling terminal, on the shared memory to be allocated, which is acquired from the operating system by the shared memory 102, or on the registered memory of the RDMA network card, which is acquired from the operating system by the RDMA module 103 and registered, so that the operation of copying the RPC protocol frame encapsulated on the allocation memory to the shared memory 102 or the registered memory of the RDMA network card can be avoided, zero-copy RPC communication can be implemented, and the processing delay on the RPC data stream can be effectively reduced. It is easy to understand that, the memory copy in the RPC process is eliminated as much as possible, which not only can reduce the contention of the memory bus and the consumption of CPU resources, but also can effectively improve the overall performance of the upper-layer service system.
It is understood that the calling end may call an RPC interface that communicates with the called end to the node. Referring to fig. 2, fig. 2 is a schematic diagram illustrating a memory allocator 101 determining allocated memory according to an exemplary embodiment of the present disclosure. As shown in fig. 2, the memory allocator 101 may allocate corresponding allocated memory instances, such as the memory block (block) 1011 and the memory block (block) 1012 shown in fig. 2, for the RPC interfaces of the responded call end calls. The memory block may include a plurality of memory objects (objects), and both the memory block and the memory object may be allocated from the shared memory 102. In a possible implementation manner, the sizes of different memory blocks may be different, and the memory blocks may be further divided into various different memory objects according to the typical object size of the application program, so as to provide a fine-grained memory application for a calling end, thereby reducing internal fragments during memory allocation. When the memory is released, the corresponding adjacent merging can be performed according to the algorithm of the Buddy System, so as to reduce the external fragments generated in the memory use process.
Optionally, the shared memory 102 may be configured to obtain a to-be-allocated shared memory of a first preset size from the operating system;
the memory allocator 101 may be configured to determine, in response to an RPC interface called by a calling end and performing peer-to-peer communication with a called end, a first allocated memory on the shared memory to be allocated;
the RPC component 104 may be configured to encapsulate the first RPC protocol frame on the first allocated memory, and trigger a notification event that the encapsulation is completed, where the notification event is used to instruct the called end to execute an operation corresponding to the RPC interface.
The size of the shared memory to be allocated should be enough to implement the corresponding memory allocator 101 thereon, and on this basis, the first preset size may be set according to an actual situation, which is not specifically limited in the present disclosure.
In a possible implementation manner, the shared memory 102 may obtain the to-be-allocated shared memory of the first preset size from the operating system through a system call method such as mmap. Therefore, the memory allocator 101 may determine the first allocated memory on the shared memory to be allocated, in response to the RPC interface called by the calling end and performing peer-to-peer communication with the called end. Then, the RPC component 104 may encapsulate the first RPC protocol frame on the first allocated memory, and trigger a notification event that the encapsulation is completed, so that the called end executes an operation corresponding to the RPC interface.
It should be further noted that, a corresponding memory allocator 101 may be implemented on the shared memory 102, and a corresponding interface is provided based on the memory allocator 101, where the interface may request a virtual memory address of the shared memory to be allocated, which is obtained by the shared memory 102 from an operating system. Therefore, the first allocated memory for packaging the first RPC protocol frame can be provided for the RPC interface called by the calling end on the shared memory to be allocated.
Because the corresponding memory allocator 101 is implemented on the shared memory 102, and the first allocated memory is determined by the memory allocator 101, the operation of copying the RPC protocol frames encapsulated on the memory allocator 101 to the shared memory 102 can be avoided, so as to implement the RPC communication of zero copy, thereby reducing the CPU resource consumption and improving the data transmission performance.
Optionally, the shared memory 102 is further configured to map the shared memory to be allocated to a preset virtual address space, where the preset virtual address space is the same as virtual address spaces of the calling end and the called end.
The preset virtual address space may be determined according to actual conditions, and this disclosure is not limited in this respect.
It should be noted that the calling end and the called end perform interprocess communication based on the shared memory 102, and both ends have the same physical address. On this basis, the shared memory 102 maps the shared memory to be allocated to a preset virtual address space which is the same as the virtual address spaces of the calling end and the called end, so that the calculation process of the called end for aligning the virtual addresses can be reduced, and the called end can quickly find the first allocated memory for encapsulating the first RPC protocol frame. In a possible implementation mode, the virtual address pointers in respective processes can be directly encapsulated when the first RPC protocol frame is encapsulated, so that the purpose of memory zero copy is achieved.
It is understood that the caller may call an RPC interface that communicates cross-node with the callee.
Optionally, the RDMA module 103 may be configured to obtain a second to-be-registered memory with a preset size from the operating system, and register the to-be-registered memory to the RDMA network card to obtain a registered memory;
the memory allocator 101 may be configured to determine, in response to an RPC interface that is called by a calling end and performs cross-node communication with a called end, a second allocated memory on the registered memory;
the RPC component 104 may be configured to encapsulate the second RPC protocol frame on the second allocation memory, and send the second RPC protocol frame to the called end, so that the called end executes an operation corresponding to the RPC interface.
The size of the memory to be registered should be enough to implement the corresponding memory allocator 101 thereon, and on this basis, the second preset size may be set according to an actual situation, which is not specifically limited in the present disclosure.
It should be noted that in RDMA programming, an application is required to register a relevant memory area of a process in advance in an RDMA network card. And after the registration is successful, all transmission operations of the data plane are unloaded to the network card, and data copying is not performed through the CPU. Meanwhile, context switching and memory copying between a user mode and a kernel mode can be realized by means of a kernel by pass technology. However, there is still one copy of the user state memory for the RPC flow.
In this embodiment, before the instance of the memory allocator 101 is initialized, the memory to be registered, which is acquired by the RDMA module 103 from the operating system, may be registered in the RDMA network card, so as to obtain the registered memory. This may enable the RDMA module 103 to index memory addresses of all current processes in an address translation table of the RDMA network card, implement the corresponding memory allocator 101 on the registered memory, and determine the second allocated memory through the memory allocator 101. Then, the RPC component 104 may encapsulate the second RPC protocol frame on the second allocation memory, and send the second RPC protocol frame to the called end, so that the called end executes an operation corresponding to the RPC interface.
Because the corresponding memory allocator 101 is implemented on the registered memory, and the second allocated memory is determined by the memory allocator 101, the operation of copying the RPC protocol frames encapsulated on the memory allocator 101 to the registered memory can be avoided, so as to implement the RPC communication of zero copy, thereby reducing the CPU resource consumption and improving the data transmission performance.
It should be noted that, in the process of data framing and unframing by RPC, there are two types of copy scenarios, and both of the two types of copy scenarios are copied when data is serialized and/or deserialized. The first type is that when a dynamically allocated memory is adopted for a calling terminal, an RPC protocol frame packaged on the allocated heap memory is copied to the shared memory 102 or the register memory. The second type is to copy the RPC protocol frame encapsulated on the allocated stack memory to the shared memory 102 or the registered memory when the local variable memory on the function stack is adopted for the calling end.
Optionally, for the first type of copy scenario, the memory allocator 101 may be configured to, in response to the call end calling an RPC interface for heap memory allocation, determine, on the target memory, an allocated memory corresponding to the heap structure.
It is understood that the shared memory to be allocated acquired by the shared memory 102 from the operating system and the memory to be registered acquired by the RDMA network card from the operating system both belong to the heap memory, and therefore, when the RPC interface for allocating the heap memory is called by the calling terminal, the allocated memory corresponding to the heap structure can be directly determined on the target memory.
Optionally, the remote procedure call apparatus provided in the embodiment of the present disclosure further includes a coroutine module. For the second type of copy scenario, a function execution stack may be constructed by the coroutine module, so as to replace a stack memory with a heap memory of the coroutine. The coroutine is a user mode thread, which can utilize dynamically allocated memory (also called heap memory) to construct a function runtime stack, and switch the coroutine context executed during runtime through active enter and yield operations. Therefore, after the coroutine stack space is initialized by the coroutine module, the coroutine stack space including the coroutine structure is allocated to the allocated memory determined by the memory allocator 101, so that the coroutine stack space of the coroutine module can also allocate the memory from the shared memory 102.
Referring to fig. 3, fig. 3 is a schematic diagram of a coroutine module providing a coroutine structure according to an exemplary embodiment of the disclosure. As shown in fig. 3, the coroutine module may allocate a coroutine execution stack including a coroutine structure to an allocated memory instance allocated by the memory allocator 101 in response to the RPC interface called by the calling end (the allocated memory instance is located on the shared memory 102), for example, allocate the coroutine execution stack 3051 shown in fig. 3 to a memory object (object) instance 1021, allocate the coroutine execution stack 3052 to a memory object (object) instance 1022, and allocate the coroutine execution stack 3053 to a memory object (object) instance 1023.
The coroutine module may be configured to initialize a coroutine stack space including a coroutine structure, so as to provide the coroutine structure when the memory allocator 101 allocates a stack memory;
the memory allocator 101 may be configured to, in response to a call end calling an RPC interface for stack memory allocation, determine an allocated memory corresponding to the coroutine structure on the target memory.
In a possible implementation, the initialization of at least one coroutine execution stack and a coroutine pool may be completed by a coroutine module before the process is started. Thus, a corresponding coroutine execution stack can be dynamically allocated for each RPC call request, and the request and completion procedures of the RPC are asynchronously tracked. Exemplarily, after the RPC request is issued, the coroutine execution stack is exited, and after the RPC request is completed, the coroutine execution stack is reentered to the original coroutine context based on the event loop or the coroutine scheduler, so that the complete RPC flow is completed. By means of the protocol module, RPC frame sealing and frame removing processes on the function stack can be completed, and therefore zero-copy RPC communication is achieved by combining other embodiments of the disclosure.
It should be further noted that, on the basis of the coroutine module, the RDMA module 103 provided in this embodiment of the present disclosure may not need to design a separate RDMA-based RPC memory buffer, so that cache management and RDMA stream control management may be facilitated.
Optionally, the remote procedure call device provided in the embodiment of the present disclosure further includes an address negotiation and translation module. Referring to fig. 4, fig. 4 is a schematic diagram illustrating an address negotiation and translation module 406 allocating a virtual memory address according to an exemplary embodiment of the disclosure. As shown in fig. 4, the address negotiation and translation module 406 may perform cross-process address negotiation to assist address allocation when the memory allocator 101 allocates corresponding allocated memory instances for RPC interfaces called by different call ends (e.g., the call end 4071, the call end 4072, and the call end 4073 shown in fig. 4).
The address negotiation and translation module 406 may be configured to determine a virtual memory address that does not overlap with the allocated memory of other call terminals when the memory allocator 101 determines to allocate the memory.
In a possible implementation manner, initialization of respective allocated memory instances may be completed for different processes based on an address negotiation mechanism of the address negotiation and translation module 406. The allocated memory instances of the different processes may be aware of all the dispatchable addresses of the shared memory 102 through the address negotiation and translation module 406, but only using the assigned address space. That is, the address negotiation and translation module 406 may determine, when the memory allocator 101 determines to allocate the memory, a virtual memory address that does not overlap with the allocated memory of other call terminals depending on a virtual address negotiation mechanism, so that the call terminal and other call terminals only use the allocated virtual memory address under the condition of sharing the shared memory to be allocated, thereby avoiding usage conflicts between different processes.
In addition, the address negotiation and translation module 406 may also be configured to translate a virtual memory address of the allocated memory, so that the called end may find the corresponding allocated memory according to the translated virtual memory address, and obtain the corresponding RPC protocol frame.
Optionally, the notification event triggered by the calling end may include a virtual memory address of the first allocated memory. On this basis, the address negotiation and translation module 406 may be configured to translate the virtual memory address of the first allocated memory when the calling end and the called end perform peer-to-peer communication, so that the RPC component 104 obtains the first RPC protocol frame from the first allocated memory according to the translated virtual memory address.
It is understood that, when the calling end and the called end communicate with each other, the called end may translate the virtual memory address of the first allocated memory included in the notification event through the address negotiation and translation module 406, so that the RPC component 104 obtains the first RPC protocol frame from the first allocated memory according to the translated virtual memory address.
Optionally, the RPC component 104 is further configured to, when the calling end and the called end perform peer-to-peer communication, obtain a first RPC protocol frame from the first allocation memory, analyze the obtained first RPC protocol frame, and determine an operation code corresponding to the RPC interface according to the analyzed first RPC protocol frame, so that the called end executes an operation corresponding to the operation code.
It should be noted that, because the corresponding memory allocator 101 is implemented on the shared memory 102, the first allocated memory is determined by the memory allocator 101, and the virtual address spaces of the calling end and the called end are the same, the called end can directly find the first allocated memory according to the information (for example, the virtual memory address information of the first allocated memory) related to the first RPC protocol frame included in the notification event, and obtain the first RPC protocol frame from the first allocated memory. After the obtained first RPC protocol frame is analyzed, the called end may determine an operation code corresponding to the RPC interface according to the analyzed first RPC protocol frame, and execute an operation corresponding to the operation code.
The memory is determined to be allocated on the target memory by responding to the RPC interface called by the calling end, and because the target memory is the shared memory to be allocated, which is acquired from the operating system by the shared memory 102, or is acquired from the operating system by the RDMA module 103 and is registered to the registered memory of the RDMA network card, the RPC protocol frame corresponding to the RPC interface is packaged on the allocated memory, so that the operation of copying the RPC protocol frame packaged on the allocated memory to the shared memory 102 or the registered memory of the RDMA network card can be avoided, and therefore, zero-copy RPC communication can be realized, the CPU resource consumption is reduced, and the data transmission performance is improved.
Based on the same inventive concept, an embodiment of the present disclosure further provides a remote procedure call method, and referring to fig. 5, fig. 5 is a flowchart illustrating a remote procedure call method according to an exemplary embodiment of the present disclosure. The remote procedure call method may be applied to the remote procedure call apparatus in the above embodiment, and the method includes:
s501, responding to an RPC interface called by a calling end, determining to allocate a memory on a target memory through a memory allocator, wherein the target memory is a shared memory to be allocated, which is acquired from an operating system by the shared memory, or a register memory which is acquired from the operating system by an RDMA module and is registered to an RDMA network card;
s502, packaging an RPC protocol frame corresponding to the RPC interface on the allocated memory through the RPC component, wherein the RPC protocol frame is used for indicating the called end to execute the operation corresponding to the RPC interface.
The RPC protocol frames corresponding to the RPC interface are packaged on the allocated memory, so that the operation of copying the RPC protocol frames packaged on the allocated memory to the shared memory or the registered memory of the RDMA network card can be avoided, and zero-copy RPC communication can be realized, thereby reducing the resource consumption of a CPU (Central processing Unit) and improving the data transmission performance.
In an embodiment, the method further comprises:
acquiring a shared memory to be allocated with a first preset size from an operating system through the shared memory;
responding to the RPC interface called by the calling end, determining the allocated memory on the target memory through a memory allocator, wherein the method comprises the following steps:
responding to an RPC interface which is called by a calling end and is communicated with a called end in the same node, and determining a first allocated memory on a shared memory to be allocated through a memory allocator;
through RPC subassembly encapsulation RPC protocol frame that RPC interface corresponds on allocating the memory, include:
and encapsulating a first RPC protocol frame on the first allocation memory through an RPC component, and triggering a notification event for completing frame encapsulation, wherein the notification event is used for indicating a called end to execute an operation corresponding to an RPC interface.
In an embodiment, the method further comprises:
acquiring a to-be-registered memory with a second preset size from an operating system through an RDMA (remote direct memory access) module, and registering the to-be-registered memory to an RDMA network card to obtain a registered memory;
responding to an RPC interface called by a calling terminal, determining to allocate memory on a target memory through a memory allocator, wherein the method comprises the following steps:
responding to an RPC interface which is called by a calling end and performs cross-node communication with a called end, and determining a second allocated memory on the registered memory through a memory allocator;
through RPC subassembly encapsulation RPC protocol frame that RPC interface corresponds on allocating the memory, include:
and packaging a second RPC protocol frame on the second distribution memory through the RPC component, and sending the second RPC protocol frame to the called end so that the called end executes the operation corresponding to the RPC interface.
In an embodiment, the method further comprises:
and mapping the shared memory to be allocated to a preset virtual address space through the shared memory, wherein the preset virtual address space is the same as the virtual address spaces of the calling end and the called end.
In one embodiment, determining, by the memory allocator, to allocate memory on the target memory in response to the RPC interface called by the call side includes:
and responding to the calling end to call an RPC interface for heap memory allocation, and determining the allocated memory corresponding to the heap structure on the target memory through the memory allocator.
In an embodiment, the method further comprises:
initializing a coroutine stack space comprising a coroutine structure body through a coroutine module so as to provide the coroutine structure body when a memory distributor performs stack memory distribution;
responding to an RPC interface called by a calling terminal, determining to allocate memory on a target memory through a memory allocator, wherein the method comprises the following steps:
and responding to the call of the call end to call an RPC interface for stack memory allocation, and determining an allocated memory corresponding to the coroutine structure body on the target memory through a memory allocator.
In an embodiment, the method further comprises:
through the address negotiation and translation module, when the memory distributor determines to distribute the memory, the virtual memory address which is not overlapped with the distributed memory of other calling terminals is determined.
In one embodiment, the notification event includes a virtual memory address of the first allocated memory; the method further comprises the following steps:
and translating the virtual memory address of the first allocated memory by the address negotiation and translation module when the calling end and the called end communicate with the node, so that the RPC component acquires the first RPC protocol frame from the first allocated memory according to the translated virtual memory address.
In an embodiment, the method further comprises:
through the RPC component, when the calling end and the called end communicate with each other, a first RPC protocol frame is obtained from the first allocation memory, the obtained first RPC protocol frame is analyzed, and an operation code corresponding to the RPC interface is determined according to the analyzed first RPC protocol frame, so that the called end executes operation corresponding to the operation code.
With regard to the method in the above-described embodiment, the respective steps have been described in detail in the embodiment of the corresponding apparatus, and will not be elaborated herein.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure as long as it does not depart from the gist of the present disclosure.

Claims (10)

1. A remote procedure call apparatus, the apparatus comprising a memory allocator, a shared memory, an RDMA module, and an RPC component;
the memory allocator is used for responding to an RPC interface called by a calling end and determining to allocate a memory on a target memory, wherein the target memory is a shared memory to be allocated, which is acquired by the shared memory from an operating system, or a register memory, which is acquired by the RDMA module from the operating system and is registered to the RDMA network card;
the RPC component is used for packaging the RPC protocol frame corresponding to the RPC interface on the allocation memory, and the RPC protocol frame is used for indicating a called end to execute the operation corresponding to the RPC interface.
2. The apparatus of claim 1,
the shared memory is used for acquiring a shared memory to be allocated with a first preset size from an operating system;
the memory distributor is used for responding to an RPC interface which is called by a calling end and is communicated with a called end in the same node, and determining a first distributed memory on the shared memory to be distributed;
and the RPC component is used for packaging a first RPC protocol frame on the first allocation memory and triggering a notification event for completing frame packaging, wherein the notification event is used for indicating a called end to execute the operation corresponding to the RPC interface.
3. The apparatus of claim 1,
the RDMA module is used for acquiring a memory to be registered with a second preset size from an operating system, and registering the memory to be registered to the RDMA network card to obtain a registered memory;
the memory allocator is used for responding to an RPC interface which is called by a calling end and is in cross-node communication with a called end, and determining a second allocated memory on the registered memory;
and the RPC component is used for packaging a second RPC protocol frame on the second allocated memory and sending the second RPC protocol frame to a called end so that the called end executes the operation corresponding to the RPC interface.
4. The apparatus of claim 2,
the shared memory is further configured to map the shared memory to be allocated to a preset virtual address space, where the preset virtual address space is the same as virtual address spaces of the calling end and the called end.
5. The apparatus according to any one of claims 1-3,
the memory allocator is used for responding to the calling end to call the RPC interface for allocating the heap memory and determining the allocated memory corresponding to the heap structure on the target memory.
6. The apparatus of any of claims 1-3, further comprising a coroutine module;
the coroutine module is used for initializing a coroutine stack space comprising a coroutine structure body so as to provide the coroutine structure body when the memory distributor distributes stack memories;
and the memory distributor is used for responding to a calling terminal to call an RPC interface for stack memory distribution and determining the distributed memory corresponding to the coroutine structure body on the target memory.
7. The apparatus of any of claims 1-3, wherein the apparatus further comprises an address negotiation and translation module;
the address negotiation and translation module is configured to determine a virtual memory address that does not overlap with allocated memories of other call terminals when the memory allocator determines to allocate memories.
8. The apparatus of claim 2, further comprising an address negotiation and translation module, wherein the notification event comprises a virtual memory address of the first allocated memory;
the address negotiation and translation module is configured to translate a virtual memory address of the first allocated memory when the calling end and the called end perform peer-to-peer communication, so that the RPC component obtains the first RPC protocol frame from the first allocated memory according to the translated virtual memory address.
9. The apparatus of claim 2,
the RPC component is also used for acquiring a first RPC protocol frame from the first allocation memory when the calling end and the called end communicate with the node, analyzing the acquired first RPC protocol frame, and determining an operation code corresponding to the RPC interface according to the analyzed first RPC protocol frame so as to enable the called end to execute the operation corresponding to the operation code.
10. A remote procedure call method applied to the remote procedure call apparatus according to any one of claims 1 to 9, the method comprising:
responding to an RPC interface called by a calling end, determining to allocate memory on a target memory through the memory allocator, wherein the target memory is a to-be-allocated shared memory acquired from an operating system by the shared memory, or a register memory acquired from the operating system by the RDMA module and registered to the RDMA network card;
and packaging the RPC protocol frame corresponding to the RPC interface on the allocation memory through the RPC component, wherein the RPC protocol frame is used for indicating a called end to execute the operation corresponding to the RPC interface.
CN202211230705.3A 2022-10-09 2022-10-09 Remote procedure calling device and method Pending CN115586980A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211230705.3A CN115586980A (en) 2022-10-09 2022-10-09 Remote procedure calling device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211230705.3A CN115586980A (en) 2022-10-09 2022-10-09 Remote procedure calling device and method

Publications (1)

Publication Number Publication Date
CN115586980A true CN115586980A (en) 2023-01-10

Family

ID=84780586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211230705.3A Pending CN115586980A (en) 2022-10-09 2022-10-09 Remote procedure calling device and method

Country Status (1)

Country Link
CN (1) CN115586980A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105978985A (en) * 2016-06-07 2016-09-28 华中科技大学 Memory management method of user-state RPC over RDMA
CN108021449A (en) * 2017-12-01 2018-05-11 厦门安胜网络科技有限公司 One kind association journey implementation method, terminal device and storage medium
CN110086571A (en) * 2019-04-10 2019-08-02 广州华多网络科技有限公司 A kind of data transmission and received method, apparatus and data processing system
CN113485834A (en) * 2021-07-12 2021-10-08 深圳华锐金融技术股份有限公司 Shared memory management method and device, computer equipment and storage medium
CN114756388A (en) * 2022-03-28 2022-07-15 北京航空航天大学 RDMA (remote direct memory Access) -based method for sharing memory among cluster system nodes as required
CN115113922A (en) * 2022-05-31 2022-09-27 青岛海尔科技有限公司 Method, device, equipment and storage medium for realizing stack-free coroutine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105978985A (en) * 2016-06-07 2016-09-28 华中科技大学 Memory management method of user-state RPC over RDMA
CN108021449A (en) * 2017-12-01 2018-05-11 厦门安胜网络科技有限公司 One kind association journey implementation method, terminal device and storage medium
CN110086571A (en) * 2019-04-10 2019-08-02 广州华多网络科技有限公司 A kind of data transmission and received method, apparatus and data processing system
CN113485834A (en) * 2021-07-12 2021-10-08 深圳华锐金融技术股份有限公司 Shared memory management method and device, computer equipment and storage medium
CN114756388A (en) * 2022-03-28 2022-07-15 北京航空航天大学 RDMA (remote direct memory Access) -based method for sharing memory among cluster system nodes as required
CN115113922A (en) * 2022-05-31 2022-09-27 青岛海尔科技有限公司 Method, device, equipment and storage medium for realizing stack-free coroutine

Similar Documents

Publication Publication Date Title
CN110888827B (en) Data transmission method, device, equipment and storage medium
US7003586B1 (en) Arrangement for implementing kernel bypass for access by user mode consumer processes to a channel adapter based on virtual address mapping
EP1861778B1 (en) Data processing system
US7231638B2 (en) Memory sharing in a distributed data processing system using modified address space to create extended address space for copying data
US7478390B2 (en) Task queue management of virtual devices using a plurality of processors
JP7506472B2 (en) System and method for offloading application functions to a device
EP1891787B1 (en) Data processing system
US8549521B2 (en) Virtual devices using a plurality of processors
US8533740B2 (en) Data processing system with intercepting instructions
CN112153013B (en) Socket data forwarding method and device, electronic equipment and storage medium
CN108491278B (en) Method and network device for processing service data
CN113891396A (en) Data packet processing method and device, computer equipment and storage medium
US8782642B2 (en) Data processing system with data transmit capability
CN115586980A (en) Remote procedure calling device and method
WO2023125565A1 (en) Network node configuration and access request processing method and apparatus
CN110245027B (en) Inter-process communication method and device
CN115473811A (en) Network performance optimization method, device, equipment and medium
CN105893112B (en) Data packet processing method and device in virtualization environment
US20240211392A1 (en) Buffer allocation
KR20000065846A (en) Method for zero-copy between kernel and user in operating system
CN115344192A (en) Data processing method and device and electronic equipment
US7941629B2 (en) Memory registration caching
KR19980086588A (en) System Resource Reduction Tool Using TCP / IP Socket Application
CN118316567A (en) Message transmission method and device
CN113225257A (en) UPF data processing method, system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 1022, Floor 10, No. 1, Zhongguancun Street, Haidian District, Beijing 100085

Applicant after: Vita Technology (Beijing) Co.,Ltd.

Address before: 1022, Floor 10, No. 1, Zhongguancun Street, Haidian District, Beijing 100080

Applicant before: Vita Technology (Beijing) Co.,Ltd.

CB02 Change of applicant information