CN113626184A - Super-fusion performance optimization method, device and equipment - Google Patents

Super-fusion performance optimization method, device and equipment Download PDF

Info

Publication number
CN113626184A
CN113626184A CN202110745161.3A CN202110745161A CN113626184A CN 113626184 A CN113626184 A CN 113626184A CN 202110745161 A CN202110745161 A CN 202110745161A CN 113626184 A CN113626184 A CN 113626184A
Authority
CN
China
Prior art keywords
iser
memory
super
fusion
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110745161.3A
Other languages
Chinese (zh)
Inventor
马怀旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Inspur Data Technology Co Ltd
Original Assignee
Jinan Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Inspur Data Technology Co Ltd filed Critical Jinan Inspur Data Technology Co Ltd
Priority to CN202110745161.3A priority Critical patent/CN113626184A/en
Publication of CN113626184A publication Critical patent/CN113626184A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/541Client-server
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method, a device and equipment for optimizing super-fusion performance, wherein the method comprises the following steps: firstly, a super-fusion storage end provides an iser server end for large-page memory storage and shared memory storage, so that a memory is shared in a system, a memory address is shared in a plurality of remote direct data access RDMA connections, then the iser server end utilizes the memory address to directly register, and when an IO request exists, the memory address content needing IO and a read-write first key value are sent to a remote end node; and sending the memory address content needing IO and the read-write second key value to the iser Client, and then sending the memory address content and the read-write first key value (or the second key value) to the intelligent network card of the remote node (or the iser Client) so that the intelligent network card returns the data to the remote node (or the iser Client) for issuing after copying the data at the corresponding memory position. Therefore, the super-fusion performance can be reasonably optimized, the super-fusion time delay is reduced, and the utilization rate of the CPU is improved.

Description

Super-fusion performance optimization method, device and equipment
Technical Field
The application relates to the technical field of computers, in particular to a super-fusion performance optimization method, device and equipment.
Background
In the era of information explosion growth, mass data grows, the traditional storage cost is high, the efficiency is low, the growth speed of user data cannot be met, and the efficient and intelligent distributed storage technology is developed at the end.
Distributed storage has several features: high performance, high reliability, high expandability, transparency and autonomy. The distributed storage data storage firstly needs to be subjected to fragmentation and cutting processing, then the data storage position is calculated or searched out through a certain algorithm or metadata service, and as the user data is divided into a plurality of data blocks, the data can be unavailable due to the loss of any data block, so that a reasonable redundant storage model must be considered in the distributed storage, a plurality of redundant storage copies are provided for the data blocks of the user, and the safety and the reliability of the data are ensured. The super-fusion interior forwards the copy data through the network and provides services to the outside through the network, so that the super-fusion whole process cannot leave the network; regarding network transmission, the conventional network transmission requires CPU participation, but as the network speed increases, the CPU utilization rate is high. Therefore, how to reasonably optimize the super-fusion performance to accelerate the super-fusion storage performance and reduce the super-fusion time delay is an urgent problem to be solved at present.
Disclosure of Invention
The main purpose of the embodiments of the present application is to provide a method, an apparatus, and a device for optimizing super-fusion performance, which can reasonably optimize the super-fusion performance, accelerate the super-fusion storage performance, reduce the super-fusion time delay, and further improve the utilization rate of a CPU.
In a first aspect, an embodiment of the present application provides a method for optimizing super-fusion performance, including:
the super-fusion storage end provides an iser server end for large-page memory storage and shared memory storage, so that the memory is shared in the system, and the memory address is shared in a plurality of remote direct data access RDMA connections;
the iser server end uses the memory address to directly register, and sends the memory address content needing IO and the read-write first key value to the remote node remote in the super-fusion structure when an IO request exists; when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure;
the remote node sends the memory address content and the read-write first key value to an intelligent network card of the remote node, so that the intelligent network card copies data at a corresponding memory position and returns the data to the remote node for issuing;
and the iser Client sends the memory address content and the read-write second key value to the intelligent network card of the iser Client, so that the intelligent network card returns the data to the iser Client for issuing after copying the data at the corresponding memory position.
Optionally, the hyper-fusion storage is accessed to the Client terminal by using an iSER protocol; the iser server side realizes the unloading of the super-fusion full link protocol by using a large memory page, a locking-free queue, a RoCE protocol and a polling mechanism.
Optionally, the large-page memory is used for avoiding page-missing interruption; the lock-free queue is used for improving the concurrency capability of the super-fusion; the iser protocol offload and the RoCE protocol offload are used for reducing the time delay of network transmission; the polling mechanism is used to improve event awareness.
Optionally, the iser client and the iser server directly perform memory access through an intelligent network card; the IO issuing of the iser client and the iser server is driven by RDMA, and the iser server actively accesses the data address of the iser client and copies the data address to the local large registered memory page address through the intelligent network card.
In a second aspect, an embodiment of the present application further provides a device for optimizing super-fusion performance, including:
the sharing unit is used for providing an iser server end for large-page memory storage and shared memory storage by the super fusion storage end so as to share the memory in the system and share the memory address in a plurality of remote direct data access RDMA connections;
the first sending unit is used for the iser server to directly register by using the memory address, and sending the memory address content needing IO and the read-write first key value to the remote node remote in the super-fusion structure when an IO request exists; when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure;
the second sending unit is used for sending the memory address content and the read-write first key value to the intelligent network card of the remote node by the remote node so that the intelligent network card returns the data to the remote node for issuing after copying the data at the corresponding memory position;
and the third sending unit is used for sending the memory address content and the read-write second key value to the intelligent network card of the user Client terminal by the user Client terminal so that the intelligent network card returns the data to the user Client terminal for issuing after copying the data at the corresponding memory position.
Optionally, the hyper-fusion storage is accessed to the Client terminal by using an iSER protocol; the iser server side realizes the unloading of the super-fusion full link protocol by using a large memory page, a locking-free queue, a RoCE protocol and a polling mechanism.
Optionally, the large-page memory is used for avoiding page-missing interruption; the lock-free queue is used for improving the concurrency capability of the super-fusion; the iser protocol offload and the RoCE protocol offload are used for reducing the time delay of network transmission; the polling mechanism is used to improve event awareness.
Optionally, the iser client and the iser server directly perform memory access through an intelligent network card; the IO issuing of the iser client and the iser server is driven by RDMA, and the iser server actively accesses the data address of the iser client and copies the data address to the local large registered memory page address through the intelligent network card.
The embodiment of the present application further provides a super-fusion performance optimization device, including: a processor, a memory, a system bus;
the processor and the memory are connected through the system bus;
the memory is configured to store one or more programs, the one or more programs including instructions, which when executed by the processor, cause the processor to perform any one of the implementations of the above-described hyper-fusion performance optimization method.
An embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the terminal device is enabled to execute any implementation manner of the above-mentioned hyper-convergence performance optimization method.
The method, the device and the equipment for optimizing the super-fusion performance provided by the embodiment of the application comprise the steps that firstly, a super-fusion storage end provides an iser server end to carry out large-page memory storage and shared memory storage, so that a memory is shared in a system, memory addresses are shared in a plurality of remote direct data access RDMA connections, then the iser server end carries out direct registration by using the memory addresses, and when an IO request exists, the memory address content needing IO and a read-write first key value are sent to a remote node remote in a super-fusion structure; and when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure, then, the remote node sends the memory address content and the read-write first key value to the intelligent network card of the remote node, so that the intelligent network card returns data to the remote node for issuing after copying the data at the corresponding memory position, and meanwhile, the iser Client terminal also sends the memory address content and the read-write second key value to the intelligent network card of the intelligent network card, so that the intelligent network card returns the data to the iser Client terminal for issuing after copying the data at the corresponding memory position. Therefore, the super-fusion performance can be reasonably optimized, the super-fusion storage performance is accelerated, the super-fusion time delay is reduced, and the utilization rate of the CPU is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for optimizing super-fusion performance according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a physical architecture for super-fusion performance optimization according to an embodiment of the present disclosure;
FIG. 3 is an overall schematic diagram of super-fusion performance optimization provided by an embodiment of the present application;
fig. 4 is a schematic composition diagram of a super-fusion performance optimization apparatus provided in an embodiment of the present application.
Detailed Description
At present, copy data forwarding is carried out in the ultra-fusion interior through a network, and meanwhile, service is provided to the outside through the network, so that the ultra-fusion whole process cannot leave the network; regarding network transmission, the conventional network transmission requires CPU participation, but as the network speed increases, the CPU utilization rate is high. Therefore, how to reasonably optimize the super-fusion performance to accelerate the super-fusion storage performance and reduce the super-fusion time delay is an urgent problem to be solved at present.
In order to solve the above-mentioned defects, an embodiment of the present application provides a super-fusion performance optimization method, where a super-fusion storage end provides an iser server end to perform large-page EMS memory and shared EMS memory, so as to share EMS memory inside a system, and share EMS memory addresses in multiple remote RDMA connections, and then the iser server end performs direct registration by using EMS memory addresses, and sends the content of the EMS memory address requiring IO and the first key value of read-write to a remote node remote in a super-fusion structure when an IO request is made; and when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure, then, the remote node sends the memory address content and the read-write first key value to the intelligent network card of the remote node, so that the intelligent network card returns data to the remote node for issuing after copying the data at the corresponding memory position, and meanwhile, the iser Client terminal also sends the memory address content and the read-write second key value to the intelligent network card of the intelligent network card, so that the intelligent network card returns the data to the iser Client terminal for issuing after copying the data at the corresponding memory position. Therefore, the super-fusion performance can be reasonably optimized, the super-fusion storage performance is accelerated, the super-fusion time delay is reduced, and the utilization rate of the CPU is improved.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
First embodiment
Referring to fig. 1, a schematic flow chart of a method for optimizing super-fusion performance provided in this embodiment is shown, where the method includes the following steps:
s101: the super-fusion storage end provides an iser server for large-page memory storage and shared memory storage, so that the internal part of the system is shared, and the memory store address is shared in a plurality of remote direct data access RDMA connections.
In this embodiment, no matter the iser protocol offload or the RoCE protocol offload is actually performed by the smart network card for memory registration, when data is read, written, or transmitted and received, only the address of the smart network card is notified, and the smart network card actively performs data copy, so the super-fusion storage end provides the iser server for large-page memory storage and shared memory storage to share the inside of the system, and share the memory store address in multiple remote direct data access RDMA connections, where the large-page memory is used for network card registration, that is, the direct memory address access of the data inside the node is realized by the large-page memory sharing, which can avoid the occurrence of page-missing interruption during the use process and influence on the performance of the super-fusion (HCI for short).
S102: the iser server end uses the memory address to directly register, and sends the memory address content needing IO and the read-write first key value to the remote node remote in the super-fusion structure when an IO request exists; and when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure.
S103: the remote node sends the memory address content and the read-write first key value to the intelligent network card of the remote node, so that the intelligent network card returns the data to the remote node for issuing after copying the data at the corresponding memory position.
S104: the iser Client sends the memory address content and the read-write second key value to the intelligent network card of the iser Client, so that the intelligent network card copies data at the corresponding memory position and returns the data to the iser Client for issuing.
In this embodiment, as shown in fig. 2, when the Iser server performs an IO request, the intelligent network card is notified of its own memory address, and after the remote intelligent network card prepares an address, data copying is performed between the intelligent network cards to directly copy the data to the memory address of the Iser server.
Meanwhile, the Iser server transmits the IO by slicing after the IO requests and the data segment contents are transmitted through the RoCE protocol, a memory address can be registered in the network card for multiple times, so that the memory address of the Iser server end is directly registered for registration, part of the protocol transmission message address uses the private memory address registered in the network card, and the data segment directly uses the data segment address received by the Iser server to perform copy position transmission.
On the basis, after receiving the request, the node where the HCI copy is located directly makes a read write request, directly accesses the memory address registered at the remote end, copies the copy data to the local memory and directly issues IO. And after the copy data is written, the write request is used for informing that the copy data is completely issued, and the copy data is successfully written. And the far-end polling informs the iser client of the ending of the processing request after the task is ended
Therefore, in the embodiment of the application, the hyper-fusion storage is accessed to the Client terminal by using an iSER protocol; the iser server side realizes the unloading of the super-fusion full link protocol by using a large-page memory, a lock-free queue, a RoCE protocol and a polling mechanism, wherein the large-page memory is used for avoiding page-missing interruption; the lock-free queue is used for improving the concurrency capability of the super-fusion; the iser protocol offload and the RoCE protocol offload are used for reducing the delay of network transmission; the polling mechanism is used for improving event perception, and further can improve the CPU utilization rate of the system.
Specifically, the iser client and the iser server directly access the memory through the intelligent network card, the iser client and the iser server send IO through RDMA, and the iser server actively accesses the data address of the iser client and copies the data address to the local large memory page address through the intelligent network card.
The distributed storage generally stores data through multiple copies, so that data redundancy is achieved, single-point faults are avoided, and meanwhile, data is stored in a slicing mode; the intelligent network card of the remote node is informed to copy data according to different data slice positions after the iser server end receives the data slice, the remote node writes the result back to the internal memory address of the iser server end through a direct internal memory access mode after processing is finished, the operation modes of using CPU such as send recv operation are reduced through a direct internal memory operation mode, the participation degree of CPU is reduced, and the service waiting is reduced through polling internal memory non-blocking waiting by polling.
In addition, in this embodiment, polling occupies one CPU alone to perform polling operation, finds in time that a network request arrives and a memory state bit is set, does not need CPU context switching to directly process a non-time-consuming request, achieves timeliness of response, and can improve memory access speed through parent NUMA sensing.
Thus, for the physical architecture with optimized super-fusion performance as shown in fig. 2, by executing the steps S101 to S104, the overall process of optimizing super-fusion performance as shown in fig. 3 is implemented, full link protocol offload is provided, zero-break copy of data in an IO path is ensured, IO access speed is accelerated, IO access delay is reduced, and storage performance in a distributed storage virtualization scenario is improved.
In summary, in the method for optimizing the super-fusion performance provided in this embodiment, the super-fusion storage end provides the iser server end to perform large-page EMS memory and shared EMS memory, so as to share the EMS memory inside the system, and share the EMS memory address in multiple remote direct data access RDMA connections, and then the iser server end performs direct registration by using the EMS memory address, and sends the content of the EMS memory address requiring IO and the first key value for reading and writing to the remote node remote in the super-fusion structure when there is an IO request; and when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure, then, the remote node sends the memory address content and the read-write first key value to the intelligent network card of the remote node, so that the intelligent network card returns data to the remote node for issuing after copying the data at the corresponding memory position, and meanwhile, the iser Client terminal also sends the memory address content and the read-write second key value to the intelligent network card of the intelligent network card, so that the intelligent network card returns the data to the iser Client terminal for issuing after copying the data at the corresponding memory position. Therefore, the super-fusion performance can be reasonably optimized, the super-fusion storage performance is accelerated, the super-fusion time delay is reduced, and the utilization rate of the CPU is improved.
Second embodiment
In this embodiment, a super-fusion performance optimization apparatus will be described, and please refer to the above method embodiment for related contents.
Referring to fig. 4, a schematic composition diagram of a super-fusion performance optimization apparatus provided in this embodiment is shown, where the apparatus includes:
the sharing unit 401 is configured to provide the iser server side with the super fusion storage side to perform large-page memory storage and shared memory storage, so as to share the memory inside the system, and share the memory address in multiple remote direct data access RDMA connections;
a first sending unit 402, configured to perform direct registration by using the memory address at the iser server, and send memory address content and a read-write first key value that need IO to be sent to a remote node remote in a super-fusion structure when an IO request is made; when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure;
a second sending unit 403, configured to send, by the remote node, the memory address content and the read-write first key value to an intelligent network card of the remote node, so that after the intelligent network card performs data copy at a corresponding memory location, the data is returned to the remote node for issuing;
and a third sending unit 404, configured to send, by the iser Client, the memory address content and the read-write second key value to an intelligent network card of the third sending unit, so that after the intelligent network card performs data copy at a corresponding memory location, the intelligent network card returns the data to the iser Client for issuing.
In an implementation manner of this embodiment, the hyper-fusion storage is accessed to the Client terminal by using an iSER protocol; the iser server side realizes the unloading of the super-fusion full link protocol by using a large memory page, a locking-free queue, a RoCE protocol and a polling mechanism.
In an implementation manner of this embodiment, the large-page memory is used to avoid page-missing interruption; the lock-free queue is used for improving the concurrency capability of the super-fusion; the iser protocol offload and the RoCE protocol offload are used for reducing the time delay of network transmission; the polling mechanism is used to improve event awareness.
In an implementation manner of this embodiment, the iser client and the iser server directly perform memory access through an intelligent network card; the IO issuing of the iser client and the iser server is driven by RDMA, and the iser server actively accesses the data address of the iser client and copies the data address to the local large registered memory page address through the intelligent network card.
In summary, according to the device for optimizing the super-fusion performance provided by this embodiment, the super-fusion storage end provides the iser server end to perform large-page EMS memory and shared EMS memory, so as to share the EMS memory inside the system, and share the EMS memory address in multiple remote direct data access RDMA connections, then the iser server end performs direct registration by using the EMS memory address, and sends the content of the EMS memory address requiring IO and the first key value for reading and writing to the remote node remote in the super-fusion structure when there is an IO request; and when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure, then, the remote node sends the memory address content and the read-write first key value to the intelligent network card of the remote node, so that the intelligent network card returns data to the remote node for issuing after copying the data at the corresponding memory position, and meanwhile, the iser Client terminal also sends the memory address content and the read-write second key value to the intelligent network card of the intelligent network card, so that the intelligent network card returns the data to the iser Client terminal for issuing after copying the data at the corresponding memory position. Therefore, the super-fusion performance can be reasonably optimized, the super-fusion storage performance is accelerated, the super-fusion time delay is reduced, and the utilization rate of the CPU is improved.
Further, an embodiment of the present application further provides a super-fusion performance optimization device, including: a processor, a memory, a system bus;
the processor and the memory are connected through the system bus;
the memory is configured to store one or more programs, the one or more programs including instructions, which when executed by the processor, cause the processor to perform any of the above-described methods of hyper-fusion performance optimization.
Further, an embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a terminal device, the terminal device is caused to execute any implementation method of the above-mentioned hyper-convergence performance optimization method.
As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that all or part of the steps in the above embodiment methods can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for optimizing super-fusion performance is characterized by comprising the following steps:
the super-fusion storage end provides an iser server end for large-page memory storage and shared memory storage, so that the memory is shared in the system, and the memory address is shared in a plurality of remote direct data access RDMA connections;
the iser server end uses the memory address to directly register, and sends the memory address content needing IO and the read-write first key value to the remote node remote in the super-fusion structure when an IO request exists; when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure;
the remote node sends the memory address content and the read-write first key value to an intelligent network card of the remote node, so that the intelligent network card copies data at a corresponding memory position and returns the data to the remote node for issuing;
and the iser Client sends the memory address content and the read-write second key value to the intelligent network card of the iser Client, so that the intelligent network card returns the data to the iser Client for issuing after copying the data at the corresponding memory position.
2. The method of claim 1, wherein the hyper-converged storage is accessed by using an iSER protocol at the Client side; the iser server side realizes the unloading of the super-fusion full link protocol by using a large memory page, a locking-free queue, a RoCE protocol and a polling mechanism.
3. The method of claim 2, wherein the large page memory is used to avoid page fault interrupts; the lock-free queue is used for improving the concurrency capability of the super-fusion; the iser protocol offload and the RoCE protocol offload are used for reducing the time delay of network transmission; the polling mechanism is used to improve event awareness.
4. The method according to claim 1, wherein the iser client and the iser server directly access the memory through an intelligent network card; the IO issuing of the iser client and the iser server is driven by RDMA, and the iser server actively accesses the data address of the iser client and copies the data address to the local large registered memory page address through the intelligent network card.
5. A super-fusion performance optimization apparatus, comprising:
the sharing unit is used for providing an iser server end for large-page memory storage and shared memory storage by the super fusion storage end so as to share the memory in the system and share the memory address in a plurality of remote direct data access RDMA connections;
the first sending unit is used for the iser server to directly register by using the memory address, and sending the memory address content needing IO and the read-write first key value to the remote node remote in the super-fusion structure when an IO request exists; when an IO request exists, the memory address content needing IO and the read-write second key value are sent to an iser Client terminal in the super-fusion structure;
the second sending unit is used for sending the memory address content and the read-write first key value to the intelligent network card of the remote node by the remote node so that the intelligent network card returns the data to the remote node for issuing after copying the data at the corresponding memory position;
and the third sending unit is used for sending the memory address content and the read-write second key value to the intelligent network card of the user Client terminal by the user Client terminal so that the intelligent network card returns the data to the user Client terminal for issuing after copying the data at the corresponding memory position.
6. The apparatus of claim 5, wherein the hyper-converged storage is accessed by using an iSER protocol; the iser server side realizes the unloading of the super-fusion full link protocol by using a large memory page, a locking-free queue, a RoCE protocol and a polling mechanism.
7. The apparatus of claim 6, wherein the large page memory is configured to avoid page fault interrupts; the lock-free queue is used for improving the concurrency capability of the super-fusion; the iser protocol offload and the RoCE protocol offload are used for reducing the time delay of network transmission; the polling mechanism is used to improve event awareness.
8. The device of claim 5, wherein the iser client and the iser server are directly accessible via an intelligent network card; the IO issuing of the iser client and the iser server is driven by RDMA, and the iser server actively accesses the data address of the iser client and copies the data address to the local large registered memory page address through the intelligent network card.
9. A super-fusion performance optimization device, comprising: a processor, a memory, a system bus;
the processor and the memory are connected through the system bus;
the memory is to store one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of any of claims 1-4.
10. A computer-readable storage medium having stored therein instructions that, when executed on a terminal device, cause the terminal device to perform the method of any one of claims 1-4.
CN202110745161.3A 2021-06-30 2021-06-30 Super-fusion performance optimization method, device and equipment Pending CN113626184A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110745161.3A CN113626184A (en) 2021-06-30 2021-06-30 Super-fusion performance optimization method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110745161.3A CN113626184A (en) 2021-06-30 2021-06-30 Super-fusion performance optimization method, device and equipment

Publications (1)

Publication Number Publication Date
CN113626184A true CN113626184A (en) 2021-11-09

Family

ID=78378864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110745161.3A Pending CN113626184A (en) 2021-06-30 2021-06-30 Super-fusion performance optimization method, device and equipment

Country Status (1)

Country Link
CN (1) CN113626184A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114780465A (en) * 2022-03-01 2022-07-22 阿里巴巴(中国)有限公司 Method and device for creating sharable remote direct data access link
WO2023116141A1 (en) * 2021-12-21 2023-06-29 阿里巴巴(中国)有限公司 Data processing method, system and device, and medium
CN117215995A (en) * 2023-11-08 2023-12-12 苏州元脑智能科技有限公司 Remote direct memory access method, distributed storage system and electronic equipment
CN117573043A (en) * 2024-01-17 2024-02-20 济南浪潮数据技术有限公司 Transmission method, device, system, equipment and medium for distributed storage data

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023116141A1 (en) * 2021-12-21 2023-06-29 阿里巴巴(中国)有限公司 Data processing method, system and device, and medium
CN114780465A (en) * 2022-03-01 2022-07-22 阿里巴巴(中国)有限公司 Method and device for creating sharable remote direct data access link
CN114780465B (en) * 2022-03-01 2024-04-16 阿里巴巴(中国)有限公司 Creation method and device for sharable remote direct data access link
CN117215995A (en) * 2023-11-08 2023-12-12 苏州元脑智能科技有限公司 Remote direct memory access method, distributed storage system and electronic equipment
CN117215995B (en) * 2023-11-08 2024-02-06 苏州元脑智能科技有限公司 Remote direct memory access method, distributed storage system and electronic equipment
CN117573043A (en) * 2024-01-17 2024-02-20 济南浪潮数据技术有限公司 Transmission method, device, system, equipment and medium for distributed storage data

Similar Documents

Publication Publication Date Title
CN113626184A (en) Super-fusion performance optimization method, device and equipment
US20170193416A1 (en) Reducing costs related to use of networks based on pricing heterogeneity
WO2019141186A1 (en) Data processing method and device
CN112597251B (en) Database cluster log synchronization method and device, server and storage medium
CN108989432B (en) User-mode file sending method, user-mode file receiving method and user-mode file receiving and sending device
WO2014047269A2 (en) System and method for small batching processing of usage requests
CN112988680B (en) Data acceleration method, cache unit, electronic device and storage medium
US9910808B2 (en) Reflective memory bridge for external computing nodes
CN111404931A (en) Remote data transmission method based on persistent memory
CN113010549A (en) Data processing method based on remote multi-active system, related equipment and storage medium
CN113703672A (en) Super-fusion system, IO request issuing method thereof and physical server
JP4208506B2 (en) High-performance storage device access environment
WO2019000423A1 (en) Data storage method and device
CN113411363A (en) Uploading method of image file, related equipment and computer storage medium
CN112052104A (en) Message queue management method based on multi-computer-room realization and electronic equipment
CN111770054A (en) Interaction acceleration method and system for SMB protocol read request
JPH07239808A (en) Distributed data managing system
CN116302605A (en) Message engine-based message transmission method
US10565004B2 (en) Interrupt and message generation independent of status register content
CN113434290A (en) Data processing method and device based on RAFT protocol, and computer storage medium
CN107615259A (en) A kind of data processing method and system
CN116601616A (en) Data processing device, method and related equipment
US12001450B2 (en) Distributed table storage processing method, device and system
US11914865B2 (en) Methods and systems for limiting data traffic while processing computer system operations
US20070277019A1 (en) Communication interface device and communication method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination