CN113595807B - Computer system, RDMA network card and data communication method - Google Patents

Computer system, RDMA network card and data communication method Download PDF

Info

Publication number
CN113595807B
CN113595807B CN202111146457.XA CN202111146457A CN113595807B CN 113595807 B CN113595807 B CN 113595807B CN 202111146457 A CN202111146457 A CN 202111146457A CN 113595807 B CN113595807 B CN 113595807B
Authority
CN
China
Prior art keywords
data
api
request
network card
accelerator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111146457.XA
Other languages
Chinese (zh)
Other versions
CN113595807A (en
Inventor
杨航
方兴
李金虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Cloud Computing Ltd
Original Assignee
Alibaba Cloud Computing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Cloud Computing Ltd filed Critical Alibaba Cloud Computing Ltd
Priority to CN202111146457.XA priority Critical patent/CN113595807B/en
Publication of CN113595807A publication Critical patent/CN113595807A/en
Application granted granted Critical
Publication of CN113595807B publication Critical patent/CN113595807B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • H04L41/083Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability for increasing network speed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)

Abstract

The embodiment of the application provides a computer system, an RDMA network card and a data communication method. The computer system comprises a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on at least one API is run in the first CPU; the first application calls at least one API to add a work request generated by a first accelerator operator corresponding to the data processing requirement into a work queue; the first RDMA network card acquires a work request from the work queue, calls a first acceleration operator to process first to-be-processed data on the indicated accelerator, acquires first target data and carries out remote access processing on the first target data. The embodiment of the application reduces the data communication delay.

Description

Computer system, RDMA network card and data communication method
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a computer system, an RDMA network card and a data communication method.
Background
RDMA (Remote Direct Memory Access) is a technology for directly performing Remote Memory Access, both communication parties can directly and quickly perform data transmission by means of an RDMA network card in a host without any influence on an operating system, consumption of a Central Processing Unit (CPU) participating in a data transmission process is reduced, and the RDMA network Access has the characteristics of high bandwidth, low time delay and low CPU occupancy rate. The method is widely used in distributed application and cluster scenes.
In practical application, when two communication parties perform data communication in an RDMA manner, some corresponding data processing operations on data, such as data identification, screening, sorting, encryption and decryption, may be involved.
Disclosure of Invention
The embodiment of the application provides a computer system, an RDMA (Remote Direct Memory Access) network card and a data communication method, which are used for solving the technical problem of RDMA (Remote Direct Memory Access) data communication delay in the prior art.
In a first aspect, an embodiment of the present application provides a computer system, including a first remote direct memory access RDMA network card and a first central processing unit CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on at least one preconfigured Application Program Interface (API) runs in the first CPU;
the first application is used for calling the at least one API so as to add a work request generated by a first acceleration operator pair corresponding to a data processing requirement into a work queue;
the first RDMA network card is used for acquiring the work request from the work queue, determining the first acceleration operator pair, calling the first acceleration operator to process the first to-be-processed data of the indicated first target accelerator, acquiring first target data, and performing remote access processing on the first target data.
Optionally, the first application is configured to call the at least one API to query an acceleration operator supported by the first RDMA network card; determining a first acceleration operator pair according to a data processing requirement, and generating a work request based on the first acceleration operator pair; and calling the at least one API to add the work request to a work queue.
Optionally, the work request includes a data sending request or a data writing request;
the first RDMA network card calls the first acceleration operator to process the first to-be-processed data on the indicated first target accelerator to obtain first target data, and the remote access processing on the first target data comprises the following steps: acquiring the first to-be-processed data from a memory, calling the first acceleration operator to process the first to-be-processed data for an indicated first target accelerator to acquire first target data, generating a message request based on the first target data and the first acceleration operator pair, and sending the message request to a destination; and the destination is used for calling the first acceleration operator to process the first target data for the indicated second target accelerator and storing the processing result data.
Optionally, the work request comprises a data read request;
the first RDMA network card calls the first acceleration operator to process the first to-be-processed data on the indicated first target accelerator to obtain first target data, and the remote access processing on the first target data comprises the following steps: generating a message request based on a first acceleration operator pair, sending the message request to a destination, reading first to-be-processed data from the destination, calling the first acceleration operator to process the first to-be-processed data for a first target accelerator indicated by the first acceleration operator pair, obtaining first target data, and storing or transmitting the first target data; and the first to-be-processed data is obtained by the destination calling the first acceleration operator to process the indicated second target accelerator according to the request of the message and the data read by the request.
Optionally, the at least one API includes at least one resource transparent API and at least one request processing API;
the first application body calls a corresponding resource transparent transmission API to acquire an acceleration operator pair list in the first RDMA network card; selecting a first acceleration operator pair according to data processing requirements, and calling corresponding resources to export an API (application program interface) so as to obtain a parameter list required by the first acceleration operator pair; determining parameter data of the first acceleration operator pair according to the parameter list of the first acceleration operator pair; generating a work request based on the first acceleration operator pair and the parameter data; and calling a corresponding request processing API to add the work request into a work queue.
Optionally, the at least one resource transparent API includes one or more of a first resource transparent API for obtaining a list of acceleration operator pairs of the first RDMA network card, and a second resource transparent API for obtaining a list of parameters of any acceleration operator pair from the first RDMA network card; the at least one request processing API comprises one or more of a first request processing API for adding work requests for data transmission to the request based on any accelerator operator to the work queue and a second request processing API for adding work requests for data reception to the request based on any accelerator operator to the work queue.
In a second aspect, an embodiment of the present application provides a computer system, where the computer system includes a second RDMA network card and a second CPU; the second RDMA network card integrates at least one accelerator; a second application developed based on the preconfigured at least one API runs in the second CPU;
the second RDMA network card is used for receiving a message request sent by a source end, calling a second target accelerator to process second to-be-processed data based on a first acceleration operator pair specified in the message request, obtaining second target data, and performing remote access processing on the second target data.
Optionally, the second application is configured to call the at least one API, so as to add a data receiving request generated by a second accelerator operator corresponding to a data processing requirement to the work queue;
the second RDMA network card is further used for acquiring the data receiving request from the work queue and determining a second acceleration operator pair based on the data receiving request; and receiving first target data sent by a source end, calling the second acceleration operator to process the first target data for the indicated second target accelerator, obtaining processing result data, and storing the processing result data.
Optionally, the second application is further configured to call the at least one API, so as to add a work request generated by a third accelerator operator pair corresponding to a data processing requirement into a work queue;
the second RDMA network card is further configured to acquire the work request from the work queue, determine the third acceleration operator pair, invoke the third acceleration operator to process third to-be-processed data on the indicated first target accelerator, acquire third target data, and perform remote access processing on the third target data.
In a third aspect, an embodiment of the present application provides an RDMA network card, where the RDMA network card is configured in a computer system, and includes an RDMA network card unit and at least one accelerator;
the RDMA network card unit is used for acquiring a work request from a work queue, calling a corresponding first target accelerator to process first to-be-processed data based on an acceleration operator pair indicated by the work request, acquiring first target data and performing remote access processing on the first target data; or receiving a message request sent by a source end, calling a corresponding second target accelerator to process second data to be processed based on an acceleration operator pair specified in the message request, obtaining second target data, and performing remote access processing on the second target data;
the work request is generated by a first application based on an accelerator operator pair corresponding to a data processing requirement, and at least one preset API is called to be added into the work queue.
In a fourth aspect, an embodiment of the present application provides a data communication method, where the data communication method is applied to a first computer system, where the first computer system includes a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on the preconfigured at least one API runs in the first CPU;
the method comprises the following steps:
and calling the at least one API to add a work request generated by a first acceleration operator pair corresponding to a data processing requirement into a work queue so that the first RDMA network card can acquire the work request from the work queue, calling the first acceleration operator to process first to-be-processed data for an indicated first target accelerator to acquire first target data, and performing remote access processing on the first target data.
In a fifth aspect, an embodiment of the present application provides a data communication method, where the data communication method is applied to a first computer system, where the first computer system includes a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on the preconfigured at least one API runs in the first CPU;
the method comprises the following steps:
acquiring a work request from a work queue;
determining a target acceleration operator pair indicated by the work request;
calling the target acceleration operator to process the first to-be-processed data for the indicated first target accelerator to obtain first target data;
and performing remote access processing on the first target data.
In a sixth aspect, an embodiment of the present application provides a data communication method, where the data communication method is applied to a second computer system, where the second computer system includes a second RDMA network card and a second CPU; the second RDMA network card integrates at least one accelerator, and a second application developed based on the preconfigured at least one API runs in the second CPU;
the method comprises the following steps:
receiving a message request sent by a source end;
determining a first acceleration operator pair indicated in the message request;
calling the first acceleration operator to process second data to be processed for a second target accelerator indicated by the first acceleration operator to obtain second target data;
and performing remote access processing on the second target data.
In a seventh aspect, an embodiment of the present application provides a computer storage medium, which stores a computer program, and the computer program, when executed by a computer, implements the data communication method according to the fourth, fifth or sixth aspect.
The computer system provided by the embodiment of the application comprises a first RDMA network card and a first CPU, wherein at least one accelerator is integrated in the first RDMA network card, and the first CPU can run a first application developed based on at least one API; the first application calls the at least one API to add a work request generated by a first acceleration operator pair corresponding to the data processing requirement into a work queue, the first RDMA consumes the work request from the work queue, after the work request is taken, the first acceleration operator pair can be determined, so that the first acceleration operator can be selected to process the first to-be-processed data by the indicated first target accelerator, the first target data is obtained, and then remote access processing can be performed on the first target data. In the embodiment of the application, the RDMA network card integrates at least one accelerator, data processing and network transmission are combined in the RDMA network card through at least one pre-configured API, data processing operation can be realized by depending on the acceleration processing capability of the RDMA network card, the data processing operation is unloaded from a CPU, the CPU resource consumption is reduced, online data acceleration processing is realized, and the data communication delay is further reduced on the basis of low communication delay of the RDMA mode.
These and other aspects of the present application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic block diagram illustrating an embodiment of a data communication system provided herein;
FIG. 2 is a diagram illustrating a data communication process in a practical application of the embodiment of the present application;
FIG. 3 is a block diagram illustrating one embodiment of a computer system provided herein;
FIG. 4 is a schematic block diagram illustrating an embodiment of a computer system provided herein;
FIG. 5 is a schematic diagram illustrating scene interaction in a practical application according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram illustrating an embodiment of an RDMA network card provided by the present application;
FIG. 7 is a flow chart illustrating one embodiment of a method for data communication provided herein;
FIG. 8 is a flow chart illustrating a further embodiment of a method of data communication provided herein;
fig. 9 is a flowchart illustrating a data communication method according to another embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In some of the flows described in the specification and claims of this application and in the above-described figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, the number of operations, e.g., 101, 102, etc., merely being used to distinguish between various operations, and the number itself does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
The technical scheme of the embodiment of the application is suitable for a scene of data communication by using an RDMA (Remote Direct Memory Access) technology, such as a scene of data communication between distributed applications or cluster systems by using the RDMA technology. The RDMA resources provided by the embodiments of the present application may be provided by a cloud computing platform.
In order to facilitate understanding of the technical solutions of the present application, the following first explains technical terms that may be involved in the present application:
RDMA: according to the communication standard, two communication parties can directly access data from a memory by means of an RDMA network card and directly transmit the data without participation of a CPU. Both communicating parties may refer to upper layer applications in different computer systems, also referred to as RDMA applications, which run on the CPU.
Verbs: the transport interface layer (software transport interface) between RDMA applications and RDMA network cards defined by IBTA (InfiniBand Trade Association), defines the actions performed to access an RDMA network card.
Verbs API (Application Programming Interface): the basic API using RDMA services, defined by OFA (Open Fabrics Alliance, an Open source based organization), is a functional interface that operates RDMA network cards. RDMA applications are written based on this set of Verbs APIs, or in various middleware at a layer interface encapsulated by the Verbs APIs.
Channel-IO; and the data channel is created when the two communication parties carry out data communication.
WR (Work Request): the related content of the communication between the RDMA application and the opposite end is described, and when the RDMA application has the data communication requirement on the opposite end, the work request is created.
WQ (Work Queue): including SQ (Send Queue), RQ (Receive Queue), etc., RDMA provides Queue-based peer-to-peer communication, and Work requests created by RDMA applications, which are added to a Work Queue and converted into WQE (Work Queue Element) that points to a block of memory buffer that stores data. The RDMA network card consumes WQE from WQ, namely acquires WR execution, and stores data sent by the opposite terminal from a buffer pointed by the WQE or sends the data to the opposite terminal after the data is taken.
CQ (Complete Queue): WR execution is completed, the RDMA network card generates WC (work Complete) to be added into CQ, and the RDMA application consumes WC from CQ to determine the execution result of the work request.
send/receive: a data transmission mode provided by RDMA is a bilateral operation, and data receiving and sending can be completed only by application sensing participation of a destination end (a communication receiver). In this data transmission mode, WR may refer to a data send request or a data receive request.
write (write): the RDMA provides a data transmission mode, which is unilateral operation, only a source end (a communication initiator) needs to clearly identify a source address and a destination address, and a destination end application does not need to sense the communication and essentially pushes data in a source end memory to a destination end memory. In this data transfer mode, a WR may refer to a data write request.
read (read): the RDMA provides a data transmission mode, unilateral operation is carried out, only the source address and the destination address need to be determined by the application of the source end, the application of the destination end does not need to sense the communication, and the essence is to pull the data in the memory of the destination end back to the memory of the local end. In this data transfer mode, a WR may refer to a data read request.
Distributed application: the application programs are distributed on different computer systems, and work modes of one task are jointly completed through a network, wherein the work modes comprise different components which run in separated running environments.
In a traditional RDMA data communication mode, if data processing operation is involved, the RDMA network card still needs to deliver data to the CPU for execution, and then perform remote access processing on the data processed by the CPU, such as transmitting to an opposite end or storing in a local memory. The need for CPU involvement not only consumes CPU resources, but also causes data communication delays. In order to reduce data communication delay, the inventor provides a technical scheme of the application through a series of researches, an RDMA network card integrated accelerator realizes data acceleration processing by the RDMA network card integrated accelerator in the RDMA data communication process through at least one pre-configured API, combines data processing operation with data communication, realizes unloading of CPU load in the data communication process, can realize the data processing operation without participation of a CPU, and further reduces the data communication delay on the basis of low communication delay of the RDMA mode.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical solution of the embodiment of the present application may be applied to a data communication system as shown in fig. 1, and the data communication coefficient may include a first computer system 10 and a second computer system 20. In this embodiment, the first computer system 10 is mainly used as a source device, and the second computer system 20 is used as a destination device to describe the technical solution of the present application.
The first computer system 10 may include a first RDMA network card 101 and a first CPU 102; wherein, the first RDMA Network card (RDMA Network Interface Controller, RNIC for short) 101 may integrate at least one accelerator 30 (only one accelerator is exemplarily drawn in the figure); the first CPU102 runs a first application 103 developed based on at least one API. Wherein at least one API may be preconfigured in the API function library.
The second computer system 20 may include a second RDMA network card 201 and a second CPU 202; wherein the second RDMA network card 201 may integrate at least one accelerator 30 (only one accelerator is exemplarily depicted in the figure); the second CPU202 runs a second application 203 developed based on at least one API. Wherein the at least one API may be preconfigured in the API function library.
The first computer system 10 or the second computer system 20 may be an elastic computing system provided by a physical device or a cloud computing platform, and in this case, the RDMA network card, the CPU, and the like may refer to a basic server resource provided by the cloud computing platform.
In the first computer system 10 or the second computer system 20, the RDMA network card and the CPU may be respectively disposed in different computer devices.
Of course, it can be understood by those skilled in the art that the computer device may also include some other components, such as a memory, a bus, etc., which are not described in detail herein.
The API function library is used for providing an API for operating the RDMA network card by an application, and includes a newly added API, i.e. at least one API described above, in addition to an original API, such as Verbs API.
The accelerator 30 may provide an accelerated processing capability, and in this embodiment, the accelerated processing capability that the accelerator 30 may provide is represented by an acceleration operator, and the accelerated processing capability may include, for example, data processing operations such as data decompression, data encryption and decryption, image recognition, data sorting, and data filtering, which is not specifically limited in this application.
Alternatively, the accelerator 30 may specifically provide a hardware acceleration capability, and perform a data processing operation by using hardware performance, and since hardware has a fast processing characteristic, compared with software work of a CPU, high-efficiency processing of data may be achieved to further reduce data communication delay.
In practical applications, the accelerator 30 may be implemented by, for example, an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a special-purpose processor, and the like, and the application is not limited thereto.
In combination with actual requirements, the first RDMA network card 101 or the second RDMA network card 201 may integrate one or more accelerators, and an accelerator may be added according to actual situations, where each accelerator may provide at least one acceleration operator. The first RDMA network card 101 or the second RDMA network card 201 may be provided with a network card function by an RDMA network card unit.
The first application 103 and the second application 203 may be two independent application programs, or may be two separate components in a distributed application, and the like.
By using RDMA technology, the first application 103 and the second application 203 can directly perform data transmission through the RDMA network card.
In this embodiment, the first application 103 may call the at least one API to add a work request generated by a first accelerator operator pair corresponding to a data processing requirement to a work queue;
optionally, the first application may first call at least one API to query the acceleration operator supported by the first RDMA network card, that is, one or more APIs of the added at least one API may query the acceleration operator supported by the first RDMA network card from the second RDMA network card, and the added at least one API passes the acceleration operator supported by the first RDMA network card to the upper layer application. The first application may determine a first acceleration operator pair according to the data processing requirement, generate a work request based on the first acceleration operator pair, and then may call at least one API to add the work request to the work queue. The supported acceleration operators are inquired from the first RDMA network card through at least one API, the supported acceleration operators of the first RDMA network card can be conveniently determined, the first RDMA network card can integrate different accelerators according to actual requirements, the accelerators can be added newly, and the supported acceleration operators of the first RDMA network card can be inquired and obtained by the first application only through the at least one API.
Of course, as another alternative, the first application may also determine the acceleration operator supported by the first RDMA network card according to the pre-stored information, the first application may be developed and obtained based on the acceleration operator supported by the first RDMA network card, and the related information of the acceleration operator supported by the first RDMA network card may be stored in advance.
After the first RDMA network card 101 is powered on and started, the acceleration operators supported by at least one accelerator currently integrated with the first RDMA network card can be determined, and after the first application is started, the at least one API can be called to acquire the acceleration operators supported by the first RDMA network card from the first RDMA network card.
The first acceleration operator pair may include acceleration operators respectively adopted by the source end and the destination end, for example, when data is encrypted and decrypted, the source end adopts an encryption operator, the destination end adopts a decryption operator, and the encryption operator and the decryption operator form the acceleration operator pair; when data is decompressed, the source end adopts a compression operator, the destination end adopts a decompression operator, and the compression operator and the decompression operator form an acceleration operator pair. Of course, the acceleration operator used by the source end or the destination end in the first acceleration operator pair may be null, so that data acceleration processing may be performed only on the source end or the destination end according to the data processing requirement.
Optionally, in some cases, the list of acceleration operator pairs may be generated by the first RDMA network card. The acceleration operator pair list comprises identification information and the like of acceleration operator pairs supported by the first RDMA network card; each acceleration operator pair comprises acceleration operators respectively adopted by the source end and the destination end.
The work request may refer to a data send request, a data write request, or a data read request, etc. according to different data transmission modes. Based on the work request generated by the first acceleration operator pair, the at least one API may be invoked to place the work request into a designated work queue. The work request may include identification information of the first acceleration operator pair, and the like.
The first RDMA network card 101 may be configured to acquire a work request from a work queue, determine a first acceleration operator pair, invoke the first acceleration operator pair to process the first to-be-processed data with the indicated first target accelerator, acquire first target data, and perform remote access processing on the first target data.
The first RDMA network card 101 may consume the work request in the work queue, and after the work request is taken from the work queue, may determine a first acceleration operator pair based on identification information of the first acceleration operator pair in the work request, and call the first acceleration operator pair to process the first to-be-processed data for the indicated first target accelerator.
The first target accelerator may be an accelerator that is required to be used by the acceleration operator adopted by the source terminal.
Remote access processing may refer to, among other things, RDMA-related processing.
For example, in the send/receive mode, the first to-be-processed data may refer to data in a memory specified by the first application, and is sent to the destination after being processed, the first target data is data sent to the destination, and the remote access processing on the first target data is specifically sending the first target data to the destination.
In the write mode, the first to-be-processed data may be data in a specified memory of the first application, and needs to be pushed to the destination after data processing, where the first target data is data pushed to the destination, and the remote access processing on the first target data is specifically to push the first target data to the destination.
In read mode, the first data to be processed may refer to data pulled back from the destination, and need to be stored or read after data processing. The remote access processing on the first target data may also refer to storing or reading the first target data.
The first RDMA network card 101 specifically interacts with the second RDMA network card 201 of the destination end based on messages. Therefore, in this embodiment, the second RDMA network card 201 may be configured to receive a message request sent by the source end, call the second target accelerator to process the second to-be-processed data based on the first acceleration operator pair specified in the message request, obtain the second target data, and perform remote access processing on the second target data.
The second target accelerator may be an accelerator that is required to be used by an acceleration operator employed by the destination.
The content in the message request is different in different data transfer modes. Thus, as an alternative, when the work request comprises a data send request or a data write request:
the first RDMA network card 101 calls a first acceleration operator to process the first to-be-processed data on the corresponding first target accelerator to obtain the first target data, and the remote access processing on the first target data may specifically be: the method comprises the steps of obtaining first data to be processed from a memory, calling a first acceleration operator to process the first data to be processed for an indicated first target accelerator, obtaining first target data, generating a message request based on the first target data and the first acceleration operator, and sending the message request to a destination.
The message request may specifically be a send request or a write request, where the message request may include the first target data and related information of the first acceleration operator pair, such as identification information and parameter information of the acceleration operator used by the destination, and may also include some necessary message contents of the two communication parties in a send or write mode, such as addressing information, and this application is not described in detail herein.
The destination terminal can call the first acceleration operator to the indicated second target accelerator based on the message request, process the first target data, and store the processing result data.
Optionally, the message request may be specifically sent to a second RDMA network card 201 of the second computer system 20, and the second RDMA network card 201 may specifically invoke a second target accelerator indicated by the first accelerator pair, process the first target data, and store the processing result data. That is, in this case, the first target data is the second data to be processed, and the second target data is the processing result data.
As yet another alternative, when the work request comprises a data read request:
the first RDMA network card 101 calls a first acceleration operator to process the first to-be-processed data on the indicated first target accelerator to obtain the first target data, and the remote access processing on the first target data may specifically be: the method comprises the steps of generating a message request based on a first acceleration operator pair, sending the message request to a target end, reading first to-be-processed data from the target end, calling the first acceleration operator to process the first to-be-processed data for an indicated first target accelerator, obtaining first target data, and storing or transmitting the first target data.
The first to-be-processed data can be obtained by the destination terminal by calling the data requested to be read to the first acceleration operator to process the indicated second target accelerator based on the message request.
The message request is specifically a read request, and may include related information of the first accelerator pair, such as identification information and parameter information of the accelerator used by the destination, and may also include some necessary message contents, such as addressing information, of the two communication parties in the read mode.
Optionally, the message request is specifically sent to the second RDMA network card 201 in the second computer system 20, and the first to-be-processed data may be obtained by the second RDMA network card 201 calling the first acceleration operator to process the indicated second target accelerator based on the message request. That is, in this case, the second data to be processed is the data read based on the message request, and the second target data is the first data to be processed.
In addition, as another implementation manner, in the send/receive mode, the destination may also create a work request, a work queue, and the like, and the destination may also determine a required pair of acceleration operators according to a data processing requirement to perform data processing, so in some embodiments, the second application 203 may be configured to call at least one API to add a work request generated by the second pair of acceleration operators corresponding to the data processing requirement to the work queue. Optionally, the second application may call the at least one API to query the acceleration operators supported by the second RDMA network card 201, that is, one or more APIs of the at least one API may query the acceleration operators supported by the second RDMA network card from the second RDMA network card.
The second application can determine a second acceleration operator pair according to the data processing requirement, generate a data receiving request based on the second acceleration operator pair, and then call the at least one API again to add the data receiving request into the work queue;
the second RDMA network card 201 is further configured to obtain a data receiving request from the work queue, and based on the data receiving request, accelerate the operator pair; and receiving first target data sent by the source end, calling a second acceleration operator to process the first target data on the indicated second target accelerator, obtaining processing result data, and storing the processing result data.
In order to clearly understand the data communication mode of the technical solution in the embodiment of the present application in the three data transmission modes, as shown in the scene interaction diagram shown in fig. 2, a rough process of data communication in the three data transmission modes is respectively shown, in the send/receive mode, a first application in a first computer system creates a WQ (in this mode, also referred to as SQ) in a first RDMA network card, a WR generated by the first application is added to the WQ and converted into a WQE, and each WQE points to a buffer in a source-end memory. A second application in the second computer system creates a WQ (in this mode, also referred to as RQ) in the second RDMA network card, and the second application also generates a WR, adds the WR to the WQ, and converts the WR into WQEs, where each WQE points to a buffer in the destination memory.
The first RDMA network card acquires WQE (WQE), namely WR (write queue), executes the WQE, the WR, acquires first to-be-processed data from a buffer pointed by the WR, determines an acceleration operator adopted by a source end according to the first acceleration operator, calls a first target accelerator supporting the acceleration operator to process the first to-be-processed data to generate first target data, and can send a message request, namely a send request, to the second RDMA network card based on the first target data.
And the second RDMA network card acquires WQE (WQE), namely WR (write queue), executes the WQE, receives the message request of the first RDMA network card, determines the first target data, calls a second target accelerator to accelerate the first target data according to the acceleration operator in the first acceleration operator pair indicated in the message request or the acceleration operator in the second acceleration operator pair indicated in the WR, and stores the processing result data into the buffer pointed by the WR.
In the write mode, a first application creates a WQ in a first RDMA network card, WR generated by the first application is added into the WQ and converted into WQEs, and each WQE points to a buffer in a source end memory.
The first RDMA network card acquires WQE (WQE), namely WR (write request), executes the WQE, acquires first to-be-processed data from a buffer pointed by the WR, determines an acceleration operator adopted by a source end according to the first acceleration operator, calls a first target accelerator supporting the acceleration operator to process the first to-be-processed data to generate first target data, and can send a message request, namely a write request, to the second RDMA network card based on the first target data. After the second RDMA network card receives the first target data, if an acceleration operator is indicated in the message request, the corresponding second target accelerator may be called to perform acceleration processing on the first target data, and then the processing result data is stored to the destination buffer.
In a read mode, a first application creates a WQ in a first RDMA network card, WR generated by the first application is added into the WQ and converted into WQEs, and each WQE points to a buffer in a source end memory.
And the first RDMA network card acquires WQE (WQE), namely WR (write queue), executes the WQE, and sends the message request to the second RDMA network card according to the first acceleration operator pair generated message request, namely read request. And the second RDMA network card calls a corresponding second target accelerator according to the acceleration operator indicated by the message request to accelerate the data requested to be read to obtain first data to be processed, and feeds the first data to be processed back to the first RDMA network card in the form of the message request, and the first RDMA network card calls the corresponding first target accelerator according to the first acceleration operator to accelerate the second data to be processed to obtain first target data and stores the first target data into the buffer.
It can be known from the data communication process that the data processing operation is unloaded from the CPU to the RDMA network card for execution, without the participation of the CPU, and the data communication delay is further reduced.
Furthermore, as can be seen from the foregoing description, the second computer system is a destination device, but it can also be a source device. When the RDMA network card is used as a source end device, the second application and the second RDMA network card can realize the same functional operations of the first application and the first RDMA network card, such as:
the second application can also call the at least one API so as to add a work request generated by a third acceleration operator pair corresponding to the data processing requirement into a work queue; optionally, the second application may specifically call the at least one API to query an acceleration operator supported by the second RDMA network card; the second application may determine a third acceleration operator pair according to the data processing requirement, generate a work request based on the third acceleration operator pair, and then may call at least one API to add the work request to the work queue.
The second RDMA network card may also be configured to acquire a work request from the work queue, determine a third acceleration operator pair, invoke the third acceleration operator to process third to-be-processed data for the indicated first target accelerator, acquire third target data, and perform remote access processing on the third target data.
The specific implementation may be detailed in the description information related to the first application and the first RDMA network card, which will not be described herein again. The first, second, third, fourth, etc. descriptors used herein are also only used for describing the convenience of distinguishing the same content, for example, the first, second, and third acceleration operator pairs of the first, second, and third acceleration operator pairs are only used for distinguishing the acceleration operator pairs designated in different situations, and do not represent other meanings. In order not to introduce too many concepts, the first target accelerator may be understood as an accelerator corresponding to the acceleration operator used at the source end, that is, the acceleration operator may correspond to the accelerator with the data processing capability, and the second target accelerator may be understood as an accelerator corresponding to the acceleration operator used at the destination end.
For the processing operation on the third to-be-processed data, reference may be made to the processing operation on the first to-be-processed data, and for the processing operation on the third target data, reference may be made to the processing operation on the first target data, which will not be described herein again.
The first computer system is a source end device, and certainly can also be a destination end device, and when the first computer system is the destination end device, the first application and the first RDMA network card can implement the same functional operation of the second application and the second RDMA network card, for example, the first RDMA network card is also used for receiving a message request sent by a corresponding source end, and based on a fourth acceleration operator pair specified in the message request, the indicated second target accelerator is called to process fourth to-be-processed data, so as to obtain fourth target data, and remote access processing is performed on the fourth target data.
For the processing operation on the fourth to-be-processed data, reference may be made to the processing operation on the second to-be-processed data, and for the processing operation on the fourth target data, reference may be made to the processing operation on the second target data, which will not be described herein again.
In the embodiment of the application, the RDMA network card integrated accelerator realizes data acceleration processing by the RDMA network card integrated accelerator in the RDMA data communication process through at least one pre-configured API, and combines data processing operation and data communication, so that the unloading of CPU load is realized in the data communication process, the data processing operation can be realized without the participation of a CPU, and the data communication delay is further reduced.
In combination with the foregoing description, in the embodiment of the present application, at least one API is newly added in the API function library. Optionally, the at least one API may include at least one resource pass-through API and at least one request processing API.
The first application entity can call a corresponding resource transparent transmission API to obtain an acceleration operator pair list in the first RDMA network card; selecting a first acceleration operator pair according to data processing requirements, calling corresponding resources to see out an API (application program interface), acquiring a parameter list required by the first acceleration operator pair, and determining parameter data of the first acceleration operator pair according to the parameter list of the first acceleration operator pair; and generating a work request based on the first acceleration operator pair and the parameter data, and calling a corresponding request processing API to add the work request into a work queue.
In one implementation, the at least one resource transparent transmission API may include one or more of a first resource transparent transmission API for obtaining a list of acceleration operator pairs in a first RDMA network card, and a second resource transparent transmission API for obtaining a list of parameters of any acceleration operator pair from the first RDMA network card;
the at least one request processing API may include one or more of a first request processing API that adds work requests for data transmission to the request based on any accelerator operator to the work queue, and a second request processing API that adds work requests for data reception to the request based on any accelerator operator to the work queue.
Wherein, the data transmission may include send, read or write, and the data reception is also receive
For convenience of understanding, the following description will be given by taking as an example a function description document of the first resource transparent API, the second resource transparent API, the first request processing API, and the second request processing API, but the present application is not limited to this, and the output value and the return value of the function are not limited to the following description information. The application developer can learn about the newly added target API based on the function specification document and develop the application accordingly.
A: ibv _ get _ device _ acc _ pair _ list, representing a first resource transparent API, for obtaining a list of acceleration operator pairs in a specified RDMA network card.
The following steps are described: and acquiring an acceleration operator pair list supported by the appointed RDMA network card, returning a newly distributed acceleration operator pair list pointer of the RDMA network card when the execution is successful, unloading related loads on a CPU (Central processing Unit) by the application according to the acceleration operator pair list supported by the RDMA network card, and releasing a memory occupied by the list after the program is finished.
Inputting: a device pointer or device context to uniquely determine the RDMA network card being accessed.
And (3) outputting: the list of the newly created acceleration operator pairs of the RDMA network card may include basic information such as representation information, parameter types, and the like of the acceleration operator pairs, and each acceleration operator pair may include acceleration operators respectively adopted at the source end and the destination end.
And returning a value: success or failure.
B: ibv _ get _ device _ acc _ pair _ parameter, representing a second resource transparent API, for obtaining a parameter list for an accelerator pair in a specified RDMA network card.
The following steps are described: and acquiring a parameter list of a certain acceleration operator pair in the appointed RDMA network card, and returning a newly allocated parameter list pointer corresponding to the acceleration operator pair when the execution is successful.
Inputting: the device pointer or device context is used to uniquely determine the identification information of the RDMA network card, acceleration operator pair, etc. being accessed.
And (3) outputting: the newly allocated acceleration operator pair corresponds to a parameter list pointer, the parameter list may include basic information such as parameter length, parameter number, parameter type, and the like, and the parameter list includes parameter information respectively required by the source end and the destination end.
And returning a value: success or failure.
C: ibv _ post _ send _ acc, representing a first request processing API for specifying data transfers for pairs of acceleration operators, including send, read, or write.
The following steps are described: the work request is converted to a WQE in the WQ for initiating a send operation including write and read.
Inputting: appointing used WQ and work request, the work request includes local memory list, appointed accelerating operator pair and its parameter data, memory address of destination end and other information
And (3) outputting: the RDMA network card rejects the received first request pointer.
And returning a value: success or failure
D: ibv _ post _ receive _ acc: the data receiving operation, namely the receive operation, of the acceleration operator pair is specified.
The following steps are described: and converting the work request into a WQE in the WQ for initializing a receiving operation of a Receive.
Inputting: and specifying the used WQ and a work request, wherein the work request comprises information such as a local memory list, a specified accelerator operator pair and parameters thereof.
And (3) outputting: the hardware rejects the first request pointer received.
And returning a value: success or failure.
An embodiment of the present application further provides a computer system, as shown in fig. 3, where the computer system may be specifically implemented as the first computer system shown in fig. 1, and may include a first RDMA network card 101 and a first CPU 102; the first RDMA network card 101 integrates at least one accelerator 30; a first application 103 developed based on an API function library is configured in the first CPU 102; the API function library includes at least one API that is preconfigured;
the first application 103 is configured to call at least one API, so as to add a work request generated by a first accelerator operator pair corresponding to a data processing requirement to a work queue; optionally, the first application may first call at least one API to query an acceleration operator supported by the first RDMA network card, the first application may determine a first acceleration operator pair according to a data processing requirement, generate a work request based on the first acceleration operator pair, and then call at least one API to add the work request to the work queue;
the first RDMA network card 101 is configured to acquire a work request from a work queue, determine a first acceleration operator pair, invoke the first acceleration operator to process the first to-be-processed data on the indicated first target accelerator, acquire first target data, and perform remote access processing on the first target data.
Specific structural implementation and functional implementation in the computer system shown in fig. 3 may be detailed in the related description of the first computer system shown in fig. 1, and will not be repeated herein.
In addition, an embodiment of the present application further provides a computer system, as shown in fig. 4, where the computer system may be specifically implemented as a second computer system shown in fig. 1, and may include a second RDMA network card 201 and a second CPU 202; a second RDMA network card 201 integrates at least one accelerator 30, a second CPU202 is configured with a second application 203 developed based on an API function library; the API function library includes at least one API that is preconfigured;
the second RDMA network card 201 is configured to receive a message request sent by the source end, call a second target accelerator to process second to-be-processed data based on a first acceleration operator pair specified in the message request, obtain second target data, and perform remote access processing on the second target data.
In some embodiments, in the send/receive mode, the second application may further call at least one API to add the data receiving request generated by the second acceleration operator corresponding to the data processing requirement to the work queue. Optionally, the second application may specifically first call at least one API to query and determine an acceleration operator supported by the second RDMA network card, the second application may determine a second acceleration operator pair according to a data processing requirement, generate a data reception request based on the second acceleration operator pair, and then call at least one API to add the data reception request to the work queue.
The second RDMA network card can also be used for acquiring a data receiving request from the work queue and determining a second acceleration operator pair based on the data receiving request; and receiving first target data sent by the source end, calling a second acceleration operator to process the first target data on the indicated second target accelerator, obtaining processing result data, and storing the processing result data.
Specific structural implementation and functional implementation in the computer system shown in fig. 4 may be detailed in the related description of the second computer system shown in fig. 1, and will not be repeated herein.
The computer systems shown in fig. 3 and 4 describe data communication processes from different function implementation perspectives, and the computer system shown in fig. 3 may also serve as a destination device, and the computer system shown in fig. 4 may also serve as a source device.
Therefore, in some embodiments, the second application may be further configured to call at least one API to add a data receiving request, generated by the second accelerator corresponding to the data processing requirement, to the work queue; optionally, the second application may specifically call at least one API to query an acceleration operator supported by the second RDMA network card, the second application may determine a second acceleration operator pair according to a data processing requirement, generate a data reception request based on the second acceleration operator pair, and then call at least one API to add the data reception request to the work queue;
the second RDMA network card is also used for acquiring a data receiving request from the work queue and determining a second acceleration operator pair based on the data receiving request; and receiving first target data sent by the source end, calling a second acceleration operator to process the first target data on the indicated second target accelerator, obtaining processing result data, and storing the processing result data.
In some embodiments, the second application may further call at least one API to add a work request generated by a third accelerator operator pair corresponding to the data processing requirement to the work queue; optionally, the second application may specifically call at least one API to query an acceleration operator supported by the second RDMA network card, determine a third acceleration operator pair according to a data processing requirement, generate a work request based on the third acceleration operator pair, and then call at least one API to add the work request to the work queue;
the second RDMA network card is further used for acquiring the work request from the work queue, determining a third acceleration operator pair, calling the third acceleration operator to process third to-be-processed data to the indicated first target accelerator, acquiring third target data, and performing remote access processing on the third target data.
In one scenario interaction diagram shown in fig. 5, the computer system 50, when acting as a source device, may communicate data with the destination 60 in RDMA mode, and when acting as a destination device, may communicate data with the source 70 in RDMA mode. At least one accelerator 30 is integrated in the RDMA network card 501 in the computer system 50, and when data communication is performed with the source end 70 or the destination end 60, the accelerator in the RDMA network card can be selected to perform corresponding data processing operation on the data, so that the data processing operation does not need to be executed by a CPU, CPU resources are not consumed, and the data processing operation is unloaded from the CPU to the RDMA network card for execution due to the complex execution operation of the CPU, so that data communication delay can be further reduced.
In addition, referring to fig. 6, an RDMA network card is further provided in the embodiment of the present application, which may include an RDMA network card unit 601 and at least one accelerator 602;
the RDMA network card unit 601 is configured to acquire a work request from a work queue, call a corresponding first target accelerator to process first to-be-processed data based on an acceleration operator pair indicated by the work request, acquire first target data, and perform remote access processing on the first target data; or receiving a message request sent by the source end, calling a corresponding second target accelerator to process second data to be processed based on an acceleration operator pair specified in the message request, obtaining second target data, and performing remote access processing on the second target data.
The work request is generated by the first application according to an accelerator operator pair corresponding to the data processing requirement, and is added into the work queue based on at least one preset API. Wherein the first application is obtained based on the at least one API development, the first application running in the CPU.
The specific operation of the RDMA network card may be detailed in the operations executed by the first RDMA network card and the second RDMA network card described in the foregoing corresponding embodiments, and details will not be repeated here. The RDMA network card unit may also refer to hardware for implementing a corresponding function of the RDMA network card, and in actual application, the RDMA network card unit may specifically be a network card supporting RDMA.
In addition, an embodiment of the present application further provides a data communication method, as shown in fig. 7, the technical solution of this embodiment may be applied to a first computer system, where the first computer system may include a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on at least one API runs in the first CPU; as described in the embodiment shown in fig. 1 and fig. 3, the technical solution of this embodiment may be specifically executed by a first application, and the first application may call at least one API to add a work request generated by a first accelerator operator corresponding to a data processing requirement into a work queue, where optionally, the method may include the following steps:
701: an acceleration operator supported by the first RDMA network card is determined.
Optionally, the at least one API may be called to query an acceleration operator supported by the first RDMA network card.
702: and determining a first acceleration operator pair according to the data processing requirement.
703: based on the first pair of acceleration operators, a work request is generated.
704; the method comprises the steps of calling at least one API (application programming interface) to add work requests into a work queue, obtaining the work requests from the work queue by a first RDMA (remote direct memory access) network card, calling a first acceleration operator to process first to-be-processed data to an indicated first target accelerator to obtain first target data, and performing remote access processing on the first target data.
An embodiment of the present application further provides a data communication method, and as shown in fig. 8, the technical solution of the embodiment may be applied to a first computer system, where the first computer system may include a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on at least one API runs in the first CPU; the specific implementation of the first computer system may be described in detail in the embodiments shown in fig. 1 and fig. 3, where the technical solution of this embodiment may be specifically executed by the first RDMA network card, and the method may include the following steps:
801: the work request is obtained from the work queue.
802: a first acceleration operator pair indicated by the work request is determined.
803: and calling a first acceleration operator to process the first to-be-processed data for the indicated first target accelerator to obtain first target data.
804: and performing remote access processing on the first target data.
An embodiment of the present application further provides a data communication method, and as shown in fig. 9, the technical solution of this embodiment may be applied to a second computer system, where the second computer system may include a second RDMA network card and a second CPU; the second RDMA network card integrates at least one accelerator, and the second CPU runs a second application developed based on at least one API; the specific implementation of the second computer system may be described in detail in the embodiments shown in fig. 1 and fig. 4, where the technical solution of this embodiment may be specifically executed by the second RDMA network card, and the method may include the following steps:
901: and receiving a message request sent by the source end.
902: a first acceleration operator pair indicated in the message request is determined.
903: and calling the first acceleration operator to process the second to-be-processed data for the indicated second target accelerator to obtain second target data.
904: and performing remote access processing on the second target data.
For the respective operations involved in the data communication method in the above embodiments, the detailed description has been given in the above specific manner of performing the operations by the modules and units of the related devices, and the detailed description will not be provided here.
In addition, an embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a computer, the data communication method described in any one of the above embodiments of fig. 7, fig. 8, or fig. 9 may be implemented.
By adopting the technical method of the embodiment of the application, the acceleration operator supported by the RDMA network card can be transmitted to the upper layer application through the newly added API, the data processing and the network transmission are combined in the RDMA network card, the online data acceleration processing is realized, and the data communication delay is further reduced on the basis of low communication delay of the RDMA mode.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (14)

1. A computer system, wherein, include the first remote direct memory access RDMA network card and first central processing unit CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on at least one preconfigured Application Program Interface (API) runs in the first CPU; the at least one API is an API which is newly added on the basis of the original API in the API function library;
the first application is used for calling the at least one API so as to add a work request generated by a first acceleration operator pair corresponding to a data processing requirement into a work queue;
the first RDMA network card is used for acquiring the work request from the work queue, determining the first acceleration operator pair, calling the first acceleration operator to process the first to-be-processed data of the indicated first target accelerator, acquiring first target data, and performing remote access processing on the first target data.
2. The computer system of claim 1, wherein the first application is to call the at least one API to query an acceleration operator supported by the first RDMA network card; determining a first acceleration operator pair according to a data processing requirement, and generating a work request based on the first acceleration operator pair; and calling the at least one API to add the work request to a work queue.
3. The computer system of claim 1, wherein the work request comprises a data send request or a data write request;
the first RDMA network card calls the first acceleration operator to process the first to-be-processed data on the indicated first target accelerator to obtain first target data, and the remote access processing on the first target data comprises the following steps: acquiring the first to-be-processed data from a memory, calling the first acceleration operator to process the first to-be-processed data for an indicated first target accelerator to acquire first target data, generating a message request based on the first target data and the first acceleration operator pair, and sending the message request to a destination; and the destination is used for calling the first acceleration operator to process the first target data for the indicated second target accelerator and storing the processing result data.
4. The computer system of claim 1, wherein the work request comprises a data read request;
the first RDMA network card calls the first acceleration operator to process the first to-be-processed data on the indicated first target accelerator to obtain first target data, and the remote access processing on the first target data comprises the following steps: generating a message request based on a first acceleration operator pair, sending the message request to a destination, reading first to-be-processed data from the destination, calling the first acceleration operator to process the first to-be-processed data for a first target accelerator indicated by the first acceleration operator pair, obtaining first target data, and storing or transmitting the first target data; and the first to-be-processed data is obtained by the destination calling the first acceleration operator to process the indicated second target accelerator according to the request of the message and the data read by the request.
5. The computer system of claim 1, wherein the at least one API comprises at least one resource pass-through API and at least one request processing API;
the first application body calls a corresponding resource transparent transmission API to acquire an acceleration operator pair list in the first RDMA network card; selecting a first acceleration operator pair according to data processing requirements, and calling a corresponding resource transparent transmission API (application program interface) to obtain a parameter list required by the first acceleration operator pair; determining parameter data of the first acceleration operator pair according to the parameter list of the first acceleration operator pair; generating a work request based on the first acceleration operator pair and the parameter data; and calling a corresponding request processing API to add the work request into a work queue.
6. The computer system of claim 5, wherein the at least one resource pass-through API comprises one or more of a first resource pass-through API to obtain a list of acceleration operator pairs for the first RDMA network card, and a second resource pass-through API to obtain a list of parameters for any acceleration operator pair from the first RDMA network card; the at least one request processing API comprises one or more of a first request processing API for adding work requests for data transmission to the request based on any accelerator operator to the work queue and a second request processing API for adding work requests for data reception to the request based on any accelerator operator to the work queue.
7. A computer system comprising a second RDMA network card and a second CPU; the second RDMA network card integrates at least one accelerator; a second application developed based on the preconfigured at least one API runs in the second CPU;
the second RDMA network card is used for receiving a message request sent by a source end, calling a second target accelerator to process second to-be-processed data based on a first acceleration operator pair specified in the message request, obtaining second target data, and performing remote access processing on the second target data.
8. The computer system of claim 7, wherein the second application is configured to call the at least one API to add a data receiving request generated by a second accelerator operator corresponding to a data processing requirement to a work queue;
the second RDMA network card is further used for acquiring the data receiving request from the work queue and determining a second acceleration operator pair based on the data receiving request; and receiving first target data sent by a source end, calling the second acceleration operator to process the first target data for the indicated second target accelerator, obtaining processing result data, and storing the processing result data.
9. The computer system of claim 7, wherein the second application is further configured to call the at least one API to add a work request generated by a third accelerator operator pair corresponding to a data processing requirement to a work queue;
the second RDMA network card is further configured to acquire the work request from the work queue, determine the third acceleration operator pair, invoke the third acceleration operator to process third to-be-processed data on the indicated first target accelerator, acquire third target data, and perform remote access processing on the third target data.
10. An RDMA network card, wherein, dispose in the computer system, including RDMA network card unit and at least one accelerator;
the RDMA network card unit is used for acquiring a work request from a work queue, calling a corresponding first target accelerator to process first to-be-processed data based on an acceleration operator pair indicated by the work request, acquiring first target data and performing remote access processing on the first target data; or receiving a message request sent by a source end, calling a corresponding second target accelerator to process second data to be processed based on an acceleration operator pair specified in the message request, obtaining second target data, and performing remote access processing on the second target data;
the work request is generated by a first application based on an accelerator operator pair corresponding to a data processing requirement, and at least one preset API is called to be added into the work queue; the at least one API is an API which is newly added on the basis of the original API in the API function library.
11. A data communication method is applied to a first computer system, wherein the first computer system comprises a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on the preconfigured at least one API runs in the first CPU; the at least one API is an API which is newly added on the basis of the original API in the API function library;
the method comprises the following steps:
and calling the at least one API to add a work request generated by a first acceleration operator pair corresponding to a data processing requirement into a work queue so that the first RDMA network card can acquire the work request from the work queue, calling the first acceleration operator to process first to-be-processed data for an indicated first target accelerator to acquire first target data, and performing remote access processing on the first target data.
12. A data communication method is applied to a first computer system, wherein the first computer system comprises a first RDMA network card and a first CPU; the first RDMA network card integrates at least one accelerator; a first application developed based on the preconfigured at least one API runs in the first CPU; the at least one API is an API which is newly added on the basis of the original API in the API function library;
the method comprises the following steps:
acquiring a work request from a work queue; the work request is generated by the first application based on an accelerator operator pair corresponding to a data processing requirement, and at least one preset API is called to be added into the work queue;
determining a first acceleration operator pair indicated by the work request;
calling the first acceleration operator to process the first to-be-processed data of the indicated first target accelerator to obtain first target data;
and performing remote access processing on the first target data.
13. A data communication method is applied to a second computer system, wherein the second computer system comprises a second RDMA network card and a second CPU; the second RDMA network card integrates at least one accelerator, and a second application developed based on the preconfigured at least one API runs in the second CPU;
the method comprises the following steps:
receiving a message request sent by a source end;
determining a first acceleration operator pair indicated in the message request;
calling the first acceleration operator to process second data to be processed for a second target accelerator indicated by the first acceleration operator to obtain second target data;
and performing remote access processing on the second target data.
14. A computer storage medium in which a computer program is stored which, when executed by a computer, implements the data communication method of claim 11, 12 or 13.
CN202111146457.XA 2021-09-28 2021-09-28 Computer system, RDMA network card and data communication method Active CN113595807B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111146457.XA CN113595807B (en) 2021-09-28 2021-09-28 Computer system, RDMA network card and data communication method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111146457.XA CN113595807B (en) 2021-09-28 2021-09-28 Computer system, RDMA network card and data communication method

Publications (2)

Publication Number Publication Date
CN113595807A CN113595807A (en) 2021-11-02
CN113595807B true CN113595807B (en) 2022-03-01

Family

ID=78242447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111146457.XA Active CN113595807B (en) 2021-09-28 2021-09-28 Computer system, RDMA network card and data communication method

Country Status (1)

Country Link
CN (1) CN113595807B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934323B (en) * 2022-12-02 2024-01-19 北京首都在线科技股份有限公司 Cloud computing resource calling method and device, electronic equipment and storage medium
CN116627888B (en) * 2023-07-25 2023-10-03 苏州浪潮智能科技有限公司 Hardware computing module, device, method, electronic device, and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016195624A1 (en) * 2015-05-29 2016-12-08 Hewlett Packard Enterprise Development Lp Transferring an image file over a network
DK3358463T3 (en) * 2016-08-26 2020-11-16 Huawei Tech Co Ltd METHOD, DEVICE AND SYSTEM FOR IMPLEMENTING HARDWARE ACCELERATION TREATMENT
US20180150256A1 (en) * 2016-11-29 2018-05-31 Intel Corporation Technologies for data deduplication in disaggregated architectures
US11474700B2 (en) * 2019-04-30 2022-10-18 Intel Corporation Technologies for compressing communication for accelerator devices
CN112256407A (en) * 2020-12-17 2021-01-22 烽火通信科技股份有限公司 RDMA (remote direct memory Access) -based container network, communication method and computer-readable medium

Also Published As

Publication number Publication date
CN113595807A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
CN113595807B (en) Computer system, RDMA network card and data communication method
CN110278161B (en) Message distribution method, device and system based on user mode protocol stack
US10165058B2 (en) Dynamic local function binding apparatus and method
CN111163130B (en) Network service system and data transmission method thereof
CN112925737B (en) PCI heterogeneous system data fusion method, system, equipment and storage medium
CN108924183B (en) Method and device for processing information
CN114285906B (en) Message processing method and device, electronic equipment and storage medium
CN107579929B (en) Method, system and related device for setting reliable connection communication queue pair
US11190620B2 (en) Methods and electronic devices for data transmission and reception
CN115562887A (en) Inter-core data communication method, system, device and medium based on data package
US20180227347A1 (en) Virtualizing audio and video devices using synchronous a/v streaming
KR102565776B1 (en) Method and Apparatus for Cloud Service
US20230325149A1 (en) Data processing method and apparatus, computer device, and computer-readable storage medium
CN113259266A (en) Message pushing method and device of message queue, server and storage medium
CN113965628A (en) Message scheduling method, server and storage medium
CN114116184B (en) Data processing method and device in virtual scene, equipment and medium
CN115378937A (en) Distributed concurrency method, device and equipment for tasks and readable storage medium
US20200382409A1 (en) Apparatus and method for transmitting content
US10430371B1 (en) Accelerating redirected USB devices that perform bulk transfers
CN112346661A (en) Data processing method and device and electronic equipment
CN114924843B (en) Information transmission method and device, computer equipment and storage medium
CN115102992B (en) Data publishing method and device, electronic equipment and computer readable medium
CN112041817A (en) Method and node for managing requests for hardware acceleration by means of an accelerator device
CN115225586B (en) Data packet transmitting method, device, equipment and computer readable storage medium
US20240095106A1 (en) Single-step collective operations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant