CN115237500A - Data processing method, device, equipment and medium of pooling platform - Google Patents

Data processing method, device, equipment and medium of pooling platform Download PDF

Info

Publication number
CN115237500A
CN115237500A CN202210909198.XA CN202210909198A CN115237500A CN 115237500 A CN115237500 A CN 115237500A CN 202210909198 A CN202210909198 A CN 202210909198A CN 115237500 A CN115237500 A CN 115237500A
Authority
CN
China
Prior art keywords
acceleration
application
pooling
platform
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210909198.XA
Other languages
Chinese (zh)
Inventor
王江为
阚宏伟
郝锐
王彦伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN202210909198.XA priority Critical patent/CN115237500A/en
Publication of CN115237500A publication Critical patent/CN115237500A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling

Abstract

The application relates to the technical field of distributed application, and discloses a data processing method, a device, equipment and a medium of a pooling platform, wherein configuration information is added to a custom field of a transmission protocol based on application acceleration requirements; the configuration information includes operation identification, address information and calculation information that match the application acceleration requirements. And processing the application data transmitted by the host server according to the operation identifier and the calculation information, transmitting the processed application data to the acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation. The configuration information used for processing the application data is added in the transmission protocol, and the application data can be processed directly according to the configuration information, so that the configuration interaction times among acceleration components are reduced, the time delay is reduced, and the heterogeneous acceleration performance of the pooling platform is improved. The configuration information is set in the self-defined field of the transmission protocol, so that the original protocol field is simplified.

Description

Data processing method, device, equipment and medium of pooling platform
Technical Field
The present application relates to the field of distributed application technologies, and in particular, to a data processing method, apparatus, device, and computer-readable storage medium for a pooling platform.
Background
In an FPGA (Field Programmable Gate Array) pooling platform, a large number of FPGA accelerator cards form an acceleration resource pool for acceleration processing of distributed applications, and the deployment form of the FPGA accelerator cards may be a coprocessor of a host server. Or the FPGA BOX mode, namely machine card decoupling, can be realized, a server is not provided, and only the accelerator card exists as an independent accelerator unit. One form of FPGA accelerator card deployment in the conventional approach is as a coprocessor of the host server, and the other form is the FPGA BOX form. And data interaction is carried out between the FPGA accelerator cards through a transmission protocol.
The acceleration of the FPGA pooling platform relates to two aspects, including acceleration in the FPGA accelerator card and data transmission acceleration between the FPGA accelerator cards. The logic of the FPGA accelerator card is composed of three parts, a kernel (accelerated processing) unit which can be dynamically reconfigured according to the application, a memory unit for storing data, and a PCIe Interface (Physical Interface for PCI Express) or MAC (Media Layer) Interface for connecting with the peripheral.
The data acceleration process of the FPGA pooling platform comprises the steps that an application to be accelerated is transmitted to a memory unit of an FPGA acceleration card from a host server through a PCIe interface; the host configures a kernel unit to fetch data from the memory unit for accelerated calculation; and the host or the kernel unit configures a DMA IP (direct memory access), and transmits the calculation result back to the host through a PCIe (peripheral component interface) interface or transmits the calculation result to other FPGA (field programmable gate array) accelerator cards of the pooling platform through an MAC (media access control) interface.
At present, data transmission between FPGA accelerator cards is usually realized by an RDMA (Remote Direct Memory Access) technology. However, in the acceleration method of the FPGA pooling platform, calculation and transmission are separated, the host or the remote kernel unit configures the local kernel unit to complete calculation acceleration, and then the local kernel or the host configures the RDMA IP to initiate RDMA data transfer. One application of the method is applied to the FPGA pooling platform to complete the configuration process which needs to be accelerated for multiple times, so that the overall processing delay is increased, and the advantage of acceleration of the FPGA pooling platform is weakened.
It can be seen that how to reduce the processing delay of the pooling platform is a problem to be solved by those skilled in the art.
Disclosure of Invention
An object of the embodiments of the present application is to provide a data processing method, apparatus, device and computer-readable storage medium for a pooling platform, which can reduce processing latency of the pooling platform.
In order to solve the foregoing technical problem, an embodiment of the present application provides a data processing method for a pooling platform, including:
adding configuration information to a custom field of a transmission protocol based on application acceleration requirements; the configuration information comprises an operation identifier, address information and calculation information which are matched with the application acceleration requirement;
receiving application data transmitted by a host server;
and processing the application data according to the operation identifier and the calculation information, transmitting the processed application data to an acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation.
Optionally, the acceleration component is an Application Specific Integrated Circuit (ASIC);
optionally, the asic is an FPGA accelerator card.
Optionally, in a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA accelerator card, and at least one of the acceleration processing modules corresponds to multi-instruction calculation, the calculation information includes an operation sequence instruction and an instruction address; wherein the instruction address points to an instruction required by the application acceleration requirement;
the processing the application data according to the operation identifier and the calculation information comprises:
and the plurality of acceleration processing modules sequentially call the instruction pointed by the instruction address to process the application data according to the operation sequence instruction.
Optionally, in a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA acceleration card, and each acceleration processing module corresponds to single-instruction calculation, the calculation information includes an instruction required by the application acceleration requirement;
the processing the application data according to the operation identifier and the calculation information comprises:
and the acceleration processing modules process the application data according to the corresponding instructions.
Optionally, in a case that the application acceleration requirement corresponds to an acceleration processing module for internal computation on the FPGA accelerator card, the computation information includes an instruction address; wherein the instruction address points to an internal computation instruction required by the application acceleration requirement;
the processing the application data according to the operation identifier and the calculation information comprises:
and the acceleration processing module calls an internal calculation instruction to process the application data according to the instruction address.
Optionally, in a case that the operation identifier is a remote direct data access operation identifier, the address information includes a logical address of the target acceleration component, a read-write identifier determined according to the calculation information and the remote direct data access operation identifier, and a transmission length of the remote direct data access operation.
Optionally, in a case that the operation identifier is a stream operation identifier, the address information includes a target acceleration component logical address.
Optionally, the configuration information further includes a packet sequence number;
after the transmitting the processed application data to the acceleration component pointed by the address information, the method further comprises:
judging whether the processed application data is matched with the packet sequence number or not;
and under the condition that the processed application data is not matched with the packet sequence number, feeding back packet loss prompt information carrying a missing sequence number to the host server.
The embodiment of the application also provides a data processing device of the pooling platform, which comprises an adding unit, a receiving unit, a processing unit and a transmission unit;
the adding unit is used for adding configuration information to the custom field of the transmission protocol based on the application acceleration requirement; the configuration information comprises an operation identifier, address information and calculation information which are matched with the application acceleration requirement;
the receiving unit is used for receiving the application data transmitted by the host server;
the processing unit is used for processing the application data according to the operation identifier and the calculation information;
and the transmission unit is used for transmitting the processed application data to the acceleration component pointed by the address information until the processing of the application data on different acceleration components in the pooling platform is completed, and ending the operation.
Optionally, the acceleration component is an application specific integrated circuit.
Optionally, the asic is an FPGA accelerator card.
Optionally, in a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA accelerator card, and at least one of the acceleration processing modules corresponds to multi-instruction calculation, the calculation information includes an operation sequence instruction and an instruction address; wherein the instruction address points to an instruction required by the application acceleration requirement;
and the processing unit is used for sequentially calling the instructions pointed by the instruction addresses to the plurality of accelerated processing modules according to the operation sequence instructions so as to process the application data.
Optionally, in a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA acceleration card, and each acceleration processing module corresponds to single-instruction calculation, the calculation information includes an instruction required by the application acceleration requirement;
and the processing unit is used for processing the application data by the plurality of accelerated processing modules according to respective corresponding instructions.
Optionally, in a case that the application acceleration requirement corresponds to an acceleration processing module for internal computation on the FPGA accelerator card, the computation information includes an instruction address; wherein the instruction address points to an internal computation instruction required by the application acceleration requirement;
and the processing unit is used for calling an internal calculation instruction by the accelerated processing module according to the instruction address to process the application data.
Optionally, in a case that the operation identifier is a remote direct data access operation identifier, the address information includes a logical address of the target acceleration component, a read-write identifier determined according to the calculation information and the remote direct data access operation identifier, and a transmission length of the remote direct data access operation.
Optionally, in a case that the operation identifier is a stream operation identifier, the address information includes a target acceleration component logical address.
Optionally, the configuration information further includes a packet sequence number; the device also comprises a judging unit and a feedback unit;
the judging unit is used for judging whether the processed application data is matched with the packet sequence number;
and the feedback unit is configured to feed back packet loss prompt information carrying a missing serial number to the host server under the condition that the processed application data is not matched with the packet serial number.
An embodiment of the present application further provides an electronic device, including:
a memory for storing a computer program;
a processor for executing said computer program for implementing the steps of the data processing method of the pooling platform as described above.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor, the computer program implements the steps of the data processing method of the pooling platform.
According to the technical scheme, based on the application acceleration requirement, configuration information is added to the custom field of the transmission protocol; the configuration information may include operation identification, address information, and calculation information matching the application acceleration requirement. The operation identifier is used for indicating the type of operation required to be executed, the address information is used for indicating an acceleration component for processing the application data, and the calculation information is used for indicating the specific operation required to be executed on the application data. Receiving application data transmitted by a host server; and processing the application data according to the operation identifier and the calculation information, transmitting the processed application data to the acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation. In the technical scheme, the configuration information for processing the application data is added in the transmission protocol, and after the application data is received, the application data can be processed directly according to the configuration information in the transmission protocol, so that the configuration interaction times among acceleration components are reduced, the time delay is reduced, and the heterogeneous acceleration performance of the pooling platform is improved. And configuration information is set in the user-defined field of the transmission protocol according to the actual application acceleration requirement, so that the original protocol field is simplified, the internal processing logic is simplified, and the processing performance is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a data processing method of a pooling platform according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a pooling platform provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of a pooling platform for processing application data based on two FPGA accelerator cards according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a data processing apparatus of a pooling platform according to an embodiment of the present disclosure;
fig. 5 is a structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the present application.
The terms "including" and "having," and any variations thereof, in the description and claims of this application and the drawings described above, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may include other steps or elements not expressly listed.
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings.
Next, a data processing method of a pooling platform provided in an embodiment of the present application is described in detail. Fig. 1 is a flowchart of a data processing method of a pooling platform according to an embodiment of the present application, where the method includes:
s101: based on the application acceleration requirements, configuration information is added to the custom field of the transport protocol.
The configuration information may include operation identification, address information, and calculation information that match the application acceleration requirements.
The operation identifier is used for indicating the type of operation required to be executed, the address information is used for indicating an acceleration component for processing the application data, and the calculation information is used for indicating the specific operation required to be executed on the application data. Compared with a processor, the acceleration component can accelerate the data processing flow and improve the data processing speed. As an embodiment of the present Application, the acceleration part may be an Application Specific Integrated Circuit (ASIC).
In the embodiment of the present application, the asic may be an FPGA accelerator card.
In this embodiment, the transport protocol may adopt a remote direct data access (RDMA _ Enhance) transport protocol. The format of the RDMA _ Enhance transport protocol is shown in table 1,
TABLE 1
Figure BDA0003773521760000071
The Ethernet L2 Header, the IP Header and the UDP Header are standard Ethernet Header fields, the RDMA Enhance is a self-defined field, payload represents message load, and ICRC and FCS respectively correspond to redundancy detection and frame check.
The format of the custom field may be set based on actual application acceleration requirements, one common format for custom fields may be found in table 2,
TABLE 2
Figure BDA0003773521760000072
Figure BDA0003773521760000081
The opcode is an operation identifier, and may include a remote direct data access (RDMA) operation identifier and a Stream (Stream) operation identifier. dqp represents the target acceleration component logical address. cal _ code represents custom calculation information. psn denotes the packet sequence number used to check the integrity of the data. addr represents a read-write identifier defined according to the operation identifier and the calculation information. len denotes the transfer length at the time of RDMA operation.
The acceleration components called for by different application acceleration requirements and the operations that each acceleration component needs to perform may vary. Therefore, aiming at the current application acceleration requirement, the bytes of the custom field can be divided, and the configuration information is set for the divided bytes, so that each acceleration component in the pooling platform can complete the processing of the application data depending on the configuration information.
S102: and receiving the application data transmitted by the host server.
Taking the FPGA accelerator card as an example, the pooling platform may include a plurality of FPGA accelerator cards, the PCIe DMA module in the FPGA accelerator card may implement interaction with the host server, and in practical application, the host server may transmit the application data to the PCIe DMA module of the FPGA accelerator card.
S103: and processing the application data according to the operation identifier and the calculation information, transmitting the processed application data to the acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation.
The FPGA accelerator card comprises an acceleration processing module, namely a kernel module, for processing the application data. Based on different application acceleration requirements, the number of the FPGA acceleration cards required to be called and the kernel modules related to each FPGA acceleration card are different.
In practical application, the application acceleration requirements are different, and the corresponding calculation information is different. For scenarios where the application acceleration requirement can be fulfilled by a simple algorithm, the calculation information may comprise instructions required for the application acceleration requirement.
For a scenario that the application acceleration requirement needs to be completed through a complex algorithm, an acceleration processing module that can implement internal computation needs to be called, or multiple acceleration processing modules need to be called to complete processing of application data, so that the computation information may include an instruction address, and the instruction address may be used to point to an instruction needed by the application acceleration requirement. For a scenario of calling the plurality of accelerated processing modules, the calculation information may further include an operation sequence instruction for indicating an operation sequence of the plurality of accelerated processing modules.
Taking the case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA acceleration card, and at least one acceleration processing module corresponds to multi-instruction calculation as an example, the calculation information may include an operation sequence instruction and an instruction address; wherein the instruction address points to an instruction required by the application acceleration requirement.
The process of processing the application data by the FPGA accelerator card according to the operation identifier and the calculation information may include sequentially calling instructions pointed by the instruction address to process the application data by the plurality of accelerator processing modules according to the operation sequence instructions.
Taking a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA acceleration card, and each acceleration processing module corresponds to single-instruction calculation as an example, the calculation information may include an instruction required by the application acceleration requirement.
The process of processing the application data by the FPGA accelerator card according to the operation identifier and the calculation information may include processing the application data by the plurality of acceleration processing modules according to respective corresponding instructions.
Taking the case that the application acceleration requirement corresponds to an acceleration processing module for internal computation on the FPGA accelerator card as an example, the computation information may include an instruction address; wherein the instruction address points to an internal computation instruction required by the application acceleration requirement.
The process of processing the application data by the FPGA accelerator card according to the operation identifier and the calculation information may include the accelerator processing module calling an internal calculation instruction according to an instruction address to process the application data.
In practical applications, the types of operations performed by the FPGA accelerator card may include remote direct data access operations and streaming operations. The operation identification may thus comprise a remote direct data access operation identification and a stream operation identification.
In the case that the operation identifier is a remote direct data access operation identifier, the address information may include a logical address of the target acceleration component, a read-write identifier determined according to the calculation information and the remote direct data access operation identifier, and a transmission length of the remote direct data access operation.
The stream operation is used for realizing the transmission of the application data between different FPGA acceleration cards, so in the case that the operation identifier is the stream operation identifier, the address information can only comprise the logical address of the target acceleration component.
And the FPGA accelerator card is used for realizing the accelerated processing of the application data in the pooling platform.
Fig. 2 is a schematic structural diagram of a pooling platform according to an embodiment of the present disclosure, in fig. 2, three FPGA accelerator cards are taken as an example, the FPGA accelerator card on the leftmost side may be used as a coprocessor of a host server, and the two FPGA accelerator cards on the right side exist as independent accelerator units in an FPGA BOX form. Each FPGA accelerator card can comprise a PCIe DMA module, a Memory module, a DMA module, a Stream module, a MAC module and at least one Kernel module. The arrows in fig. 2 are used to indicate the flow of application data. And data interaction between different FPGA accelerator cards can be realized through the exchange unit. In fig. 2, RDMA _ Enhance is marked between each FPGA accelerator card and the switch unit, and is used to represent that data interaction is realized between different FPGA accelerator cards according to an RDMA _ Enhance transport protocol.
Taking the application acceleration requirement of using two FPGA accelerator cards to perform application data processing as an example, the two FPGA accelerator cards may be respectively called an FPGA accelerator card 1 and an FPGA accelerator card 2, and the two FPGA accelerator cards are programmed into acceleration processing modules, and the FPGA accelerator card 1 uses 3 kernel modules, which are respectively used for implementing decompression, internal calculation, and encryption functions. The FPGA accelerator card 2 uses 2 kernel modules to respectively realize decryption and internal calculation functions.
The process of adding configuration information to the custom field of the transport protocol may include that the host server configures, through a register, the configuration information of the local FPGA accelerator card 1 based on the RDMA _ Enhance protocol as follows:
opcode _1: PCIe DMA module inputs data, and Stream module outputs data;
cal _ code _1:3 kernel module sequential computation modes;
dqp _1: an FPGA accelerator card 2;
addr _1: a memory read address;
len _1: the kernel module reads the data length from the memory module.
Meanwhile, the configuration information of the RDMA _ Enhance protocol of the remote FPGA acceleration card 2 is configured in the local FPGA acceleration card 1 as follows:
opcode _2: the Stream module inputs data, and the DMA module outputs data; wherein, the destination memory is a remote host memory;
cal _ code _2:2 kernel modules acquire an instruction set from a storage unit for processing;
dqp _2: an FPGA accelerator card 1;
addr _2: a memory write address;
len _2: and writing the kernel module into the data length of the memory module.
As shown in fig. 3, which is a schematic structural diagram of a pooling platform for processing application data based on two FPGA accelerator cards according to an embodiment of the present disclosure, the two FPGA accelerator cards may be referred to as an FPGA accelerator card 1 and an FPGA accelerator card 2, respectively. The Kernel unit in the FPGA accelerator card 1 comprises three Kernel modules, namely a Kernel1, a Kernel2 and a Kernel3. The Kernel unit in the FPGA accelerator card 2 comprises two Kernel modules, namely a Kernel1 and a Kernel2. It should be noted that operations executed by the kernel1 of the FPGA accelerator card 1 and the kernel1 of the FPGA accelerator card 2 are different, and operations executed by the kernel2 of the FPGA accelerator card 1 and the kernel2 of the FPGA accelerator card 2 are different.
The reference numerals between different modules in fig. 3 are used to indicate the processing sequence of the application data, and the processing flow of the application data includes the following steps: (1) the compressed application data is stored into a Memory from a host server through a PCIe DMA module on the FPGA accelerator card 1; (2) kernel1 detects that the internal DMA controller sets a completion signal to be 1, and starts to read data from the Memory; (3) kernel1 starts decompression calculation and transmits the calculation result to Kernel2; (4) the Kernel2 starts to self-define the first-stage calculation of the algorithm model, and transmits the result to the Kernel3 after the calculation is finished; (5) kernel3 starts encryption calculation, and sends a calculation result to a target accelerator card, namely an FPGA accelerator card 2, in a stream mode based on an RDMA _ Enhance protocol; (6) the target acceleration component receives the RDMA _ Enhance protocol message, analyzes and extracts related fields and sends the fields to the logic module and the kernel module; meanwhile, the data part in the message is sent to Kernel1; (7) kernel1 performs decryption calculations and sends the results to Kernel2; (8) kernel2 reads a calculation instruction from the storage unit to execute the second-stage calculation of the custom algorithm model, and stores the calculation result into the Memory; (9) storing data into a Memory by Kernel2, writing the data into a DMA internal register Memory _ wr _ done 1, starting DMA to fetch the data from the Memory, organizing the data into data based on an RDMA _ Enhance protocol, and transmitting the data to a target acceleration component, namely a host server of the FPGA acceleration card 1, so as to complete the acceleration calculation task.
In the prior art, 5 configuration operations need to be performed when performing an accelerated computing task of application data, which respectively include: (1) After the data storage of the flow 1 is completed, the configuration of triggering kernel calculation needs to be initiated once; (2) Before the process 5 starts, triggering data movement needs to be configured once; (3) After the operation of the flow 6 is completed, the trigger kernel calculation needs to be configured once, and an Ethernet packet can be configured; (4) Before the process 8, after the kernel2 completes calculation, triggering data storage needs to be configured once; (5) before the process 9, a configuration trigger data move is required.
In the prior art, 5 configuration operations are needed for completing application data processing in the embodiment of the application once, the content of the RDMA protocol is simplified by using the customized RDMA _ enhance protocol, and only configuration information needs to be added to the customized field of the transmission protocol based on application acceleration requirements before the application data is processed, namely, the application data can be processed through one-time configuration, so that internal processing logic is simplified, and the processing efficiency of the application data is effectively improved.
According to the technical scheme, based on the application acceleration requirement, configuration information is added to the custom field of the transmission protocol; the configuration information may include operation identification, address information, and calculation information that match the application acceleration requirements. The operation identifier is used for indicating the type of operation required to be executed, the address information is used for indicating an acceleration component for processing the application data, and the calculation information is used for indicating the specific operation required to be executed on the application data. Receiving application data transmitted by a host server; and processing the application data according to the operation identifier and the calculation information, transmitting the processed application data to the acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation. In the technical scheme, the configuration information for processing the application data is added in the transmission protocol, and after the application data is received, the application data can be processed directly according to the configuration information in the transmission protocol, so that the configuration interaction times among acceleration components are reduced, the time delay is reduced, and the heterogeneous acceleration performance of the pooling platform is improved. And configuration information is set in the user-defined field of the transmission protocol according to the actual application acceleration requirement, so that the original protocol field is simplified, the internal processing logic is simplified, and the processing performance is further improved.
In this embodiment of the present application, in order to implement packet loss detection on application data, a packet sequence number may be set in configuration information. After the FPGA acceleration card transmits the processed application data to the acceleration component pointed by the address information, whether the processed application data is matched with the packet serial number or not can be judged; and under the condition that the processed application data is not matched with the packet sequence number, packet loss prompt information carrying the missing sequence number can be fed back to the host server.
Fig. 4 is a schematic structural diagram of a data processing apparatus of a pooling platform according to an embodiment of the present application, including an adding unit 41, a receiving unit 42, a processing unit 43, and a transmitting unit 44;
an adding unit 41, configured to add configuration information to a custom field of a transmission protocol based on an application acceleration requirement; the configuration information comprises an operation identifier matched with an application acceleration requirement, address information and calculation information;
a receiving unit 42, configured to receive application data transmitted by the host server;
a processing unit 43, configured to process the application data according to the operation identifier and the calculation information;
and the transmission unit 44 is configured to transmit the processed application data to the acceleration component pointed by the address information, until the processing of the application data on different acceleration components in the pooling platform is completed, and then the operation is ended.
The operation identifier is used for indicating the type of operation required to be executed, the address information is used for indicating an acceleration component for processing the application data, and the calculation information is used for indicating the specific operation required to be executed on the application data.
The acceleration components invoked by different application acceleration requirements and the operations that each acceleration component needs to perform will vary. Therefore, aiming at the current application acceleration requirement, the bytes of the custom field can be divided, and the configuration information is set for the divided bytes, so that each acceleration component in the pooling platform can complete the processing of the application data depending on the configuration information.
Optionally, the acceleration component is an application specific integrated circuit.
Optionally, the application specific integrated circuit is an FPGA accelerator card.
Optionally, under the condition that the application acceleration requirement corresponds to multiple acceleration processing modules on the FPGA acceleration card, and at least one acceleration processing module corresponds to multi-instruction calculation, the calculation information includes an operation sequence instruction and an instruction address; wherein, the instruction address points to the instruction needed by the application acceleration requirement;
and the processing unit is used for sequentially calling the instructions pointed by the instruction addresses to process the application data by the plurality of accelerated processing modules according to the operation sequence instructions.
Optionally, in a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on the FPGA acceleration card, and each acceleration processing module corresponds to single-instruction calculation, the calculation information includes an instruction required by the application acceleration requirement;
and the processing unit is used for processing the application data by the plurality of accelerated processing modules according to the corresponding instructions.
Optionally, in a case that the application acceleration requirement corresponds to an acceleration processing module for internal computation on the FPGA accelerator card, the computation information includes an instruction address; wherein, the instruction address points to an internal calculation instruction required by the application acceleration requirement;
and the processing unit is used for calling the internal calculation instruction by the acceleration processing module according to the instruction address to process the application data.
Optionally, in the case that the operation identifier is a remote direct data access operation identifier, the address information includes a logical address of the target acceleration component, a read-write identifier determined according to the calculation information and the remote direct data access operation identifier, and a transmission length of the remote direct data access operation.
Optionally, in the case that the operation identifier is a stream operation identifier, the address information includes a target acceleration component logical address.
Optionally, the configuration information further includes a packet sequence number; the device also comprises a judging unit and a feedback unit;
the judging unit is used for judging whether the processed application data is matched with the packet sequence number;
and the feedback unit is used for feeding back the packet loss prompt information carrying the missing serial number to the host server under the condition that the processed application data is not matched with the packet serial number.
The description of the features in the embodiment corresponding to fig. 4 may refer to the related description of the embodiment corresponding to fig. 1, and is not repeated here.
According to the technical scheme, based on the application acceleration requirement, configuration information is added to the custom field of the transmission protocol; the configuration information may include operation identification, address information, and calculation information matching the application acceleration requirement. The operation identifier is used for indicating the type of operation required to be executed, the address information is used for indicating an acceleration component for processing the application data, and the calculation information is used for indicating the specific operation required to be executed on the application data. Receiving application data transmitted by a host server; and processing the application data according to the operation identifier and the calculation information, transmitting the processed application data to the acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation. In the technical scheme, the configuration information for processing the application data is added in the transmission protocol, and after the application data is received, the application data can be processed directly according to the configuration information in the transmission protocol, so that the configuration interaction times among acceleration components are reduced, the time delay is reduced, and the heterogeneous acceleration performance of the pooling platform is improved. And configuration information is set in the user-defined field of the transmission protocol according to the actual application acceleration requirement, so that the original protocol field is simplified, the internal processing logic is simplified, and the processing performance is further improved.
Fig. 5 is a structural diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 5, the electronic device includes: a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the data processing method of the pooling platform as described above in the previous embodiments when executing the computer program.
The electronic device provided by the embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, or a desktop computer.
The processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 21 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 21 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in a wake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 21 may further include an AI (Artificial Intelligence) processor for processing a calculation operation related to machine learning.
The memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing the following computer program 201, wherein after being loaded and executed by the processor 21, the computer program can implement the relevant steps of the data processing method of the pooling platform disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202, data 203, and the like, and the storage manner may be a transient storage manner or a permanent storage manner. Operating system 202 may include, among others, windows, unix, linux, and the like. Data 203 may include, but is not limited to, configuration information, and the like.
In some embodiments, the electronic device may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
Those skilled in the art will appreciate that the configuration shown in fig. 5 is not intended to be limiting of electronic devices and may include more or fewer components than those shown.
It is understood that, if the data processing method of the pooling platform in the above embodiment is implemented in the form of software functional units and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application may be substantially or partially implemented in the form of a software product, which is stored in a storage medium and executes all or part of the steps of the methods of the embodiments of the present application, or all or part of the technical solutions. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), an electrically erasable programmable ROM, a register, a hard disk, a removable magnetic disk, a CD-ROM, a magnetic disk, or an optical disk.
Based on this, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the data processing method of the pooling platform.
The functions of the functional modules of the computer-readable storage medium according to the embodiment of the present invention may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the related description of the foregoing method embodiment, which is not described herein again.
The foregoing details a data processing method, an apparatus, a device, and a computer-readable storage medium for a pooling platform provided in the embodiments of the present application. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The foregoing detailed description is directed to a method, an apparatus, a device, and a computer-readable storage medium for processing data of a pooling platform provided in the present application. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present application.

Claims (12)

1. A data processing method of a pooling platform, comprising:
adding configuration information to a custom field of a transmission protocol based on application acceleration requirements; the configuration information comprises an operation identifier, address information and calculation information which are matched with the application acceleration requirement;
receiving application data transmitted by a host server;
and processing the application data according to the operation identifier and the calculation information, transmitting the processed application data to an acceleration component pointed by the address information until the application data is processed on different acceleration components in the pooling platform, and ending the operation.
2. The data processing method of the pooling platform of claim 1, wherein said acceleration component is an application specific integrated circuit.
3. The data processing method of the pooling platform of claim 2, wherein said application specific integrated circuit is an FPGA accelerator card.
4. The data processing method of the pooling platform of claim 3, wherein in case said application acceleration requirement corresponds to a plurality of acceleration processing modules on an FPGA acceleration card, and at least one acceleration processing module corresponds to a multi-instruction computation, said computation information comprises an operation sequence instruction and an instruction address; wherein the instruction address points to an instruction required by the application acceleration requirement;
the processing the application data according to the operation identifier and the calculation information comprises:
and the plurality of accelerated processing modules sequentially call the instruction pointed by the instruction address to process the application data according to the operation sequence instruction.
5. The data processing method of the pooling platform of claim 3, wherein in a case that the application acceleration requirement corresponds to a plurality of acceleration processing modules on an FPGA acceleration card, and each acceleration processing module corresponds to a single-instruction calculation, the calculation information includes an instruction required by the application acceleration requirement;
the processing the application data according to the operation identifier and the calculation information comprises:
and the acceleration processing modules process the application data according to the corresponding instructions.
6. The data processing method of the pooling platform of claim 3, wherein in case said application acceleration requirement corresponds to an acceleration processing module for internal computation on an FPGA acceleration card, said computation information includes an instruction address; wherein the instruction address points to an internal computation instruction required by the application acceleration requirement;
the processing the application data according to the operation identifier and the calculation information comprises:
and the acceleration processing module calls an internal calculation instruction to process the application data according to the instruction address.
7. The data processing method of the pooling platform of claim 1, wherein in case that the operation identifier is a remote direct data access operation identifier, the address information includes a logical address of a target acceleration unit, a read-write identifier determined according to the calculation information and the remote direct data access operation identifier, and a transmission length of the remote direct data access operation.
8. The data processing method of the pooling platform of claim 1, wherein the address information includes a target acceleration component logical address in a case that the operation identification is a streaming operation identification.
9. The data processing method of the pooling platform of any of claims 1-8, wherein said configuration information further includes a packet sequence number;
after the transmitting the processed application data to the acceleration component pointed by the address information, the method further comprises:
judging whether the processed application data is matched with the packet sequence number;
and under the condition that the processed application data is not matched with the packet sequence number, feeding back packet loss prompt information carrying a missing sequence number to the host server.
10. The data processing device of the pooling platform is characterized by comprising an adding unit, a receiving unit, a processing unit and a transmission unit;
the adding unit is used for adding configuration information to the custom field of the transmission protocol based on the application acceleration requirement; the configuration information comprises an operation identifier, address information and calculation information which are matched with the application acceleration requirement;
the receiving unit is used for receiving the application data transmitted by the host server;
the processing unit is used for processing the application data according to the operation identifier and the calculation information;
and the transmission unit is used for transmitting the processed application data to the acceleration component pointed by the address information until the processing of the application data on different acceleration components in the pooling platform is completed, and ending the operation.
11. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing said computer program for implementing the steps of the data processing method of the pooling platform of any of the claims 1 to 9.
12. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the data processing method of the pooling platform of any of the claims 1 to 9.
CN202210909198.XA 2022-07-29 2022-07-29 Data processing method, device, equipment and medium of pooling platform Pending CN115237500A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210909198.XA CN115237500A (en) 2022-07-29 2022-07-29 Data processing method, device, equipment and medium of pooling platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210909198.XA CN115237500A (en) 2022-07-29 2022-07-29 Data processing method, device, equipment and medium of pooling platform

Publications (1)

Publication Number Publication Date
CN115237500A true CN115237500A (en) 2022-10-25

Family

ID=83678108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210909198.XA Pending CN115237500A (en) 2022-07-29 2022-07-29 Data processing method, device, equipment and medium of pooling platform

Country Status (1)

Country Link
CN (1) CN115237500A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858206A (en) * 2023-02-27 2023-03-28 联宝(合肥)电子科技有限公司 Data processing method and device, electronic equipment and storage medium
WO2023231330A1 (en) * 2022-05-31 2023-12-07 广东浪潮智慧计算技术有限公司 Data processing method and apparatus for pooling platform, device, and medium
CN117806988A (en) * 2024-02-29 2024-04-02 山东云海国创云计算装备产业创新中心有限公司 Task execution method, task configuration method, board card and server

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023231330A1 (en) * 2022-05-31 2023-12-07 广东浪潮智慧计算技术有限公司 Data processing method and apparatus for pooling platform, device, and medium
CN115858206A (en) * 2023-02-27 2023-03-28 联宝(合肥)电子科技有限公司 Data processing method and device, electronic equipment and storage medium
CN115858206B (en) * 2023-02-27 2023-07-11 联宝(合肥)电子科技有限公司 Data processing method, device, electronic equipment and storage medium
CN117806988A (en) * 2024-02-29 2024-04-02 山东云海国创云计算装备产业创新中心有限公司 Task execution method, task configuration method, board card and server

Similar Documents

Publication Publication Date Title
CN115237500A (en) Data processing method, device, equipment and medium of pooling platform
EP3496007B1 (en) Device and method for executing neural network operation
CN108268328B (en) Data processing device and computer
CN109739786B (en) DMA controller and heterogeneous acceleration system
CN111651384B (en) Register reading and writing method, chip, subsystem, register set and terminal
WO2016115831A1 (en) Fault tolerant method, apparatus and system for virtual machine
US10909655B2 (en) Direct memory access for graphics processing unit packet processing
US9703603B1 (en) System and method for executing accelerator call
CN109040210B (en) Communication method between applications, terminal equipment and storage medium
CN114095427A (en) Method and network card for processing data message
US11868297B2 (en) Far-end data migration device and method based on FPGA cloud platform
US11182150B2 (en) Zero packet loss upgrade of an IO device
CN110380992A (en) Message processing method, device and network flow acquire equipment
US9594702B2 (en) Multi-processor with efficient search key processing
CN116627888A (en) Hardware computing module, device, method, electronic device, and storage medium
CN108829530B (en) Image processing method and device
CN117118828A (en) Protocol converter, electronic equipment and configuration method
CN112422485A (en) Communication method and device of transmission control protocol
WO2023231330A1 (en) Data processing method and apparatus for pooling platform, device, and medium
CN103973581A (en) Method, device and system for processing message data
CN110647355B (en) Data processor and data processing method
CN113472523A (en) User mode protocol stack message processing optimization method, system, device and storage medium
CN115878521B (en) Command processing system, electronic device and electronic equipment
US20240048543A1 (en) Encryption acceleration for network communication packets
US20230350720A1 (en) Chaining Services in an Accelerator Device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination