CN117579570B - A PCIe-based data transmission method, device and system - Google Patents

A PCIe-based data transmission method, device and system

Info

Publication number
CN117579570B
CN117579570B CN202311617177.1A CN202311617177A CN117579570B CN 117579570 B CN117579570 B CN 117579570B CN 202311617177 A CN202311617177 A CN 202311617177A CN 117579570 B CN117579570 B CN 117579570B
Authority
CN
China
Prior art keywords
data
descriptor
data packets
packets
packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311617177.1A
Other languages
Chinese (zh)
Other versions
CN117579570A (en
Inventor
秦向东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yusur Technology Co ltd
Original Assignee
Yusur Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yusur Technology Co ltd filed Critical Yusur Technology Co ltd
Priority to CN202311617177.1A priority Critical patent/CN117579570B/en
Publication of CN117579570A publication Critical patent/CN117579570A/en
Application granted granted Critical
Publication of CN117579570B publication Critical patent/CN117579570B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/28Flow control; Congestion control in relation to timing considerations
    • H04L47/283Flow control; Congestion control in relation to timing considerations in response to processing delays, e.g. caused by jitter or round trip time [RTT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/36Flow control; Congestion control by determining packet size, e.g. maximum transfer unit [MTU]
    • H04L47/365Dynamic adaptation of the packet size
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Communication Control (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明提供一种基于PCIe的数据传输方法、装置和系统,该方法在设备端执行,设备端的多个端口由主机端的驱动器进行识别以匹配对应的数据缓冲区,该方法包括:将数据包分段为数据小包,描述符生成器根据数据小包生成描述符,描述符仲裁轮询每个端口对应的通道以检测描述符,将检测到的待处理的描述符传递至DMA控制器生成控制信息发送至数据抓取器和数据发送器,数据抓取器从对应通道读取数据小包,并将其传递至数据发送器,数据发送器将数据小包发送至主机端的数据缓冲区以供驱动器对数据进行处理。本发明能够简化传输流程并降低数据传输时延。

The present invention provides a PCIe-based data transmission method, device, and system. The method is executed on a device side, wherein a host-side driver identifies multiple ports on the device side to match corresponding data buffers. The method includes: segmenting a data packet into data packets; a descriptor generator generates descriptors based on the data packets; a descriptor arbitrator polls the channel corresponding to each port to detect descriptors; the detected descriptors to be processed are transmitted to a DMA controller to generate control information and transmit it to a data grabber and a data transmitter; the data grabber reads the data packets from the corresponding channel and transmits them to the data transmitter; the data transmitter transmits the data packets to the host-side data buffer for the driver to process the data. The present invention can simplify the transmission process and reduce data transmission latency.

Description

PCIe-based data transmission method, device and system
Technical Field
The present invention relates to the field of computer communications technologies, and in particular, to a PCIe-based data transmission method, apparatus, and system.
Background
With the continuous improvement of the network delay requirements of industries such as financial securities, data centers, 5G industries and the like, the traditional network card can not meet the requirements of service processing, so that the network delay is required to be specially optimized to accelerate the processing of industry services.
DMA (direct memory Access) technology allows external devices to directly access the memory of a computer without intervention by a Central Processing Unit (CPU), can improve the efficiency of data transmission, reduce delay, relieve the burden of the CPU, enable the CPU to execute other tasks, and improve the overall efficiency of the computer system.
In a conventional DMA, a Host generates descriptor information according to a data transmission requirement and notifies the device, the device captures the descriptor information from a designated location in the Host memory, and a DMA controller carries data between the Host memory and the device according to information such as a destination address, a source address, and a length provided by the descriptor information.
Compared with the traditional DMA, the low-latency DMA mainly reduces the latency in the data transmission process through two directions, namely, one is to simplify or optimize the steps in the DMA transmission flow, like Host directly puts descriptors into the designated positions in the device through PIO, and the other is to reduce the copying times of the data in the memory, such as zero-copy technology. The DMA in the C2H direction needs the Host to pre-fill the descriptor, and informs the device how to organize and send the data to the specified location in the memory at the Host. In addition, network packets received by a plurality of port channels are required to be put in the positions corresponding to the memories in this way, so that the delay of the packets is relatively large, and larger jitter is easy to generate. When the traffic is relatively large, the network packet cannot be processed in time, so that the packet loss phenomenon is caused.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention provide a PCIe-based data transmission method, apparatus, and system that obviate or mitigate one or more disadvantages in the prior art.
One aspect of the present invention provides a PCIe-based data transmission method, the method being for execution at a device side, a plurality of ports of the device side being identified and configured to be enabled by a driver at a host side, each port respectively matching a data buffer pre-applied by the host side, the method comprising the steps of:
Segmenting a data packet to be sent by each port according to a preset rule to obtain a data packet, and calling a descriptor generator to generate a descriptor according to the data packet;
Invoking the descriptor arbiter to poll the channel corresponding to each port to detect and acquire the descriptor, and transmitting the detected descriptor to a DMA controller;
invoking the DMA controller to send control information to a data grabber and a data transmitter, instructing the data grabber to read the data packet from the corresponding channel and forward the data packet to the data transmitter;
And calling the data transmitter to transmit the data packets to the host side, and storing the data packets in the data buffer area matched with the corresponding ports of the data packets so that the driver can read and aggregate the data packets in the data buffer area one by one.
In some embodiments of the invention, further comprising:
the data transmitter transmits feedback information to the DMA controller and transmits the feedback information to the descriptor arbiter;
And the descriptor arbiter receives the feedback information and continuously polls each channel to acquire the next descriptor.
In some embodiments of the present invention, the data packet includes a data field including valid data information and a tag field including a data packet ID, a time stamp, a length of valid data, packet statistics, and hardware status information.
In some embodiments of the present invention, invoking the data transmitter to transmit the data packets to the host side and storing the data packets in the data buffer matched with the corresponding ports of the data packets for the driver to read and aggregate the data packets one by one in the data buffer, further comprising:
the data transmitter transmits interrupt information to the host, and the driver reads and aggregates the data packets in the data buffer area one by one according to the interrupt information.
In some embodiments of the present invention, the host triggers an interrupt by default or by way of interrupt aggregation.
In some embodiments of the present invention, invoking the data transmitter to transmit the data packets to the host side and storing the data packets in the data buffer matched with the corresponding ports of the data packets for the driver to read and aggregate the data packets one by one in the data buffer, further comprising:
The driver polls the data buffer to obtain the data packets and reads and aggregates the data packets in the data buffer one by one.
In some embodiments of the present invention, invoking the data transmitter to transmit the data packets to the host side and storing the data packets in the data buffer matched with the corresponding ports of the data packets for the driver to read and aggregate the data packets one by one in the data buffer, further comprising:
The data transmitter places the data packets into successive basic units in the data buffer according to a preset sequence;
When the driver processes the data packets in the data buffer, the data packets are taken out one by one in sequence and subjected to offset processing.
Another aspect of the invention provides a PCIe-based data transfer device comprising a plurality of ports, a descriptor generator, a descriptor arbiter, a DMA controller, a data grabber, and a data transmitter;
The descriptor generator is used for generating a descriptor according to the data packet;
The descriptor arbiter is used for detecting and acquiring the descriptor by the channel corresponding to each port, and transmitting the detected descriptor to the DMA controller;
The DMA controller is used for sending control information to the data grabber and the data transmitter, instructing the data grabber to read the data packet from the corresponding channel and forwarding the data packet to the data transmitter;
the data grabber is used for reading the data packet from the corresponding channel according to the control information and transmitting the data packet to the data transmitter;
the data transmitter is used for transmitting the data packets to a host end and storing the data packets in a data buffer area matched with a port corresponding to the data packets so that a driver can read and aggregate the data packets in the data buffer area one by one.
Another aspect of the invention provides a PCIe-based data transmission system comprising a processor and a memory, the memory having stored therein computer instructions for executing the computer instructions stored in the memory, the system implementing the steps of any one of the methods described above when the computer instructions are executed by the processor.
Another aspect of the invention provides a computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of any of the methods described above.
The beneficial effects of the invention at least comprise:
The invention provides a PCIe-based data transmission method, a device and a system, wherein the method is executed at a device end, a plurality of ports of the device end are identified by a driver at a host end to be matched with corresponding data buffers, a data packet is segmented into data packets at the device end, a descriptor generator is called to generate descriptors according to the data packets, a descriptor arbitration polls a channel corresponding to each port to detect the descriptors, the descriptors are transmitted to a DMA controller to generate control information and are transmitted to a data grabber and a data transmitter, the data grabber reads the data packets from the corresponding channels and transmits the data packets to the data buffers at the host end, and the data transmitter transmits the data packets to the data buffers at the host end for the driver to process the data. The method does not need the host to generate the descriptor, reduces the operation steps of the host in the DMA flow, simplifies the transmission flow, reduces the time delay of the DMA and improves the overall efficiency of the system. And the exclusive use of PCIe by large packet data or high priority channel data is avoided by combining a fair polling mechanism of the data packet and the descriptor. The whole data packet is segmented into the data packets, so that the whole data packet can be transmitted by taking the data packets as units before the equipment end is received, the running water type transmission can be realized, and the data backlog is avoided, thereby reducing the load pressure of the PCIe bus and the whole processor system at a certain time point, and improving the utilization rate of the PCIe bus.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the above-described specific ones, and that the above and other objects that can be achieved with the present invention will be more clearly understood from the following detailed description.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application. In the drawings:
FIG. 1 is a flowchart of a PCIe-based data transmission method according to an embodiment of the invention.
Fig. 2 is a schematic structural diagram of a PCIe-based data transmission system according to another embodiment of the present invention.
Fig. 3 is a flowchart illustrating an initialization configuration of a PCIe-based data transmission method according to another embodiment of the present invention.
FIG. 4 is a flow chart illustrating a default implementation of a PCIe-based data transmission method according to another embodiment of the invention.
Fig. 5 is a flowchart of a PCIe-based data transmission method according to another embodiment of the present invention performed in a real scene.
Fig. 6 (a) and (b) are schematic diagrams illustrating a packet format according to another embodiment of the present invention.
FIG. 7 is a diagram illustrating a relationship between a large packet and a small packet according to another embodiment of the present invention.
Fig. 8 is a diagram illustrating an example of a descriptor according to another embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments and the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. The exemplary embodiments of the present invention and the descriptions thereof are used herein to explain the present invention, but are not intended to limit the invention.
It should be noted here that, in order to avoid obscuring the present invention due to unnecessary details, only structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, while other details not greatly related to the present invention are omitted.
It should be emphasized that the term "comprises/comprising" when used herein is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.
It is also noted herein that the term "coupled" may refer to not only a direct connection, but also an indirect connection in which an intermediate is present, unless otherwise specified.
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the drawings, the same reference numerals represent the same or similar components, or the same or similar steps.
PCIe (Peripheral Component Interconnect Express) is a computer bus standard for connecting various hardware devices, such as video cards, network adapters, storage devices, etc., to a computer motherboard. PCIe provides a higher data transfer rate than a conventional PCI (Peripheral Component Interconnect) bus. PCIe is a point-to-point connection architecture in which each device is directly connected to a PCIe slot on the motherboard, rather than sharing a bus, which helps to improve data transfer efficiency.
DMA (Direct Memory Access ) is a data transfer technique in computer systems that allows external devices to directly access the system memory without direct intervention by the central processing unit. The main purpose of DMA is to increase the data transfer rate, relieve the burden on the CPU, and enable it to perform other tasks. In practice, DMA is typically managed by a dedicated hardware controller.
An aspect of one embodiment of the present invention provides a PCIe-based data transmission method, where the method is performed at a device side, a plurality of ports at the device side are identified and configured by a driver at a host side to be enabled, and each port is respectively matched with a data buffer area pre-applied by the host side, and the method includes steps S101 to S104:
step S101, segmenting the data packet to be sent by each port according to a preset rule to obtain a data packet, and calling a descriptor generator to generate a descriptor according to the data packet.
The data packet is segmented into data small packets, so that the data large packets can be prevented from monopolizing the PCIe bus for a long time during data transmission, and the purpose of efficient transmission is achieved.
Wherein, the descriptor is an abstract data structure for describing the attribute, position, etc. information of the data. A packet is a logical unit of data transmitted and a descriptor is a means of managing and controlling such data. The descriptor may contain a pointer to the data packet or contain meta-information about the data packet. The descriptor of the present embodiment is used to specify the location, size, and other relevant information of the data packet.
Step S102, calling a descriptor arbiter to poll a channel corresponding to each port to detect and acquire the descriptor, and transmitting the detected descriptor to a DMA controller.
Wherein polling is a technique for acquiring or checking information, in this embodiment descriptors come as requests through respective descriptor channels to a descriptor arbiter, which polls each channel, and takes out descriptors to perform a transmission operation if the current channel polled has a descriptor. Other channels may have valid descriptors at this point, but may only wait for the descriptor arbiter to poll the channel before the corresponding transfer operation can be performed.
And step S103, calling the DMA controller to send control information to the data grabber and the data transmitter, and instructing the data grabber to read the data packet from the corresponding channel and forward the data packet to the data transmitter.
Step S104, calling a data transmitter to transmit the data packets to a host end, and storing the data packets in a data buffer area matched with a port corresponding to the data packets so as to enable a driver to read and aggregate the data packets in the data buffer area one by one.
The driver of the host refers to a device driver in the computer system, and is responsible for communication and coordination with the external device, translating a request of the operating system into an instruction which can be understood by hardware, and translating a response of the external device into data which can be processed by the operating system. The drivers provide interfaces to the operating system for external devices so that applications can access device functions through standard system calls or APIs (application program interfaces).
In some embodiments of the invention, further comprising:
the data transmitter transmits feedback information to the DMA controller and passes the feedback information to the descriptor arbiter.
The descriptor arbiter receives the feedback information and continues to poll each channel to obtain the next descriptor.
In some embodiments of the present invention, the data packet includes a data field and a tag field. The DATA field (DATA field) includes valid DATA information, and the actual length of valid DATA may be smaller than the length of the DATA field. The TAG field (TAG field) includes a packet ID, a time stamp, a length of valid data, packet statistics, and hardware status information. The DATA field (DATA field) in the DATA packet is used to hold the actual DATA content, while the TAG field (TAG field) contains meta-information about the DATA packet and other information for performing more functions.
In some embodiments of the present invention, invoking the data transmitter to transmit the data packets to the host side and storing the data packets in a data buffer matching the port corresponding to the data packets for the driver to read and aggregate the data packets one by one in the data buffer, further comprising:
The data transmitter transmits interrupt information to the host, and the driver reads and aggregates the data packets in the data buffer one by one according to the interrupt information.
Wherein the interrupt information is a hardware or software generated signal that breaks the normal program execution flow, causing the processor to in turn execute a specific interrupt service routine. Interrupts may be triggered by external devices, program errors, or other events that require attention from the processor. In this embodiment, the interrupt information is used for the device side to communicate with the host side, so as to remind the host side to process the data packet transmitted by the device side.
In some embodiments of the present invention, the host side triggers the interrupt by default interrupt or interrupt aggregation.
The default interrupt refers to an interrupt triggering mode preset or defaulted at the host end.
Wherein, interrupt aggregation refers to merging or summarizing multiple interrupt events into a single interrupt event, rather than processing each interrupt individually at a time, the system can merge similar or related interrupts into one, thereby reducing interrupt frequency and processing overhead. This approach helps to improve system efficiency, especially when faced with a large number of interrupt events.
In some embodiments of the present invention, invoking the data transmitter to transmit the data packets to the host side and storing the data packets in a data buffer matching the port corresponding to the data packets for the driver to read and aggregate the data packets one by one in the data buffer, further comprising:
The driver polls the data buffer to obtain data packets and reads and aggregates the data packets in the data buffer one by one.
In some embodiments of the present invention, invoking the data transmitter to transmit the data packets to the host side and storing the data packets in a data buffer matching the port corresponding to the data packets for the driver to read and aggregate the data packets one by one in the data buffer, further comprising:
the data transmitter places the data packets in a predetermined order into successive elementary units in the data buffer.
When the driver processes the data packets in the data buffer, the data packets are fetched one by one in sequence and subjected to offset processing.
The offset processing is to adjust the offset of the data packet. The offset is an offset of a specific portion of the data packet with respect to a start position of the entire data packet.
In some embodiments of the present invention, the length field of the descriptor corresponding to each packet is 64B in size, and the length of the valid data is recorded in the TAG field (TAG field) and parsed by the driver. The offsets of the source address and the destination address corresponding to the descriptor and the source address and the destination address corresponding to the adjacent descriptor, namely the offset of the data packet, are all 64B.
Another aspect of embodiments of the invention provides a PCIe-based data transfer device comprising a plurality of ports, a descriptor generator, a descriptor arbiter, a DMA controller, a data grabber, and a data transmitter.
The descriptor generator is used for generating a descriptor according to the data packet.
The descriptor arbiter is used for each port corresponding channel to detect and acquire the descriptor, and the detected descriptor is transmitted to the DMA controller.
The DMA controller is configured to send control information to the data grabber and the data transmitter, instruct the data grabber to read the data packets from the corresponding channels, and forward the data packets to the data transmitter.
The data grabber is used for reading the data packets from the corresponding channels according to the control information and transmitting the data packets to the data transmitter.
The data transmitter is used for transmitting the data packets to the host end and storing the data packets in a data buffer area matched with the port corresponding to the data packets so as to enable the driver to read and aggregate the data packets in the data buffer area one by one.
Another aspect of the embodiments of the invention provides a PCIe-based data transmission system comprising a processor and a memory, the memory having stored therein computer instructions for executing the computer instructions stored in the memory, the system implementing the steps of the method as in any of the embodiments described above when the computer instructions are executed by the processor.
Another aspect of an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method as in any of the embodiments described above.
Another embodiment of the present invention provides a PCIe-based data transmission method, apparatus, and system, where the specific implementation manner is as follows:
The overall architecture of this embodiment is shown schematically in FIG. 2, where the left part of PCIe resides within the device, including a descriptor Generator (Generator), a descriptor arbiter (Arbiter), a DMA Controller (Controller), a data grabber (DATA FETCHER), and a data Transmitter (Transmitter). The descriptor generator is responsible for generating descriptors from data packets segmented from data packets and for supplying the descriptors to the descriptor arbiter. The descriptor arbiter is responsible for polling each channel for descriptors and forwarding descriptors to the DMA controller. The DMA controller is responsible for sending control information to the data grabber and the data transmitter. The data grabber is responsible for grabbing data according to the corresponding channel of the control information of the DMA controller. The data transmitter is responsible for transmitting the data packet returned by the data grabber to the corresponding address or the queue position of the Host memory through PCIe according to the control information of the DMA controller.
The right part of PCIe is the Host end. The data buffer area (queue or channel, etc.) in the memory stores the data packet sent by the device end, and the corresponding relation between the data packet and the channel in the device is determined according to the actual situation, and the queue may correspond to the specific user application program or virtual machine, etc. objects according to some corresponding rule. The Driver is responsible for configuring the device during the initialization phase and further processing the data packets in the buffer according to interrupts or polls during the actual working phase.
The implementation is divided into two major phases, an initialization configuration phase and an actual execution phase.
The process of initializing the configuration stage is shown in fig. 3, and the main behavior is that after the driver loads and identifies the board, corresponding memory resources are allocated to the device according to the number of ports (physical or virtual) of the device.
For the actual execution phase, the default execution flow after the initialization configuration is completed is shown in fig. 4. Arbiter will continually poll each channel and send descriptors to the DMA controller if pending descriptors are detected. The DMA controller sends control information to the data grabber and the data transmitter based on the descriptors. The data grabber reads the data from the corresponding channel according to the control information and sends the returned data to the data transmitter. The data transmitter sends the data to the corresponding location in memory, then sends feedback of completion of execution to Arbiter, and notifies interrupt sending logic to send an interrupt to Host. Arbiter after receiving the feedback, determining that the transmission is completed, and detecting whether there is a next descriptor to be processed. After the driver receives the interrupt, the driver can go to the corresponding position in the memory to take out the data and further process the data.
In a practical application environment, the size of the data volume and the frequency of transmitting the data packets are uncertain. For example, frequent interrupt processing under a large data volume may occupy a large amount of CPU resources, thereby affecting the efficiency of data transmission. Therefore, in the practical application environment, the driver can choose whether to send interrupt or send the frequency of interrupt (interrupt aggregation) by configuring the related register according to the practical requirement, or select interrupt aggregation in a pure software mode or process data in a polling mode at the Host end, and the related execution relationship is shown in fig. 5. The dotted line path in the figure can adopt a default interrupt mode, interrupt aggregation in a pure software mode, and the like, can also adopt a driving self-polling data packet presented by a non-dotted line path, or adopts a mode of combining interrupt and polling. In practical application, the above modes can be combined for use or dynamically configured according to the actual scene requirements.
The format of the data packet for each DMA transfer is shown in fig. 6 (a). The data packets are segmented from the complete network packet (data big packet) received from the ethernet network, the relationship of which is shown in fig. 7. Wherein each data packet has a respective corresponding descriptor, the length field of the descriptor is 64B in size, and the length of the effective data is actually recorded in the TAG and parsed by the driver. As shown in fig. 8, the offsets of the source address and the destination address corresponding to the descriptor and the source address and the destination address corresponding to the adjacent descriptor, that is, the offsets of the data packets are all 64B. The benefit of segmentation is to prevent large packets from monopolizing the PCIe bus for long periods of time, so that other channels of possible real-time data are handled in time. Each frame packet consists of a DATA field (DATA field) and a TAG field (TAG field), the lengths of which are fixed. Wherein the effective DATA length of the DATA field (DATA field) may be smaller than the length of the DATA field (DATA field). The information contained in the TAG field (TAG field) includes the ID of the data packet, the length of the valid data, a flag bit for identifying whether the data packet is the last data packet, etc., and of course, the information including, but not limited to, a time stamp, packet statistics, hardware status, etc. may also be carried. As shown in FIG. 6 (b), each DMAC2H places data packets sequentially in DMA fashion into successive queue elements of the data buffer, which aggregates the packets into large packets. The data packets are taken out from the buffer area one by the driver, and after the data in the data packets are subjected to offset processing, the data packets can be used as a data packet to be sent to a corresponding application program.
In the practical application of the DMAC2H scheme, a descriptor is generated by a descriptor generator adjacent to DMAC2H logic in the device, so that the descriptor phase delay in the DMA transmission flow is reduced to the order of one digit nanosecond. In addition, the logic complexity of the DMAC2H is simplified to a certain extent by the processing mode of data segmentation, so that the delay of the whole transmission process is reduced. Meanwhile, the data packets are sequentially put into a Host end buffer area one by one in the form of data packets, so that the purpose of packet aggregation is achieved.
In accordance with the above method, the present invention also provides a system comprising a computer device comprising a processor and a memory, the memory having stored therein computer instructions for executing the computer instructions stored in the memory, the apparatus/system implementing the steps of the method as described above when the computer instructions are executed by the processor.
The embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the edge computing server deployment method described above. The computer readable storage medium may be a tangible storage medium such as Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, floppy disks, hard disk, a removable memory disk, a CD-ROM, or any other form of storage medium known in the art.
In summary, the present invention provides a PCIe-based data transmission method, apparatus and system, where the method is performed at a device side, a plurality of ports at the device side are identified by a driver at a host side to match corresponding data buffers, the method includes segmenting a data packet into data packets, a descriptor generator generating descriptors according to the data packets, a descriptor arbitration polling a channel corresponding to each port to detect the descriptors, transmitting the descriptors to a DMA controller to generate control information to send to a data grabber and a data transmitter, the data grabber reads the data packets from the corresponding channels and transmits them to the data transmitter, and the data transmitter transmits the data packets to the data buffers at the host side for processing by the driver. The invention can simplify the DMA transmission flow and reduce the data transmission time delay, and the operation of segmenting the whole data packet into the data small packets can realize the running water type transmission, avoid the data backlog, reduce the load pressure of the PCIe bus and the whole processor system and improve the utilization rate of the PCIe bus.
Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein can be implemented as hardware, software, or a combination of both. The particular implementation is hardware or software dependent on the specific application of the solution and the design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave.
It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. The method processes of the present invention are not limited to the specific steps described and shown, but various changes, modifications and additions, or the order between steps may be made by those skilled in the art after appreciating the spirit of the present invention.
In this disclosure, features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. A PCIe-based data transmission method, wherein the method is used for executing at a device side, a plurality of ports of the device side are identified and configured to be enabled by a driver of a host side, each port is respectively matched with a data buffer area pre-applied by the host side, and the method includes the following steps:
segmenting a data packet to be sent by each port according to a preset rule to obtain a data packet, and calling a descriptor generator to generate a descriptor according to the data packet, wherein the data packet comprises a data field and a tag field, the data field comprises effective data information, and the tag field comprises a data packet ID, a time stamp, the length of effective data, packet statistics and hardware state information;
Invoking a descriptor arbiter to poll a channel corresponding to each port to detect and acquire the descriptor, and transmitting the detected descriptor to a DMA controller;
invoking the DMA controller to send control information to a data grabber and a data transmitter, instructing the data grabber to read the data packet from the corresponding channel and forward the data packet to the data transmitter;
The data transmitter is called to transmit the data packets to the host side and stored in the data buffer area matched with the corresponding port of the data packets so that the driver can read and aggregate the data packets in the data buffer area one by one, the data transmitter transmits feedback information to the DMA controller, the DMA controller transmits the feedback information to the descriptor arbiter, and the descriptor arbiter continuously polls each channel to acquire the next descriptor after receiving the feedback information, wherein the reading and aggregating modes comprise:
the data transmitter transmits interrupt information to the host, and the driver reads and aggregates the data packets in the data buffer area one by one according to the interrupt information;
Or, the driver polls the data buffer area to acquire the data packets, and reads and aggregates the data packets in the data buffer area one by one;
Or the data transmitter places the data packets into continuous basic units in the data buffer area according to a preset sequence, and the driver sequentially takes out the data packets one by one and performs offset processing when processing the data packets in the data buffer area.
2. The PCIe-based data transmission method according to claim 1, wherein the data transmitter transmits interrupt information to the host side, and the host side triggers an interrupt by default interrupt or interrupt aggregation.
3. A PCIe-based data transmission device is characterized by comprising a plurality of ports, a descriptor generator, a descriptor arbiter, a DMA controller, a data grabber and a data transmitter;
The descriptor generator is used for generating a descriptor according to a data packet, wherein the data packet comprises a data field and a tag field, the data field comprises effective data information, and the tag field comprises a data packet ID, a time stamp, the length of the effective data, packet statistics and hardware state information;
The descriptor arbiter is used for detecting and acquiring the descriptor by the channel corresponding to each port, and transmitting the detected descriptor to the DMA controller;
The DMA controller is used for sending control information to the data grabber and the data transmitter, instructing the data grabber to read the data packet from the corresponding channel and forwarding the data packet to the data transmitter;
the data grabber is used for reading the data packet from the corresponding channel according to the control information and transmitting the data packet to the data transmitter;
The data transmitter is used for transmitting the data packets to a host side and storing the data packets in a data buffer zone matched with a port corresponding to the data packets so that a driver can read and aggregate the data packets in the data buffer zone one by one, the data transmitter transmits feedback information to the DMA controller, the DMA controller transmits the feedback information to the descriptor arbiter, the descriptor arbiter continuously polls each channel after receiving the feedback information to acquire the next descriptor, the reading and aggregating mode comprises the steps that the data transmitter transmits interrupt information to the host side, the driver reads and aggregates the data packets in the data buffer zone one by one according to the interrupt information, or the driver polls the data buffer zone to acquire the data packets and reads and aggregates the data packets in the data buffer zone one by one, or the data transmitter places the data packets in a preset sequence into a continuous unit in the data buffer zone, and processes the data packets one by one, and processes the data packets in the continuous unit according to the preset sequence.
4. A PCIe-based data transmission system comprising a processor and a memory, wherein said memory has stored therein computer instructions for executing the computer instructions stored in said memory, which system, when executed by the processor, implements the steps of the method according to any one of claims 1 to 2.
5. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 2.
CN202311617177.1A 2023-11-29 2023-11-29 A PCIe-based data transmission method, device and system Active CN117579570B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311617177.1A CN117579570B (en) 2023-11-29 2023-11-29 A PCIe-based data transmission method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311617177.1A CN117579570B (en) 2023-11-29 2023-11-29 A PCIe-based data transmission method, device and system

Publications (2)

Publication Number Publication Date
CN117579570A CN117579570A (en) 2024-02-20
CN117579570B true CN117579570B (en) 2025-10-21

Family

ID=89860621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311617177.1A Active CN117579570B (en) 2023-11-29 2023-11-29 A PCIe-based data transmission method, device and system

Country Status (1)

Country Link
CN (1) CN117579570B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120710964A (en) * 2025-06-05 2025-09-26 北京云豹创芯智能科技有限公司 Data packet receiving method, data processing unit, host and network card

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10783103B1 (en) * 2017-02-24 2020-09-22 Xilinx, Inc. Split control for direct memory access transfers
CN112970010A (en) * 2018-11-09 2021-06-15 赛灵思公司 Streaming platform streams and architectures

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9684615B1 (en) * 2015-01-08 2017-06-20 Altera Corporation Apparatus and methods for multiple-channel direct memory access
US10795836B2 (en) * 2017-04-17 2020-10-06 Microsoft Technology Licensing, Llc Data processing performance enhancement for neural networks using a virtualized data iterator
CN109471816B (en) * 2018-11-06 2021-07-06 西安微电子技术研究所 Descriptor-based PCIE bus DMA controller and data transmission control method
CN113366459B (en) * 2018-11-28 2024-11-22 马维尔亚洲私人有限公司 Network switch with endpoint and direct memory access controller for in-vehicle data transfer
US11392533B1 (en) * 2020-12-30 2022-07-19 Cadence Design Systems, Inc. Systems and methods for high-speed data transfer to multiple client devices over a communication interface
CN116166581B (en) * 2023-02-24 2025-08-15 北京轩宇空间科技有限公司 Queue type DMA controller circuit for PCIE bus and data transmission method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10783103B1 (en) * 2017-02-24 2020-09-22 Xilinx, Inc. Split control for direct memory access transfers
CN112970010A (en) * 2018-11-09 2021-06-15 赛灵思公司 Streaming platform streams and architectures

Also Published As

Publication number Publication date
CN117579570A (en) 2024-02-20

Similar Documents

Publication Publication Date Title
CN113296884B (en) Virtualization method, virtualization device, electronic equipment, virtualization medium and resource virtualization system
US9430432B2 (en) Optimized multi-root input output virtualization aware switch
US8572342B2 (en) Data transfer device with confirmation of write completion and method of controlling the same
CN110888827A (en) Data transmission method, device, equipment and storage medium
CN107992436A (en) A kind of NVMe data read-write methods and NVMe equipment
CN114662136A (en) A high-speed encryption and decryption system and method of multi-algorithm IP core based on PCIE channel
US11741039B2 (en) Peripheral component interconnect express device and method of operating the same
CN100592273C (en) Apparatus and method for performing DMA data transfer
US12487777B2 (en) Apparatus for a first device MMIO mapped with a second device and a method of performing data processing operations therebetween
CN101539902A (en) DMA device for nodes in multi-computer system and communication method
US20180181421A1 (en) Transferring packets between virtual machines via a direct memory access device
WO2020143237A1 (en) Dma controller and heterogeneous acceleration system
CN105993148B (en) Network Interface
CN114817965A (en) High-speed encryption and decryption system and method for MSI interrupt processing based on multi-algorithm IP core
CN117579570B (en) A PCIe-based data transmission method, device and system
WO2022032990A1 (en) Command information transmission method, system, and apparatus, and readable storage medium
CN114662162B (en) Multi-algorithm-core high-performance SR-IOV encryption and decryption system and method for realizing dynamic VF distribution
CN104123173A (en) Method and device for achieving communication between virtual machines
CN117971135B (en) Storage device access method and device, storage medium and electronic device
CN116136790A (en) Task processing method and device
US10769092B2 (en) Apparatus and method for reducing latency of input/output transactions in an information handling system using no-response commands
US20230350824A1 (en) Peripheral component interconnect express device and operating method thereof
US6298409B1 (en) System for data and interrupt posting for computer devices
JP3873589B2 (en) Processor system
CN111625486B (en) Universal serial bus device and data transmission method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant