CN113676421A - Multi-port network message receiving and transmitting method based on PCIe - Google Patents

Multi-port network message receiving and transmitting method based on PCIe Download PDF

Info

Publication number
CN113676421A
CN113676421A CN202111237181.6A CN202111237181A CN113676421A CN 113676421 A CN113676421 A CN 113676421A CN 202111237181 A CN202111237181 A CN 202111237181A CN 113676421 A CN113676421 A CN 113676421A
Authority
CN
China
Prior art keywords
message
page
network
sending
merge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111237181.6A
Other languages
Chinese (zh)
Other versions
CN113676421B (en
Inventor
沈文君
张富军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202111237181.6A priority Critical patent/CN113676421B/en
Publication of CN113676421A publication Critical patent/CN113676421A/en
Application granted granted Critical
Publication of CN113676421B publication Critical patent/CN113676421B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/72Admission control; Resource allocation using reservation actions during connection setup
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/72Admission control; Resource allocation using reservation actions during connection setup
    • H04L47/722Admission control; Resource allocation using reservation actions during connection setup at the destination endpoint, e.g. reservation of terminal resources or buffer space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a multi-port network message receiving and transmitting method based on PCIe, which comprises the following steps: s1: the ARM and the FPGA virtualize a plurality of network devices, port numbers are added to original network messages, and the function of simultaneously receiving and transmitting data by a plurality of network ports on a single PCIe channel is realized; s2: the ARM adopts a mechanism of sending network message page merging, multi-page aggregation sending and overtime sending, and adopts two independent threads and two lock-free caches to operate in the processes of sending the message page merging and the merged page sending; s3: and the ARM performs CPU binding processing on the processing threads for sending and receiving the network message. As the DMA transmission of the ARM takes pages (4096 bytes) as a unit, the ARM creates two lock-free caches and processing threads, performs page combination processing on network transmission messages, effectively reduces the DMA transmission through a page aggregation transmission and overtime transmission mechanism, improves the network transmission efficiency, and simultaneously ensures the instantaneity when the network messages are less.

Description

Multi-port network message receiving and transmitting method based on PCIe
Technical Field
The invention relates to the technical field of computer networks, in particular to a multiport network message receiving and transmitting method based on PCIe.
Background
In the embedded network device, for a scene requiring a plurality of network ports, a native network port of a general ARM processor is not enough, and at the moment, the network port can be expanded through a PCIe channel and an FPGA of the ARM. For embedded equipment with sensitive cost, the FPGA does not have an SR-IOV function, so a plurality of network ports need to be realized by virtual network equipment and a mode of modifying original network messages. When the network message is frequently received and sent, the receiving and sending efficiency of the message needs to be ensured, and meanwhile, the real-time performance of the message is also considered, so that designing a set of efficient real-time network message receiving and sending method based on PCIe is the key for reflecting the network performance of the equipment.
Disclosure of Invention
The invention aims to provide a multi-port network message receiving and transmitting method based on PCIe (peripheral component interface express) so as to overcome the defects in the prior art.
In order to achieve the purpose, the invention provides the following technical scheme:
the invention discloses a multi-port network message receiving and transmitting method based on PCIe, which comprises the following steps:
s1: the ARM and the FPGA virtualize a plurality of network devices, port numbers are added to original network messages, and the function of simultaneously receiving and transmitting data by a plurality of network ports on a single PCIe channel is realized;
s2: the ARM adopts a mechanism of sending network message page merging, multi-page aggregation sending and overtime sending, and adopts two independent threads and two lock-free caches to operate in the processes of sending the message page merging and the merged page sending;
s3: and the ARM performs CPU binding processing on the processing threads for sending and receiving the network message.
Preferably, the step S1 includes the following sub-steps:
s11: the method comprises the steps that ARM creates a plurality of virtual network devices in a network driver, the FPGA creates network devices of external network ports, the number of the network devices is the same as that of the virtual network devices, and data interaction is carried out through PCIe channels;
s12: the ARM adds a port number in a network message according to virtual network equipment, and then inserts the port number into a network original message sending lock-free cache queue, and the method comprises the following substeps:
s121: the method comprises the steps that an ARM creates a network original sending message lock-free cache queue capable of storing M message structure addresses;
s122: the ARM determines a port number for sending a network message according to the virtual network equipment, and fills the port number into a vlan _ cfi field in a message structure sk _ buff;
s123: inserting the modified message structure address into a network original sending message lock-free cache queue;
s13: the ARM receives the network message of the FPGA, analyzes and obtains a port number, and sends the original message to corresponding virtual network equipment according to the port number, and the ARM comprises the following substeps:
s131: the ARM applies for a message receiving page cache containing N pages and sends a cached physical address to the FPGA;
s132: the FPGA receives external network port data, adds a port number on the head of a network message, writes the data into a cache and updates the state of a sending engine, and only one message is written in each page so as to be convenient for ARM to read;
s133: the ARM establishes a network message receiving thread, circularly reads the state of a sending engine of the FPGA, detects that a message arrives, obtains a header and a footer which contain the message in a cache, sequentially reads the message, analyzes to obtain a port number, and sends an original message with the port number removed to corresponding virtual network equipment.
Preferably, the step S2 includes the following sub-steps:
s21: the ARM establishes a network message merging page lock-free cache queue capable of storing P pages, wherein P is less than M;
s22: the ARM application can store an array pkg _ delay _ times of P assigned chars, store corresponding page merging delay time, apply to store an array page _ data _ len of P assigned short, store the effective data length of a corresponding page, and define a page merging timeout time threshold pkg _ delay _ threshold _ times;
s23: the ARM establishes a network transmission original message merging thread, performs page merging processing on a network transmission original message lock-free cache queue, and inserts the network message lock-free cache queue into the network message merging page;
s24: defining a message merge page sending trigger threshold: page _ threshold _ nums, page _ threshold _ nums < P, defining a message merge page delay trigger threshold: delay _ threshold _ times;
s25: the ARM creates a network message merge page sending thread, processes the message merge pages in the network message merge page lock-free cache queue, and transmits data to the FPGA through the DMA.
Preferably, the message merge page delay trigger threshold defined in step S24 is: the delay _ threshold _ times is greater than the page merge timeout time threshold defined in the step S22: pkg _ delay _ threshold _ times.
Preferably, the step S23 includes the following sub-steps:
s231: detecting whether the network message merge page lock-free cache queue is full or not until the network message merge page lock-free cache queue is detected to be not full, acquiring the current network message merge page, detecting whether the network original message transmission lock-free cache queue has to send the message or not,
if there is no message to be sent in the queue, executing step S232, and if there is a message to be sent in the queue, executing step S235;
s232: detecting the effective data length of the page corresponding to the current network message merge page: whether page _ data _ len is 0,
if the page valid data length: page _ data _ len =0, returning to step S231 again, otherwise, executing step S233;
s233: adding 1us to the page merge delay time pkg _ delay _ times corresponding to the current network message merge page, the network sending the original message merge thread to execute udelay (1) function sleep 1us, detecting whether the page merge delay time pkg _ delay _ times corresponding to the current network message merge page is greater than the page merge timeout threshold pkg _ delay _ threshold _ times,
if the page merge delay time pkg _ delay _ times is greater than the page merge timeout threshold pkg _ delay _ threshold _ times, execute step S234, otherwise, go back to step S231 again;
s234: inserting the current network message merge page into the network message merge page lock-free cache queue, and returning to the step S231 again;
s235: reading length information of the oldest message in an original message-sending lock-free cache queue of the network, detecting whether the current network message merge page has enough residual cache to store the message, if the residual cache is not enough, executing step S236, and if the residual cache is enough, executing step S237;
s236: inserting the current network message merge page into the network message merge page lock-free cache queue, detecting whether the network message merge page lock-free cache queue is full or not, obtaining the current network message merge page until the network message merge page lock-free cache queue is detected to be not full, and executing step S237;
s237: respectively copying a vlan _ cfi field, a data _ len field and message data in a message structure sk _ buf to a message data length field, a port number field and a message data content part in a network message merge page, updating the message number field in the current network message merge page, updating the information of the corresponding page effective data length page _ data _ len, and releasing resources of a network message sk _ buf cache and a network message lock-free cache queue of the message in an original message sending network;
s238: and detecting whether the network originally sends a message without a lock cache queue to the message to be sent, returning to the step S237 if the message to be sent is in the queue, and returning to the step S231 again if the message to be sent is not in the queue.
Preferably, the step S25 includes the following sub-steps:
s251: detecting whether the network message merge page lock-free cache queue has a message merge page to be sent or not until the message merge page to be sent in the queue is detected, and executing S252;
s252: calculating to obtain the number of merged pages of the message to be sent, determining whether the number of merged pages, page _ nums, is greater than a merged page sending trigger threshold, page _ threshold _ nums, and if the number of merged pages, page _ nums, is greater than the merged page sending trigger threshold, page _ threshold _ nums, executing step S255, otherwise executing step S253;
s253: adding pkg _ delay _ times corresponding to the message merging pages to be sent, adding kthread _ delay _ times to obtain total delay time all _ delay _ times, judging whether the total delay time all _ delay _ times is greater than a message merging page delay triggering threshold value delay _ threshold _ times,
if the total delay time all _ delay _ time is greater than the delay triggering threshold delay _ threshold _ time of the message merge page, executing step S255, otherwise executing step S254;
s254: the thread executes udelay (1), the function is delayed by 1us, the thread delay time kthread _ delay _ times value is added with 1, and the process returns to the step S251 again;
s255: combining the message merging pages to be sent, transmitting the message to the FPGA through DMA transmission, setting the corresponding page effective data length page _ data _ len and the page merging delay time pkg _ delay _ times to 0, setting the thread delay time kthread _ delay _ times to 0, releasing the resource of the message merging pages to be transmitted in the network message merging page lock-free cache queue, and returning to the step S251.
Preferably, the ARM adopts a six-core processor which is respectively a processor CPU1-CPU 6.
Preferably, the step S3 includes the following sub-steps:
s31: the network original sending message merging thread is bound to a CPU4 to run;
s32: the network message merge page sending thread is bound to the CPU5 to run;
s33: the network packet receiving thread binding runs at the CPU 6.
The invention has the beneficial effects that:
1. the invention realizes the function of simultaneously receiving and transmitting data by four network ports on a single PCIe channel by virtualizing four network devices and modifying the original network message through the ARM and the FPGA, thereby effectively reducing the cost;
2. because the DMA transmission of the ARM takes pages (4096 bytes) as a unit, the ARM establishes two lock-free caches and a processing thread, performs page combination processing on network transmission messages, effectively reduces the DMA transmission through a page aggregation transmission and overtime transmission mechanism, improves the network transmission efficiency, and simultaneously ensures the instantaneity when the network messages are less;
3. the ARM binds the network original sending message merging thread, the network message merging page sending thread and the network message receiving thread on the fixed CPU respectively, and the receiving and sending performance of the network message is effectively improved.
The features and advantages of the present invention will be described in detail by embodiments in conjunction with the accompanying drawings.
Drawings
FIG. 1 is a general architecture diagram of a PCIe-based multiport network of the present invention;
FIG. 2 is a data structure diagram of a network message merge page according to the present invention;
FIG. 3 is a flow chart of an original message sending interface of the present invention;
FIG. 4 is a flow chart of a merging thread of original sending messages of the network according to the present invention;
FIG. 5 is a flow chart of a network message merge page send thread according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood, however, that the description herein of specific embodiments is only intended to illustrate the invention and not to limit the scope of the invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
As shown in fig. 1, the overall framework of the present invention is a PCIe-based multi-port network packet transceiving method, which specifically includes the following processing methods:
s1: the method comprises the following steps that ARM and FPGA virtualize four network devices, port numbers are added to original network messages, and the function that the four network ports receive and send data simultaneously is achieved on a single PCIe channel, and the method specifically comprises the following steps:
s11: the ARM creates four virtual network devices in a network driver, the FPGA creates four network devices for an external network port, and data interaction is carried out through a PCIe channel;
s12: the ARM adds a port number in a network message according to the virtual network equipment, and then inserts the port number into a network original message sending lock-free cache queue, and the specific steps are as follows:
s121: the method comprises the steps that an ARM creates a network original sending message lock-free cache queue capable of storing M message structure addresses;
s122: the ARM determines a port number for sending a network message according to the virtual network equipment, and fills the port number into a vlan _ cfi field in a message structure sk _ buff;
s123: and inserting the modified message structure address into a network original message sending lock-free cache queue.
S13: the ARM receives the network message of the FPGA, analyzes and obtains a port number, and sends an original message to corresponding virtual network equipment according to the port number, and the method specifically comprises the following steps:
s131: the ARM applies for a message receiving page cache containing N pages (4096 bytes), and sends a physical address of the cache to the FPGA;
s132: the FPGA receives external network port data, adds a port number on the head of a network message, writes the data into a cache and updates the state of a sending engine, and only one message is written in each page so as to be convenient for ARM to read;
s133: the method comprises the steps that an ARM creates a network message receiving thread, circularly reads the state of a sending engine of an FPGA, detects that a message arrives, obtains a header (rx _ head) and a footer (rx _ tail) which contain the message in a cache, sequentially reads the message, analyzes to obtain a port number, and sends an original message with the port number stripped to corresponding virtual network equipment, wherein the specific flow is shown in FIG. 3.
In the step, four network devices are virtualized through the ARM and the FPGA, original network messages are modified, the function that four network ports receive and send data simultaneously is achieved on a single PCIe channel, and cost is effectively reduced.
S2: the ARM adopts a mechanism of sending network message page merging, multi-page aggregation sending and overtime sending, and adopts two independent threads and two lock-free caches to operate in the processes of sending the message page merging and the merged page sending, and the specific steps are as follows:
s21: ARM creates a network message merge page lock-free cache queue which can store P pages (4096 bytes), P < M;
s22: the ARM application can store an array pkg _ delay _ times of P assigned chars, store corresponding page merging delay time, apply to store an array page _ data _ len of P assigned short, store effective data length of corresponding pages, and define a page merging timeout time threshold: pkg _ delay _ threshold _ times;
s23: the ARM creates a network original message sending merging thread, performs page merging processing on a network original message sending lock-free cache queue, and inserts the network message sending original message sending lock-free cache queue into the network message merging page lock-free cache queue, as shown in fig. 4, the specific steps are as follows:
s231: detecting whether the network message merge page lock-free cache queue is full, continuing to execute S231 when the queue is full, and executing S232 when the queue is not full;
s232: acquiring the current network message merge page ID, marking as i, detecting whether a network original message sending lock-free cache queue has a message to be sent or not, if not, executing S233, and if so, executing S236;
s233: detecting a page _ data _ len [ i ] corresponding to the current network message merge page, and executing S231 if the page _ data _ len [ i ] =0, otherwise executing S234;
s234: adding 1 to pkg _ delay _ times [ i ] corresponding to the current network message merge page, the network sending the original message merge thread to execute udelay (1) function delay 1us, detecting pkg _ delay _ times [ i ], pkg _ delay _ times [ i ] > pkg _ delay _ threshold _ times, then executing S235, otherwise executing S231;
s235: inserting the current network message merge page (ID is i) into a network message merge page lock-free cache queue, and executing S231;
s236: acquiring the oldest message ID in the network original message lock-free cache queue, marking the oldest message ID as j, detecting whether the current network message merge page has enough space to store the message, if not, executing S237, and if so, executing S2310;
s237: inserting the current network message merge page (ID is i) into the network message merge page lock-free cache queue, and then executing S238;
s238: detecting whether the network message merge page lock-free cache queue is full, continuing to execute S238 if the queue is full, and executing S239 if the queue is not full;
s239: acquiring the current network message merge page ID, marking as k, and then executing S2310;
s2310: the vlan _ cfi, data _ len and message data of the network original message structure sk _ buff marked as j are respectively copied to the data length field, port number field and message data content of the current network message merge page (ID is i or k), the message number field of the current network message merge page is updated, the processed page data structure is as shown in fig. 2,
updating the corresponding value of page _ data _ len [ i ] or page _ data _ len [ k ], releasing resources marked as j network original message sk _ buff cache and network original sent message lock-free cache queues, and then executing S2311;
s2311: and detecting whether the network originally sends the message without a lock cache queue to be sent or not, if not, executing S231, and if so, executing S236.
S24: defining a message merge page sending trigger threshold: page _ threshold _ nums, page _ threshold _ nums < P, defining a message merge page delay trigger threshold: delay _ threshold _ times, delay _ threshold _ times > pkg _ delay _ threshold _ times;
s25: the ARM creates a network message merge page sending thread, processes the message merge pages in the network message merge page lock-free cache queue, and transmits data to the FPGA through the DMA, as shown in fig. 5, and the specific steps are as follows:
s251: detecting whether a network message merge page lock-free cache queue has a message merge page to be sent, if not, executing S251, and otherwise, executing S252;
s252: calculating to obtain the number of page _ nums of the merged pages of the message to be sent, and executing S255 if the number of the page _ nums is greater than the number of the page _ threshold _ nums, or executing S253 if the number of the merged pages of the message to be sent is not greater than the number of the page _ threshold _ nums;
s253: adding pkg _ delay _ times corresponding to a message merging page to be sent, and adding the sum to thread delay time kthread _ delay _ times to obtain total delay time all _ delay _ times, wherein if the total delay time all _ delay _ times > delay _ threshold _ times is S255, otherwise, S254 is executed;
s254: the thread executes udelay (1) the function delays 1us, adds 1us to the kthread _ delay _ times value, and then executes S251;
s255: combining the message merging pages to be sent, transmitting the messages to the FPGA through DMA transmission, setting the corresponding page _ data _ len and pkg _ delay _ times to be 0, setting the kthread _ delay _ times to be 0, releasing the resource of the message merging pages to be sent in the network message merging page lock-free cache queue, and executing S251.
In the step, the ARM establishes two lock-free caches and processing threads, page combination processing is carried out on the network message, DMA transmission is effectively reduced through a page aggregation sending and overtime sending mechanism, the network transmission efficiency is improved, and meanwhile the instantaneity of the network message in a small number is guaranteed.
S3: the ARM carries out CPU binding processing on processing threads for sending and receiving network messages, adopts a six-core processor which is respectively a processor CPU1-CPU6, and specifically comprises the following steps:
s31: the network original sending message merging thread is bound to a CPU4 to run;
s32: the network message merge page sending thread is bound to the CPU5 to run;
s33: the network packet receiving thread binding runs at the CPU 6.
In the step, the ARM binds the network original sending message merging thread, the network message merging page sending thread and the network message receiving thread to the fixed CPU respectively, so that the receiving and sending performance of the network message is effectively improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents or improvements made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A multi-port network message receiving and sending method based on PCIe is characterized by comprising the following steps:
s1: the ARM and the FPGA virtualize a plurality of network devices, port numbers are added to original network messages, and the function of simultaneously receiving and transmitting data by a plurality of network ports on a single PCIe channel is realized;
s2: the ARM adopts a mechanism of sending network message page merging, multi-page aggregation sending and overtime sending, and adopts two independent threads and two lock-free caches to operate in the processes of sending the message page merging and the merged page sending;
s3: and the ARM performs CPU binding processing on the processing threads for sending and receiving the network message.
2. The PCIe-based multiport network messaging method of claim 1, wherein the step S1 comprises the substeps of:
s11: the method comprises the steps that ARM creates a plurality of virtual network devices in a network driver, the FPGA creates network devices of external network ports, the number of the network devices is the same as that of the virtual network devices, and data interaction is carried out through PCIe channels;
s12: the ARM adds a port number in a network message according to virtual network equipment, and then inserts the port number into a network original message sending lock-free cache queue, and the method comprises the following substeps:
s121: the method comprises the steps that an ARM creates a network original sending message lock-free cache queue capable of storing M message structure addresses;
s122: the ARM determines a port number for sending a network message according to the virtual network equipment, and fills the port number into a vlan _ cfi field in a message structure sk _ buff;
s123: inserting the modified message structure address into a network original sending message lock-free cache queue;
s13: the ARM receives the network message of the FPGA, analyzes and obtains a port number, and sends the original message to corresponding virtual network equipment according to the port number, and the ARM comprises the following substeps:
s131: the ARM applies for a message receiving page cache containing N pages and sends a cached physical address to the FPGA;
s132: the FPGA receives external network port data, adds a port number on the head of a network message, writes the data into a cache and updates the state of a sending engine, and only one message is written in each page so as to be convenient for ARM to read;
s133: the ARM establishes a network message receiving thread, circularly reads the state of a sending engine of the FPGA, detects that a message arrives, obtains a header and a footer which contain the message in a cache, sequentially reads the message, analyzes to obtain a port number, and sends an original message with the port number removed to corresponding virtual network equipment.
3. The PCIe-based multiport network messaging method of claim 1, wherein the step S2 comprises the substeps of:
s21: the ARM establishes a network message merging page lock-free cache queue capable of storing P pages, wherein P is less than M;
s22: the ARM application can store an array pkg _ delay _ times of P assigned chars, store corresponding page merging delay time, apply to store an array page _ data _ len of P assigned short, store the effective data length of a corresponding page, and define a page merging timeout time threshold pkg _ delay _ threshold _ times;
s23: the ARM establishes a network transmission original message merging thread, performs page merging processing on a network transmission original message lock-free cache queue, and inserts the network message lock-free cache queue into the network message merging page;
s24: defining a message merge page sending trigger threshold: page _ threshold _ nums, page _ threshold _ nums < P, defining a message merge page delay trigger threshold: delay _ threshold _ times;
s25: the ARM creates a network message merge page sending thread, processes the message merge pages in the network message merge page lock-free cache queue, and transmits data to the FPGA through the DMA.
4. The PCIe-based multiport network messaging method of claim 3, wherein the packet merge page latency trigger threshold defined in the step S24 is: the delay _ threshold _ times is greater than the page merge timeout time threshold defined in the step S22: pkg _ delay _ threshold _ times.
5. The PCIe-based multiport network messaging method of claim 3, wherein the step S23 comprises the substeps of:
s231: detecting whether the network message merge page lock-free cache queue is full or not until the network message merge page lock-free cache queue is detected to be not full, acquiring the current network message merge page, detecting whether the network original message transmission lock-free cache queue has to send the message or not,
if there is no message to be sent in the queue, executing step S232, and if there is a message to be sent in the queue, executing step S235;
s232: detecting the effective data length of the page corresponding to the current network message merge page: whether page _ data _ len is 0,
if the page valid data length: page _ data _ len =0, returning to step S231 again, otherwise, executing step S233;
s233: adding 1us to the page merge delay time pkg _ delay _ times corresponding to the current network message merge page, the network sending the original message merge thread to execute udelay (1) function sleep 1us, detecting whether the page merge delay time pkg _ delay _ times corresponding to the current network message merge page is greater than the page merge timeout threshold pkg _ delay _ threshold _ times,
if the page merge delay time pkg _ delay _ times is greater than the page merge timeout threshold pkg _ delay _ threshold _ times, execute step S234, otherwise, go back to step S231 again;
s234: inserting the current network message merge page into the network message merge page lock-free cache queue, and returning to the step S231 again;
s235: reading length information of the oldest message in an original message-sending lock-free cache queue of the network, detecting whether the current network message merge page has enough residual cache to store the message, if the residual cache is not enough, executing step S236, and if the residual cache is enough, executing step S237;
s236: inserting the current network message merge page into the network message merge page lock-free cache queue, detecting whether the network message merge page lock-free cache queue is full or not, obtaining the current network message merge page until the network message merge page lock-free cache queue is detected to be not full, and executing step S237;
s237: respectively copying a vlan _ cfi field, a data _ len field and message data in a message structure sk _ buf to a message data length field, a port number field and a message data content part in a network message merge page, updating the message number field in the current network message merge page, updating the information of the corresponding page effective data length page _ data _ len, and releasing resources of a network message sk _ buf cache and a network message lock-free cache queue of the message in an original message sending network;
s238: and detecting whether the network originally sends a message without a lock cache queue to the message to be sent, returning to the step S237 if the message to be sent is in the queue, and returning to the step S231 again if the message to be sent is not in the queue.
6. The PCIe-based multiport network messaging method of claim 3, wherein the step S25 comprises the substeps of:
s251: detecting whether the network message merge page lock-free cache queue has a message merge page to be sent or not until the message merge page to be sent in the queue is detected, and executing S252;
s252: calculating to obtain the number of merged pages of the message to be sent, determining whether the number of merged pages, page _ nums, is greater than a merged page sending trigger threshold, page _ threshold _ nums, and if the number of merged pages, page _ nums, is greater than the merged page sending trigger threshold, page _ threshold _ nums, executing step S255, otherwise executing step S253;
s253: adding pkg _ delay _ times corresponding to the message merging pages to be sent, adding kthread _ delay _ times to obtain total delay time all _ delay _ times, judging whether the total delay time all _ delay _ times is greater than a message merging page delay triggering threshold value delay _ threshold _ times,
if the total delay time all _ delay _ time is greater than the delay triggering threshold delay _ threshold _ time of the message merge page, executing step S255, otherwise executing step S254;
s254: the thread executes udelay (1), the function is delayed by 1us, the thread delay time kthread _ delay _ times value is added with 1, and the process returns to the step S251 again;
s255: combining the message merging pages to be sent, transmitting the message to the FPGA through DMA transmission, setting the corresponding page effective data length page _ data _ len and the page merging delay time pkg _ delay _ times to 0, setting the thread delay time kthread _ delay _ times to 0, releasing the resource of the message merging pages to be transmitted in the network message merging page lock-free cache queue, and returning to the step S251.
7. The PCIe-based multi-port network messaging method of claim 1, wherein the ARM employs a six-core processor, which is the processors CPU1-CPU6 respectively.
8. The PCIe-based multiport network messaging method of claim 7, wherein the step S3 comprises the substeps of:
s31: the network original sending message merging thread is bound to a CPU4 to run;
s32: the network message merge page sending thread is bound to the CPU5 to run;
s33: the network packet receiving thread binding runs at the CPU 6.
CN202111237181.6A 2021-10-25 2021-10-25 Multi-port network message receiving and transmitting method based on PCIe Active CN113676421B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111237181.6A CN113676421B (en) 2021-10-25 2021-10-25 Multi-port network message receiving and transmitting method based on PCIe

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111237181.6A CN113676421B (en) 2021-10-25 2021-10-25 Multi-port network message receiving and transmitting method based on PCIe

Publications (2)

Publication Number Publication Date
CN113676421A true CN113676421A (en) 2021-11-19
CN113676421B CN113676421B (en) 2022-01-28

Family

ID=78550976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111237181.6A Active CN113676421B (en) 2021-10-25 2021-10-25 Multi-port network message receiving and transmitting method based on PCIe

Country Status (1)

Country Link
CN (1) CN113676421B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286866A (en) * 2008-05-30 2008-10-15 杭州华三通信技术有限公司 Multicast implementing method and system based on switching network of high-speed peripheral extended interface
CN102185770A (en) * 2011-05-05 2011-09-14 汉柏科技有限公司 Multi-core-architecture-based batch message transmitting and receiving method
CN106897106A (en) * 2017-01-12 2017-06-27 北京三未信安科技发展有限公司 The sequential scheduling method and system of the concurrent DMA of multi-dummy machine under a kind of SR IOV environment
CN106998347A (en) * 2016-01-26 2017-08-01 中兴通讯股份有限公司 The apparatus and method of server virtualization network share
CN108595353A (en) * 2018-04-09 2018-09-28 杭州迪普科技股份有限公司 A kind of method and device of the control data transmission based on PCIe buses
US20190272249A1 (en) * 2018-03-01 2019-09-05 Samsung Electronics Co., Ltd. SYSTEM AND METHOD FOR SUPPORTING MULTI-MODE AND/OR MULTI-SPEED NON-VOLATILE MEMORY (NVM) EXPRESS (NVMe) OVER FABRICS (NVMe-oF) DEVICES
CN110247860A (en) * 2018-03-09 2019-09-17 三星电子株式会社 Multi-mode and/or multiple speed NVMe-oF device
CN110545152A (en) * 2019-09-10 2019-12-06 清华大学 upper computer with real-time transmission function in Ethernet and Ethernet system
CN110943941A (en) * 2019-12-06 2020-03-31 北京天融信网络安全技术有限公司 Message receiving method, message sending method, network card and electronic equipment
WO2020177252A1 (en) * 2019-03-06 2020-09-10 上海熠知电子科技有限公司 Pcie protocol-based dma controller, and dma data transmission method
CN112437028A (en) * 2020-12-10 2021-03-02 福州创实讯联信息技术有限公司 Method and system for expanding multiple network ports of embedded system
US20210306302A1 (en) * 2019-12-19 2021-09-30 Xiamen Wangsu Co., Ltd. Datagram processing method, processing unit and vpn server

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286866A (en) * 2008-05-30 2008-10-15 杭州华三通信技术有限公司 Multicast implementing method and system based on switching network of high-speed peripheral extended interface
CN102185770A (en) * 2011-05-05 2011-09-14 汉柏科技有限公司 Multi-core-architecture-based batch message transmitting and receiving method
CN106998347A (en) * 2016-01-26 2017-08-01 中兴通讯股份有限公司 The apparatus and method of server virtualization network share
CN106897106A (en) * 2017-01-12 2017-06-27 北京三未信安科技发展有限公司 The sequential scheduling method and system of the concurrent DMA of multi-dummy machine under a kind of SR IOV environment
US20190272249A1 (en) * 2018-03-01 2019-09-05 Samsung Electronics Co., Ltd. SYSTEM AND METHOD FOR SUPPORTING MULTI-MODE AND/OR MULTI-SPEED NON-VOLATILE MEMORY (NVM) EXPRESS (NVMe) OVER FABRICS (NVMe-oF) DEVICES
CN110247860A (en) * 2018-03-09 2019-09-17 三星电子株式会社 Multi-mode and/or multiple speed NVMe-oF device
CN108595353A (en) * 2018-04-09 2018-09-28 杭州迪普科技股份有限公司 A kind of method and device of the control data transmission based on PCIe buses
WO2020177252A1 (en) * 2019-03-06 2020-09-10 上海熠知电子科技有限公司 Pcie protocol-based dma controller, and dma data transmission method
CN110545152A (en) * 2019-09-10 2019-12-06 清华大学 upper computer with real-time transmission function in Ethernet and Ethernet system
CN110943941A (en) * 2019-12-06 2020-03-31 北京天融信网络安全技术有限公司 Message receiving method, message sending method, network card and electronic equipment
US20210306302A1 (en) * 2019-12-19 2021-09-30 Xiamen Wangsu Co., Ltd. Datagram processing method, processing unit and vpn server
CN112437028A (en) * 2020-12-10 2021-03-02 福州创实讯联信息技术有限公司 Method and system for expanding multiple network ports of embedded system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨惠等: "面向多核网络分组处理系统的线程亲和缓冲区管理机制", 《国防科技大学学报》 *

Also Published As

Publication number Publication date
CN113676421B (en) 2022-01-28

Similar Documents

Publication Publication Date Title
US11899596B2 (en) System and method for facilitating dynamic command management in a network interface controller (NIC)
US11500689B2 (en) Communication method and apparatus
WO2018076793A1 (en) Nvme device, and methods for reading and writing nvme data
US10911358B1 (en) Packet processing cache
USRE47756E1 (en) High performance memory based communications interface
US7996548B2 (en) Message communication techniques
Flajslik et al. Network Interface Design for Low Latency {Request-Response} Protocols
US20100169528A1 (en) Interrupt technicques
US8966484B2 (en) Information processing apparatus, information processing method, and storage medium
JPH11175454A (en) Computer system equipped with automatic direct memory access function
CN112650558B (en) Data processing method and device, readable medium and electronic equipment
JPH11175455A (en) Communication method in computer system and device therefor
WO2020000485A1 (en) Nvme-based data writing method, device, and system
WO2020000482A1 (en) Nvme-based data reading method, apparatus and system
CN115934625B (en) Doorbell knocking method, equipment and medium for remote direct memory access
CN117178263A (en) Network-attached MPI processing architecture in SmartNIC
WO2014019511A1 (en) Multicast message replication method and device
CN115248795A (en) Peripheral Component Interconnect Express (PCIE) interface system and method of operating the same
CN113676421B (en) Multi-port network message receiving and transmitting method based on PCIe
US9288163B2 (en) Low-latency packet receive method for networking devices
US10255213B1 (en) Adapter device for large address spaces
CN116601616A (en) Data processing device, method and related equipment
US11785087B1 (en) Remote direct memory access operations with integrated data arrival indication
US20240111702A1 (en) Virtual wire protocol for transmitting side band channels
Binkert Integrated system architectures for high-performance Internet servers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant