CN110995507B

CN110995507B - Network acceleration controller and method

Info

Publication number: CN110995507B
Application number: CN201911317999.1A
Authority: CN
Inventors: 张鹏程; 张洪柳; 刘田明; 卢方勇; 王中晓
Original assignee: Qingdao Fangcun Microelectronic Technology Co ltd; Shandong Fangcun Microelectronics Technology Co ltd
Current assignee: Qingdao Fangcun Microelectronic Technology Co ltd; Shandong Fangcun Microelectronics Technology Co ltd
Priority date: 2019-12-19
Filing date: 2019-12-19
Publication date: 2022-08-12
Anticipated expiration: 2039-12-19
Also published as: CN110995507A

Abstract

The controller at least comprises a TOE module, a TXMAC module, a RXMAC module, a TX FIFO module, a RX FIFO module, a TXMDMA module, a RXDMA module, an ARBITER module and an AXI Master module. Therefore, excessive participation of software in the network processing process can be effectively reduced, the load of the CPU is reduced, the operation processing is realized by hardware through an internal handshake mechanism, the CPU is informed to process after the interruption accumulation reaches a certain number, the interruption does not need to be frequently sent to the CPU, the unnecessary waiting time can be reduced, and the processing speed of the network information is greatly improved.

Description

Network acceleration controller and method

Technical Field

The present disclosure relates to the field of network acceleration technologies, and in particular, to a network acceleration controller and a method.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

With the development of network technology, based on the timeliness requirement for network information processing, the TCP/IP protocol stack needs to further accelerate the information processing speed, which in turn has higher and higher requirements for the performance of the CPU.

The inventor of the present disclosure finds that most products in the market at present rely on a high-performance CPU to accelerate the processing of a TCP/IP protocol stack, and the network information processing speed is difficult to accelerate again under the condition that the performance of the CPU is not changed.

Disclosure of Invention

In order to solve the defects of the prior art, the disclosure provides a network acceleration controller and a method, which rely on hardware to complete the processing of a part of TCP/IP protocol stack, and reduce the load on a CPU in the network information processing process.

In order to achieve the purpose, the following technical scheme is adopted in the disclosure:

a first aspect of the present disclosure provides a network acceleration controller.

A network acceleration controller comprises a TOE module, a TXMAC module, an RXMAC module, a TX FIFO module and an RX FIFO module, wherein the TXMAC module is used for transmitting data from the TX FIFO to a network, the RXMAC module is used for receiving the data from the network and storing the data to the RX FIFO module, and the TOE module is respectively connected with the TXMAC module, the RXMAC module, the TX FIFO module and the RX FIFO module and is used for processing network information of a receiving end and a transmitting end;

the system comprises a TXDMA module, an RXDMA module, an ARBITER module and an AXI Master module, wherein the TXDMA module is connected with the TOE module and used for reading a descriptor of a transmitting end, reading corresponding transmitting data according to the read descriptor and storing the corresponding transmitting data into the TX FIFO module; the RXDMA module is connected with the TOE module and used for reading a descriptor of a receiving end and moving data from the RX FIFO to a corresponding system memory according to the read descriptor;

the ARBITER module is used for processing the processing priority between the descriptor rings of the sending end and the descriptor rings of the receiving end, and the AXI Master module is used for reading data from the system memory to the network acceleration controller during reading operation and writing the data back to the system memory during writing operation according to an operation instruction sent by the ARBITER module.

As some possible implementation manners, the controller further includes a network transmission interface control module for receiving and sending network information, and at least three transmission modes of 10 mbit/s, 100 mbit/s and 1000 mbit/s are supported.

As some possible implementation manners, the controller further includes an RGF module and an MDC module, where the RGF module is used for read-write maintenance of a register; the MDC module is connected with the RGF module and used for controlling the network PHY, and the read-write operation of the network PHY can be realized through the read-write of the register.

As a further limitation, the controller further includes an AHB Slave module, and the main control module accesses the register through the AHB Slave module, which specifically includes:

when the main control module is about to write data into the network acceleration controller, the AHB Slave module executes a write operation, writes the data into a corresponding address in the RGF module and returns a state to the main control module to inform the main control module that the write operation is successful;

when the main control module reads data from the network acceleration controller, it executes a read operation, reads the value of the corresponding register in the RGF according to the address and returns a status to inform the main control module of successful read.

As some possible implementation manners, the TOE module processes the network information of the sending end, specifically:

the sending end can split a large data packet sent by an upper layer, and the head of each split data packet is filled by hardware according to the original head by calculation; VLAN control information can be added according to the information in the descriptor; the checksum of the IPv4 header can be calculated and filled in the corresponding position of the data packet; the checksum of the UDP data packet can be calculated and filled in the corresponding position; the receiving end can carry out head checksum comparison of UDP;

the checksum of the TCP data packet can be calculated and filled in the corresponding position; interrupts can be accumulated and a new interrupt is initiated to notify the CPU to process after a certain number of interrupts are reached.

As some possible implementation manners, the TOE module processes the network information of the receiving end, specifically:

the receiving end can carry out HASH calculation on the received data packet and write the value back to the descriptor, so that software can classify the data packet conveniently; the received data packet can be subjected to header and load separation according to the descriptor, so that software processing is facilitated; being able to remove VLAN control information and write the information back into the descriptor; header checksum comparison of IPv4 can be carried out; the UDP header check sum comparison can be carried out; interrupts can be accumulated and a new interrupt is initiated to notify the CPU to process after a certain number of interrupts are reached.

The second aspect of the present disclosure provides a method for implementing network acceleration.

A method for implementing network acceleration, which utilizes a network acceleration controller according to a first aspect of the present disclosure, and a method for controlling a transmitting end, includes the following steps:

step 7-1: issuing the processed initial data packet to a network layer, then modifying the descriptor ring, and informing a network accelerator to carry out data reading operation by operating a register of a network acceleration controller;

step 7-2: the network acceleration controller starts to process the descriptor ring, processes the descriptor ring in sequence according to the priority preset by the ARBITER module and sends the descriptor ring to the AXI Master module;

and 7-3: the AXI Master module retrieves the content of the descriptor according to the corresponding address of the instruction sent by the ARBITER module;

and 7-4: the network acceleration controller determines the next further operation according to the content of the received descriptor;

and 7-5: after the data is written into the TX FIFO module, the TXMAC module is informed to fetch the data, and the data is sent when the TXMAC module detects that the network is in an idle state;

and 7-6: if conflict occurs in the sending process, whether the set peak value is exceeded or not needs to be detected, if not, a binary index backoff algorithm is used for calculating delay and then sending is carried out;

and 7-7: after the transmission is completed, the transmission status is written into the descriptor through the AXI MASTER module, whether there are any descriptors waiting for transmission is inquired, and then the step 7-2 is returned.

As some possible implementations, in step 7-4, specifically:

judging whether the GSOEN is effective, if not, directly not performing subpackage processing;

if the length of the MAC layer header, the IP layer header and the L4 layer header is valid, the length of the MAC layer header, the length of the IP layer header and the length of the L4 layer header are determined according to the MAC LEN, the IPLEN and the L4LEN in the descriptor, and the MAC layer header, the IP layer header and the L4 layer header are stored and the fragments of the MAC layer header, the IP layer header and the L4 layer header are modified;

if the packet is only an IP packet or a UDP/IP packet, the packet is required to be processed by fragmentation at an IP layer, and if the packet is a TCP/IP packet, the packet is required to be processed at a TCP layer;

judging whether VLAN control information needs to be inserted or not and whether checksum calculation needs to be carried out or not;

and calling the AXI Master module to retrieve the data according to the packet size by using the corresponding address according to the address in the descriptor and the length of the data packet, modifying the data header, calculating the checksum and adding VLAN control information.

The third aspect of the present disclosure provides a method for implementing network acceleration.

A method for implementing network acceleration, which utilizes a network acceleration controller according to the first aspect of the present disclosure, and a control method of a receiving end, includes the following steps:

step 9-1: carrying out address filtering and CRC (cyclic redundancy check) on the received data, and directly discarding the data if the address filtering and CRC error or the received data packet is less than 64 bits;

step 9-2: meanwhile, the checksum is checked, the type of the packet is determined by analyzing the data packet header, useful information is extracted, and a 32-bit hash value is calculated by using a Toeplitz algorithm;

step 9-3: extracting seven bits from the 32-bit value, pointing to the table storing the descriptor ring, and extracting a three-bit value representing the descriptor ring from the table;

step 9-4: determining the position of a descriptor ring according to the three-bit numerical value, and reading the descriptor ring from the current available address of the specified descriptor ring;

step 9-5: writing the data in the RX FIFO module into a corresponding address through the AXI MASTER module according to the address pointed by the descriptor ring;

and 9-6: after the transmission is completed, the descriptor ring is updated.

As some possible implementations, in step 9-2, the packet type includes:

an IPv4 packet for extracting a destination IP address and a source IP address;

extracting a destination IP address, a source IP address, a destination port number and a source port number from the IPv4+ TCP/UDP packet;

an IPv6 packet for extracting a destination IP address and a source IP address;

IPv6+ TCP/UDP packet, extracting destination IP address, source IP address, destination port number and source port number.

Compared with the prior art, the beneficial effect of this disclosure is:

1. the content of the method is realized by hardware for a part of operations originally belonging to TCP/IP protocol stack processing, a large data packet is processed, the protocol stack only needs to participate in one operation, and the packet processing is completed by hardware, so that excessive participation of software in a network processing process can be effectively reduced, and the load of a CPU (Central processing Unit) is reduced.

2. The content of the present disclosure realizes operation processing by hardware through an internal handshake mechanism, supports the interruption accumulation to reach a certain number and then informs the CPU to process, does not need to send the interruption frequently to the CPU, can reduce unnecessary waiting time, and can improve the processing speed of network information.

3. The content of the disclosure supports multiple queue descriptor rings, can be applied to processors with a multi-core architecture, and each core is responsible for processing different descriptor rings without mutual interference.

4. The content of the disclosure is compatible with a Linux software processing framework, is compatible with GSO (general Segment office), and supports the closing or opening of GSO function; the HASH value written back into the descriptor facilitates processing by software gro (generic Receive offload).

Drawings

Fig. 1 is a schematic architecture diagram of a network accelerator provided in embodiment 1 of the present disclosure.

Detailed Description

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

Example 1:

as shown in fig. 1, an embodiment 1 of the present disclosure provides a network acceleration controller, which implements a part of contents originally belonging to a TCP/IP protocol stack process by using hardware, so as to reduce a load on a CPU in a process of processing network information by the TCP/IP protocol stack, and specifically includes the following modules:

(1) RGMII module (Reduced Gigabit Media Independent Interface), network transmission Interface control module, used to receive and send network information.

Three transmission modes of 10 Mbit/s, 100 Mbit/s and 1000 Mbit/s are supported. RGMII supports more transmission rate modes and the number of signal lines is relatively small compared to other transmission interfaces.

(2) The TXMAC module, which is mainly responsible for transmitting data from the TX FIFO to the network, can automatically add CRC check and preamble fields to the data packet and automatically generate a Jam data packet for collision control.

The main control is realized by a state machine; when a sending end needs to send data, the state of the Ethernet needs to be detected, and the sending can be carried out only when the Ethernet is in an idle state; if the sending end is sending data, detecting the conflict, the sending end will send Jam packet and determine whether exceeding the maximum limit value, if not, waiting for the return time and sending the current data packet; when the transmitted frame is less than 64 bits (minimum frame specified by ethernet), the transmitting end will automatically supplement it to 64 bits and transmit it.

In addition, the TXMAC module can also send IEEE 1588 PTP frame to inform the user of more accurate time

(3) The RXMAC module is mainly responsible for receiving data from a network and storing the data to the RX FIFO module, address filtering and CRC check inside; the main control is realized by a state machine; when the receiving end finishes data receiving, CRC (cyclic redundancy check) and address filtering are carried out on the data, and if the data are checked, the internal DMA is informed to move the data to a corresponding internal memory; if the error is checked, the data is discarded and the controller is notified.

In addition, the RXMAC can identify IEEE 1588 PTP frames and can analyze accurate time

(4) The TX FIFO module is mainly responsible for data caching at a sending end, and when data is to be sent, the internal DMA can move the data from a system memory to the TX FIFO module of the network controller; after the whole data packet is stored, informing a TX MAC module to take out data from a TX FIFO and sending the data to a network; when the entire data packet is correctly transmitted, the data in the TX FIFO module is overwritten to prevent collision and need to retransmit the data.

(5) The RX FIFO module is mainly responsible for storing the data received by the receiving end; when the CRC comparison of the received data passes, informing RXDMA to start, and moving the received data to a system memory according to a descriptor of a receiving end; if the received packet is less than 64 bits or the CRC comparison fails, it is discarded directly.

(6) The TOE module is mainly responsible for processing network information of a receiving end and a sending end.

Mainly comprises the following aspects:

(6-1) the sending end can split a large data packet (indicating that the data packet exceeds the MTU) sent by an upper layer, and the head of each split data packet is filled by hardware which calculates according to the original head;

(6-2) the receiving end can perform HASH calculation on the received data packet and write the value back to the descriptor, so that software can classify the data packet conveniently;

(6-3) the receiving end can separate the header and the load of the received data packet according to the descriptor, so that the software processing is facilitated;

(6-4) the sending end can add VLAN control information according to the information in the descriptor;

(6-5) the receiving end is able to remove the VLAN control information and write the information back into the descriptor;

(6-6) the sender can calculate the checksum of the IPv4 header and fill the checksum into the corresponding position of the data packet; the receiving end can carry out header check sum comparison of IPv 4;

(6-7) the transmitting end can calculate the checksum of the UDP data packet and fill the checksum into the corresponding position; the receiving end can carry out head checksum comparison of UDP;

(6-8) the sending end can calculate the checksum of the TCP data packet and fill the checksum into the corresponding position; the receiving end can carry out the head checksum of UDP;

(6-9) interrupts can be accumulated and a new interrupt can be notified to the CPU for processing after a certain number of interrupts have been reached.

(7) The TXDMA module is mainly responsible for three functions: reading the sender descriptor, reading the corresponding sending data according to the read-back descriptor and storing the corresponding sending data into the TX FIFO, and modifying the descriptor status bit after the transmission is finished.

(8) The RXDMA module is mainly responsible for three functions: reading the receiving end descriptor, moving the data from the RX FIFO to the corresponding system memory according to the read-back descriptor, and modifying the descriptor status bit after the transmission is finished.

(9) The ARBITER module is mainly responsible for processing the priority problem between 8 descriptor rings at the transmitting end and 8 descriptor rings at the receiving end.

There are two processing modes:

(9-1) polling, and sequentially processing the polling and the polling;

(9-2) priority setting, wherein the processing with high priority is firstly carried out, and the processing with low priority is finally carried out.

(10) The AXI Master module is mainly responsible for reading and writing back operations; according to an operation instruction sent by the ARBITER, the network acceleration controller is responsible for reading data from a system memory during reading operation; and the write operation is responsible for writing data back to the system memory.

(11) The MDC module is mainly responsible for controlling the network PHY; software can realize the read-write operation of the network PHY through the read-write of the register to achieve the purpose to be realized.

(12) The RGF module is mainly responsible for reading and writing maintenance of the register; what functions the software needs to implement needs to accomplish the corresponding purpose by configuring the corresponding registers.

(13) The AHB Slave module can access the register through the interface module, when the CPU needs to write data into the network acceleration controller, the AHB Slave module executes a write operation, writes the data into a corresponding address in the RGF and returns a state to inform the CPU of successful writing; when the CPU reads data from the network acceleration controller, it executes a read operation, reads the value of the corresponding register in the RGF according to the address and returns a status to inform the CPU of the successful read.

The acceleration method using the network acceleration controller specifically comprises the following steps:

a sending end:

when the software issues the processed initial data packet to the network layer (maximum 64KB), it needs to fill in the descriptor ring and inform the network acceleration controller through the operation register to proceed the next operation.

The operation flow of the sending end is as follows:

(A) the software sends the processed initial data packet (the maximum can reach 64KB) to the network layer, then modifies the descriptor ring, and informs the network accelerator to carry out data reading operation by operating the register of the network acceleration controller. The 8 descriptor rings are set for convenience of a multi-core CPU processing network acceleration controller later and can be independently applied to a single core.

(B) The network acceleration controller starts to process the descriptor ring, processes the descriptor ring in sequence according to the preset priority (through the ARBITER module) and sends the descriptor ring to the AXI MASTER module;

(C) the AXI MASTER module retrieves the content of the descriptor according to the corresponding address of the instruction sent by the ARBITER module;

(D) the network acceleration controller decides the following further operation according to the content of the received descriptor:

(D-1) judging whether the GSOEN is effective or not, and if not, directly not performing subpackage processing;

(D-2) if the descriptor is valid, performing packetization processing, mainly determining the lengths of the MAC layer header, the IP layer header and the L4 layer header according to the MAC LEN, the IPLEN and the L4LEN in the descriptor, storing the lengths and modifying fragments.

What needs to be modified for IPv4 is the total length, the identity (which can be determined by a descriptor whether it has changed), the fragment offset, and the header checksum need to be modified;

what needs to be modified for IPv6 is the payload length; what needs to be modified for UDP is the total length and checksum;

what needs to be modified for TCP is the sequence number and checksum.

If the packet is only an IP packet or a UDP/IP packet, the packet is required to be processed by fragmentation at an IP layer; if it is a TCP/IP packet, packetization is required at the TCP layer.

(D-3) judging whether VLAN control information needs to be inserted or not and whether checksum calculation needs to be carried out or not;

(D-4) calling the AXI Master module to retrieve the data according to the packet size by using the corresponding address according to the address in the descriptor and the length of the data packet, modifying the header (adding the header after the first time), calculating the checksum and adding VLAN control information;

(E) after the data is written into the TX FIFO module, the TXMAC module is informed to fetch the data, the TXMAC module starts to send the data after detecting that the network is in an idle state, and a standard Ethernet frame is formed by adding a preamble, an SFD and a CRC for a data packet;

(H) if conflict occurs in the sending process, whether the set peak value is exceeded or not needs to be detected, if not, a binary index backoff algorithm is used for calculating delay and then sending is carried out;

(G) after the transmission is completed, the transmission state is written into the descriptor through the AXI Master;

(H) inquiring whether there are any descriptors waiting for transmission, and if so, starting from (B).

Receiving end

When the receiving end receives data, the receiving end needs to analyze the header information in the received data packet and determine the receiving end descriptor ring by using part of the information in the header information.

The receiving end specifically comprises the following steps:

(a) and carrying out address filtering and CRC check on the received data, and directly discarding the data if the address filtering, the CRC check is wrong or the received data packet is less than 64 bits.

(b) And meanwhile, the checksum is checked, the type of the packet is determined by analyzing the data packet header, useful information is extracted, and a 32-bit hash value is calculated by using a Toeplitz algorithm.

The packet types mainly include the following:

(b-1) extracting a destination IP address and a source IP address if the IPv4 packet is received;

(b-2) extracting a destination IP address, a source IP address, a destination port number and a source port number if the IPv4+ TCP/UDP packet;

(b-3) IPv6 packet, extracting destination IP address and source IP address

(b-4) extracting a destination IP address, a source IP address, a destination port number and a source port number if the IPv6+ TCP/UDP packet is received;

(c) extracting 7 bits from the 32 bit value, pointing to the table storing the descriptor ring, and extracting a 3-bit value representing the descriptor ring from the table;

(d) determining the position of a descriptor ring according to a 3-bit numerical value, reading the descriptor ring from the current available address of the specified descriptor ring (recording the address used by the current descriptor ring each time);

(e) writing the data in the RX FIFO module into a corresponding address through the AXI Master according to the address pointed by the descriptor ring;

(f) after the transmission is completed, the descriptor ring is updated.

According to the method, a part of operations originally belonging to TCP/IP protocol stack processing are realized by hardware, a large data packet is processed, the protocol stack only needs to participate in one operation, and the sub-packet processing is completed by the hardware, so that excessive participation of software in a network processing process can be effectively reduced, and the load of a CPU is reduced; the operation processing is realized by hardware through an internal handshake mechanism, the CPU is informed to process after the interruption accumulation reaches a certain number, the interruption does not need to be frequently sent to the CPU, the unnecessary waiting time can be reduced, and the processing speed of the network information can be greatly improved.

The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A network acceleration controller is characterized by comprising a TOE module, a TXMAC module, an RXMAC module, a TX FIFO module and an RX FIFO module, wherein the TXMAC module is used for transmitting data from the TX FIFO to a network, the RXMAC module is used for receiving the data from the network and storing the data to the RX FIFO module, the TOE module is respectively connected with the TXMAC module, the RXMAC module, the TX FIFO module and the RX FIFO module and is used for processing network information of a receiving end and a transmitting end;

the system also comprises a TXDMA module, an RXDMA module, an ARBITER module and an AXI Master module, wherein the TXDMA module is connected with the TOE module and is used for reading a descriptor of a sending end, reading corresponding sending data according to the read descriptor and storing the sending data into the TX FIFO module; the RXDMA module is connected with the TOE module and used for reading a descriptor of a receiving end and moving data from the RX FIFO to a corresponding system memory according to the read descriptor;

the ARBITER module is used for processing the processing priority between a plurality of descriptor rings at a sending end and a plurality of descriptor rings at a receiving end, the AXI Master module is used for reading data from a system memory to a network acceleration controller in a reading operation and writing the data back to the system memory in a writing operation according to an operation instruction sent by the ARBITER module;

the TOE module processes network information of a sending end, and specifically comprises:

the checksum of the TCP data packet can be calculated and filled in the corresponding position; the interrupt can be accumulated, and a new interrupt is started to inform the CPU to process after a certain number of interrupts are reached;

the TOE module processes network information of a receiving end, specifically:

2. The network acceleration controller of claim 1, characterized in that the network acceleration controller further comprises a network transmission interface control module for receiving and transmitting network information, supporting at least three transmission modes of 10 mbit/s, 100 mbit/s and 1000 mbit/s.

3. The network acceleration controller of claim 1, characterized in that the controller further comprises an RGF module and an MDC module, the RGF module being used for read-write maintenance of registers; the MDC module is connected with the RGF module and used for controlling the network PHY, and the read-write operation of the network PHY can be realized through the read-write of the register.

4. The network acceleration controller of claim 3, wherein the controller further comprises an AHB Slave module, and the master control module accesses the register through the AHB Slave module, specifically:

5. A method for implementing network acceleration, wherein the method for controlling the transmitting end by using the network acceleration controller of any one of claims 1 to 4 comprises the following steps:

and 7-4: the network acceleration controller decides the next further operation according to the content received to the descriptor;

and 7-7: after the transmission is completed, writing the transmission state into the descriptor through the AXI MASTER module, inquiring whether there is any descriptor waiting for transmission, and if there is any descriptor waiting for transmission, returning to step 7-2.

6. The method for implementing network acceleration according to claim 5, wherein in the step 7-4, specifically:

if VLAN control information needs to be inserted, check sum calculation is carried out if the VLAN control information needs to be inserted; and calling the AXI Master module to retrieve the data according to the packet size by using the corresponding address according to the address in the descriptor and the length of the data packet, modifying the data header, calculating the checksum and adding VLAN control information.

7. A method for implementing network acceleration, wherein the network acceleration controller of any one of claims 1-4 is used to control a receiving end, and the method comprises the following steps:

and 9-6: after the transmission is completed, the descriptor ring is updated.

8. The method for implementing network acceleration according to claim 7, wherein in the step 9-2, the packet type includes:

an IPv4 packet for extracting a destination IP address and a source IP address;

an IPv6 packet for extracting a destination IP address and a source IP address;