CN114221905A - Processing unit and flow control unit, and related methods - Google Patents

Processing unit and flow control unit, and related methods Download PDF

Info

Publication number
CN114221905A
CN114221905A CN202010912936.7A CN202010912936A CN114221905A CN 114221905 A CN114221905 A CN 114221905A CN 202010912936 A CN202010912936 A CN 202010912936A CN 114221905 A CN114221905 A CN 114221905A
Authority
CN
China
Prior art keywords
value
flow
data
flow value
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010912936.7A
Other languages
Chinese (zh)
Inventor
郭向东
严青
林英姿
吴炜
肖德宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pingtouge Shanghai Semiconductor Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010912936.7A priority Critical patent/CN114221905A/en
Publication of CN114221905A publication Critical patent/CN114221905A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present disclosure provides a processing unit and flow control unit, and related methods. The flow control unit is located at a sending end and comprises: the first flow extreme value memory is used for storing a first flow extreme value; a first flow consumption memory for storing the flow value consumed by the transmitting end; the comparator is used for determining that the size of the data to be sent does not exceed the difference between the first flow extreme value and the consumed flow value and allowing the data to be sent to a receiving end; and the second flow generator is used for sending the consumed flow value serving as a second flow value to the flow control unit positioned at the receiving end, so that the flow control unit adjusts the flow value sent to the first flow extreme value storage next time according to the difference between the second flow value and the data volume received by the receiving end from the sending end. The embodiment of the disclosure avoids the situation that the flow value which is not updated by the sending end can be used for sending data due to the data loss in the midway, thereby avoiding the occurrence of dead communication.

Description

Processing unit and flow control unit, and related methods
Technical Field
The present disclosure relates to processing unit interconnects, and more particularly, to a processing unit and flow control unit, and related methods.
Background
Artificial Intelligence (AI) chips (e.g., NPU, neural network acceleration unit) are dedicated to hardware acceleration and machine learning processing of the neural network. AI chips often need to be interconnected when deployed to form a larger processing network to handle AI tasks that require more computing resources.
If the data amount sent by the sending-end AI chip to the receiving-end AI chip exceeds the receiving buffer capacity of the receiving-end AI chip when the AI chips communicate, data overflow occurs. Therefore, flow control must be performed. One flow control technique of the prior art employs a mechanism based on prior notification of an allowed flow value. After the data in the receiving buffer of the receiving-side AI chip is moved into the memory, and the part of the data does not exist in the receiving buffer, a flow value indicating how many packets it is allowed to transmit to itself can be transmitted to the transmitting-side AI chip according to the size of the part of the data (the amount of data transferred from the receiving buffer to the receiving-side memory). The flow value sent to the sending end AI chip is continuously updated as the data volume is continuously transferred from the receiving buffer to the memory in the using process of the receiving end AI chip. The transmitting-end AI chip consumes the corresponding flow value whenever it sends out data. Before sending data to the receiving end AI chip, the sending end AI chip checks whether the size of the data to be sent is not larger than the difference between the flow extreme value and the consumed flow value, and if so, the data can be sent.
The above-described mechanism based on the advance notification of the allowed traffic value reduces data overflow, but risks the occurrence of dead communication. If the data sent by the sending end AI chip to the receiving end AI chip is lost in the transmission process, the data cannot be buffered in the receiving buffer because the receiving end AI chip does not receive the data, and the data cannot be moved to the memory from the receiving buffer, so that the flow value is triggered to be updated to the sending end AI chip. Thus, the total flow value released to the sending-side AI chip is less than expected. If more and more data are lost on the way, the sending end AI chip can not receive the updated flow value, and the data can not be sent again, so that the data is trapped in dead communication.
Disclosure of Invention
In view of this, the present disclosure is directed to avoiding a situation in which a traffic value that is not updated by a transmitting end is available to send out data due to data being lost halfway, thereby avoiding occurrence of dead communication.
To achieve this object, according to an aspect of the present disclosure, there is provided a flow control unit at a transmitting end, including:
a first flow extreme value storage, configured to store a first flow extreme value, where the first flow extreme value is updated according to a first flow value sent by a flow control unit located at a receiving end, and the first flow value represents a data amount allowed to be further received by the receiving end;
a first traffic consumption memory, configured to store a traffic value consumed by the sending end, where the consumed traffic value represents a data amount that the sending end has sent to the receiving end;
a comparator, configured to determine that the size of data to be sent does not exceed the difference between the first flow extreme value and the consumed flow value, and allow the data to be sent to the receiving end;
and the second flow generator is used for sending the consumed flow value serving as a second flow value to the flow control unit at the receiving end, so that the flow control unit at the receiving end adjusts the first flow value sent to the first flow extremum memory next time according to the difference between the second flow value and the data volume received by the receiving end from the sending end.
Optionally, the sending end includes a sending end processing unit, and the flow control unit is located in the sending end processing unit; the receiving end comprises a receiving end processing unit, and the flow control unit of the receiving end is positioned in the receiving end processing unit.
Optionally, the flow control unit further comprises:
a first flow value parser for parsing a flow value from a current first flow value notification packet received by the flow control unit at the receiving end;
a previous first flow value buffer which buffers a flow value parsed from a previous first flow value notification packet received by the flow control unit at the receiving end;
the subtracter is used for subtracting the flow value cached in the previous first flow value buffer from the flow value analyzed from the current first flow value notification packet to obtain a flow value difference value;
and the first adder is used for adding the flow value difference to the first flow extreme value stored in the first flow extreme value storage.
Optionally, after the subtractor obtains the flow value difference, the flow value analyzer updates the previous first flow value buffer with a flow value analyzed from the current first flow value notification packet.
Optionally, the flow control unit further comprises: and a second adder for adding the size of the transmitted data to the consumed traffic value stored in the first traffic value consumption memory after transmitting the data to the receiving end.
Optionally, the second traffic value generator puts the second traffic value in a second traffic value notification packet, and transmits the second traffic value to the receiving end.
Optionally, the first traffic value parser may perform a transmission error check, such as a check error, on the current first traffic value notification packet received from the receiving end, and discard the current first traffic value notification packet.
Optionally, the current first traffic value notification packet includes a first traffic value initial notification packet and a first traffic value update notification packet, and the transmitting-end chip flow control unit further includes a first traffic value initial acknowledgement packet generator, configured to generate a first traffic value initial acknowledgement packet to be transmitted to the receiving end after the first traffic value extremum memory stores the traffic value notified by the first traffic value initial notification packet.
Optionally, the data, the first flow value initial notification packet, the first flow value update notification packet, the first flow value initial acknowledgement packet, and the second flow value notification packet sent by the sending end to the receiving end are transmitted in a flow value message packet format, where the flow value message packet format includes a destination chip address, a source chip address, a packet type, a notified flow value, a padding bit, and a transmission error check bit.
Optionally, the first flow extremum stored in the first flow extremum storage includes a plurality of first flow extremums respectively corresponding to virtual channels of a plurality of receiving ends; the first flow consumption memory stores flow values respectively consumed by a sending end aiming at virtual channels of the plurality of receiving ends; the comparator determines a virtual channel of a receiving end to which data is to be sent, and if the size of the data to be sent does not exceed the difference between a flow extreme value corresponding to the virtual channel of the receiving end and a consumed flow value corresponding to the virtual channel of the receiving end, the data is allowed to be sent to the virtual channel of the receiving end; the second flow value generator takes the flow values respectively consumed by the virtual channels of the plurality of receiving ends as second flow values respectively corresponding to the virtual channels of the plurality of receiving ends to respectively send the second flow values to the plurality of receiving ends; the second traffic value of the notification includes second traffic values corresponding to virtual channels of a plurality of receiving ends.
Optionally, the unit setting of the flow value or the second flow value is equal to 64 bytes, the initial flow value in the first flow value initial notification packet is set to 1440 flow value units, and the maximum value stored by the first flow extremum memory and the first flow value consumption memory is 4096 flow value units, and the storage is reset after exceeding the maximum value.
Optionally, a period for sending the traffic value or the second traffic value is N times of a period for sending data to the receiving end by the sending end, where N is a positive integer greater than or equal to 2.
Optionally, N-8 or N-16.
According to an aspect of the present disclosure, a processing unit is provided, located at a transmitting end, and includes: a memory; a direct memory access module to control direct access of the memory; a plurality of ports; a switch module controlling the connection of the dma module to the plurality of ports, wherein at least one of the plurality of ports comprises the transmit-side chip flow control unit according to claims 1-13, a MAC layer for MAC, and a serializer and deserializer serializing data to be transmitted or deserializing received data.
According to an aspect of the present disclosure, there is provided a flow control unit at a receiving end, including:
a receive data buffer for buffering data received from a transmitting end;
a second traffic value memory for storing a second traffic value received from the transmitting end;
a received data size memory for storing a size of data received from the transmitting end;
a difference calculation unit for calculating a difference between the second traffic value stored in the second traffic value memory and the received data size stored in the received data size memory;
and the first flow value distribution unit is used for updating the flow value sent to the sending end according to the difference between the second flow value and the size of the received data and the data quantity sent to the receiving end memory by the received data buffer.
Optionally, the receiving end includes a receiving end processing unit, and the flow control unit is located in the receiving end processing unit; the sending end comprises a sending end processing unit, and the flow control unit of the sending end is positioned in the sending end processing unit.
Optionally, the received data size memory stores an initial value set to 0, and the flow control unit further includes: a third adder for, in response to receiving data from the transmitting end, adding the received data size to the value stored by the received data size memory.
Optionally, the third adder adds the difference to a value stored in the received data size memory after the difference calculation unit calculates the difference.
Optionally, the flow control unit further comprises: and a fifth adder for adding the difference between the second traffic value and the received data size and the amount of data transmitted from the received data buffer to the receiver memory to the traffic value allocated by the first traffic value allocation unit.
Optionally, the flow control unit further comprises: and a first flow value notification packet generator for placing the updated flow value in a first flow value notification packet and transmitting the same to the transmitting end.
Optionally, the flow control unit further comprises: and the second flow value analyzer is used for analyzing a second flow value from a second flow value notification packet received by the transmitting end and storing the second flow value into the second flow value memory.
Optionally, the current first flow value notification packet comprises a first flow value initial notification packet and a first flow value update notification packet; the receiving end chip flow control unit further receives a first flow value initial acknowledgement packet sent by the sending end chip flow control unit after sending the first flow value initial notification packet to the sending end chip flow control unit.
Optionally, the data, the first flow value initial notification packet, the first flow value update notification packet, the first flow value initial acknowledgement packet, and the second flow value notification packet sent by the sending end to the receiving end are transmitted in a flow value message packet format, where the flow value message packet format includes a destination chip address, a source chip address, a packet type, a notified flow value, a padding bit, and a transmission error check bit.
Optionally, the second traffic value stored in the second traffic value storage includes a plurality of second traffic values respectively corresponding to virtual channels of a plurality of transmitting ends; the received data size storage stores data sizes received from virtual channels of the plurality of transmitting ends, respectively; the difference calculation unit calculates differences between a plurality of second traffic values corresponding to virtual channels of a plurality of transmitting ends stored in the second traffic value memory and received data sizes corresponding to virtual channels of the plurality of transmitting ends stored in the received data size memory, respectively; the first traffic value allocation unit updates, according to the difference obtained for the virtual channels of the multiple transmitting ends and the data amount sent to a receiving end memory for the virtual channels of the multiple transmitting ends in the received data buffer, the multiple first traffic values sent to the virtual channels of the multiple transmitting ends respectively, where the notified first traffic values include first traffic values corresponding to the virtual channels of the multiple transmitting ends.
Optionally, the size of the receiving data buffer is set to 720k bits, the unit of the traffic value is set to 64 bytes, the initial traffic value in the first traffic value initial notification packet is set to 1440 units of traffic value, the maximum value of the received data size stored in the received data size memory and the second traffic value stored in the second traffic value memory is 4096 units of traffic value, and the storage is reset after the maximum value is exceeded.
Optionally, a period for sending the traffic value or the second traffic value is N times of a period for sending data to the receiving end by the sending end, where N is a positive integer greater than or equal to 2.
Optionally, N-8 or N-16.
According to an aspect of the present disclosure, there is provided a processing unit, located at a receiving end, including: a memory; a direct memory access module to control direct access of the memory; a plurality of ports; a switch module controlling connection of the direct memory access module with the plurality of ports, wherein at least one of the plurality of ports includes the receiving-side chip flow control unit according to claims 15-27, a MAC layer for performing a medium access control MAC, and a serializer and deserializer serializing data to be transmitted or deserializing received data.
According to an aspect of the present disclosure, a sending-end flow control method is provided, including:
updating a first flow extreme value according to a flow value sent by a receiving end, wherein the sent flow value is the data volume which is allowed to be increased by the receiving end at present;
if the size of the data to be sent does not exceed the difference between the first flow extreme value and the consumed flow value, allowing the data to be sent to the receiving end, wherein the consumed first flow value is equal to the amount of the data sent to the receiving end by the sending end;
and taking the consumed flow value as a second flow value to be sent to the receiving end, so that the receiving end adjusts the flow value sent to the sending end according to the difference between the second flow value and the data volume which is received by the receiving end from the sending end.
Optionally, the updating the first flow extremum according to the flow value sent by the receiving end includes:
analyzing a flow value from a current first flow value notification packet received by a receiving end;
subtracting the flow value analyzed from the previous first flow value notification packet received from the receiving end from the flow value analyzed from the current first flow value notification packet to obtain a flow value difference value;
and adding the flow value difference to the flow value stored in the first flow extremum storage.
Optionally, after allowing the data to be transmitted to the receiving end, the method further includes: adding the size of the data sent to the consumed flow value.
Optionally, the sending the consumed flow value as a second flow value to the receiving end includes: the step of placing the consumed flow value as a second flow value in a second flow value notification packet for transmission to the receiving end.
Optionally, before parsing out the flow value from the current first flow value notification packet received by the receiving end, the method further includes: the current first traffic value notification packet received from the receiving end is subjected to transmission error check, such as check error, and is discarded.
According to an aspect of the present disclosure, there is provided a receiving end chip flow control method, including:
storing data received from a transmitting end in a received data buffer;
receiving a second traffic value from the transmitting end;
acquiring the size of data received from a sending end chip;
calculating a difference between the second traffic value and the received data size;
and updating the flow value sent to the sending end according to the difference and the data volume transferred from the receiving data buffer to the receiving end memory.
Optionally, the obtaining the size of the data received from the sending end includes:
setting a received data size memory storing an initial value of 0;
in response to receiving data from the sender, accumulating the received data size to a value stored by the received data size memory;
reading the received data size from the received data size memory.
Optionally, after calculating the difference between the second traffic value and the received data size, the method further comprises: the difference is accumulated onto a value stored by the received data size memory.
Optionally, the updating the traffic value sent to the sender according to the difference and the amount of data transferred from the receive data buffer to the receive side memory includes:
adding a difference between the second traffic value and the size of the received data, and the amount of data transferred from the receive data buffer to a receive side memory, to the traffic value allocated for transmission to the transmit side;
and sending the flow value to the sending end chip.
Optionally, the sending the flow value to the sending end includes: and the flow value is put in a first flow value notification packet and is sent to the sending end.
Optionally, the receiving a second traffic value from the transmitting end includes: and analyzing a second flow value from a second flow value notification packet received by the transmitting end.
Different from the flow control mechanism based on the unidirectional flow value in the prior art, the embodiment of the disclosure provides a flow control mechanism based on the bidirectional flow value. That is, not only the receiving end sends the flow value allowed to be sent to the sending end chip according to the transfer condition of the data of its own received data buffer to the memory to inform the sending end of how much data can be sent, but also the sending end sends the consumed flow value (that is, how much data the sending end has sent to the receiving end) to the receiving end as the second flow value, so that the receiving end can compare the second flow value with the data amount the receiving end has received from the sending end to see whether the two values are consistent or not, and if the inconsistency indicates that the data is lost in the middle of data transmission, the flow value sent to the sending end can be adjusted according to the difference between the two values, and the synchronization of the sending and receiving end flow values is ensured. By the method, even if the data is lost in the transmission, the flow value allowed to be sent can be automatically corrected, and the dead communication cannot be caused.
Drawings
The foregoing and other objects, features, and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which refers to the accompanying drawings in which:
FIG. 1 is an overall architecture diagram of a processing unit interconnect to which embodiments of the present disclosure are applied;
FIG. 2 is a diagram of the internal structure of two interconnected processing units;
fig. 3 is an internal structural diagram of a transmitting-end flow control unit and a receiving-end flow control unit of the related art.
Fig. 4 is an internal structural diagram of a transmitting-end flow control unit and a receiving-end flow control unit according to one embodiment of the present disclosure.
Fig. 5 is a flowchart of a flow control method of a transmitting end according to one embodiment of the present disclosure.
Fig. 6 is a flowchart of a flow control method at a receiving end according to one embodiment of the present disclosure.
Fig. 7 is a diagram of a generic format of a flow value message packet according to one embodiment of the present disclosure.
Detailed Description
The present disclosure is described below based on examples, but the present disclosure is not limited to only these examples. In the following detailed description of the present disclosure, some specific details are set forth in detail. It will be apparent to those skilled in the art that the present disclosure may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present disclosure. The figures are not necessarily drawn to scale.
Interpretation of terms
The following terms are used herein.
A processing unit: the unit for performing information processing by running a program includes a conventional processing unit (central processing unit, etc.) for performing general-purpose operations, and an acceleration unit for accelerating some transactions whose conventional processing unit has a slower processing speed, for example, an acceleration unit designed to increase a data processing speed in the field of neural networks in a case where the conventional processing unit is inefficient in processing calculations in the field of neural networks, including a Central Processing Unit (CPU), a Graphic Processing Unit (GPU), a general-purpose graphic processing unit (GPGPU), a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), and dedicated intelligent acceleration hardware (e.g., a neural network processor NPU).
Flow rate value: the receiving end accumulates the amount of data allowed to be received, that is, the accumulated value of the amount of data the receiving end has historically allowed the transmitting end to transmit. It is not the newly freed data increments of the data buffer each time it is received, but rather the cumulative value of all these increments historically. The data received by the receiving end is buffered in a received data buffer and then formally stored in a memory. Once the receive data buffer has data moved into memory, the receive data buffer is freed up a portion of the memory capacity for receiving new data. The receiving end continuously accumulates the newly added data quantity allowed to be sent by the sending end to form a first flow value.
First flow rate limit: the receiving end, which the sending end knows, accumulates the amount of data that is allowed to be received. That is, the traffic value is the data amount that the receiving end knows it is allowed to receive in its own accumulation, and the first extreme traffic value is the data amount that the receiving end knows it is allowed to receive in its accumulation, which are actually the results of observing the same amount from different angles. The receiving end sends the flow value to the sending end each time, the sending end extracts the flow value increment from the flow value increment, namely the difference value between the flow value and the flow value received last time, and then the difference value is continuously accumulated to the first flow extreme value, so that the first flow extreme value reflects the sum of all the difference values, and finally the receiving end is accumulated with the data volume allowed to be received from the perspective of the sending end. Which reflects the sum of the storage capacity that the receiving end has historically been freed up. It may be larger than the storage capacity of the receive data buffer at the receiving end because the same storage space of the receive data buffer will be freed up several times, and this part of the storage space will be repeatedly represented in the traffic value and the first traffic limit value. However, since the consumed traffic value (i.e. the sum of the data amount already sent by the sending end to the receiving end) is also accumulated repeatedly, that is, the traffic value is consumed as long as the sending end sends data to the receiving end, regardless of whether the consumed traffic value is the portion of the traffic value that appears in the same storage space for the first time or the portion that appears repeatedly a plurality of times later, the consumed traffic value can be compared with the first traffic extreme value to determine a difference value, and according to the difference value, it is determined whether the next data can be sent from the sending end to the receiving end.
Consumed flow value: the amount of data that the sender has sent to the receiver. And as long as the sending end sends data to the receiving end, the flow value is consumed, and the newly sent data volume is added to the consumed flow value.
A second flow rate value: the traffic value that the sender has consumed, i.e., the amount of data that the sender has sent to the receiver. The second traffic value reflects an accumulated amount, not an increment, which is the total amount of data that the sender has historically sent to the receiver. The total amount of data it has historically received from the sender is recorded at the receiver. The two are subtracted to obtain the data amount lost in transmission. After the data volume is compensated, the credit volume synchronization of the sending end and the receiving end can be realized.
First traffic value notification packet: a packet for the receiving end to notify the transmitting end of the traffic value allowed to be transmitted. Which includes a first traffic value initial notification packet and a first traffic value update notification packet. The first traffic value initial notification packet is a packet in which the receiving end notifies the transmitting end of a traffic value when the transmitting end does not have data to transmit to the receiving end. The first traffic value update notification packet is a packet in which the receiving end notifies the transmitting end of an updated traffic value when the transmitting end has transmitted data to the receiving end.
Second traffic value notification packet: and a packet for the transmitting end to notify the receiving end of the second traffic value.
And (3) transmission error checking: for the received data packet, checks are performed to verify whether an error occurs during transmission, including a Cyclic Redundancy Check (CRC), etc.
First traffic value initial acknowledgement packet: the sending end receives the first flow value initial notification packet sent by the receiving end and confirms that the receiving end receives the packet of the first flow value initial notification packet.
Flow value message packet format: the method is used for transmitting the data sent by the sending end to the receiving end, the first flow value initial notification packet, the first flow value updating notification packet, the first flow value initial confirmation packet and the second flow value notification packet.
Flow value unit: the flow value is recorded in units, and the flow value per unit corresponds to a data amount of a fixed size.
Resetting storage: when the accumulated value of the memory data exceeds a predetermined maximum value, the accumulated value becomes 0, and accumulation and storage are continued.
Dead communication: under the condition that the flow value is taken as a proof for allowing more data to be transmitted, if the transmitting end cannot continuously receive the flow value transmitted by the receiving end, the transmitting end cannot continuously transmit the data to the receiving end, which is called dead communication.
Direct Memory Access (DMA): direct memory access is a method of high-speed data transfer, where data can be transferred from one channel directly between memory or input/output devices without being processed by the CPU. A device interface attempts to send data (typically a large batch of data) directly to another device over the bus, it first sends a DMA request signal to the CPU. The peripheral sends a bus request for taking over the bus control right to the CPU through a DMA controller (DMAC), which is a special interface circuit of the DMA, and after the CPU receives the signal, the CPU responds to the DMA signal according to the priority of the DMA signal and the order of sending the DMA request after the current bus cycle is finished. When the CPU responds to a DMA request to a certain device interface, the CPU gives the bus control right. The peripheral and the memory then exchange data directly, without the intervention of the CPU, under the management of the DMA controller. After the data is transferred, the device interface sends a DMA end signal to the CPU, and the bus control right is returned.
Medium Access Control (MAC): a mechanism for solving the problem of how to allocate the usage right of a channel when the usage of a shared channel in a local area network generates contention. Which defines how data frames are transmitted over the medium. In links sharing the same bandwidth, access to the connection medium is "first come, first served". Physical addressing is defined herein, as is the logical topology (the path of a signal through the physical topology). Line control, error notification (uncorrected), frame delivery order, and optional flow control are also implemented at this level.
Serialization: parallel data is converted into serial data for transmission between processing units.
Deserializing: because the data transmitted between the processing units is serial data, the processing units need to decode the serial data into parallel data after receiving the data, and then process and store the parallel data, and the process of decoding the serial data into the parallel data is called deserialization.
Background and general network architecture generated by embodiments of the present disclosure
The processing unit is a unit that performs information processing by running a program. The system comprises a traditional processing unit (a central processing unit and the like) for executing general-purpose operation and an acceleration unit for accelerating the transaction processing with slower processing speed of some traditional processing units, for example, aiming at the condition that the traditional processing unit is not efficient in processing the calculation in the neural network field, the acceleration unit designed for improving the data processing speed in the neural network field comprises a Central Processing Unit (CPU), a Graphic Processing Unit (GPU), a general graphic processing unit (GPGPU), a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC) and special intelligent acceleration hardware (for example, a neural network processor NPU). In the field of artificial intelligence, the processing unit is often embodied as an Artificial Intelligence (AI) chip.
In the field of AI, processing elements may be interconnected to form a large processing network to handle AI tasks that require more computing power, as shown in fig. 1. In fig. 1, an interconnection of a plurality of processing units 110 is shown. Each processing unit 110 has four ports 120, P0-P3, for interconnecting different other processing units 110 in four directions. Corresponding to the four ports 120 are four memories 110 for storing data required for communication with different other processing units 110. For example, data received from four different other processing units 110 may be stored in four memories 110, respectively. Although four ports 120 and memories 110 are shown, in fact, those skilled in the art will appreciate that other numbers of ports 120 and memories 110 are possible.
In FIG. 1, processing unit A may transfer data from internal memory M0 to neighboring processing unit B through port P0, and store to processing unit B's memory M2 through processing unit B's port P2. Processing unit B may move data from processing unit B's memory M3 to neighboring processing unit E through port P3, store to processing unit E's memory M1 through processing unit E's port P1, and so on.
One existing approach to building processing unit interconnects is to build an interconnect network using PCIe interfaces and introducing PCIe switches. However, the PCIe switch has low throughput and long latency, and cannot meet the requirements of AI applications.
Prior artin order to improve throughput and latency performance, the architecture of fig. 2 is used to implement processing unit interconnects. As shown in fig. 2, the processing unit 100 includes 4 ports 120, a switch module 140, a Direct Memory Access (DMA) module 130, and a memory 110. Although 4 ports 120 are shown, those skilled in the art will appreciate that other numbers of ports 120 may be provided. Each port 120 includes a serializer and deserializer 121, a Media Access Control (MAC) layer 122, a transmit side flow control unit 123, and a receive side flow control unit 124.
The serializer and deserializer 121 serves to convert data to be transmitted into serial data for high-speed transmission between the processing units, and convert the serial data between the processing units into parallel data for processing and storage. Converting parallel data into serial data for transmission between processing units is called serialization. The process of deserializing serial data into parallel data is called deserializing.
The MAC layer 122 mainly solves the problem of how to allocate the usage right of the channel when the usage of the shared channel in the local area network generates contention. Which defines how data frames are transmitted over the medium. In links sharing the same bandwidth, access to the connection medium is "first come, first served". Physical addressing is defined herein, as is the logical topology (the path of a signal through the physical topology).
The transmitting-end flow control unit 123 is a unit that controls when data is issued when a processing unit issues data to the outside. The receiving-end flow control unit 124 is a unit that controls data reception when the processing unit receives data. Control of the flow value is accomplished in both sections. It is also the main unit of the disclosed embodiments to avoid dead communication by control of the second traffic value.
The switching module 140 is a module for realizing data exchange between each port 120 and the memory 110. In fig. 2, the switch module 140 and the four ports 120 each have a pair of input and output ports (here, the ports are internal ports of the switch module 140, not the four ports 120 described above, which interface with the four ports 120 described above), and the direct memory access module 130 and a pair of input and output ports.
A Direct Memory Access (DMA) module 130 is a module for DMA communication outside the processing unit 100 with the memory 110 inside the processing unit 100. DMA is a high-speed data transfer method, and data can be transferred from one channel directly between a memory or an input/output device without being processed by a CPU. A device interface attempts to send data (typically a large batch of data) directly to another device over the bus, it first sends a DMA request signal to the CPU. The peripheral sends a bus request for taking over the bus control right to the CPU through a DMA controller (DMAC), which is a special interface circuit of the DMA, and after the CPU receives the signal, the CPU responds to the DMA signal according to the priority of the DMA signal and the order of sending the DMA request after the current bus cycle is finished. When the CPU responds to a DMA request to a certain device interface, the CPU gives the bus control right. The peripheral and the memory then exchange data directly, without the intervention of the CPU, under the management of the DMA controller. Since the communication link between the processing units is not 100% reliable, the data has to be retransmitted in some cases. Therefore, even though the sender has sent data, the DMA module 130 of the sender still locks the data in the memory 110 until an Acknowledgement (ACK) signal is received from the receiver. If an ACK signal is received indicating that the data has been successfully received by the receiving end, DMA module 130 may free up address space and reallocate. If a non-acknowledgement (NAK) signal is received, the DMA module 130 will retransmit the data with the intent of omitting the dedicated retransmission buffer.
The memory 110 is a portion of the processing unit 100 that stores data. The data received by each port 120 of the processing unit 100 is finally entered into the memory 110 for storage. Data sent by each port 120 of the processing unit 100 also comes from the memory 110.
In the configuration of fig. 2, each port 120 supports 800Gbps of data communication. Thus, each processing unit of FIG. 2 supports transmitting and receiving 3200Gbps worth of data to 4 other processing units through 4 ports.
The processing unit 100 for transmitting data is referred to as a transmitting-side processing unit 1001, and the processing unit 100 for receiving data is referred to as a receiving-side processing unit 1002. Since the data received by the receiving-side processing unit 1002 needs to be buffered in the received data buffer before being transferred to the memory 110, if the amount of data sent by the sending-side processing unit 1001 to the receiving-side processing unit 1002 exceeds the receiving buffer capacity of the receiving-side processing unit 1002 during the processing unit communication, data overflow may occur. Therefore, the prior art uses the approach of fig. 3 for flow control. In fig. 3, only the transmitting-side flow control section 123 of the transmitting-side processing section 1001 and the receiving-side flow control section 124 of the receiving-side processing section 1002 are shown, but actually, the transmitting-side processing section 1001 also has the receiving-side flow control section 124, and the receiving-side processing section 1002 also has the transmitting-side flow control section 123, but only in this text, the main focus is on transmitting data from the transmitting-side processing section 1001 to the receiving-side processing section 1002, and therefore, the receiving-side flow control section 124 of the transmitting-side processing section 1001 and the transmitting-side flow control section 123 of the receiving-side processing section 1002 do not function. In addition, since the embodiment of the present disclosure is mainly completed by the transmitting-end flow control unit 123 and the receiving-end flow control unit 124, the serializer and deserializer 121, the MAC layer 122, the switching module 140, the DMA module 130, and the memory 110 in fig. 2 are omitted in fig. 3.
To prevent overflow, the transmitting-end processing unit 1001 needs to perform data transmission according to the amount of data that the receiving-end processing unit 1002 can receive. Therefore, it employs a mechanism of notifying in advance the traffic that is allowed to be sent. When the data in the received data buffer 12401 of the receiving end is transferred to the memory 110, and the part of the data does not exist in the received data buffer 12401, a traffic value indicating how many packets it is allowed to transmit to the transmitting-end processing unit 1001 can be transmitted to the transmitting-end processing unit according to the size of the part of the data (the size of the data newly transferred from the received data buffer 12401 to the receiving-end memory). As the receive data buffer 12401 continues to transfer data to the receive memory during use by the receive, the flow value sent to the transmit is continually updated. The transmitting-side processing unit 1001 consumes the corresponding traffic value every time it issues data. Before sending data to the receiving side processing unit 1002, each time the sending side processing unit 1001 checks whether the size of data to be sent is not larger than the difference between the traffic extreme value and the consumed traffic value, and if so, it indicates that the data can be sent without overflowing the receiving data buffer 12401 of the receiving side.
As shown in fig. 3, when the receive-side processing unit 1002 does not receive data from the transmit-side processing unit 1001, the receive data buffer 12401 does not store any data, and the size of data that the transmit-side processing unit 1001 is allowed to transmit to itself is the size of the entire receive data buffer 12401, and the first traffic value allocation unit 12402 allocates an initial first traffic value, which is the size of the receive data buffer 12401. The first traffic value notification packet generator 12403 places this initial first traffic value in a first traffic value initial notification packet, and transmits it to the transmitting-end flow control unit 123 through the transmitter 12405.
The receiver 12311 of the transmitting-end processing unit 1001 receives the first traffic value initial notification packet. The first flow value parser 12305 parses a flow value from the first flow value initial notification packet received by the receiving-end flow control unit 124, and reflects the flow value to the first flow extremum storage 12302.
First flow extremum store 12302 stores a first flow extremum. The first traffic extreme value is an accumulated value of traffic values received by the transmitting end from the receiving end. It reflects how much traffic value the sender processing unit 1001 received from the receiver chip 1002 since the start of counting, i.e., the sum of the memory capacity that the receiver historically has been freed up. The initial traffic value received by the transmitting end from the receiving end is the entire memory capacity of the received data buffer 12401. The transmitting end transmits data to the receiving end according to the initial traffic value, and fills the received data buffer 12401. Then, a part of the data in the received data buffer 12401 is transferred to the memory 110, and a part of the memory space is freed. How much memory space is free due to the data transfer, the sender receives how large the update traffic value is from the receiver. Each time the receive data buffer 12401 transfers data to the receive memory, the receive processing unit 1002 adds the amount of the transferred data to the flow value, sends the flow value to the transmit processing unit 1001, and finally adds the flow value to the first extreme flow value recorded by the transmit processing unit 1001. Therefore, the first extreme flow value is generally larger than the storage capacity of the receive data buffer 12401 at the receiving end, because the same storage space of the receive data buffer 12401 is repeatedly freed up, and this part of the storage space is repeatedly accumulated into the flow value and the first extreme flow value.
After storing the initial traffic value in the first traffic extreme value storage 12302, the first traffic value initial acknowledgement packet generator 12319 of the transmitting end transmits the first traffic value initial acknowledgement packet to the receiving end through the transmitter 12310 of the transmitting end.
The first traffic value consumption memory 12303 stores a consumed traffic value, i.e., a sum of data amounts that the transmitting end has transmitted to the receiving end. When the sender sends data to the receiver, the traffic value of the corresponding size is added to the first traffic value consumption memory 12303. The traffic extreme value stored in the first traffic extreme value storage 12302 is the amount of data that the receiving end allows the transmitting end to transmit, the consumed traffic value stored in the first traffic value consumption storage 12303 is the amount of data that has been transmitted according to the first traffic extreme value, and the difference between the consumed traffic value and the consumed traffic value is the amount of data that the transmitting end is allowed to additionally transmit. The comparator 12304 compares whether the amount of data currently to be transmitted to the receiving side in the transmit data buffer 12301 does not exceed the difference, and if not, allows the next data to be transmitted from the transmitting side to the receiving side. If so, it cannot be issued. However, since the first traffic extreme value stored in the first traffic extreme value storage 12302 will increase later, it must wait until a time that does not exceed the first traffic extreme value, and then the next data is sent from the sending end to the receiving end.
When a data packet to be transmitted is transmitted to the receiving end through the transmitter 12310 of the transmitting-end flow control unit 123, the transmitted data size is added to the consumed traffic value stored in the first traffic value consumption memory 12303 through the second adder 12309.
The receiving-side stream control unit 124 receives the data packet transmitted from the transmitting side through its receiver 12406, and stores the data packet in the receiving data buffer 12401 until the receiving data buffer 12401 is full. After the received data buffer 12401 is filled, the received data stored in the received data buffer 12401 is transferred to the memory 110 of the receiving end. In this way, the receive data buffer 12401 again frees up memory capacity. At this time, the data from the transmitting end can be received again. Accordingly, the amount of data newly transferred to the receiver memory in the reception data buffer 12401 is added to the flow value already allocated by the first flow value allocation unit 12402 by the fifth adder 12404. Previously, the traffic value allocated by the first traffic value allocation unit 12402 is the entire storage capacity of the received data buffer 12401, and at this time, the traffic value allocated by the first traffic value allocation unit 12402 becomes the sum of the entire storage capacity of the received data buffer 12401 plus the amount of data of the received data buffer 12401 newly transferred to the sink memory, i.e., the amount of data that the sender was historically allowed to send to the sink. Next, the first traffic value notification packet generator 12403 is configured to place the updated traffic value in a first traffic value update notification packet, which is transmitted to the transmitting-side flow control unit 123 by the transmitter 12405 of the transmitting side.
The first traffic value parser 12305 of the transmitting end receives the first traffic value update notification packet through the receiver 12311 of the transmitting end, and parses an updated traffic value therefrom. The updated traffic value is actually the total amount of data that the receiver historically allows the sender to send to it. The previous first traffic value buffer 12306 buffers the traffic value parsed from the previous first traffic value notification packet received by the receiving-end flow control unit 123, that is, the total amount of data the receiving end allows the transmitting end to transmit to it before receiving the current traffic value update notification packet. The subtractor 12307 subtracts the current first traffic value update notification packet to obtain the data volume that the sending end is newly allowed to send to the receiving end by the current first traffic value update notification packet, that is, the traffic value difference. The first adder 12308 adds the flow value difference to the flow extremum stored in the first flow extremum storage 12302 to obtain the updated first flow extremum. The updated first traffic extreme value reflects the data volume which is historically allowed to be sent to the receiver by the receiver in total.
The method of receiving the updated traffic value at the sending end, obtaining the difference value of the first traffic value according to the received updated traffic value, and adding the difference value to the first traffic extremum is adopted instead of directly allowing the receiving end to transmit the difference value of the traffic value, that is, the method of receiving the data volume newly transferred from the data buffer 12401 to the memory of the receiving end, so as to avoid the influence caused by transmission loss. If the receiving end directly sends the difference of the traffic values to the transmitting end, once the difference is lost in transmission, the first traffic extremum stored in the first traffic extremum storage 12302 can never reflect the lost difference. If the updated traffic value is sent, even if the value is lost in the transmission process, since the traffic value allocated by the first traffic value allocation unit 12402 is always accumulated based on the first traffic value allocated last time, the difference is not lost in the entire updated traffic value, so that the first traffic value parser 12305, after receiving the next first traffic value update notification packet, parses out the updated traffic value, and when subtracting the value buffered by the previous first traffic value buffer 12306, the obtained difference becomes the sum of the differences of the first traffic values transmitted by the receiving end to the transmitting end last two times, and therefore, even if the transmission is lost, the lost traffic value difference is correctly accumulated in the first traffic extremum memory 12302 in the subsequent process, thereby avoiding the influence caused by the transmission loss.
In addition, the first traffic value parser 12305 may perform transmission error check on the first traffic value notification packet received from the receiving end. The transmission error check is a check for verifying whether an error occurs in a transmission process of a received data packet, and includes a Cyclic Redundancy Check (CRC) and the like. If an error is checked, the first traffic value notification packet is discarded. This may improve the reliability of the received flow value. Dropping the first traffic value notification packet does not risk losing the traffic value that should be received. The reason is the same. Even if the current traffic value is discarded, the next traffic value sent by the receiving end chip 1002 is accumulated based on the current traffic value, so that the first traffic value parser 12305 parses the updated traffic value after receiving the next first traffic value update notification packet, and when subtracting the value cached in the previous first traffic value buffer 12306, the difference becomes the sum of the differences between the first traffic values currently transmitted by the receiving end and the next transmitted to the transmitting end, and therefore, even if the transmission is lost, the lost traffic value difference is correctly accumulated in the first traffic extreme value storage 12302 in the subsequent process, and the influence caused by the transmission loss is avoided.
After the subtractor 12307 obtains the first flow value difference, the first flow value parser 12305 may update the previous first flow value buffer 12306 with the flow value parsed from the current first flow value notification packet because the current flow value actually becomes the previous flow value in the next calculation when the flow value difference is calculated next time. This is done in order to have the correct premise for the next flow value difference calculation.
When the transmitting end wants to transmit data to the receiving end, the comparator 12304 fetches the data to be transmitted from the transmitting data buffer 12301, compares the data size with the difference between the traffic extreme value stored in the first traffic extreme value memory 12302 and the consumed traffic value stored in the first traffic value consumption memory 12303, and if the data size to be transmitted does not exceed the difference, it indicates that the transmitting of the data will not cause the receiving data buffer 12401 of the receiving end to overflow, and the data can be transmitted. After the transmitter 12310 at the transmitting end transmits the data, the second adder 12309 adds the size of the transmitted data to the consumed traffic value stored in the first traffic value consumption memory 12303, so that the consumed traffic value always reflects the amount of data that the transmitting end has transmitted to the receiving end.
After the transmitting end transmits data to the receiving end, the receiving end flow control unit 124 repeats the same process, thereby implementing continuous transmission of data.
The advantages of the above flow value based flow control over previous techniques include:
(1) even if the first traffic value notification packet is lost on the way, overflow is never caused. This is because, as described above, when the first traffic value parser 12305 subtracts the value buffered in the previous first traffic value buffer 12306 after receiving the next unreleased traffic value, the difference becomes the sum of the differences between the traffic values transmitted to the transmitting end by the receiving end this time and the traffic value transmitted to the transmitting end next time. Therefore, even if transmission is lost, the lost traffic value difference value is correctly accumulated in the first traffic extreme value storage 12302 in the subsequent process, and the influence caused by the transmission loss is avoided.
(2) The mechanism can accurately know the size of the space transferred from the received data buffer 12401 to the receiving end memory, and fully utilize the space of the received data buffer 12401 instead of reserving some buffer space for the data packet, which causes waste.
(3) Due to the fact that the size of data sent to the receiving end by the sending end is reasonably evaluated, sending efficiency is improved, and time for switching from a data waiting state to a new data sending state is shortened.
(4) The size of the receive data buffer 12401 can be flexibly controlled.
However, since the link is not 100% reliable, flow control based on the flow value risks communication interruption, i.e., dead communication. In the above process, if the data transmitted from the transmitting end to the receiving end is lost in the transmission process and does not reach the receiving end, the data is considered to have been transmitted at the transmitting end, so the consumed traffic value is accumulated in the first traffic value consumption memory 12303, but the receiving end does not receive the data, and cannot allocate a new traffic value according to the release of the corresponding data space of the received data buffer 12401, and the transmitting end cannot receive the new traffic value, so as to update the first traffic extremum in the first traffic extremum memory 12302. Thus, the first traffic extreme value in the first traffic extreme value memory 12302 is not increased, and the traffic value consumed in the first traffic value consumption memory 12303 is increased, so that the difference between the two values becomes smaller and smaller, and thus the data to be transmitted from the transmit data buffer 12301 becomes unable to be transmitted, which results in a dead communication between the transmitting-end processing unit 100 and the receiving-end processing unit 1002.
Flow control mechanism based on bidirectional flow value of the disclosed embodiments
The flow control mechanism based on the bidirectional flow value means that not only the receiving end transmits to the transmitting end a flow value indicating how much data the receiving end allows the transmitting end to transmit to itself, but also the transmitting end transmits to the receiving end a second flow value indicating how much data the transmitting end has accumulatively transmitted to the receiving end. The receiving end receives the second traffic value, compares the second traffic value with the total data size which is counted inside the receiving end and received from the transmitting end, and if a deviation occurs, the deviation is made up to the traffic value which is sent to the transmitting end next time, so that the first traffic extreme value stored in the first traffic extreme value memory 12302 has a chance to be corrected when dead communication occurs, and therefore dead communication is avoided.
A flow control mechanism based on bidirectional flow values according to one embodiment of the present disclosure is shown in fig. 4. Fig. 4 is different from fig. 3 in a portion of the transmitting-side flow control unit 123 in that the transmitting-side flow control unit 123 of fig. 4 is added with a second flow value generator 12312. The second traffic value generator 12312 takes the consumed traffic value (in effect how much data the sender has sent to the receiver) stored in the first traffic value consumption memory 12303 as the second traffic value to send to the receiver flow control unit 124. Specifically, the second traffic value generator 12312 puts the second traffic value in a second traffic value notification packet, which is transmitted to the receiving-end flow control unit 124 through the receiver 12311 on the transmitting side. Other parts of the transmitting-side flow control unit 123 of fig. 4 are the same as those of fig. 3, and functions and signal flow relationships with respect to these parts have been described in detail in the description with reference to fig. 3, and thus are not described in detail.
The receiving-end flow control unit 124 of fig. 4 is different from fig. 3 in that it adds a second traffic value parser 12411, a second traffic value memory 12407, a received data size memory 12408, a difference calculation unit 12409, a third adder 12410, and the fifth adder 12404 has the same accumulated contents as fig. 3, and the rest is the same as fig. 3. The functions and signal flow relationships of the remaining parts have been described in detail in the description with reference to fig. 3, and thus are not described in detail.
The second traffic value parser 12411 parses a second traffic value from the second traffic value notification packet received by the transmitting-side flow control unit 123 by the receiver 12406 on the receiving side, and stores the second traffic value in the second traffic value memory 12407. The second traffic value memory 12407 is dedicated to storing second traffic values.
The received data size memory 12408 stores the size of data that the receiving end has received from the transmitting end. The second traffic value represents the amount of data that the transmitting end has transmitted to the receiving end. The difference between the second traffic value and the received data size stored in received data size memory 12408 is in effect the amount of data lost en route. The difference calculation unit 12409 is used to calculate the difference. When the difference is 0, it indicates that there is no loss on the way, and the fifth adder 12404 adds the amount of data newly transferred to the receiver memory in the received data buffer 12401 to the traffic value already allocated by the first traffic value allocating unit 12402 as in fig. 3, regardless of the difference. When the difference is not 0, unlike the fifth adder 12404 in fig. 3 which adds only the amount of data newly transferred from the reception data buffer 12401 to the sink memory, the fifth adder 12404 in fig. 4 adds the difference between the second traffic value and the size of the received data and the amount of data newly transferred from the reception data buffer 12401 to the sink memory to the traffic value already allocated by the first traffic value allocation unit 12402. The difference is added to the first traffic value to compensate for the inaccurate amount of the newly vacant storage space of the receive data buffer 12401 caused by the transmission loss, which results in the inaccurate traffic value returned to the transmitting end, and thus causes the dead communication.
When the received data size in the received data size memory 12408 is determined, the initial value stored in the received data size memory 12408 is made to be set to 0. A third adder 12410 adds the received data size to the value stored in the received data size memory 12408 in response to receiving data from the transmitting end. Since the initial value stored in the received data size memory 12408 is 0, the data size is added to the value stored in the received data size memory 12408 every time the receiving end receives data, so that the data size that the receiving end has received from the transmitting end can be directly read out from the received data size memory 12408. In addition, the third adder 12410 adds the difference to the value stored in the received data size memory 12408 after the difference is calculated by the difference calculating unit 12409.
The first flow value initial notification packet and the first flow value update notification packet are collectively referred to as a first flow value notification packet. The first traffic value notification packet is a packet for notifying the first traffic value, regardless of whether the first traffic value is notified to the transmitting end for the first time or the first traffic value that is previously notified of the update.
In the embodiment of the present disclosure, the data, the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet, and the second traffic value notification packet sent by the sending end to the receiving end may all be transmitted in a traffic value message packet format. The flow value message packet format is a general format for transmitting data sent by the sending end chip to the receiving end chip, the first flow value initial notification packet, the first flow value update notification packet, the first flow value initial acknowledgement packet, and the second flow value notification packet. The universal format is set to transmit various special packets of the embodiment of the disclosure, which is beneficial to reducing the processing load and improving the processing efficiency.
The flow value message packet format is shown in fig. 7 and includes a destination chip address (DA)701, a source chip address (SA)702, a packet type 703, a signaled flow value or second flow value (C0-C3)704 and 707, padding bits 708, and transport error check bits 709.
DA 701 is the address of the receiving-side processing unit 1002 in the processing unit data communication, and SA 702 is the address of the transmitting-side processing unit 1001 in the processing unit data communication. To establish a processing unit communication, the addresses of the sending side processing unit 1001 and the receiving side processing unit 1002 are first known and therefore included. In the example of a total packet length of 64 bytes, 8 bytes may be allocated for each of the two fields.
The communication type 703 indicates whether the message packet is a data packet, a first traffic value initial notification packet, a first traffic value update notification packet, a first traffic value initial acknowledgement packet, or a second traffic value notification packet. In the example of a total packet length of 64 bytes, 2 bytes may be allocated for this field.
The traffic value or second traffic value (C0-C3)704 field carries the traffic value or second traffic value of the notification. For the first traffic value initial notification packet, this field carries the initial traffic value; for the first traffic value update notification packet, this field carries the updated traffic value; for a first traffic value initial acknowledgement packet, this field carries the traffic value that acknowledges receipt; for a second traffic value notification packet, the field carries a second traffic value; for data packets, this field is null. The reason why the notified traffic value or the second traffic value (C0-C3)704-707 is set to a plurality of fields C0-C3 instead of one field C0 is described in detail below. In the example of a total packet length of 64 bytes, each of the plurality of fields C0-C3 may be set to 2 bytes.
The padding bits 708 are bits padded to allow the length of the entire flow value message packet to be a prescribed length, e.g., 64 bytes. The padding bits may place a fixed code string for the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet, or the second traffic value notification packet. For data packets, the padding bits 708 may be used to carry the data that needs to be communicated. In the example of a total packet length of 64 bytes, 36 bytes may be allocated for this field.
The transmission error check bits 709 are bits set to check whether an error occurs in the entire traffic value message packet during transmission, for example, Cyclic Redundancy Check (CRC) bits. If the bit indicates that a transmission error has occurred, the packet may be dropped or retransmitted, etc. In the example of a total packet length of 64 bytes, 2 bytes may be allocated for this field.
Since the processing unit network may be composed of hundreds of processing units, in order to avoid network transmission congestion, the content to be transmitted by a plurality of virtual channels may be carried in one message packet to reduce the network load. It can be considered that one sending-end processing unit 1001 and multiple receiving-end processing units 1002 have multiple virtual channels, respectively, and may simultaneously carry contents to be delivered of the multiple virtual channels in one message packet. The receiving-end processing unit 1002 and the transmitting-end processing units 1001 also have a plurality of virtual channels, respectively, and may simultaneously carry contents to be transmitted of the plurality of virtual channels in one message packet.
For example, for a message packet to be sent by one sender processing unit 1001, it is allowed to simultaneously transmit a second traffic value or data sent to virtual channels of multiple (e.g. 4) receiver processing units, which occupy the above-mentioned fields C0-C3, respectively, for example, and a virtual channel is formed between the sender processing unit 1001 and any one of the receiver processing units 1002. For a notification packet to be sent by a receiving-side processing unit 1002, it is allowed to simultaneously transmit the traffic values (initial or updated first traffic values) sent by the receiving-side processing unit to the virtual channels of a plurality (e.g., 4) of transmitting-side processing units, which occupy the above-mentioned fields C0-C3, respectively, for example, and a virtual channel is formed between the receiving-side processing unit 1002 and any transmitting-side processing unit 1001. In this way, the network load can be greatly reduced.
In the case where one transmitting-end processing unit 1001 and a plurality of receiving-end processing units 1002 form a plurality of virtual channels, the first flow extremum stored in the first flow extremum storage 12302 includes a plurality of first flow extremums respectively corresponding to the virtual channels of the plurality of receiving-end processing units 1002, and reflects an amount of data that can be allowed to be transmitted to each receiving-end processing unit 1002 by the transmitting-end processing unit 1001; the first traffic value consumption memory 12303 stores traffic values that have been consumed by the transmitting-end processing unit for the virtual channels of the plurality of receiving-end processing units, respectively, and reflects the amount of data that has been transmitted to each receiving-end processing unit 1002; the comparator 12304 determines, for a data to be transmitted in the transmit data buffer 12301, a virtual channel of the receive side processing unit 1002 to which the data is to be transmitted, and allows the data to be transmitted to the virtual channel of the receive side processing unit 1002 if the size of the data to be transmitted does not exceed the difference between the first traffic extreme value corresponding to the virtual channel of the receive side processing unit 1002 and the consumed traffic value corresponding to the virtual channel of the receive side processing unit 1002; the second traffic value generator 12312 sets the traffic values respectively consumed for the virtual channels of the plurality of sink processing units 1002 as second traffic values respectively corresponding to the virtual channels of the plurality of sink processing units 10021002 to be respectively transmitted to the sink flow control units 124 of the plurality of sink processing units 1002. Thus, the C0-C3 fields of the second traffic value notification packet generated by the second traffic value generator 12312 respectively include the second traffic values corresponding to the virtual channels of the multiple receiving-end processing units (e.g., 4), so as to achieve the purpose of simultaneously notifying the multiple receiving-end processing units 1002 of the second traffic values and reducing network congestion.
In the case where one receiving-end processing unit 1002 and a plurality of sending-end processing units 1001 form a plurality of virtual channels, the second traffic value stored in the second traffic value memory 12407 includes a plurality of second traffic values respectively corresponding to the virtual channels of the plurality of sending-end processing units, and respectively reflects data amounts respectively sent by the plurality of sending-end processing units 1001 to the receiving-end processing unit 1002; a received data size memory 12408 stores the size of data received by the receiving-side processing unit 1002 from the virtual channels of the plurality of transmitting-side processing units 1001, respectively; the difference calculation unit 12409 calculates differences between the plurality of second traffic values corresponding to the virtual channels of the plurality of sender processing units 1001 and stored in the second traffic value memory 12407, and the data sizes received from the virtual channels of the plurality of sender processing units 1001 and stored in the received data size memory 12408, respectively, and the calculated difference for each virtual channel of the sender processing unit 1001 is actually the amount of data lost by the sender processing unit 1001 during data transmission to the receiver processing unit 1002; the first traffic value allocation unit 12402 updates the traffic values of the virtual channels transmitted to the plurality of sender processing units 1001, respectively, based on the differences obtained for the virtual channels of the plurality of sender processing units 1001 and the amount of data transferred from the receive data buffer 12401 to the receiver memory for the virtual channels of the plurality of sender processing units 1001. In this way, the C0-C3 fields of the first traffic value initial notification packet or the first traffic value update notification packet generated by the first traffic value notification packet generator 12403 respectively include traffic values corresponding to virtual channels of a plurality of sending-end processing units (e.g., 4), so as to achieve the purpose of simultaneously notifying respective traffic values to a plurality of sending-end chips 1001 and reducing network congestion.
If the traffic value or the second traffic value in the C0-C3 fields is directly in bytes or bits, it will greatly occupy the storage space in the message packet, reducing the transmission efficiency. Thus, the unit of the flow value or the second flow value may be set equal to 64 bytes. The flow value represents how many 64-byte storage capacity in how many units. In addition, the size of the receive data buffer 12401 may be set to 720 kbits to cover the time it takes for data to make one round trip between two processing units. Since the initial traffic value corresponds to the memory space of the entire receive data buffer 12401 (as soon as the receive data buffer 12401 has no data stored, and how much memory space the receive data buffer 12401 has allows the sending-end processing unit 1001 to send how much data to the receiving-end processing unit 1002), as described above, the initial traffic value may be 720k/(8 × 64) ═ 1440 traffic value units.
In addition, as can be seen from the above process, the first traffic extreme value memory 12302, the first traffic consumption memory 12303, the second traffic value memory 12407, and the received data size memory 12408 store accumulated values, which are accumulated with new values as the usage process continues, and never decrease. Thus, the problem that the stored numerical value is larger and larger, and the storage space is wasted exists. To avoid waste of memory space, a maximum of these 4 memory stores is specified, beyond which the re-accumulation starts from 0. For example, when the maximum value is set to 5000 flow value units, when the value stored in the memory reaches 5000, it is automatically changed back to 0, and re-accumulated on the basis of 0, that is, the memory is reset. Since the difference between the value stored in the first flow extreme value memory 12302 and the value stored in the first flow consumption memory 12303 and the difference between the value stored in the second flow value memory 12407 and the value stored in the first flow consumption memory 12303 are mainly used in the embodiment of the present disclosure, a pair of compared values tends to return to zero at a relatively close time point, and the difference remains unchanged, i.e., the calculation of the difference is not affected, and the storage space is saved. Even if at a certain time point one of the compared pair of values is zero and the other one is not zero, if the obtained difference is negative, the correct difference can be obtained by adding a maximum value (such as 5000) on the basis of the negative number.
An empirical formula for this maximum is 2 x 2^ (ceil (log)2(a) A) represents the initial flow value, ceil (x) is a ceiling function whose function is to solve the smallest integer greater than or equal to x, 2^ (x) represents the power x of 2. When a is 1440, the formula is substituted to obtain 4096, that is, the maximum value stored in the first flow rate extremum memory 12302, the first flow rate value consumption memory 12303, the second flow rate value memory 12407, and the received data size memory 12408 at this time is 4096 flow rate value units.
The transmission of the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet, and the second traffic value notification packet does not necessarily coincide with the period of transmission data. The main cost of the disclosed embodiment is the extra bandwidth required for transmitting the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet and the second traffic value notification packet. A typical data packet size is 1 kbyte, and the typical sizes of the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet, and the second traffic value notification packet are 64 bytes, respectively. It is assumed that the period of transmitting the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet, the second traffic value notification packet is N times the period of transmitting the data packet. If N is 1, it means that the period for transmitting the first traffic value initial notification packet, the first traffic value update notification packet, the first traffic value initial acknowledgement packet, and the second traffic value notification packet is equal to the period for transmitting the data packet. If N is 1, the bandwidth cost of transmitting the traffic value related information is 64/1024-6.25%. If N ═ a, the bandwidth cost of transmitting the traffic value-related information is 64/1024a ═ 6.25/a%. For example, when N is 8, the bandwidth cost is 0.78%. When N is 16, the bandwidth cost is 0.39%. The size of the receive data buffer 12401 at this time only needs to be 8KB (when N equals 8) or 16KB (when N equals 16). Practice shows that N is a positive integer greater than or equal to 2, and especially N-8 or N-16 can achieve better effects in the embodiments of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 5, there is provided a sending-end flow control method including:
step 510, updating a first flow extreme value according to a flow value sent by a receiving end, wherein the sent flow value is the data volume which is allowed to be increased and received by the receiving end at present;
step 520, if the size of the data to be transmitted does not exceed the difference between the first traffic extreme value and the consumed traffic value, allowing the data to be transmitted to the receiving end, wherein the consumed first traffic value is equal to the amount of data transmitted to the receiving end by the transmitting end;
step 530, the consumed flow value is used as a second flow value to be sent to the receiving end, so that the receiving end adjusts the flow value sent to the sending end according to the difference between the second flow value and the data volume which has been received by the receiving end from the sending end.
Optionally, step 510 comprises:
analyzing a flow value from a current first flow value notification packet received by a receiving end;
subtracting the flow value analyzed from the previous first flow value notification packet received from the receiving end from the flow value analyzed from the current first flow value notification packet to obtain a flow value difference value;
and adding the flow value difference to the flow value stored in the first flow extremum storage.
Optionally, after step 520, the method further comprises: adding the size of the data sent to the consumed flow value.
Optionally, step 530 comprises: the step of placing the consumed flow value as a second flow value in a second flow value notification packet for transmission to the receiving end.
Optionally, before parsing out the flow value from the current first flow value notification packet received by the receiving end, the method further includes: the current first traffic value notification packet received from the receiving end is subjected to transmission error check, such as check error, and is discarded.
As shown in fig. 6, according to an embodiment of the present disclosure, there is provided a receiving-end flow control method including:
step 610, storing the data received from the transmitting end in a received data buffer;
step 620, receiving a second flow value from the sending end;
step 630, obtaining the size of the data received from the sending end chip;
step 640, calculating the difference between the second traffic value and the received data size;
step 650, updating the traffic value sent to the sending end according to the difference and the amount of data transferred from the receive data buffer to the receive end memory.
Optionally, step 630 comprises:
setting a received data size memory storing an initial value of 0;
in response to receiving data from the sender, accumulating the received data size to a value stored by the received data size memory;
reading the received data size from the received data size memory.
Optionally, after step 640, the method further comprises: the difference is accumulated onto a value stored by the received data size memory.
Optionally, step 650 comprises:
adding a difference between the second traffic value and the size of the received data, and the amount of data transferred from the receive data buffer to a receive side memory, to the traffic value allocated for transmission to the transmit side;
and sending the flow value to the sending end chip.
Optionally, the sending the flow value to the sending end includes: and the flow value is put in a first flow value notification packet and is sent to the sending end.
Optionally, step 620 comprises: and analyzing a second flow value from a second flow value notification packet received by the transmitting end.
The implementation details of the sending-end flow control method and the receiving-end flow control method are already described in the foregoing device embodiment section, and are not repeated for brevity.
Commercial value of the disclosed embodiments
By utilizing the flow control mechanism based on the bidirectional flow value, disclosed by the embodiment of the disclosure, the flow values of the sending end and the receiving end are synchronous and can be automatically corrected when some data packets are lost due to errors in transmission. Thus, flow control based on flow values can be used with end-to-end DMA and internal switching modules to provide high throughput interconnects over unreliable links. Flow control based on traffic values may reduce transmission delays and improve switching efficiency. Experiments prove that the network throughput is improved by 20%, the transmission efficiency between the processing units is improved by more than 30%, and the method and the device have wide market prospects.
It should be understood that the embodiments in this specification are described in a progressive manner, and that the same or similar parts in the various embodiments may be referred to one another, with each embodiment being described with emphasis instead of the other embodiments.
It should be understood that the above description describes particular embodiments of the present specification. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
It should be understood that an element described herein in the singular or shown in the figures only represents that the element is limited in number to one. Furthermore, modules or elements described or illustrated herein as separate may be combined into a single module or element, and modules or elements described or illustrated herein as single may be split into multiple modules or elements.
It is also to be understood that the terms and expressions employed herein are used as terms of description and not of limitation, and that the embodiment or embodiments of the specification are not limited to those terms and expressions. The use of such terms and expressions is not intended to exclude any equivalents of the features shown and described (or portions thereof), and it is recognized that various modifications may be made within the scope of the claims. Other modifications, variations, and alternatives are also possible. Accordingly, the claims should be looked to in order to cover all such equivalents.

Claims (39)

1. A flow control unit at a transmitting end, comprising:
a first flow extreme value storage, configured to store a first flow extreme value, where the first flow extreme value is updated according to a first flow value sent by a flow control unit located at a receiving end, and the first flow value represents a data amount allowed to be further received by the receiving end;
a first traffic consumption memory, configured to store a traffic value consumed by the sending end, where the consumed traffic value represents a data amount that the sending end has sent to the receiving end;
a comparator, configured to determine that the size of data to be sent does not exceed the difference between the first flow extreme value and the consumed flow value, and allow the data to be sent to the receiving end;
and the second flow generator is used for sending the consumed flow value serving as a second flow value to the flow control unit at the receiving end, so that the flow control unit at the receiving end adjusts the flow value sent to the first flow extremum memory next time according to the difference between the second flow value and the data volume received by the receiving end from the sending end.
2. The flow control unit of claim 1, wherein the sender comprises a sender processing unit, the flow control unit being located in the sender processing unit; the receiving end comprises a receiving end processing unit, and the flow control unit of the receiving end is positioned in the receiving end processing unit.
3. The flow control unit of claim 1, further comprising:
a first flow value parser for parsing a flow value from a current first flow value notification packet received by the flow control unit at the receiving end;
a previous first flow value buffer which buffers a flow value parsed from a previous first flow value notification packet received by the flow control unit at the receiving end;
the subtracter is used for subtracting the flow value cached in the previous first flow value buffer from the flow value analyzed from the current first flow value notification packet to obtain a flow value difference value;
and the first adder is used for adding the flow value difference to the first flow extreme value stored in the first flow extreme value storage.
4. The flow control unit of claim 3, wherein the flow value parser updates the previous first flow value buffer with a flow value parsed from a current first flow value notification packet after the subtractor obtains the flow value difference.
5. The flow control unit of claim 1, further comprising:
and a second adder for adding the size of the transmitted data to the consumed traffic value stored in the first traffic value consumption memory after transmitting the data to the receiving end.
6. The flow control unit according to claim 1, wherein the second traffic value generator puts the second traffic value in a second traffic value notification packet to be transmitted to the receiving end.
7. A flow control unit according to claim 3, wherein the first flow value parser performs a transmission error check, such as a check error, on a current first flow value notification packet received from the receiving end, which is discarded.
8. The flow control unit as claimed in claim 3, wherein the current first flow value notification packet includes a first flow value initial notification packet and a first flow value update notification packet, and the transmitting-end chip flow control unit further includes a first flow value initial acknowledgement packet generator for generating the first flow value initial acknowledgement packet to be transmitted to the receiving end after the first flow value extremum storage stores the flow value notified by the first flow value initial notification packet.
9. The flow control unit according to claim 6 or 8, wherein the data transmitted by the transmitting end to the receiving end, the first flow value initial notification packet, the first flow value update notification packet, the first flow value initial acknowledgement packet, the second flow value notification packet are transmitted in a flow value message packet format including a destination chip address, a source chip address, a packet type, a notified flow value, a padding bit, a transmission error check bit.
10. The flow control unit as claimed in claim 9, wherein the first flow extreme value stored by the first flow extreme value storage includes a plurality of first flow extreme values respectively corresponding to virtual channels of a plurality of receiving ends; the first flow consumption memory stores flow values respectively consumed by a sending end aiming at virtual channels of the plurality of receiving ends; the comparator determines a virtual channel of a receiving end to which data is to be sent, and if the size of the data to be sent does not exceed the difference between a flow extreme value corresponding to the virtual channel of the receiving end and a consumed flow value corresponding to the virtual channel of the receiving end, the data is allowed to be sent to the virtual channel of the receiving end; the second flow value generator takes the flow values respectively consumed by the virtual channels of the plurality of receiving ends as second flow values respectively corresponding to the virtual channels of the plurality of receiving ends to respectively send the second flow values to the plurality of receiving ends;
the second traffic value of the notification includes second traffic values corresponding to virtual channels of a plurality of receiving ends.
11. The flow control unit of claim 9, wherein the unit setting of the flow value or the second flow value is equal to 64 bytes, the initial flow value in the first flow value initial notification packet is set to 1440 flow value units, and the first flow extremum memory and the first flow value consumption memory store a maximum of 4096 flow value units, and the memory is reset after exceeding the maximum.
12. The flow control unit according to claim 1, wherein a period of transmitting the traffic value or the second traffic value is N times a period of transmitting data from the transmitting end to the receiving end, N being a positive integer equal to or greater than 2.
13. The flow control unit of claim 1, wherein N-8 or N-16.
14. A processing unit at a transmitting end, comprising:
a memory;
a direct memory access module to control direct access of the memory;
a plurality of ports;
a switch module controlling connection of the direct memory access module to the plurality of ports,
wherein at least one of the plurality of ports comprises the transmitting-end chip flow control unit according to claims 1-13, a MAC layer for performing a medium access control MAC, and a serializer and deserializer to serialize data to be transmitted or deserialize received data.
15. A flow control unit at a receiving end, comprising:
a receive data buffer for buffering data received from a transmitting end;
a second traffic value memory for storing a second traffic value received from the transmitting end;
a received data size memory for storing a size of data received from the transmitting end;
a difference calculation unit for calculating a difference between the second traffic value stored in the second traffic value memory and the received data size stored in the received data size memory;
and the first flow value distribution unit is used for updating the flow value sent to the sending end according to the difference between the second flow value and the size of the received data and the data quantity sent to the receiving end memory by the received data buffer.
16. The flow control unit of claim 15, wherein the receiving end comprises a receiving end processing unit, the flow control unit being located in the receiving end processing unit; the sending end comprises a sending end processing unit, and the flow control unit of the sending end is positioned in the sending end processing unit.
17. The flow control unit of claim 15, wherein the received data size memory stores an initial value set to 0, the flow control unit further comprising: a third adder for, in response to receiving data from the transmitting end, adding the received data size to the value stored by the received data size memory.
18. The flow control unit of claim 17, wherein the third adder adds the difference to a value stored by the received data size memory after the difference is calculated by the difference calculation unit.
19. The flow control unit of claim 15, further comprising: and a fifth adder for adding the difference between the second traffic value and the received data size and the amount of data transmitted from the received data buffer to the receiver memory to the traffic value allocated by the first traffic value allocation unit.
20. The flow control unit of claim 15, further comprising: and a first flow value notification packet generator for placing the updated flow value in a first flow value notification packet and transmitting the same to the transmitting end.
21. The flow control unit of claim 15, further comprising: and the second flow value analyzer is used for analyzing a second flow value from a second flow value notification packet received by the transmitting end and storing the second flow value into the second flow value memory.
22. The flow control unit according to claim 20, wherein the current first flow value notification packet includes a first flow value initial notification packet and a first flow value update notification packet; the receiving end chip flow control unit further receives a first flow value initial acknowledgement packet sent by the sending end chip flow control unit after sending the first flow value initial notification packet to the sending end chip flow control unit.
23. The flow control unit according to claim 21 or 22, wherein the data transmitted by the transmitting end to the receiving end, the first flow value initial notification packet, the first flow value update notification packet, the first flow value initial acknowledgement packet, the second flow value notification packet are transmitted in a flow value message packet format including a destination chip address, a source chip address, a packet type, a notified flow value, a padding bit, a transmission error check bit.
24. The flow control unit according to claim 23, wherein the second traffic value stored by the second traffic value storage includes a plurality of second traffic values respectively corresponding to virtual channels of a plurality of transmitting ends; the received data size storage stores data sizes received from virtual channels of the plurality of transmitting ends, respectively; the difference calculation unit calculates differences between a plurality of second traffic values corresponding to virtual channels of a plurality of transmitting ends stored in the second traffic value memory and received data sizes corresponding to virtual channels of the plurality of transmitting ends stored in the received data size memory, respectively;
the first traffic value allocation unit updates the plurality of first traffic values to be transmitted to the virtual channels of the plurality of transmitting ends, respectively, based on the differences obtained for the virtual channels of the plurality of transmitting ends and the amount of data to be transmitted to a receiving end memory for the virtual channels of the plurality of transmitting ends in the received data buffer,
the notified first flow value comprises first flow values corresponding to virtual channels of a plurality of sending ends.
25. The flow control unit according to claim 23, wherein the size of the received data buffer is set to 720k bits, the unit of flow value is set equal to 64 bytes, the initial flow value in the first flow value initial notification packet is set to 1440 units of flow value, the maximum value of the received data size stored by the received data size memory and the second flow value stored by the second flow value memory is 4096 units of flow value, and the storage is reset after exceeding the maximum value.
26. The flow control unit according to claim 15, wherein a period of transmitting the traffic value or the second traffic value is N times a period of transmitting data from the transmitting end to the receiving end, N being a positive integer equal to or greater than 2.
27. The flow control unit of claim 15, wherein N-8 or N-16.
28. A processing unit at a receiving end, comprising:
a memory;
a direct memory access module to control direct access of the memory;
a plurality of ports;
a switch module controlling connection of the direct memory access module to the plurality of ports,
wherein at least one of the plurality of ports comprises the receive side chip flow control unit according to claims 15-27, a MAC layer for medium access control MAC, and a serializer and deserializer to serialize data to be transmitted or deserialize received data.
29. A sending end flow control method, comprising:
updating a first flow extreme value according to a flow value sent by a receiving end, wherein the sent flow value is the data volume which is allowed to be increased by the receiving end at present;
if the size of the data to be sent does not exceed the difference between the first flow extreme value and the consumed flow value, allowing the data to be sent to the receiving end, wherein the consumed first flow value is equal to the amount of the data sent to the receiving end by the sending end;
and taking the consumed flow value as a second flow value to be sent to the receiving end, so that the receiving end adjusts the flow value sent to the sending end according to the difference between the second flow value and the data volume which is received by the receiving end from the sending end.
30. The method of claim 29, wherein the updating the first flow extremum based on the flow value transmitted by the receiving end comprises:
analyzing a flow value from a current first flow value notification packet received by a receiving end;
subtracting the flow value analyzed from the previous first flow value notification packet received from the receiving end from the flow value analyzed from the current first flow value notification packet to obtain a flow value difference value;
and adding the flow value difference to the flow value stored in the first flow extremum storage.
31. The method of claim 29, after allowing the data to be transmitted to the receiving end, the method further comprising:
adding the size of the data sent to the consumed flow value.
32. The method of claim 29, wherein said transmitting the consumed flow value as a second flow value to the receiving end comprises: the step of placing the consumed flow value as a second flow value in a second flow value notification packet for transmission to the receiving end.
33. The method of claim 30, wherein prior to parsing out the flow value from the current first flow value notification packet received at the receiving end, the method further comprises: the current first traffic value notification packet received from the receiving end is subjected to transmission error check, such as check error, and is discarded.
34. A receiving end flow control method, comprising:
storing data received from a transmitting end in a received data buffer;
receiving a second traffic value from the transmitting end;
acquiring the size of data received from a sending end chip;
calculating a difference between the second traffic value and the received data size;
and updating the flow value sent to the sending end according to the difference and the data volume transferred from the receiving data buffer to the receiving end memory.
35. The method of claim 34, wherein the obtaining the size of the data received from the transmitting end comprises:
setting a received data size memory storing an initial value of 0;
in response to receiving data from the sender, accumulating the received data size to a value stored by the received data size memory;
reading the received data size from the received data size memory.
36. The method of claim 35, wherein after calculating the difference between the second traffic value and the received data size, the method further comprises:
the difference is accumulated onto a value stored by the received data size memory.
37. The method of claim 34, wherein updating the traffic value sent to the sender based on the difference and the amount of data transferred from the receive data buffer to a receive side memory comprises:
adding a difference between the second traffic value and the size of the received data, and the amount of data transferred from the receive data buffer to a receive side memory, to the traffic value allocated for transmission to the transmit side;
and sending the flow value to the sending end chip.
38. The method of claim 37, wherein the transmitting the flow value to the transmitter comprises: and the flow value is put in a first flow value notification packet and is sent to the sending end.
39. The method of claim 34, wherein the receiving a second traffic value from a transmitting end comprises: and analyzing a second flow value from a second flow value notification packet received by the transmitting end.
CN202010912936.7A 2020-09-03 2020-09-03 Processing unit and flow control unit, and related methods Pending CN114221905A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010912936.7A CN114221905A (en) 2020-09-03 2020-09-03 Processing unit and flow control unit, and related methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010912936.7A CN114221905A (en) 2020-09-03 2020-09-03 Processing unit and flow control unit, and related methods

Publications (1)

Publication Number Publication Date
CN114221905A true CN114221905A (en) 2022-03-22

Family

ID=80695574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010912936.7A Pending CN114221905A (en) 2020-09-03 2020-09-03 Processing unit and flow control unit, and related methods

Country Status (1)

Country Link
CN (1) CN114221905A (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001203705A (en) * 2000-01-19 2001-07-27 Nec Corp Device and method for controlling flow and storage medium recording flow control program
CN1996942A (en) * 2006-06-26 2007-07-11 华为技术有限公司 A method and system for traffic control
US20090268747A1 (en) * 2005-10-03 2009-10-29 Hiroshi Kurata Communication apparatus
CN101707789A (en) * 2009-11-30 2010-05-12 中兴通讯股份有限公司 Method and system for controlling flow
JP2014230072A (en) * 2013-05-22 2014-12-08 株式会社リコー Data communication device, data communication apparatus, and data communication method
CN104639298A (en) * 2013-11-08 2015-05-20 腾讯科技(深圳)有限公司 Data transmission method, device and system
CN107948236A (en) * 2016-10-12 2018-04-20 佳能株式会社 Communicator, communication means and storage medium
CN108199925A (en) * 2018-01-30 2018-06-22 网宿科技股份有限公司 A kind of data transmission method for uplink, method of reseptance and device
US20190104218A1 (en) * 2017-10-02 2019-04-04 Canon Kabushiki Kaisha Information processing apparatus and image processing apparatus that perform transmission and reception of data, and method of controlling information processing apparatus
CN110505039A (en) * 2019-09-26 2019-11-26 北京达佳互联信息技术有限公司 A kind of data transfer control method, device, equipment and medium
CN110677355A (en) * 2019-10-08 2020-01-10 香港乐蜜有限公司 Packet loss coping method and device, electronic equipment and storage medium
CN111478826A (en) * 2020-06-09 2020-07-31 北京大米科技有限公司 Packet loss rate determining method, data transmission control method and data transmission system

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001203705A (en) * 2000-01-19 2001-07-27 Nec Corp Device and method for controlling flow and storage medium recording flow control program
US20090268747A1 (en) * 2005-10-03 2009-10-29 Hiroshi Kurata Communication apparatus
CN1996942A (en) * 2006-06-26 2007-07-11 华为技术有限公司 A method and system for traffic control
CN101707789A (en) * 2009-11-30 2010-05-12 中兴通讯股份有限公司 Method and system for controlling flow
JP2014230072A (en) * 2013-05-22 2014-12-08 株式会社リコー Data communication device, data communication apparatus, and data communication method
CN104639298A (en) * 2013-11-08 2015-05-20 腾讯科技(深圳)有限公司 Data transmission method, device and system
CN107948236A (en) * 2016-10-12 2018-04-20 佳能株式会社 Communicator, communication means and storage medium
US20190104218A1 (en) * 2017-10-02 2019-04-04 Canon Kabushiki Kaisha Information processing apparatus and image processing apparatus that perform transmission and reception of data, and method of controlling information processing apparatus
CN108199925A (en) * 2018-01-30 2018-06-22 网宿科技股份有限公司 A kind of data transmission method for uplink, method of reseptance and device
CN110505039A (en) * 2019-09-26 2019-11-26 北京达佳互联信息技术有限公司 A kind of data transfer control method, device, equipment and medium
CN110677355A (en) * 2019-10-08 2020-01-10 香港乐蜜有限公司 Packet loss coping method and device, electronic equipment and storage medium
CN111478826A (en) * 2020-06-09 2020-07-31 北京大米科技有限公司 Packet loss rate determining method, data transmission control method and data transmission system

Similar Documents

Publication Publication Date Title
EP0391583B1 (en) Dual-path computer interconnect system with four-ported packet memory control
CN111327603B (en) Data transmission method, device and system
JP4560409B2 (en) Integrated circuit and method for exchanging data
CN113728596A (en) System and method for facilitating efficient management of idempotent operations in a Network Interface Controller (NIC)
US4484326A (en) Packet load monitoring by trunk controllers
US6356962B1 (en) Network device and method of controlling flow of data arranged in frames in a data-based network
US5301186A (en) High speed transmission line interface
US4488289A (en) Interface facility for a packet switching system
EP0459753A2 (en) Network access controller having logical FIFO buffer
CN100596114C (en) Credit based flow control system, apparatus and method
JPH07273796A (en) Communication system and frame relay network for transferring data and method for transferring data packet
CN103618673A (en) NoC routing method guaranteeing service quality
CN110471872A (en) One kind realizing M-LVDS bus data interactive system and method based on ZYNQ chip
CN106603420B (en) It is a kind of in real time and failure tolerance network-on-chip router
CN110971542B (en) SRIO data transmission system based on FPGA
CN112738229B (en) Communication method for realizing automatic data continuous transmission
EP1554644A4 (en) Method and system for tcp/ip using generic buffers for non-posting tcp applications
CN110505168B (en) NI interface controller and data transmission method
CN101123580A (en) Packet transmission method and base station device
CN113572582A (en) Data transmission and retransmission control method and system, storage medium and electronic device
KR20170015000A (en) On-chip network and communication method thereof
CN116846826A (en) High-reliability self-adaptive network-on-chip router micro-architecture
CN114221905A (en) Processing unit and flow control unit, and related methods
CN1917519B (en) Method and system for parallel transmitting serial data according to high level data link control
CN106302426A (en) A kind of udp protocol stack implementation method of band retransmission mechanism based on FPGA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240223

Address after: 5th Floor, No. 2, Lane 55, Chuanhe Road, No. 366 Shangke Road, Pudong New Area Free Trade Pilot Zone, Shanghai

Applicant after: Pingtouge (Shanghai) semiconductor technology Co.,Ltd.

Country or region after: China

Address before: 847, 4 / F, capital tower 1, Grand Cayman, British Cayman Islands

Applicant before: ALIBABA GROUP HOLDING Ltd.

Country or region before: United Kingdom

TA01 Transfer of patent application right