CN112486871B - Routing method and system for on-chip bus - Google Patents

Routing method and system for on-chip bus Download PDF

Info

Publication number
CN112486871B
CN112486871B CN202011342617.3A CN202011342617A CN112486871B CN 112486871 B CN112486871 B CN 112486871B CN 202011342617 A CN202011342617 A CN 202011342617A CN 112486871 B CN112486871 B CN 112486871B
Authority
CN
China
Prior art keywords
transmission
port
destination
transmission bandwidth
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011342617.3A
Other languages
Chinese (zh)
Other versions
CN112486871A (en
Inventor
余德君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hygon Information Technology Co Ltd
Original Assignee
Hygon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hygon Information Technology Co Ltd filed Critical Hygon Information Technology Co Ltd
Priority to CN202011342617.3A priority Critical patent/CN112486871B/en
Publication of CN112486871A publication Critical patent/CN112486871A/en
Application granted granted Critical
Publication of CN112486871B publication Critical patent/CN112486871B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1678Details of memory controller using bus width
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3027Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1684Details of memory controller using multiple buses
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例提供一种用于片上总线的路由方法以及系统,所述路由方法包括:获取总线上至少一个目的端口中各目的端口的实际传输带宽占用量;获取所述至少一个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量;根据所述实际传输带宽占用量和所述预测传输带宽占用量,为来自于源端口的待传输数据确定传输策略。本申请的一些实施例通过监测总线传输过程中各目的端口的实际带宽占用情况和预测带宽占用情况,进而为待传输的数据提供合适的目标出口,当数据传输方向为上行时以使待传输数据通过目标出口发送至与目标端口相连的主机(或片上存储单元)。

Figure 202011342617

An embodiment of the present application provides a routing method and system for an on-chip bus. The routing method includes: acquiring the actual transmission bandwidth occupancy of each destination port in at least one destination port on the bus; Predicted transmission bandwidth occupancy of the destination port in the future estimated time period; according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy, a transmission strategy is determined for the data to be transmitted from the source port. Some embodiments of the present application monitor the actual bandwidth occupancy and predicted bandwidth occupancy of each destination port in the bus transmission process, and then provide a suitable target exit for the data to be transmitted. When the data transmission direction is uplink, the data to be transmitted Send it to the host (or on-chip storage unit) connected to the target port through the target outlet.

Figure 202011342617

Description

一种用于片上总线的路由方法以及系统A routing method and system for on-chip bus

技术领域technical field

本申请涉及计算机IO领域,具体而言本申请实施例涉及一种用于片上总线的路由方法以及系统。The present application relates to the field of computer IO, and specifically, the embodiments of the present application relate to a routing method and system for an on-chip bus.

背景技术Background technique

计算机I/O技术在高性能计算技术的发展中始终是一个十分重要的关键技术。其技术特性决定了计算机I/O的处理能力,进而决定了计算机的整体性能以及应用环境。从根本上讲,无论现在还是将来,I/O技术都将制约着计算机技术的应用与发展,尤其在高端计算领域。Computer I/O technology is always a very important key technology in the development of high performance computing technology. Its technical characteristics determine the processing capability of computer I/O, and then determine the overall performance and application environment of the computer. Fundamentally speaking, no matter now or in the future, I/O technology will restrict the application and development of computer technology, especially in the field of high-end computing.

计算机I/O外设主要包括与PCIE、USB、SATA以及Ethernet连接的设备。在服务器应用中,涉及多路I/O设备访问内存,以及多路I/O设备之间的P2P(Peer to Peer)的访问。这些I/O设备的访问需要满足带宽的需求,也需要统一灵活的路由管理。在片上系统或者系统级芯片SoC芯片中,该路由管理由专门的IO路由模块实现,IO路由模块或统称为IOHUB。IOHUB连接各种I/O外设和CPU内存,也路由本地到远端的I/O设备以及内存访问。随着服务器对I/O设备需求的增加,服务器SoC芯片需要支持更多的I/O设备,满足更大的带宽需求。Computer I/O peripherals mainly include devices connected to PCIE, USB, SATA, and Ethernet. In server applications, access to memory by multiple I/O devices and P2P (Peer to Peer) access between multiple I/O devices are involved. The access of these I/O devices needs to meet the bandwidth requirements, and also requires unified and flexible routing management. In a system-on-chip or system-on-a-chip SoC chip, the routing management is implemented by a dedicated IO routing module, and the IO routing module is collectively referred to as IOHUB. IOHUB connects various I/O peripherals and CPU memory, and also routes local to remote I/O devices and memory access. As the server's demand for I/O devices increases, the server SoC chip needs to support more I/O devices to meet greater bandwidth requirements.

因此如何满足更多外设对CPU内存的同时访问,同时提升IOHUB的路由能力成立亟待解决的技术问题。Therefore, how to meet the simultaneous access of more peripherals to CPU memory and improve the routing capability of IOHUB is a technical problem that needs to be solved urgently.

发明内容Contents of the invention

本申请实施例的目的在于提供一种用于片上总线的路由方法以及系统,通过本申请实施例的技术方案至少可以有效提升IOHUB的带宽和传输效率。例如,在本申请的一些实施例中通过增加目的端口target port的数量,可支持更多源端口source port,通过带宽监测机制和灵活路由的改进,提升传输效率。The purpose of the embodiments of the present application is to provide a routing method and system for an on-chip bus. Through the technical solutions of the embodiments of the present application, at least the bandwidth and transmission efficiency of the IOHUB can be effectively improved. For example, in some embodiments of the present application, by increasing the number of target ports, more source ports can be supported, and the transmission efficiency can be improved by improving the bandwidth monitoring mechanism and flexible routing.

第一方面,本申请的一些实施例提供一种用于片上总线的路由方法,所述路由方法包括:获取总线上至少一个目的端口中各目的端口的实际传输带宽占用量;获取所述至少一个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量;根据所述实际传输带宽占用量和所述预测传输带宽占用量,为来自于源端口的待传输数据确定传输策略。In a first aspect, some embodiments of the present application provide a routing method for an on-chip bus. The routing method includes: obtaining the actual transmission bandwidth occupancy of each destination port in at least one destination port on the bus; obtaining the at least one The predicted transmission bandwidth occupancy of each of the destination ports in the expected future time period; according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy, a transmission strategy is determined for the data to be transmitted from the source port.

本申请的一些实施例通过监测总线传输过程中各目的端口的实际带宽占用情况和预测带宽占用情况,进而为待传输的数据提供合适的目标出口,当数据传输方向为上行时以使待传输数据通过目标出口发送至与目标端口相连的主机(或片上存储单元)。Some embodiments of the present application monitor the actual bandwidth occupancy and predicted bandwidth occupancy of each destination port in the bus transmission process, and then provide a suitable target exit for the data to be transmitted. When the data transmission direction is uplink, the data to be transmitted Send it to the host (or on-chip storage unit) connected to the target port through the target outlet.

在一些实施例中,所述目的端口为与主机或者片上存储器相连的端口,所述目的端口的数量为多个;所述根据所述实际传输带宽占用量和所述预测传输带宽占用量,为来自于源端口的待传输数据确定传输策略,包括:根据所述实际传输带宽占用量和所述预测传输带宽占用量,为所述待传输数据从多个所述目的端口中确定至少一个目标出口;其中,所述目标出口用于向所述主机或者所述片上存储器提供所述待传输数据。In some embodiments, the destination port is a port connected to a host or an on-chip memory, and there are multiple destination ports; according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy, it is Determining a transmission strategy for the data to be transmitted from the source port includes: determining at least one target exit from a plurality of destination ports for the data to be transmitted according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy ; Wherein, the target exit is used to provide the data to be transmitted to the host or the on-chip memory.

本申请的一些实施例通过增加与主机或者内部存储器相连的总线接口的数量,进而可以为来自于更多的I/O设备的数据提供更多的目的端口。通过设置多个目的端口可以满足随着I/O设备的增加,或者随着I/O接口速率的提升和系统应用导致的带宽需求的增加,改善了由于IO路由采用一个与主机或者片上存储器相连的目的端口造成的带宽瓶颈问题。Some embodiments of the present application can provide more destination ports for data from more I/O devices by increasing the number of bus interfaces connected to the host or internal memory. By setting multiple destination ports, it can meet the increase of I/O devices, or the increase of I/O interface speed and the increase of bandwidth requirements caused by system applications, and improve the IO routing by using a connection with the host or on-chip memory. The bandwidth bottleneck problem caused by the destination port.

在一些实施例中,所述根据所述实际传输带宽占用量和所述预测传输带宽占用量,为所述待传输数据从多个所述目的端口中确定至少一个目标出口,包括:根据为所述多个目的端口中各目的端口分别设置的传输带宽阈值和预测带宽阈值,从多个所述目的端口中确定至少一个所述目标出口。In some embodiments, the determining at least one target exit from multiple destination ports for the data to be transmitted according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy includes: according to the determining at least one target egress from the plurality of destination ports according to the transmission bandwidth threshold and the predicted bandwidth threshold respectively set for each destination port in the plurality of destination ports.

本申请的一些实施例通过设定比较阈值,并将监测得到的带宽消耗量(例如,包括实际传输带宽占用量和预期带宽占用量)与设定阈值比较,可以确定这些目的端口是否能够传输来自于源端口的待传输数据。Some embodiments of the present application can determine whether these destination ports can transmit data from Data to be transmitted on the source port.

在一些实施例中,所述根据为所述多个目的端口中各端口分别设置的传输带宽阈值和预测带宽阈值,从多个所述目的端口中确定至少一个所述目标出口,包括:确认第一目的端口对应的第一实际传输带宽占用量小于与所述第一目的端口对应的第一传输带宽阈值,并确认所述第一目的端口对应的第一预测传输带宽占用量小于与所述第一目的端口对应的第一预测带宽阈值;选择所述第一目的端口作为所述目标出口。In some embodiments, the determining at least one target egress from the plurality of destination ports according to the transmission bandwidth threshold and the predicted bandwidth threshold respectively set for each of the plurality of destination ports includes: confirming the first The first actual transmission bandwidth occupancy corresponding to a destination port is smaller than the first transmission bandwidth threshold corresponding to the first destination port, and confirming that the first predicted transmission bandwidth occupancy corresponding to the first destination port is smaller than the first transmission bandwidth threshold corresponding to the first destination port A first predicted bandwidth threshold corresponding to a destination port; selecting the first destination port as the target egress.

本申请的一些实施例将当前时刻目的端口带宽余量充足且距当前时刻一定时长的可预测时长内的带宽余量较多的目的端口选为当前待传输数据的目标出口,以使所述待传输数据通过足够目标出口输出至与目的端口相连的主机或者内存,保证了待传输数据的顺利传输且提升了目的端口资源的利用率。In some embodiments of the present application, a destination port with sufficient bandwidth margin at the current moment and a larger bandwidth margin within a predictable period of time from the current moment is selected as the target egress of the current data to be transmitted, so that the The transmission data is output to the host or memory connected to the destination port through enough destination ports, which ensures the smooth transmission of the data to be transferred and improves the resource utilization of the destination port.

在一些实施例中,所述根据为所述多个目的端口中各端口分别设置的传输带宽阈值和预测带宽阈值,从多个所述目的端口中确定至少一个所述目标出口,包括:根据所述待传输数据的属性特征,确认采用所述传输带宽阈值和所述预测带宽阈值中的至少一个确定所述目标出口,其中,所述属性特征用于表征所述待传输数据的数据量的多少。In some embodiments, the determining at least one target egress from multiple destination ports according to the transmission bandwidth threshold and predicted bandwidth threshold respectively set for each of the multiple destination ports includes: according to the Describe the attribute characteristics of the data to be transmitted, and confirm that at least one of the transmission bandwidth threshold and the predicted bandwidth threshold is used to determine the target exit, wherein the attribute characteristics are used to characterize the amount of data to be transmitted .

本申请的一些实施例还根据待传输数据对应的数据量的多少来选择更加合理的目标出口,在保证待传输数据顺利传输的基础上充分利用了各目的端口的带宽。Some embodiments of the present application also select a more reasonable target exit according to the amount of data corresponding to the data to be transmitted, and make full use of the bandwidth of each destination port on the basis of ensuring the smooth transmission of the data to be transmitted.

在一些实施例中,根据所述待传输数据的属性特征,确认采用所述传输带宽阈值和所述预测带宽阈值中的至少一个确定所述目标出口,包括:确认所述待传输数据的数据量大于第一设定阈值;确认第二目的端口对应的第二预测传输带宽占用量小于与所述第二目的端口对应的第二预测带宽阈值;选择所述第二目的端口作为所述目标出口;或者,确认所述待传输数据的数据量小于第二设定阈值;确认所述第二目的端口对应的第二实际传输带宽占用量小于与所述第二目的端口对应的第二传输带宽阈值;选择所述第二目的端口作为所述目标出口;其中,所述第一设定阈值大于所述第二设定阈值。In some embodiments, according to the attribute characteristics of the data to be transmitted, confirming that at least one of the transmission bandwidth threshold and the predicted bandwidth threshold is used to determine the target egress includes: confirming the data volume of the data to be transmitted greater than the first set threshold; confirming that the second predicted transmission bandwidth occupancy corresponding to the second destination port is less than the second predicted bandwidth threshold corresponding to the second destination port; selecting the second destination port as the target egress; Or, confirming that the amount of data to be transmitted is less than a second set threshold; confirming that the second actual transmission bandwidth occupation corresponding to the second destination port is less than the second transmission bandwidth threshold corresponding to the second destination port; Selecting the second destination port as the target egress; wherein the first set threshold is greater than the second set threshold.

本申请的一些实施例当待传输数据量较大时,优选预测带宽余量充足(即根据预测实际传输带宽占用量来确定是否选择某个目的端口作为目标出口)的目的端口,当待传输数据量较小时,可以优先选择当前剩余带宽量(即根据实际传输带宽占用量来确定是否选择某个目的端口作为目标出口),这些实施例充分结合了待传输数据的特征和各目的端口剩余带宽情况,能够更加准确的为待传输数据确定与主机或者内存相连的目标出口。In some embodiments of the present application, when the amount of data to be transmitted is large, the destination port with sufficient predicted bandwidth margin (that is, whether to select a certain destination port as the target exit according to the predicted actual transmission bandwidth occupancy) is preferred. When the amount is small, the current remaining bandwidth can be preferentially selected (that is, to determine whether to select a certain destination port as the target exit according to the actual transmission bandwidth occupancy). These embodiments fully combine the characteristics of the data to be transmitted and the remaining bandwidth of each destination port , which can more accurately determine the target outlet connected to the host or memory for the data to be transmitted.

在一些实施例中,所述实际传输带宽占用量是根据当前传输在传输窗口内的传输带宽确定的。In some embodiments, the actual transmission bandwidth occupancy is determined according to the transmission bandwidth of the current transmission within the transmission window.

本申请的一些实施例通过统计传输窗口内的传输带宽来确定各目的端口的实际传输带宽占用量,提升了各目的端口带宽占用情况估计的准确性和客观性。Some embodiments of the present application determine the actual transmission bandwidth occupancy of each destination port by counting the transmission bandwidth within the transmission window, which improves the accuracy and objectivity of estimating the bandwidth occupancy of each destination port.

在一些实施例中,所述实际传输带宽占用量的计算公式如下:In some embodiments, the calculation formula of the actual transmission bandwidth occupancy is as follows:

Figure GDA0004199193200000041
Figure GDA0004199193200000041

其中,CLK COUNT为在设置的传输窗口内统计的系统时钟周期数,TRANS COUNT用于表征统计当前时刻的所有有效传输所对应的数据字节数。Among them, CLK COUNT is the number of system clock cycles counted within the set transmission window, and TRANS COUNT is used to represent and count the number of data bytes corresponding to all valid transmissions at the current moment.

本申请的一些实施例通过上述公式可以确定各目的端口的实际传输带宽占用量,提升了带宽占用情况估计的客观性。Some embodiments of the present application can determine the actual transmission bandwidth occupancy of each destination port through the above formula, which improves the objectivity of bandwidth occupancy estimation.

在一些实施例中,通过监测通道中的传输有效信号统计参数TRANS COUNT的值。In some embodiments, the value of the parameter TRANS COUNT is counted by monitoring the transmission effective signal in the channel.

本申请的一些实施例通过检测总线上各通道传输数据的有效信号来确定记录实际传输带宽占用量,提升了实际传输带宽占用量估计的准确性。Some embodiments of the present application determine and record the actual transmission bandwidth occupancy by detecting valid signals of data transmission of each channel on the bus, thereby improving the accuracy of estimating the actual transmission bandwidth occupancy.

在一些实施例中,所述预测传输带宽占用量是根据可预期的传输在可预期的传输窗口内的传输带宽确定的,其中,所述可预期的传输窗口是根据所述可预期的传输对应的传输长度和传输位宽确定的总的系统时钟周期数。In some embodiments, the predicted transmission bandwidth occupancy is determined according to the transmission bandwidth of the expected transmission within the expected transmission window, wherein the expected transmission window is determined according to the expected transmission corresponding to The total number of system clock cycles determined by the transfer length and transfer bit width.

本申请的一些实施例通过统计各目的端口在可预期传输窗口内的可预期传输带宽来确定各目的端口的预测传输带宽占用量,提升了各目的端口带宽占用情况估计的准确性和客观性。Some embodiments of the present application determine the predicted transmission bandwidth occupancy of each destination port by counting the expected transmission bandwidth of each destination port within the expected transmission window, which improves the accuracy and objectivity of estimating the bandwidth occupancy of each destination port.

在一些实施例中,所述预测传输带宽占用量的计算公式如下:In some embodiments, the calculation formula of the predicted transmission bandwidth occupancy is as follows:

Figure GDA0004199193200000051
Figure GDA0004199193200000051

其中,CLK COUNT1为在所述可预期的传输窗口内的系统时钟计数值,EST是通过统计当前传输请求所携带用于表征所述数据的传输长度得到的计数值。Wherein, CLK COUNT1 is a system clock count value within the predictable transmission window, and EST is a count value obtained by counting the transmission length carried by the current transmission request to represent the data.

本申请的一些实施例通过上述公式可以确定各目的端口的预测传输带宽占用量,提升了带宽占用情况估计的客观性。Some embodiments of the present application can determine the predicted transmission bandwidth occupancy of each destination port through the above formula, which improves the objectivity of bandwidth occupancy estimation.

在一些实施例中,通过提取通道中的传输请求信号所携带的信息获取所述EST的值。In some embodiments, the value of the EST is obtained by extracting the information carried in the transmission request signal in the channel.

本申请的一些实施例通过检测总线上各通道传输请求信号所携带的信息来确定预测传输带宽占用量的值,提升了预测传输带宽占用量估计的准确性。In some embodiments of the present application, the value of the predicted transmission bandwidth occupancy is determined by detecting the information carried in the transmission request signal of each channel on the bus, which improves the accuracy of estimation of the predicted transmission bandwidth occupancy.

第二方面,本申请的一些实施例提供一种用于片上总线的路由装置,所述路由装置包括:带宽占用量计算模块,被配置为:获取总线上当前至少一个目的端口中各目的端口的实际传输带宽占用量;获取所述至少一个目的端口中各目的端口在未来预设时间段内的预测传输带宽占用量;仲裁模块,被配置为根据所述实际传输带宽占用量和所述预测传输带宽占用量,为来自于源端口的待传输数据确定传输策略。In a second aspect, some embodiments of the present application provide a routing device for an on-chip bus. The routing device includes: a bandwidth occupancy calculation module configured to: obtain the bandwidth occupancy calculation module of each destination port in the current at least one destination port on the bus The actual transmission bandwidth occupancy; obtaining the predicted transmission bandwidth occupancy of each destination port in the at least one destination port in a preset time period in the future; the arbitration module is configured to transmit according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy Bandwidth occupancy, which determines the transmission strategy for the data to be transmitted from the source port.

第三方面,本申请的一些实施例提供一种用于片上系统的路由系统,所述路由系统包括:系统时钟计数模块,被配置为当监测开始时,每个传输时钟周期加1,当达到设置的传输窗口值时,清零,并重新开始下一个监测传输窗口中的时钟计数;实际传输量计数模块,被配置为根据传输有效信号统计通道上有效传输对应的实际传输量;预测传输量计数模块,被配置为根据传输请求信号统计通道上在未来预计时间段内的预测传输量;带宽计算模块,被配置为:根据设置的传输窗口大小和与一个目的端口对应的多个所述实际传输量确定所述目的端口的实际传输带宽占用量;根据可预测传输窗口大小和与所述目的端口对应的多个所述预测传输量确定预测传输带宽占用量;仲裁器,被配置为:比较所述目的端口的实际传输带宽占用量与设置的传输带宽阈值,获取第一比较结果;比较所述目的端口的预测传输带宽占用量与设置的预测传输带宽阈值,获取第二比较结果;根据所述第一比较结果和所述第二比较结果,生成目标出口选择信号,其中,所述目标出口为从多个所述目的端口中选择的用于传输待传输数据的出口;路由器,被配置为接收所述目标出口选择信号,以将所述待传输数据向所述目标出口发送。In a third aspect, some embodiments of the present application provide a routing system for a system on chip, the routing system includes: a system clock counting module configured to add 1 for each transmission clock cycle when monitoring starts, and when the When the transmission window value is set, it is cleared and restarts the clock counting in the next monitoring transmission window; the actual transmission volume counting module is configured to count the actual transmission volume corresponding to the effective transmission on the channel according to the effective transmission signal; predict the transmission volume The counting module is configured to count the predicted transmission volume on the channel in the expected time period in the future according to the transmission request signal; the bandwidth calculation module is configured to: according to the set transmission window size and a plurality of actual The transmission amount determines the actual transmission bandwidth occupancy of the destination port; determines the predicted transmission bandwidth occupancy according to the predictable transmission window size and a plurality of the predicted transmission amounts corresponding to the destination port; the arbiter is configured to: compare Obtaining a first comparison result between the actual transmission bandwidth occupancy of the destination port and the set transmission bandwidth threshold; comparing the predicted transmission bandwidth occupancy of the destination port with the set prediction transmission bandwidth threshold to obtain a second comparison result; The first comparison result and the second comparison result generate a target exit selection signal, wherein the target exit is an exit selected from a plurality of destination ports for transmitting data to be transmitted; the router is configured as receiving the target exit selection signal to send the data to be transmitted to the target exit.

本申请的一些实施例通过多个技术器统计实际传输量、预测传输量、以及系统时钟数量,进而根据这些数据确定实际传输带宽占用量、预测传输带宽占用量,提升了带宽占用情况估计的准确性。Some embodiments of the present application count the actual transmission volume, the predicted transmission volume, and the number of system clocks through multiple technical devices, and then determine the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy based on these data, which improves the accuracy of bandwidth occupancy estimation sex.

在一些实施例中,确认传输有效信号为第一电平时,启动所述实际传输量计数模块进行一次计数,其中,所述第一电平为高电平或者低电平。In some embodiments, when it is confirmed that the transmission valid signal is at a first level, the actual transmission amount counting module is started to perform one count, wherein the first level is a high level or a low level.

本申请的一些实施例通过传输有效信号触发实际传输量开始统计,可以提升实际传输量统计结果的准确性。In some embodiments of the present application, by transmitting a valid signal to trigger the statistics of the actual transmission volume, the accuracy of the statistical result of the actual transmission volume can be improved.

在一些实施例中,所述一次计数的计数步长是根据所述通道的传输位宽确定的。In some embodiments, the counting step of one count is determined according to the transmission bit width of the channel.

本申请的一些实施例,通过设定统一的步长值,来实现多个目的端口带宽比较结果的可比拟性,进而提升得到的传输策略的合理性。In some embodiments of the present application, by setting a uniform step size value, the comparability of bandwidth comparison results of multiple destination ports is realized, thereby improving the rationality of the obtained transmission strategy.

在一些实施例中,确认传输请求信号为第二电平时,启动所述预测传输量计数模块进行一次计数,其中,所述第二电平为高电平或者低电平。In some embodiments, when it is confirmed that the transmission request signal is at a second level, the predicted transmission amount counting module is started to perform one count, wherein the second level is a high level or a low level.

本申请的一些实施例通过传输请求信号来触发预测传输量开始统计,可以提升预测传输量统计结果的准确性。In some embodiments of the present application, the transmission request signal is used to trigger the statistics of the predicted transmission volume, which can improve the accuracy of the statistics result of the predicted transmission volume.

在一些实施例中,所述一次计数的计数步长是根据所述传输请求信号携带的本次待传输数据的传输长度确定的。In some embodiments, the counting step of one count is determined according to the transmission length of the data to be transmitted this time carried by the transmission request signal.

本申请的一些实施例,通过设定统一的步长值,来实现多个目的端口带宽比较结果的可比拟性,进而提升得到的传输策略的合理性。In some embodiments of the present application, by setting a uniform step size value, the comparability of bandwidth comparison results of multiple destination ports is realized, thereby improving the rationality of the obtained transmission strategy.

第四方面,本申请的一些实施例提供一种片上系统的数据传输方法,所述数据传输方法包括:在一次有效传输开始前:设置多个目的端口中各目的端口对应的传输窗口的值,并设置所述多个目的端口中各目的端口对应的带宽比较阈值,其中,所述带宽比较阈值包括传输带宽比较阈值以及预测传输带宽阈值;当有效传输开始时,不断更新所述多个目的端口中各目的端口的实际传输带宽占用量和预测传输带宽占用量的值,其中,所述实际传输带宽占用量是根据所述传输窗口的值和统计的实际传输量计算得到的,所述预测传输带宽占用量是根据预测传输窗口的值和统计得到的预测传输量计算得到的;仲裁时,根据所述多个目的端口中各目的端口的所述带宽比较阈值动态调整输出的路由结果,以将来自于源端口的待传输数据路由至调整后的目标出口。In a fourth aspect, some embodiments of the present application provide a system-on-chip data transmission method, the data transmission method including: before an effective transmission starts: setting the value of the transmission window corresponding to each destination port among the multiple destination ports, And set the bandwidth comparison threshold corresponding to each destination port in the plurality of destination ports, wherein the bandwidth comparison threshold includes a transmission bandwidth comparison threshold and a predicted transmission bandwidth threshold; when effective transmission starts, continuously update the plurality of destination ports The actual transmission bandwidth occupancy of each destination port and the value of the predicted transmission bandwidth occupancy, wherein the actual transmission bandwidth occupancy is calculated according to the value of the transmission window and the statistical actual transmission volume, and the predicted transmission bandwidth The bandwidth occupancy is calculated according to the value of the predicted transmission window and the predicted transmission volume obtained by statistics; during arbitration, the output routing result is dynamically adjusted according to the bandwidth comparison threshold of each destination port in the plurality of destination ports, so as to Data to be transmitted from the source port is routed to the adjusted destination egress.

第五方面,本申请的一些实施例提供一种片上系统,所述片上系统包括处理器、存储器、至少一个I/O设备以及上述第三方面任一项所述的路由系统;其中,所述至少一个I/O设备与所述处理器通过所述路由系统互连;或者,所述至少一个I/O设备与所述存储器通过所述路由系统互连。In the fifth aspect, some embodiments of the present application provide a system on chip, the system on chip includes a processor, a memory, at least one I/O device, and the routing system described in any one of the above third aspects; wherein, the At least one I/O device is interconnected with the processor through the routing system; or, the at least one I/O device is interconnected with the memory through the routing system.

附图说明Description of drawings

为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the accompanying drawings that need to be used in the embodiments of the present application will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present application, so It should not be regarded as a limitation on the scope, and those skilled in the art can also obtain other related drawings according to these drawings without creative work.

图1为相关技术提供的IO路由与多个I/O设备和主机连接的关系图;Fig. 1 is a relational diagram of IO routing provided by related technologies and multiple I/O devices and host connections;

图2为相关技术提供的IO路由的内部结构图;FIG. 2 is an internal structure diagram of IO routing provided by related technologies;

图3为本申请的实施例提供的用于阐述由于相关技术的上行数据传输对应的目的端口数量少导致的阻塞示意图;FIG. 3 is a schematic diagram provided by an embodiment of the present application to explain the blockage caused by the small number of destination ports corresponding to uplink data transmission in the related art;

图4为本申请实施例提供的用于片上总线的路由方法的流程图;FIG. 4 is a flowchart of a routing method for an on-chip bus provided in an embodiment of the present application;

图5为本申请实施例提供的传输窗口移动示意图;FIG. 5 is a schematic diagram of transmission window movement provided by an embodiment of the present application;

图6为本申请实施例提供的多总线接口的IO路由与多个I/O设备和主机连接的关系图;Fig. 6 is the relationship diagram of the IO routing of the multi-bus interface provided by the embodiment of the present application and the connection between multiple I/O devices and the host;

图7为本申请实施例提供的源端口与目的端口连接关系示意图;7 is a schematic diagram of the connection relationship between the source port and the destination port provided by the embodiment of the present application;

图8为本申请实施例提供的上行数据传输对应两个目的端口的路由示意图;FIG. 8 is a schematic diagram of routes corresponding to two destination ports for uplink data transmission provided by the embodiment of the present application;

图9为本申请实施例提供的用于片上总线的路由装置的组成框图;FIG. 9 is a block diagram of a routing device for an on-chip bus provided in an embodiment of the present application;

图10为本申请实施例提供的用于片上总线的路由系统的组成框图;FIG. 10 is a block diagram of a routing system for an on-chip bus provided in an embodiment of the present application;

图11为本申请实施例提供的IO路由的内部结构图;FIG. 11 is an internal structural diagram of the IO routing provided by the embodiment of the present application;

图12为本申请实施例提供的路由方法又一流程图。FIG. 12 is another flowchart of the routing method provided by the embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。同时,在本申请的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second" and the like are only used to distinguish descriptions, and cannot be understood as indicating or implying relative importance.

本申请涉及的缩略语的含义如下。The meanings of the abbreviations involved in this application are as follows.

SoC:System-on-a-Chip,SoC称为系统级芯片,是一个有专用目标的集成电路,其中包含完整系统并有嵌入软件的全部内容。SoC: System-on-a-Chip, SoC is called a system-on-a-chip, which is an integrated circuit with a dedicated target, which contains a complete system and has all the content of embedded software.

P2P:Peer-to-Peer,PCIE设备端到端的传输。P2P: Peer-to-Peer, PCIE device end-to-end transmission.

Host:I/O设备传输互联的对象,如Data Fabric。Host: I/O device transmission and interconnection objects, such as Data Fabric.

Monitor:用于传输带宽的监测的功能模块。Monitor: A functional module for monitoring transmission bandwidth.

Source port:IO路由的client port中发起传输请求的port。Source port: The port that initiates the transfer request in the client port of the IO route.

Target port:IO路由的client port中接收传输请求并输出的port。Target port: The port that receives the transfer request and outputs it in the client port of the IO route.

下面结合图1和图2示例性阐述相关技术的IOHUB以及这些IOHUB存在的技术问题和缺陷。IOHUBs of related technologies and the technical problems and defects existing in these IOHUBs are exemplarily described below with reference to FIG. 1 and FIG. 2 .

相关技术的IOHUB设计通过总线接口Bus Interface将I/O设备与主机Host互联,以5个I/O设备通过IOHUB与主机Host互联为例,其中互联结构如图1所示。图1的上行数据流upstream即I/O设备300向主机100发送数据,具体过程为5个I/O设备通过总线接口BUSIF(Bus Interface)与IOHUB(即图1的IO路由器200)相连,IO路由器200通过路由将来自于I/O设备的请求和数据通过与主机相连的总线接口BUSIF与主机100相连;图1的下行数据流downstream即从主机100向I/O设备300发送数据,具体过程为来自主机100的请求和数据通过路由分发给与I/O设备连接的总线接口BUSIF后实现对5个I/O设备的访问。图1中5个I/O设备共享一个主机100,各I/O设备之间也可以通过路由实现P2P访问。对IOHUB而言,每个端口port都是客户端端口client port,每个客户端端口client port既可以是源端口sourceport,也可能是目的端口target port,这要根据数据传输的方向确定。The IOHUB design of the related art interconnects the I/O device with the host Host through the bus interface Bus Interface. Taking five I/O devices interconnected with the host Host through IOHUB as an example, the interconnection structure is shown in FIG. 1 . The upstream data stream in Fig. 1 is that the I/O device 300 sends data to the host 100. The specific process is that five I/O devices are connected to the IOHUB (i.e. the IO router 200 in Fig. 1 ) through the bus interface BUSIF (Bus Interface). The router 200 connects the request and data from the I/O device to the host 100 through the bus interface BUSIF connected to the host through routing; the downstream data flow downstream in Figure 1 sends data from the host 100 to the I/O device 300, the specific process Access to five I/O devices is realized after the request and data from the host 100 are routed and distributed to the bus interface BUSIF connected to the I/O device. In FIG. 1 , five I/O devices share one host 100 , and P2P access can also be implemented between I/O devices through routing. For IOHUB, each port is a client port, and each client port can be either a source port or a destination port target port, which depends on the direction of data transmission.

相关技术的IOHUB的设计方案如图2所示,与图1不同的是,图2将图1的主机100替换成存储器110。图2的存储器110可以通过总线接收来自于I/O设备的数据并存储,也可以向与总线连接的I/O设备发送数据,因此图2示意了两个与存储器110连接的总线接口单元,而这两个总线接口单元分别对应上行数据流向对应的上行接口(此时对应目的端口)和下行数据流向对应的下行接口(此时对应源端口),在本申请的一些实施例中将这两个方向的总线接口表示为一个总线接口如图1所示。需要说明的是,本申请一些实施例通过增加与主机或者片上存储器连接的总线接口的数量,可以指代增加作为上行接口的总线接口的数目。The design scheme of the IOHUB in the related art is shown in FIG. 2 , which is different from FIG. 1 in that FIG. 2 replaces the host 100 in FIG. 1 with a memory 110 . The memory 110 in Figure 2 can receive and store data from the I/O device through the bus, and can also send data to the I/O device connected to the bus, so Figure 2 illustrates two bus interface units connected to the memory 110, And these two bus interface units respectively correspond to the upstream interface (corresponding to the destination port) of the upstream data flow and the downstream interface (corresponding to the source port) of the downstream data flow respectively. In some embodiments of the application, the two A bus interface in one direction is represented as a bus interface as shown in FIG. 1 . It should be noted that, in some embodiments of the present application, by increasing the number of bus interfaces connected to the host or on-chip memory, it may refer to increasing the number of bus interfaces serving as uplink interfaces.

图2示出了相关技术的IOHUB包括的主要功能模块,这些功能模块包括:总线接口BUSIF、译码器201(或称为decoder)、总线缓存器202(或称为BUF)、多路选择器203(或称为MUX)以及仲裁器204(或称为arbitrition),其中,各模块的核心功能介绍如下。Fig. 2 shows the main functional modules that the IOHUB of related art comprises, and these functional modules include: bus interface BUSIF, decoder 201 (or be called decoder), bus buffer 202 (or be called BUF), multiplexer 203 (or called MUX) and arbitrator 204 (or called arbitrition), wherein the core functions of each module are introduced as follows.

总线接口BUSIF属于I/O设备以及存储器Memory在总线上的总线接口,用于I/O设备以及存储器Memory访问IOHUB。总线接口实现总线协议,对命令通道,数据通道和响应通道进行处理。The bus interface BUSIF belongs to the bus interface of the I/O device and the memory on the bus, and is used for the I/O device and the memory to access the IOHUB. The bus interface realizes the bus protocol, and processes the command channel, data channel and response channel.

译码器201,该模块主要实现访问请求地址的译码。对上行数据流upstream方向,译码器201根据传输请求地址request address的范围译码决定此次传输的目标设备是存储器Memory还是其它I/O设备,或是无效地址空间。访问空间若为存储器Memory或者远端的I/O设备,则目的端口target port为与主机或者存储器连接的端口;访问空间若为同一路由Hub上的I/O设备,则目的端口target port为与其它I/O设备连接的端口;若访问空间无效,则返回无效响应。对下行数据流downstream方向,主机Host端口作为源端口sourceport,其译码器201根据软件配置的各I/O设备地址空间译码,访问地址落在哪个I/O设备的地址空间,则目的端口target port为对应的I/O设备的端口。Decoder 201, this module mainly realizes the decoding of the access request address. For the upstream direction of the upstream data flow, the decoder 201 decodes according to the range of the transmission request address request address to determine whether the target device of this transmission is a memory or other I/O devices, or an invalid address space. If the access space is a storage memory or a remote I/O device, the destination port target port is the port connected to the host or storage; if the access space is an I/O device on the same routing Hub, the destination port target port is the same as The port to which other I/O devices are connected; if the access space is invalid, an invalid response will be returned. For the downstream direction of the downstream data flow, the Host port of the host computer is used as the source port sourceport, and its decoder 201 decodes according to the address space of each I/O device configured by the software. target port is the port of the corresponding I/O device.

总线缓存器202(例如,包括总线不同通道的FIFO)用于暂存数据或者暂存访问请求。例如,在各个访问请求和数据被仲裁和输出前,FIFO缓存这些访问请求和数据,确保访问不被中断,其中,FIFO的深度以及带宽与客户端口Client port数量相关。The bus buffer 202 (for example, FIFO including different channels of the bus) is used for temporarily storing data or temporarily storing access requests. For example, before each access request and data are arbitrated and output, the FIFO buffers these access requests and data to ensure that the access is not interrupted, wherein the depth and bandwidth of the FIFO are related to the number of client ports.

仲裁器204主要负责对来自不同的源端口source port访问同一个目的端口target port时进行仲裁,在同一个时刻只有一个源端口source port被路由到目的端口target port上。仲裁采用轮询(Round Robin)的调度方式,即每一次把来自客户端口client port的请求轮流分配给目的端口target port。The arbiter 204 is mainly responsible for arbitrating when different source ports access the same target port, and only one source port is routed to the target port at the same time. Arbitration adopts round-robin (Round Robin) scheduling method, that is, each time the request from the client port client port is allocated to the destination port target port in turn.

多路选择器203负责根据译码器201和仲裁器204的结果选择源端口source port输出到目的端口target port。The multiplexer 203 is responsible for selecting the source port source port to output to the destination port target port according to the results of the decoder 201 and the arbiter 204 .

需要说明的是,图2的上行数据流方向只有一个目的端口(对应两个数据流向),因此当随着I/O设备的数量增加时,这个目的端口的带宽可能成为制约I/O设备与存储器110通信的瓶颈。It should be noted that the upstream data flow direction in Figure 2 has only one destination port (corresponding to two data flow directions), so when the number of I/O devices increases, the bandwidth of this destination port may become a constraint between the I/O device and Bottleneck for memory 110 communication.

下面结合图3示例性阐述图1和图2的相关技术存在的技术缺陷。The technical defects existing in the related technologies of FIG. 1 and FIG. 2 are exemplarily described below with reference to FIG. 3 .

相关技术方案中(例如,图1和图2)多个I/O设备只能通过一个目的端口targetport访问主机或者存储器,IO路由器上的仲裁模块做仲裁时采用轮询方式。因此随着I/O设备的增多和带宽需求的增加,存在以下问题:第一,带宽不够:随着I/O设备的增加,IOHUB挂载的I/O设备对应的客户端端口client port会增加;随着I/O设备对应的接口速率的提升和系统应用对带宽需求的增加,一个目的端口target port成为带宽瓶颈。如图3所示,通路①和通路②都是32GB/s的带宽,由于与主机100相连的总线接口BUSIF只有一个出口,则总线接口BUSIF的带宽32GB/s会成为传输带宽的瓶颈。当进行上行数据流传输时,出口带宽的计算应如下公式所示:In related technical solutions (for example, FIG. 1 and FIG. 2 ), multiple I/O devices can only access the host or memory through one target port, and the arbitration module on the IO router adopts a polling method for arbitration. Therefore, with the increase of I/O devices and the increase of bandwidth requirements, there are the following problems: First, the bandwidth is not enough: with the increase of I/O devices, the client port corresponding to the I/O devices mounted on IOHUB will be increase; with the improvement of the interface rate corresponding to the I/O device and the increase of the bandwidth demand of the system application, a target port becomes a bandwidth bottleneck. As shown in Fig. 3, the bandwidth of path ① and path ② are both 32GB/s. Since the bus interface BUSIF connected to the host 100 has only one outlet, the bandwidth of 32GB/s of the bus interface BUSIF will become the bottleneck of the transmission bandwidth. When performing upstream data flow transmission, the calculation of the egress bandwidth should be as follows:

Figure GDA0004199193200000111
Figure GDA0004199193200000111

其中,等式左边表征对与主机连接的目的端口的带宽(只有一个目的端口),等式右边的参量表征与I/O设备连接的各源端口的带宽求和。可见,当同时传输的源端口sourceport越多,目的端口target port的带宽压力越大。Wherein, the left side of the equation represents the bandwidth of the destination port connected to the host (there is only one destination port), and the parameter on the right side of the equation represents the sum of the bandwidths of the source ports connected to the I/O device. It can be seen that when there are more source ports for simultaneous transmission, the bandwidth pressure of the destination port target port is greater.

第二,相关技术的仲裁模块的仲裁机制未考虑带宽。如上所述,相关技术的仲裁器204采用轮询的机制,未考虑带宽因素,若前一个客户端client传输长度很大,一直占用目的端口target port,则后一个客户端client在前一个客户端client完成传输前无法获得目的端口target port的传输,影响关键数据的传输。例如,一种传输情况是,如图3所示,若通路①正在传输,而通路②有实时传输的数据需要尽快获得与主机100相连的BUSIF的传输权利,但由于通路①未传输完,且传输没法被中断,则通路②无法获得传输权,会造成通路②的实时传输被中断。另一种传输情况是,通路①每次发出的传输类型是批量传输,即每次传输的数据量size很大,而通路②每次传输的数据量size很小,但传输的频率高,此时通路②每次很快就释放目的端口target port,通路①每次占有目的端口target port传输的时间很长,总体上通路②的平均带宽没法得到保证。Second, the arbitration mechanism of the arbitration module in the related art does not consider the bandwidth. As mentioned above, the arbitrator 204 of the related art adopts a polling mechanism without considering the bandwidth factor. If the transmission length of the previous client client is very large and occupies the destination port target port all the time, then the latter client client will be in front of the previous client. The client cannot obtain the transmission of the target port before the transmission is completed, which affects the transmission of key data. For example, a kind of transmission situation is, as shown in Figure 3, if channel ① is transmitting, and channel ② has the data of real-time transmission need to obtain the transmission right of the BUSIF connected with host 100 as soon as possible, but because channel ① has not been transmitted, and If the transmission cannot be interrupted, the channel ② cannot obtain the transmission right, which will cause the real-time transmission of the channel ② to be interrupted. Another transmission situation is that the transmission type sent by path ① each time is batch transmission, that is, the size of data transmitted each time is large, while the size of data transmitted by path ② is small, but the transmission frequency is high. When the channel ② releases the destination port target port soon each time, the channel ① occupies the destination port target port for a long time each time, and the average bandwidth of the channel ② cannot be guaranteed on the whole.

正是由于相关技术存在的上述问题,本申请的一些实施例在相关技术的IOHUB的基础上增加带宽监测机制,根据带宽改进仲裁机制,同时增加目的端口target port的数量,通过增加目的端口target port的数量来提升流向主机或者片上存储器的出口带宽,进一步的在本申请的一些实施例中可以根据源端口source port所需的带宽和监测的目的端口的带宽灵活调整待传输数据在多个目的的端口target port上的带宽。Just because of the above-mentioned problems existing in the related technology, some embodiments of the present application increase the bandwidth monitoring mechanism on the basis of the IOHUB of the related technology, improve the arbitration mechanism according to the bandwidth, and increase the quantity of the destination port target port at the same time, by increasing the destination port target port To increase the export bandwidth flowing to the host or on-chip memory, further in some embodiments of the present application, it is possible to flexibly adjust the bandwidth of the data to be transmitted in multiple destinations according to the bandwidth required by the source port source port and the bandwidth of the monitored destination port Bandwidth on port target port.

例如,本申请的一些实施例增加了与主机或者内存互联的总线接口的数量(每个总线接口对应一个端口),进而增加了I/O设备作为数据发送方时对应的目的端口的数量;本申请的一些实施例还增加对端口的带宽监测机制,基于带宽监测结果改进现有的传输策略涉及的仲裁和路由,增加了路由的灵活性。For example, some embodiments of the present application increase the number of bus interfaces interconnected with the host or memory (each bus interface corresponds to a port), and then increase the number of corresponding destination ports when the I/O device acts as a data sender; Some embodiments of the application also add a bandwidth monitoring mechanism for ports, improve the arbitration and routing involved in the existing transmission strategy based on the bandwidth monitoring results, and increase the flexibility of routing.

下面结合图4示例性阐述由本申请实施例的IOHUB执行的用于片上总线的路由方法。The routing method for the on-chip bus executed by the IOHUB of the embodiment of the present application will be exemplarily described below with reference to FIG. 4 .

请参看图4,图4为本申请实施例提供的用于片上总线的路由方法,包括:S101,获取总线上至少一个目的端口中各目的端口的实际传输带宽占用量;S102,获取所述至少一个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量;S103,根据所述实际传输带宽占用量和所述预测传输带宽占用量,为来自于源端口的待传输数据确定传输策略。Please refer to Fig. 4, Fig. 4 is the routing method for the on-chip bus provided by the embodiment of the present application, including: S101, obtaining the actual transmission bandwidth occupancy of each destination port in at least one destination port on the bus; S102, obtaining the at least one destination port The predicted transmission bandwidth occupancy of each destination port in a destination port in the expected future time period; S103, according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy, determine the transmission for the data to be transmitted from the source port Strategy.

需要说明的是,对于上行数据传输,所述目的端口指与主机或者片上存储器相连的端口;对于下行数据传输,所述目的端口为片上总线与I/O设备相连的端口。当目的端口为与I/O设备相连的端口时,目的端口的数量为一,则本申请实施例的根据带宽占用情况(包括实际传输带宽占用量和预测传输带宽占用量)确定的传输策略用于解决多个主机同时向该目的端口传输数据时如何决策的问题。当目的端口为与主机或者片上存储单元相连的端口时,本申请的一些实施例对应的目的端口数目增加为多个,则采用本申请的根据带宽占用情况(包括实际传输带宽占用量和预测传输带宽占用量)确定的传输策略用于为待传输的数据从多个目的端口中选择一个或多个出口,通过所述出口可以将待传输数据发送至主机或者片上存储器。It should be noted that, for uplink data transmission, the destination port refers to a port connected to a host or an on-chip memory; for downlink data transmission, the destination port is a port connected to an on-chip bus and an I/O device. When the destination port is a port connected to an I/O device, the number of destination ports is one, and the transmission strategy determined according to the bandwidth occupancy (including the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy) in the embodiment of the present application is used It is used to solve the problem of how to make decisions when multiple hosts transmit data to the destination port at the same time. When the destination port is a port connected to a host or an on-chip storage unit, the number of destination ports corresponding to some embodiments of the present application is increased to multiple, and the bandwidth occupancy of this application (including actual transmission bandwidth occupancy and predicted transmission) is adopted. Bandwidth Occupancy) The transmission strategy determined is used to select one or more exits from multiple destination ports for the data to be transmitted, through which the data to be transmitted can be sent to the host or on-chip memory.

下面以上行数据传输为例,示例性阐述图4方案涉及的各个步骤。Taking uplink data transmission as an example below, various steps involved in the scheme in FIG. 4 are exemplarily described.

在本申请的一些实施例中,S101涉及的实际传输带宽占用量是根据当前传输在任意一个目的端口的传输窗口内的传输带宽确定的。例如,任意一个目的端口的实际传输带宽占用量的计算公式如下:In some embodiments of the present application, the actual transmission bandwidth occupation involved in S101 is determined according to the transmission bandwidth of the current transmission within the transmission window of any destination port. For example, the formula for calculating the actual transmission bandwidth usage of any destination port is as follows:

Figure GDA0004199193200000131
Figure GDA0004199193200000131

其中,BW_TRANS用于表征当前传输在传输窗口内的传输带宽,CLK COUNT为在设置的传输窗口内统计系统时钟周期数,TRANS COUNT用于表征统计任意一个端口的所有有效传输所对应的字节数。Among them, BW_TRANS is used to represent the transmission bandwidth of the current transmission within the transmission window, CLK COUNT is used to count the number of system clock cycles within the set transmission window, and TRANS COUNT is used to represent the number of bytes corresponding to all valid transmissions of any port .

CLK COUNT为统计得到的传输窗口范围内的时钟周期数(即clock cycle数),传输窗口(统计的传输cycle范围)的大小可以根据经验值获取,通过寄存器预先设置传输窗口的值。CLK COUNT is the number of clock cycles (that is, the number of clock cycles) within the statistically obtained transmission window range. The size of the transmission window (statistical transmission cycle range) can be obtained based on empirical values, and the value of the transmission window is preset through the register.

实际传输带宽占用量计算时的计算窗口如图10所示,当软件设置传输窗口大小后,传输窗口大小在移动过程中不改变,随着窗口的移动,带宽会变化。如图5所示,第一个传输窗口Cal_Window内的实际传输带宽占用量大于第二个传输窗口Cal_Window内的实际传输带宽占用量。The calculation window for calculating the actual transmission bandwidth usage is shown in Figure 10. After the software sets the size of the transmission window, the size of the transmission window does not change during the moving process, but the bandwidth will change as the window moves. As shown in FIG. 5 , the actual transmission bandwidth occupancy in the first transmission window Cal_Window is greater than the actual transmission bandwidth occupancy in the second transmission window Cal_Window.

在一些实施例中,S102涉及的预测传输带宽占用量是根据任意目的端口的可预期的传输在可预期的传输窗口内的传输带宽确定的,其中,所述可预期的传输窗口是根据所述可预期的传输对应的传输长度和传输位宽计算确定的总的系统时钟周期数。例如,所述可预期的传输窗口是根据所述可预期的传输对应的传输长度,在传输起点按每个系统时钟周期传输一笔数据位宽长度的数据得到的总的系统时钟周期数。例如,所述预测传输带宽占用量的计算公式如下:In some embodiments, the predicted transmission bandwidth occupancy involved in S102 is determined according to the transmission bandwidth of the expected transmission of any destination port within the expected transmission window, wherein the expected transmission window is determined according to the The total number of system clock cycles determined by the calculation of the transfer length and transfer bit width corresponding to the expected transfer. For example, the predictable transmission window is the total number of system clock cycles obtained by transmitting a piece of data with a data bit width and length at the start of transmission at each system clock cycle according to the transmission length corresponding to the predictable transmission. For example, the calculation formula of the predicted transmission bandwidth occupancy is as follows:

BW_TRANS=EST/CLK COUNT1其中,CLK COUNT1为在所述可预期的传输窗口内的系统时钟计数值,EST是通过统计当前传输请求所携带用于表征所述数据的传输长度后得到的计数值。BW_TRANS=EST/CLK COUNT1 where CLK COUNT1 is the count value of the system clock within the expected transmission window, and EST is the count value obtained by counting the transmission length carried by the current transmission request to represent the data.

需要说明的是,可预期的传输指根据传输的长度得到的传输量,如传输请求中传输长度为256Byte,则可预期的传输为256Byte;可预期的传输窗口指根据传输长度和传输位宽计算得到的传输Cycle数,如传输长度为256Byte,传输总线位宽为32Byte,则可预期的传输窗口为8个Cycle。对于上述参数的具体取值可参考下文的描述在此不过多赘述。It should be noted that the expected transmission refers to the amount of transmission obtained according to the length of the transmission. For example, if the transmission length in the transmission request is 256Byte, the expected transmission is 256Byte; the expected transmission window refers to the calculation based on the transmission length and transmission bit width The obtained transmission cycle number, if the transmission length is 256Byte, and the transmission bus bit width is 32Byte, then the expected transmission window is 8 Cycles. For the specific values of the above parameters, reference may be made to the description below, and details will not be repeated here.

下面以与主机或者片上存储器相连的总线接口为目的端口(即进行上行数据流传输)为例,示例性阐述S103。The following takes the bus interface connected to the host or the on-chip memory as the destination port (that is, transmits the upstream data stream) as an example to illustrate S103.

为了改善相关技术仅存在一个端口与主机或者片上存储器相连,导致的来自于I/O设备的待传输数据不能即时发送至主机或者内存的问题,在本申请的一些实施例中,主机或者片上存储器通过多个总线接口与IO路由器连接,当进行数据传输时多个总线接口对应多个端口。In order to improve the problem that there is only one port connected to the host or the on-chip memory in the related art, the data to be transmitted from the I/O device cannot be sent to the host or the memory immediately. In some embodiments of the present application, the host or the on-chip memory It is connected to the IO router through multiple bus interfaces, and multiple bus interfaces correspond to multiple ports when performing data transmission.

如图6所示,与图1和图2不同的是图6的主机100通过四个总线接口与IO路由器连接。可以理解的是,来自于I/O设备的数据可以通过四个总线接口即四个目的端口中的一个或多个目的端口传输至主机100,这显著增加了同时向主机或者内存传输数据的总带宽。As shown in FIG. 6 , the difference from FIG. 1 and FIG. 2 is that the host 100 in FIG. 6 is connected to the IO router through four bus interfaces. It can be understood that the data from the I/O device can be transmitted to the host 100 through four bus interfaces, that is, one or more destination ports in the four destination ports, which significantly increases the total number of simultaneous data transfers to the host or memory. bandwidth.

如图6所示,当I/O设备作为源端口source port时,目的端口target port增加至四个,此时,出口带宽增加为原方案的四倍,可以提供更多的I/O设备同时传输。如图7所示,源端口source port与目的端口target port可实现任意路由,即任意源端口source port可以路由至任意目的端口target port。实际使用中,当源端口source port是与主机或者片上存储器相连的总线接口,对应的目的端口target port为与I/O设备相连的口。需要说明的是,在本申请的一些实施例中,当数据流方向为下行数据流downstream时,源端口source port到目的端口target port的路由由其访问的地址空间决定;当源端口sourceport为与I/O设备相连的接口,目的端口target port为与主机或者内存连接的总线接口时,即数据流方向为上行数据流upstream时,源端口source port到目的端口target port的路由由本申请一些实施例监测的带宽决定。As shown in Figure 6, when the I/O device is used as the source port, the destination port target port is increased to four. At this time, the export bandwidth is increased by four times that of the original solution, and more I/O devices can be provided at the same time. transmission. As shown in FIG. 7 , the source port and the target port can implement arbitrary routing, that is, any source port can be routed to any target port. In actual use, when the source port is a bus interface connected to a host or an on-chip memory, the corresponding target port is a port connected to an I/O device. It should be noted that, in some embodiments of the present application, when the data flow direction is downstream, the route from the source port source port to the destination port target port is determined by the address space it visits; when the source port sourceport is the same as For the interface connected to the I/O device, when the destination port target port is a bus interface connected to the host or memory, that is, when the data flow direction is upstream, the route from the source port source port to the destination port target port is determined by some embodiments of the present application The monitoring bandwidth is determined.

作为一个示例,当源端口source port的总带宽低于目的端口target port时,源端口source port可被路由至任意目的端口target port,保证所有源端口source port的带宽需求均得到满足,未被用到的目的端口target port处于空闲IDLE状态;当源端口source port的总带宽等于或者高于目的端口target port时,多个源端口source port至目的端口target port的路由需根据目的端口target port的带宽动态调节。例如,第一源端口Source_0有一笔往目的端口target port的传输,当该传输请求发出来后,第一目的端口Target_0port与第二目的端口Target_1port都处于传输状态,而第三目的端口Target_2port处于空闲IDLE状态,则第一源端口Source_0的待传输数据被路由至第三目的端口Target_2;若四个目的端口Target port都被占用,则根据带宽计算结果(即计算得到预测传输带宽占用量),将第一源端口Source_0的待传输数据路由至带宽预期(即预期传输带宽占用量)最低的目的端口Target port。As an example, when the total bandwidth of the source port source port is lower than the destination port target port, the source port source port can be routed to any destination port target port to ensure that the bandwidth requirements of all source port source ports are met and are not used The destination port target port is in the idle IDLE state; when the total bandwidth of the source port source port is equal to or higher than the destination port target port, the route from multiple source ports source port to the destination port target port needs to be based on the bandwidth of the destination port target port Dynamic adjustment. For example, the first source port Source_0 has a transfer to the destination port target port. When the transfer request is sent, the first destination port Target_0port and the second destination port Target_1port are both in the transmission state, and the third destination port Target_2port is in idle IDLE state, the data to be transmitted on the first source port Source_0 is routed to the third destination port Target_2; if the four destination ports Target ports are all occupied, then according to the bandwidth calculation result (that is, the estimated transmission bandwidth occupancy is calculated), the second The data to be transmitted on the source port Source_0 is routed to the target port Target port with the lowest expected bandwidth (ie, expected transmission bandwidth occupancy).

下面结合示例阐述根据带宽进行路由的示例性策略。An exemplary strategy for routing according to bandwidth will be described below with reference to examples.

当I/O设备通过多个目的端口向主机或者内存进行数据传输时,存在合理路由的技术问题,本申请的一些实施例为了保证多个目的端口中各目的端口尽量处于满载来提升向主机或者内存传输的数据量的目的,相应的,S103包括:根据实际传输带宽占用量和预测传输带宽占用量,为待传输数据从多个目的端口中确定至少一个目标出口;其中,所述目标出口用于向所述主机或者所述片上存储器提供所述待传输数据。例如,S103可以包括根据为所述多个目的端口中各目的端口分别设置的传输带宽阈值和预测带宽阈值,从多个所述目的端口中确定至少一个所述目标出口。When the I/O device transmits data to the host computer or memory through multiple destination ports, there is a technical problem of reasonable routing. The purpose of the amount of data transmitted by the memory, correspondingly, S103 includes: according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy, at least one target exit is determined from multiple destination ports for the data to be transmitted; wherein, the target exit uses for providing the data to be transmitted to the host or the on-chip memory. For example, S103 may include determining at least one target egress from the multiple destination ports according to the transmission bandwidth threshold and the predicted bandwidth threshold respectively set for each of the multiple destination ports.

当某一个目的端口的实际传输带宽占用量和预测传输带宽占用量均小于对应设定的阈值时,S103确定这个目的端口可以作为待传输数据的目标出口。在本申请的一些实施例中,所述S103包括:确认第一目的端口(为多个目的端口中的任意一个)对应的第一实际传输带宽占用量小于与所述第一目的端口对应的第一传输带宽阈值,并确认所述第一目的端口对应的第一预测传输带宽占用量小于与所述第一目的端口对应的第一预测带宽阈值;选择所述第一目的端口作为所述目标出口。When both the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy of a certain destination port are smaller than the corresponding set threshold, S103 determines that this destination port can be used as a target egress of the data to be transmitted. In some embodiments of the present application, the S103 includes: confirming that the first actual transmission bandwidth occupancy corresponding to the first destination port (which is any one of the multiple destination ports) is smaller than the first destination port corresponding to the first destination port a transmission bandwidth threshold, and confirm that the first predicted transmission bandwidth occupancy corresponding to the first destination port is less than the first predicted bandwidth threshold corresponding to the first destination port; select the first destination port as the target egress .

也就是说,当一个目的端口的实际传输带宽小于传输带宽阈值,且预测传输带宽小于预测带宽阈值时,可以将这个目的端口作为待传输数据的目标出口。需要说明的是,如果来自于某一个源端口的待传输数据量大于这个目的端口能够提供的带宽时,这个源端口的待传输数据可以被分配至多个目标出口,此时这个目的端口对应的目标出口仅仅是这些目标出口中的一个。That is to say, when the actual transmission bandwidth of a destination port is less than the transmission bandwidth threshold and the predicted transmission bandwidth is less than the predicted bandwidth threshold, this destination port can be used as a target egress of the data to be transmitted. It should be noted that if the amount of data to be transmitted from a certain source port is greater than the bandwidth that the destination port can provide, the data to be transmitted from the source port can be distributed to multiple target outlets. An export is just one of these target exports.

作为一个示例,如图8所示。图8中有三个I/O设备访问主机host中的DDR。在示例性阐述传输策略之前,假设图8中设备或者链路还存在如下初始参数:As an example, see Figure 8. In Figure 8, there are three I/O devices accessing the DDR in the host host. Before exemplifying the transmission strategy, it is assumed that the device or link in Figure 8 also has the following initial parameters:

第一,图8中有3个device(即3个I/O设备)发起写传输请求,这三个写传输请求携带的信息表征待传输数据的平均带宽都为32GB/s。First, in Figure 8, there are 3 devices (that is, 3 I/O devices) that initiate write transfer requests, and the information carried by these three write transfer requests indicates that the average bandwidth of the data to be transferred is 32GB/s.

第二,图8的3个device通过两个总线接口与主机host连接,基于第一点假设可知本次传输的目的端口的数量为两个,且这两个目的端口的满载带宽分别为48GB/s。Second, the three devices in Figure 8 are connected to the host through two bus interfaces. Based on the first assumption, it can be known that the number of destination ports for this transmission is two, and the full-load bandwidth of these two destination ports is 48GB/ s.

第三,假设图8的初始默认传输路由为S1→T1;S2→T2;S3→T2Third, assume that the initial default transmission route in Figure 8 is S1→T1; S2→T2; S3→T2

第四,Device的总线位宽为32Byte,即一个系统传输周期cycle的有效传输为32Byte;主机Host的总线位宽为48Byte,即一个系统时钟周期cycle的有效传输为48Byte。Fourth, the bus bit width of the Device is 32Byte, that is, the effective transmission of a system transmission cycle cycle is 32Byte; the bus bit width of the host Host is 48Byte, that is, the effective transmission of a system clock cycle cycle is 48Byte.

第五,假设工作频率为1GHz,即总线的理想带宽为32GB/s。Fifth, assume that the operating frequency is 1GHz, that is, the ideal bandwidth of the bus is 32GB/s.

第六,假设每个Device发起的写传输长度都为256Byte。Sixth, assume that the length of the write transfer initiated by each Device is 256Byte.

下面结合上述六点假设示例性阐述图8的工作流程如下:The workflow in Figure 8 is illustrated below in combination with the above six assumptions:

S201,设置设备Device和主机Host相关传输参数,设备device为写传输,传输长度为256Byte。S201, setting the transmission parameters related to the device Device and the host Host, the device device is for write transmission, and the transmission length is 256 Byte.

S202,设置T1的传输带宽阈值为38GB/s(理想带宽的80%),T1的预测带宽阈值为48GB/s(理想带宽)。S202. Set the transmission bandwidth threshold of T1 to 38GB/s (80% of the ideal bandwidth), and the predicted bandwidth threshold of T1 to 48GB/s (ideal bandwidth).

S203,设置T2的传输带宽阈值为38GB/s(理想带宽的80%),T1的预测带宽阈值为48GB/s(理想带宽)。S203. Set the transmission bandwidth threshold of T2 to 38GB/s (80% of the ideal bandwidth), and the predicted bandwidth threshold of T1 to 48GB/s (ideal bandwidth).

S204,设置T1的监测器monitor的传输窗口大小为1024(即设置的传输窗口的大小为1024个Cycle),设置T2的监测器monitor的窗口为1024。S204, setting the size of the transmission window of the monitor of T1 to 1024 (that is, the size of the set transmission window is 1024 Cycles), and setting the size of the window of the monitor of T2 to 1024.

S205,启动传输,各目的端口的监测器monitor开始工作。S205, start the transmission, and the monitors of each destination port start to work.

S206,在第一个传输窗口结束时,执行如下步骤:S206, when the first transmission window ends, perform the following steps:

S206-1,T1的实际传输量TRANS COUNT为1000*32Byte,CLK COUNT为1000,根据BW_TRANS的计算公式,带宽折算(即实际传输带宽占用量)为32GB/s;T1的预测传输量ESTCOUNT为256Byte,要完成256Byte传输所需的CLK COUNT为8个系统时钟周期Cycle,所以根据EST_TRANS的带宽计算公式,带宽折算(即预测传输带宽占用量)为32GB/s。两个带宽值都小于预设的阈值,即T1上带宽并未占满,可以接收其它源端口source port的传输。S206-1, the actual transmission volume TRANS COUNT of T1 is 1000*32Byte, and the CLK COUNT is 1000. According to the calculation formula of BW_TRANS, the bandwidth conversion (that is, the actual transmission bandwidth occupancy) is 32GB/s; the predicted transmission volume ESTCOUNT of T1 is 256Byte , the CLK COUNT required to complete 256Byte transmission is 8 system clock cycles Cycle, so according to the bandwidth calculation formula of EST_TRANS, the bandwidth conversion (that is, the predicted transmission bandwidth occupancy) is 32GB/s. Both bandwidth values are smaller than the preset threshold, that is, the bandwidth on T1 is not fully occupied, and transmissions from other source ports can be received.

S206-2,T2的实际传输量TRANS COUNT为1000*48Byte,CLK COUNT为1000,根据BW_TRANS的计算公式,带宽折算(即实际传输带宽占用量)为48GB/s。S206-2, the actual transmission volume TRANS COUNT of T2 is 1000*48Byte, and the CLK COUNT is 1000. According to the calculation formula of BW_TRANS, the bandwidth conversion (that is, the actual transmission bandwidth occupancy) is 48GB/s.

S206-3,T2来自S2的预测传输量EST COUNT为256Byte,要完成256Byte传输所需的CLK COUNT为6个Cycle,所以T2来自S2的BW_EST折算(即实际传输带宽占用量)为43GB/s;T2来自S3的预测传输量EST_COUNT为256Byte,要完成256Byte传输所需的CLK COUNT为6个Cycle,所以T2来自S2的BW_EST折算(即实际传输带宽占用量)为43GB/s。所以T2端口上总的预估带宽需求为86GB/s。S206-3, T2's predicted transmission amount EST COUNT from S2 is 256Byte, and the CLK COUNT required to complete 256Byte transmission is 6 Cycles, so T2's BW_EST conversion from S2 (that is, the actual transmission bandwidth occupancy) is 43GB/s; The predicted transmission amount EST_COUNT from S3 of T2 is 256Byte, and the CLK COUNT required to complete the 256Byte transmission is 6 Cycles, so the BW_EST conversion (that is, the actual transmission bandwidth usage) of T2 from S2 is 43GB/s. So the total estimated bandwidth requirement on the T2 port is 86GB/s.

S206-4,所以T2上实际传输带宽占用量和预测传输带宽占用量都高于对应阈值。S206-4, so the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy on T2 are both higher than the corresponding threshold.

S206-5,软件将S2的路由更改为T1。S206-5, the software changes the route of S2 to T1.

S207,在第二个传输窗口结束时,T1和T2的带宽计算结果和#6相反,软件将S2的路由更改为T2。S207, at the end of the second transmission window, the bandwidth calculation results of T1 and T2 are opposite to #6, and the software changes the route of S2 to T2.

总体平均来看,图8的S2的带宽被平均分配16GB/s在目的端口T1和目的端口T2上。On an overall average, the bandwidth of S2 in FIG. 8 is evenly allocated 16 GB/s to the destination port T1 and the destination port T2.

需要说明的是,本申请的一些实施例通过判断来自于任意一个源端口的待传输数据的特征确定传输策略。例如,如果源端口还有很大量的待传输数据,则需要参考预测传输阈值来确定某个目的端口是否可以作为目标出口;如果源端口的待传输数据量较小时,则可以采用传输带宽阈值来确定是否可以将某个目的端口作为目标出口。It should be noted that, in some embodiments of the present application, the transmission strategy is determined by judging characteristics of the data to be transmitted from any source port. For example, if the source port still has a large amount of data to be transmitted, it is necessary to refer to the predicted transmission threshold to determine whether a destination port can be used as the target egress; if the amount of data to be transmitted on the source port is small, the transmission bandwidth threshold can be used to Determines whether a destination port can be used as a target egress.

作为一个示例,当本申请一些实施例的某一个目的端口的实际传输带宽占用量和预测传输带宽占用量中有一个小于对应设定的阈值时,S103可以包括:根据所述待传输数据的属性特征,确认采用所述传输带宽阈值和所述预测带宽阈值中的至少一个确定所述目标出口,其中,所述属性特征用于表征所述待传输数据的数据量的多少。例如,包括:确认所述待传输数据的数据量大于第一设定阈值;确认第二目的端口对应的第二预测传输带宽占用量小于与所述第二目的端口对应的第二预测带宽阈值;选择所述第二目的端口作为所述目标出口;或者,确认所述待传输数据的数据量小于第二设定阈值;确认所述第二目的端口对应的第二实际传输带宽占用量小于与所述第二目的端口对应的第二传输带宽阈值;选择所述第二目的端口作为所述目标出口;其中,所述第一设定阈值大于所述第二设定阈值。上述第一目的端口为多个目的端口中的任意一个。As an example, when one of the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy of a certain destination port in some embodiments of the present application is less than the corresponding set threshold, S103 may include: according to the attribute of the data to be transmitted The characteristic is to confirm that at least one of the transmission bandwidth threshold and the predicted bandwidth threshold is used to determine the target egress, wherein the attribute characteristic is used to characterize the amount of data to be transmitted. For example, it includes: confirming that the amount of data to be transmitted is greater than a first set threshold; confirming that the second predicted transmission bandwidth occupancy corresponding to the second destination port is smaller than the second predicted bandwidth threshold corresponding to the second destination port; Selecting the second destination port as the target egress; or, confirming that the amount of data to be transmitted is less than a second set threshold; confirming that the second actual transmission bandwidth occupancy corresponding to the second destination port is less than the set threshold A second transmission bandwidth threshold corresponding to the second destination port; selecting the second destination port as the target egress; wherein the first set threshold is greater than the second set threshold. The above-mentioned first destination port is any one of multiple destination ports.

请参考图9,图9示出了本申请实施例通过的用于总线上的路由装置,应理解,该装置与上述图4方法实施例对应,能够执行上述方法实施例涉及的各个步骤,该装置的具体功能可以参见上文中的描述,为避免重复,此处适当省略详细描述。装置包括至少一个能以软件或固件的形式存储于存储器中或固化在装置的操作系统中的软件功能模块,该用于总线上的路由装置,包括:带宽占用量计算模块801,被配置为:获取总线上至少一个目的端口中各目的端口的实际传输带宽占用量;获取所述至少一个目的端口中各目的端口在未来预设时间段内的预测传输带宽占用量;仲裁模块802,被配置为根据所述实际传输带宽占用量和所述预测传输带宽占用量,为来自于源端口的待传输数据确定传输策略。Please refer to FIG. 9. FIG. 9 shows the routing device used in the bus through the embodiment of the present application. It should be understood that the device corresponds to the method embodiment in FIG. 4 above, and can perform various steps involved in the method embodiment above. For specific functions of the device, reference may be made to the above description, and to avoid repetition, detailed descriptions are appropriately omitted here. The device includes at least one software function module that can be stored in the memory in the form of software or firmware or solidified in the operating system of the device. The routing device on the bus includes: a bandwidth occupancy calculation module 801 configured to: Acquire the actual transmission bandwidth occupancy of each destination port in at least one destination port on the bus; obtain the predicted transmission bandwidth occupancy of each destination port in the at least one destination port in the future preset time period; the arbitration module 802 is configured to A transmission policy is determined for the data to be transmitted from the source port according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置的具体工作过程,可以参考前述图4方法中的对应过程,在此不再过多赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the device described above can refer to the corresponding process in the method in FIG. 4 , which will not be repeated here.

下面结合图10示例性阐述本申请实施例的用于片上总线的路由系统。The routing system for the on-chip bus according to the embodiment of the present application is exemplarily described below with reference to FIG. 10 .

本申请的一些实施例提供一种用于片上总线的路由系统,所述路由系统包括:系统时钟计数模块210,被配置为当监测开始时,每个传输时钟周期加1,当达到设置的传输窗口值时,清零,并重新开始下一个监测传输窗口中的时钟计数;实际传输量计数模块220,被配置为根据传输有效信号统计上有效传输对应的实际传输量(即上文中记载的TRANSCOUNT);预测传输量计数模块230,被配置为根据传输请求信号统计通道上在未来预计时间段内的预测传输量(即上文中记载的EST);带宽计算模块240,被配置为:根据设置的传输窗口大小(即图10的REG_WINDOW)和与一个目的端口对应的多个所述实际传输量确定所述目的端口的实际传输带宽占用量(即图10的BW_TRANS);根据可预测传输窗口大小和与所述目的端口对应的多个所述预测传输量确定预测传输带宽占用量(即图10的BW_TRANS_EST);仲裁器250,被配置为:比较所述目的端口的实际传输带宽占用量与设置的传输带宽阈值(即图10的REG_THRESHOLD),获取第一比较结果;比较所述目的端口的预测传输带宽占用量与设置的预测传输带宽阈值,获取第二比较结果;根据所述第一比较结果和所述第二比较结果,生成目标出口选择信号(即图10的TRANS_BLOCK),其中,所述目标出口为从多个所述目的端口中选择的用于传输待传输数据的出口;路由器260,被配置为接收所述目标出口选择信号,以将所述待传输数据向所述目标出口发送。Some embodiments of the present application provide a routing system for an on-chip bus, the routing system includes: a system clock counting module 210 configured to add 1 to each transmission clock cycle when monitoring starts, and when the set transmission clock cycle is reached When the window value is cleared, and restart the clock counting in the next monitoring transmission window; the actual transmission amount counting module 220 is configured as the actual transmission amount corresponding to the effective transmission according to the transmission effective signal statistics (i.e. the TRANSCOUNT recorded above) ); The predicted transmission volume counting module 230 is configured to predict the transmission volume (i.e. the EST recorded above) on the channel according to the transmission request signal in the expected time period in the future; the bandwidth calculation module 240 is configured to: according to the set Transmission window size (i.e. REG_WINDOW of Fig. 10) and a plurality of said actual transmission quantities corresponding to a purpose port determine the actual transmission bandwidth occupation of said destination port (i.e. BW_TRANS of Fig. 10); according to the predictable transmission window size and A plurality of the predicted transmission amounts corresponding to the destination port determine the predicted transmission bandwidth occupancy (ie, BW_TRANS_EST in FIG. 10 ); the arbiter 250 is configured to: compare the actual transmission bandwidth occupancy of the destination port with the set The transmission bandwidth threshold (i.e. REG_THRESHOLD of Figure 10), obtains the first comparison result; compares the predicted transmission bandwidth occupancy of the destination port with the set predicted transmission bandwidth threshold, obtains the second comparison result; according to the first comparison result and As a result of the second comparison, a target exit selection signal (ie TRANS_BLOCK in FIG. 10 ) is generated, wherein the target exit is an exit selected from a plurality of destination ports for transmitting data to be transmitted; router 260 is selected by It is configured to receive the target outlet selection signal, so as to send the data to be transmitted to the target outlet.

在本申请的一些实施例中,确认传输有效信号为第一电平时,启动所述实际传输量计数模块进行一次计数,其中,所述第一电平为高电平或者低电平。例如,所述一次计数的计数步长是根据所述通道的传输位宽确定的。In some embodiments of the present application, when it is confirmed that the transmission valid signal is at a first level, the actual transmission amount counting module is started to perform one count, wherein the first level is a high level or a low level. For example, the counting step of the one count is determined according to the transmission bit width of the channel.

作为一个示例,如图10所示传输有效信号即Valid_Transfer。Valid_Transfer指有效的一次总线传输。总线上每个通道分开计算Valid_Transfer,valid的条件为通道对应的传输有效信号,每次传输有效其对应实际传输量计数模块220的计数器增加一次。在数据通道上,每次计数器增加的为传输对应的Byte数,即若数据通道为256-bit位宽,传输的byte enable都有效,则每次有效传输对应的计数器应加32。As an example, a valid signal, Valid_Transfer, is transmitted as shown in FIG. 10 . Valid_Transfer refers to a valid bus transfer. Valid_Transfer is calculated separately for each channel on the bus. The valid condition is the valid transmission signal corresponding to the channel, and the counter of the corresponding actual transmission amount counting module 220 is incremented once for each valid transmission. On the data channel, each counter increases the number of Bytes corresponding to the transmission, that is, if the data channel is 256-bit wide, and the transmitted byte enable is valid, the counter corresponding to each valid transmission should be incremented by 32.

在一些实施例中,确认传输请求信号为第二电平时,启动所述预测传输量计数模块进行一次计数,其中,所述第二电平为高电平或者低电平。例如,所述一次计数的计数步长是根据所述传输请求信号携带的本次待传输数据的传输长度确定的。In some embodiments, when it is confirmed that the transmission request signal is at a second level, the predicted transmission amount counting module is started to perform one count, wherein the second level is a high level or a low level. For example, the counting step of the one count is determined according to the transmission length of the data to be transmitted this time carried by the transmission request signal.

如图10所示,传输请求信号为Request:用于预测即将发生的传输对带宽的占用,指总线上传输命令通道。在传输命令通道,当传输请求信号Request有效时,计数器每次增加的Byte数为传输长度,如传输长度为256Byte,则计数器增加256,预测方式为:在接下来的256个传输cycle内,对应的数据通道会传输256个Byte的数据。预测传输量计数模块230用于预测数据通道的带宽,但并未考虑当传输阻塞的情况,所以实际带宽应该小等于预测带宽。As shown in FIG. 10 , the transmission request signal is Request: used to predict the bandwidth occupation of the upcoming transmission, and refers to the transmission command channel on the bus. In the transmission command channel, when the transmission request signal Request is valid, the number of bytes that the counter increases each time is the transmission length. If the transmission length is 256 Byte, the counter increases by 256. The prediction method is: in the next 256 transmission cycles, the corresponding The data channel will transmit 256 Bytes of data. The predicted transmission amount counting module 230 is used to predict the bandwidth of the data channel, but does not consider the situation of transmission congestion, so the actual bandwidth should be less than the predicted bandwidth.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置的具体工作过程,可以参考前述图4方法中的对应过程,在此不再过多赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the device described above can refer to the corresponding process in the method in FIG. 4 , which will not be repeated here.

如图11所示,该图为本申请实施例提供的IOHUB与主机(或存储器)以及IO设备互联的结构示意图。与图2不同是,图11的每个主机或者存储器与总线接口有四个(图11中的一个上行信号和一个下行信号对应一个总线接口),且图11各的总线接口模块中还可以设置监控器monitor,监控器monitor的输出信号(例如,输出信号包括如图10的实际传输带宽占用量BW_TRANS和预测传输带宽占用量BW_TRANS_EST)经缓存器BUF缓存后输入值仲裁即arbirtition,以进行路由仲裁,进而为来自于源端口的发送数据选择带宽合适的目标出口。针对图11与图2相同的单元在此不做过程陈述。As shown in FIG. 11 , this figure is a schematic structural diagram of the interconnection between the IOHUB, the host (or storage) and the IO device provided by the embodiment of the present application. Different from Fig. 2, each host or memory in Fig. 11 has four bus interfaces (one uplink signal and one downlink signal in Fig. 11 correspond to one bus interface), and each bus interface module in Fig. 11 can also be set Monitor monitor, the output signal of the monitor monitor (for example, the output signal includes the actual transmission bandwidth occupancy BW_TRANS and the predicted transmission bandwidth occupancy BW_TRANS_EST as shown in Figure 10) after being buffered by the buffer BUF, the input value arbitration is arbirtition to perform routing arbitration , and then select a target outlet with an appropriate bandwidth for the data sent from the source port. For the same units in FIG. 11 as in FIG. 2 , no process statement is made here.

图11的带宽监测装置monitor包括图9的系统时钟计数模块210、实际传输量计数模块220、预测传输量计数模块230以及带宽计算模块240。对于这些模块的功能可以参考上文描述,为避免重复,在此不做过多赘述。The bandwidth monitoring device monitor in FIG. 11 includes the system clock counting module 210 , the actual transmission volume counting module 220 , the predicted transmission volume counting module 230 and the bandwidth calculation module 240 in FIG. 9 . For the functions of these modules, reference can be made to the above description, and in order to avoid repetition, details are not repeated here.

本申请的一些实施例提供一种片上系统的数据传输方法,所述数据传输方法包括:在一次有效传输开始前:设置多个目的端口中各目的端口对应的传输窗口的值,并设置所述多个目的端口中各目的端口对应的带宽比较阈值,其中,所述带宽比较阈值包括传输带宽比较阈值以及预测传输带宽阈值;当有效传输开始时,不断更新所述多个目的端口中各目的端口的实际传输带宽占用量和预测传输带宽占用量的值,其中,所述实际传输带宽占用量是根据所述传输窗口的值和统计的实际传输量计算得到的,所述预测传输带宽占用量是根据预测传输窗口的值和统计得到的预测传输量计算得到的;仲裁时,根据所述多个目的端口中各目的端口的所述带宽比较阈值动态调整输出的路由结果,以将来自于源端口的待传输数据路由至调整后的目标出口。Some embodiments of the present application provide a system-on-chip data transmission method. The data transmission method includes: before an effective transmission starts: setting the value of the transmission window corresponding to each of the multiple destination ports, and setting the A bandwidth comparison threshold corresponding to each destination port in the plurality of destination ports, wherein the bandwidth comparison threshold includes a transmission bandwidth comparison threshold and a predicted transmission bandwidth threshold; when effective transmission starts, continuously update each destination port in the plurality of destination ports The actual transmission bandwidth occupancy and the value of the predicted transmission bandwidth occupancy, wherein the actual transmission bandwidth occupancy is calculated according to the value of the transmission window and the statistical actual transmission amount, and the predicted transmission bandwidth occupancy is Calculated according to the value of the predicted transmission window and the predicted transmission volume obtained by statistics; during arbitration, dynamically adjust the output routing result according to the bandwidth comparison threshold of each destination port in the plurality of destination ports, so as to transfer the routing results from the source port The data to be transmitted is routed to the adjusted target egress.

也就是说,本申请实施例的片上系统的数据传输方法如图12所示。在开始传输前,软件通过寄存器设置带宽计算的传输窗口大小和带宽比较阈值,当传输开始时,计算更新带宽(即计算实际传输带宽占用量和预测传输带宽占用量),仲裁时根据每个目的端口target port的带宽阈值动态调整输出的路由结果,根据带宽更新路由该结果将源端口source port路由至调整后的目的端口target port。当一笔传输结束时,该源端口sourceport若无后续传输则结束,若继续传输,则重复之前的带宽计算和路由调整。在做仲裁路由时,软件需设置对应目的端口target port的REG_THRESHOLD,即端口port的带宽阈值(即设置传输带宽阈值并设置预测带宽阈值),当对应目的端口target port的实际传输带宽大等于设置带宽阈值时,其它源端口source port不会被路由至该目的target port,否则,可以接收其它源端口source port路由至此目的端口target port。That is to say, the data transmission method of the system on chip in the embodiment of the present application is shown in FIG. 12 . Before starting the transmission, the software sets the transmission window size and bandwidth comparison threshold for bandwidth calculation through registers. When the transmission starts, it calculates the update bandwidth (that is, calculates the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy). Arbitration is based on each purpose The bandwidth threshold of the port target port dynamically adjusts the output routing result, and the routing result is updated according to the bandwidth to route the source port source port to the adjusted destination port target port. When a transmission ends, the source port sourceport will end if there is no subsequent transmission. If the transmission continues, the previous bandwidth calculation and routing adjustment will be repeated. When doing arbitration routing, the software needs to set REG_THRESHOLD of the target port corresponding to the destination port, that is, the bandwidth threshold of the port port (that is, set the transmission bandwidth threshold and set the predicted bandwidth threshold), when the actual transmission bandwidth of the corresponding destination port target port is greater than the set bandwidth When the threshold is set, other source port source ports will not be routed to this destination target port, otherwise, other source port source port routes to this destination port target port can be accepted.

本申请的一些实施例提供一种片上系统,所述片上系统包括处理器、存储器、至少一个I/O设备以及上述所述的路由系统以及装置;其中,所述至少一个I/O设备与所述处理器通过所述路由系统或者路由装置互连;或者,所述至少一个I/O设备与所述存储器通过所述路由系统或者路由装置互连。Some embodiments of the present application provide a system on a chip, the system on a chip includes a processor, a memory, at least one I/O device, and the above-mentioned routing system and device; wherein, the at least one I/O device and the The processors are interconnected through the routing system or routing device; or, the at least one I/O device and the memory are interconnected through the routing system or routing device.

在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,也可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may also be implemented in other ways. The device embodiments described above are only illustrative. For example, the flowcharts and block diagrams in the accompanying drawings show the architecture, functions and possible implementations of devices, methods and computer program products according to multiple embodiments of the present application. operate. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or part of code that includes one or more Executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.

另外,在本申请各个实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。In addition, each functional module in each embodiment of the present application may be integrated to form an independent part, each module may exist independently, or two or more modules may be integrated to form an independent part.

所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes. .

以上所述仅为本申请的实施例而已,并不用于限制本申请的保护范围,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。The above descriptions are only examples of the present application, and are not intended to limit the scope of protection of the present application. For those skilled in the art, various modifications and changes may be made to the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this application shall be included within the protection scope of this application. It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that there is a relationship between these entities or operations. There is no such actual relationship or order between them. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements of or also include elements inherent in such a process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

Claims (19)

1.一种用于片上总线的路由方法,应用于IO路由模块上,其特征在于,所述路由方法包括:1. A routing method for on-chip bus, applied to the IO routing module, is characterized in that, the routing method includes: 获取总线上多个目的端口中各目的端口的实际传输带宽占用量,其中,所述目的端口为与片上存储器或者主机连接的端口;Obtaining the actual transmission bandwidth occupancy of each destination port among the plurality of destination ports on the bus, wherein the destination port is a port connected to an on-chip memory or a host; 获取所述多个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量;Acquiring the predicted transmission bandwidth occupancy of each destination port among the plurality of destination ports in a future estimated time period; 将所述实际传输带宽占用量和所述预测传输带宽占用量的至少一个与对应设定的阈值比较,得到比较结果;Comparing at least one of the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy with a corresponding set threshold to obtain a comparison result; 至少基于所述比较结果从所述多个目的端口中选择至少一个目标出口,其中,所述目标出口用于向所述主机或者所述片上存储器提供来自源端口的待传输数据,所述源端口为与各种I/O设备连接的端口。Selecting at least one target exit from the plurality of destination ports based at least on the comparison result, wherein the target exit is used to provide the host or the on-chip memory with data to be transmitted from a source port, the source port It is a port connected with various I/O devices. 2.如权利要求1所述的路由方法,其特征在于,所述至少基于所述比较结果,从多个所述目的端口中确定至少一个所述目标出口,包括:2. The routing method according to claim 1, wherein said determining at least one target exit from a plurality of said destination ports based at least on said comparison result comprises: 若所述比较结果为第一目的端口对应的第一实际传输带宽占用量小于与所述第一目的端口对应的第一传输带宽阈值且所述第一目的端口对应的第一预测传输带宽占用量小于与所述第一目的端口对应的第一预测带宽阈值,则选择所述第一目的端口作为所述目标出口。If the comparison result is that the first actual transmission bandwidth occupancy corresponding to the first destination port is less than the first transmission bandwidth threshold corresponding to the first destination port and the first predicted transmission bandwidth occupancy corresponding to the first destination port is smaller than the first predicted bandwidth threshold corresponding to the first destination port, then select the first destination port as the target egress. 3.如权利要求1所述的路由方法,其特征在于,所述至少基于所述比较结果,从所述多个目的端口中选择至少一个目标出口包括:3. The routing method according to claim 1, wherein the selecting at least one target outlet from the plurality of destination ports based at least on the comparison result comprises: 根据所述待传输数据的属性特征和所述比较结果,从所述多个目的端口中选择至少一个目标出口,其中,所述属性特征用于表征所述待传输数据的数据量的多少。Selecting at least one target exit from the plurality of destination ports according to the attribute feature of the data to be transmitted and the comparison result, wherein the attribute feature is used to characterize the amount of the data to be transmitted. 4.如权利要求3所述的路由方法,其特征在于,所述根据所述待传输数据的属性特征和所述比较结果,从所述多个目的端口中选择至少一个目标出口,包括:4. The routing method according to claim 3, wherein the selecting at least one target exit from the plurality of destination ports according to the attribute characteristics of the data to be transmitted and the comparison result comprises: 确认所述待传输数据的数据量大于第一设定阈值;confirming that the data volume of the data to be transmitted is greater than a first set threshold; 若所述比较结果为第二目的端口对应的第二预测传输带宽占用量小于与所述第二目的端口对应的第二预测带宽阈值,则选择所述第二目的端口作为所述目标出口。If the comparison result is that the second predicted transmission bandwidth occupancy corresponding to the second destination port is smaller than the second predicted bandwidth threshold corresponding to the second destination port, then selecting the second destination port as the target egress. 5.如权利要求3所述的路由方法,其特征在于,所述根据所述待传输数据的属性特征和所述比较结果,从所述多个目的端口中选择至少一个目标出口包括:5. The routing method according to claim 3, wherein the selecting at least one target exit from the plurality of destination ports according to the attribute characteristics of the data to be transmitted and the comparison result comprises: 确认所述待传输数据的数据量小于第二设定阈值;confirming that the data volume of the data to be transmitted is less than a second set threshold; 若所述比较结果为第二目的端口对应的第二实际传输带宽占用量小于与所述第二目的端口对应的第二传输带宽阈值,则选择所述第二目的端口作为所述目标出口。If the comparison result is that the second actual transmission bandwidth occupation corresponding to the second destination port is smaller than the second transmission bandwidth threshold corresponding to the second destination port, then selecting the second destination port as the target egress. 6.如权利要求1-5任一项所述的路由方法,其特征在于,所述实际传输带宽占用量是根据当前传输在传输窗口内的传输带宽确定的。6. The routing method according to any one of claims 1-5, wherein the actual transmission bandwidth occupancy is determined according to the transmission bandwidth of the current transmission within the transmission window. 7.如权利要求6所述的路由方法,其特征在于,所述实际传输带宽占用量的计算公式如下:7. routing method as claimed in claim 6 is characterized in that, the calculation formula of described actual transmission bandwidth occupancy is as follows:
Figure QLYQS_1
Figure QLYQS_1
其中,CLK COUNT为在设置的传输窗口内统计的系统时钟周期数,Among them, CLK COUNT is the number of system clock cycles counted within the set transmission window, COUNT用于表征统计当前时刻的所有有效传输所对应的字节数。COUNT is used to represent and count the number of bytes corresponding to all valid transmissions at the current moment.
8.如权利要求7所述的路由方法,其特征在于,通过监测通道中的传输有效信号统计参数TRANSCOUNT的值。8. The routing method according to claim 7, characterized in that, the value of the parameter TRANSCOUNT is counted by monitoring the transmission effective signal in the channel. 9.如权利要求1-5任一项所述的路由方法,其特征在于,所述预测传输带宽占用量是根据可预期的传输在可预期的传输窗口内的传输带宽确定的,其中,所述可预期的传输窗口是根据所述可预期的传输对应的传输长度和传输位宽确定的总的系统时钟周期数。9. The routing method according to any one of claims 1-5, wherein the predicted transmission bandwidth occupancy is determined according to the transmission bandwidth of the expected transmission within the expected transmission window, wherein the The predictable transmission window is the total number of system clock cycles determined according to the transmission length and transmission bit width corresponding to the predictable transmission. 10.如权利要求9所述的路由方法,其特征在于,所述预测传输带宽占用量的计算公式如下:10. routing method as claimed in claim 9 is characterized in that, the computing formula of described predicted transmission bandwidth occupancy is as follows:
Figure QLYQS_2
Figure QLYQS_2
其中,CLK COUNT1为在所述可预期的传输窗口内的系统时钟计数值,EST是通过统计当前传输请求所携带用于表征所述数据的传输长度得到的计数值。Wherein, CLK COUNT1 is a system clock count value within the predictable transmission window, and EST is a count value obtained by counting the transmission length carried by the current transmission request to represent the data.
11.如权利要求10所述的路由方法,其特征在于,通过提取通道中的传输请求信号所携带的信息获取所述EST的值。11. The routing method according to claim 10, wherein the value of the EST is obtained by extracting the information carried in the transmission request signal in the channel. 12.一种用于片上总线的路由装置,其特征在于,所述路由装置包括:12. A routing device for on-chip bus, characterized in that, the routing device comprises: 带宽占用量计算模块,被配置为:The bandwidth occupancy calculation module is configured to: 获取总线上多个目的端口中各目的端口的实际传输带宽占用量,其中,所述目的端口为与片上存储器或者主机连接的端口;Obtaining the actual transmission bandwidth occupancy of each destination port among the plurality of destination ports on the bus, wherein the destination port is a port connected to an on-chip memory or a host; 获取所述多个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量;Acquiring the predicted transmission bandwidth occupancy of each destination port among the plurality of destination ports in a future estimated time period; 仲裁模块,被配置为:an arbitration module configured to: 将所述实际传输带宽占用量和所述预测传输带宽占用量的至少一个与对应设定的阈值比较,得到比较结果;Comparing at least one of the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy with a corresponding set threshold to obtain a comparison result; 至少基于所述比较结果从所述多个目的端口中选择至少一个目标出口,其中,所述目标出口用于向所述主机或者所述片上存储器提供来自源端口的待传输数据,所述源端口为与各种I/O设备连接的端口。Selecting at least one target exit from the plurality of destination ports based at least on the comparison result, wherein the target exit is used to provide the host or the on-chip memory with data to be transmitted from a source port, the source port It is a port connected with various I/O devices. 13.一种用于片上系统的路由系统,其特征在于,所述路由系统包括:13. A routing system for a system on chip, characterized in that the routing system comprises: 系统时钟计数模块,被配置为当监测开始时,每个传输时钟周期增加1,当达到设置的传输窗口值时,清零,并重新开始下一个监测传输窗口中的时钟计数;The system clock counting module is configured to increase by 1 for each transmission clock cycle when the monitoring starts, and clear to zero when the set transmission window value is reached, and restart the clock counting in the next monitoring transmission window; 实际传输量计数模块,被配置为根据传输有效信号统计有效传输对应的实际传输量;The actual transmission amount counting module is configured to count the actual transmission amount corresponding to the effective transmission according to the effective transmission signal; 预测传输量计数模块,被配置为根据传输请求信号统计在未来预计时间段内的预测传输量;The forecasted transmission volume counting module is configured to count the forecasted transmission volume in the expected time period in the future according to the transmission request signal; 带宽计算模块,被配置为:The bandwidth calculation module is configured to: 获取总线上多个目的端口中各目的端口的实际传输带宽占用量,其中,根据设置的传输窗口大小和与一个目的端口对应的多个所述实际传输量确定所述目的端口的所述实际传输带宽占用量;Obtain the actual transmission bandwidth occupancy of each of the multiple destination ports on the bus, wherein the actual transmission of the destination port is determined according to the set transmission window size and the multiple actual transmission volumes corresponding to one destination port bandwidth usage; 获取所述多个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量,其中,根据可预测传输窗口大小和与所述目的端口对应的多个所述预测传输量确定所述预测传输带宽占用量;Obtaining the predicted transmission bandwidth occupancy of each destination port in the plurality of destination ports in a future expected time period, wherein the determined according to the predictable transmission window size and the plurality of predicted transmission amounts corresponding to the destination ports Predict transmission bandwidth occupancy; 仲裁器,被配置为:Arbitrator, configured as: 比较所述目的端口的实际传输带宽占用量与设置的传输带宽阈值,获取第一比较结果;Comparing the actual transmission bandwidth occupancy of the destination port with the set transmission bandwidth threshold to obtain a first comparison result; 比较所述预测传输带宽占用量与设置的预测传输带宽阈值,获取第二比较结果;Comparing the predicted transmission bandwidth occupancy with the set predicted transmission bandwidth threshold to obtain a second comparison result; 根据所述第一比较结果和所述第二比较结果生成目标出口的选择信号,并基于所述选择信号从多个所述目的端口中选择至少一个所述目标出口,其中,所述目标出口为从多个所述目的端口中选择的用于传输待传输数据的出口,所述目标出口用于向主机或者片上存储器提供来自源端口的待传输数据,所述源端口为与各种I/O设备连接的端口,所述目的端口为与所述片上存储器或者所述主机连接的端口;generating a selection signal of a target outlet according to the first comparison result and the second comparison result, and selecting at least one target outlet from a plurality of destination ports based on the selection signal, wherein the target outlet is An exit selected from multiple destination ports for transmitting data to be transmitted, the target exit is used to provide the host or on-chip memory with data to be transmitted from the source port, and the source port is connected to various I/O A port connected to the device, the destination port is a port connected to the on-chip memory or the host; 路由器,被配置为接收所述目标出口选择信号,以将所述待传输数据向所述目标出口发送。A router configured to receive the target exit selection signal, so as to send the data to be transmitted to the target exit. 14.如权利要求13所述的用于片上系统的路由系统,其特征在于,确认传输有效信号为第一电平时,启动所述实际传输量计数模块进行一次计数,其中,所述第一电平为高电平或者低电平。14. The routing system for a system on chip according to claim 13, characterized in that, when it is confirmed that the transmission valid signal is at the first level, the actual transmission amount counting module is started to count once, wherein the first level Level is high level or low level. 15.如权利要求14所述的用于片上系统的路由系统,其特征在于,所述一次计数的计数步长是根据通道的传输位宽确定的。15. The routing system for a system on chip according to claim 14, characterized in that, the counting step of one count is determined according to the transmission bit width of the channel. 16.如权利要求13所述的用于片上系统的路由系统,其特征在于,确认传输请求信号为第二电平时,启动所述预测传输量计数模块进行一次计数,其中,所述第二电平为高电平或者低电平。16. The routing system for a system-on-chip as claimed in claim 13, wherein when the transmission request signal is confirmed to be at a second level, the predicted transmission amount counting module is started to perform a count, wherein the second level Level is high level or low level. 17.如权利要求16所述的用于片上系统的路由系统,其特征在于,所述一次计数的计数步长是根据所述传输请求信号携带的待传输数据的传输长度确定的。17. The routing system for a system on chip according to claim 16, wherein the counting step of the one count is determined according to the transmission length of the data to be transmitted carried in the transmission request signal. 18.一种片上系统的数据传输方法,其特征在于,所述数据传输方法包括:18. A data transmission method of a system on chip, characterized in that the data transmission method comprises: 在一次有效传输开始前:Before a valid transfer begins: 设置多个目的端口中各目的端口对应的传输窗口的值,并设置所述多个目的端口中各目的端口对应的带宽比较阈值,其中,所述带宽比较阈值包括传输带宽比较阈值以及预测传输带宽阈值,所述目的端口为与片上存储器或者主机连接的端口;Setting the value of the transmission window corresponding to each destination port in the plurality of destination ports, and setting the bandwidth comparison threshold corresponding to each destination port in the plurality of destination ports, wherein the bandwidth comparison threshold includes a transmission bandwidth comparison threshold and a predicted transmission bandwidth Threshold, the destination port is a port connected to the on-chip memory or the host; 获取总线上所述多个目的端口中各目的端口的实际传输带宽占用量;Acquiring the actual transmission bandwidth occupancy of each of the multiple destination ports on the bus; 获取所述多个目的端口中各目的端口在未来预计时间段内的预测传输带宽占用量;Acquiring the predicted transmission bandwidth occupancy of each destination port among the plurality of destination ports in a future estimated time period; 当有效传输开始时,不断更新所述多个目的端口中各目的端口的所述实际传输带宽占用量和所述预测传输带宽占用量的值,其中,所述实际传输带宽占用量是根据所述传输窗口的值和统计的实际传输量计算得到的,所述预测传输带宽占用量是根据预测传输窗口的值和统计得到的预测传输量计算得到的;When effective transmission starts, continuously update the values of the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy of each of the plurality of destination ports, wherein the actual transmission bandwidth occupancy is based on the The value of the transmission window and the calculated actual transmission volume are calculated, and the predicted transmission bandwidth occupancy is calculated according to the value of the predicted transmission window and the statistically obtained predicted transmission volume; 仲裁时,将所述实际传输带宽占用量和所述预测传输带宽占用量的至少一个与对应设定的阈值比较,得到比较结果;至少基于所述比较结果动态调整输出的路由结果,以将来自于源端口的待传输数据路由至调整后的目标出口,其中,根据所述实际传输带宽占用量和所述预测传输带宽占用量从所述多个目的端口中选择至少一个所述目标出口,所述目标出口用于向所述主机或者所述片上存储器提供来自源端口的待传输数据,所述源端口为与各种I/O设备连接的端口。During arbitration, at least one of the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy is compared with a corresponding set threshold to obtain a comparison result; at least based on the comparison result, the output routing result is dynamically adjusted, so that the output from The data to be transmitted at the source port is routed to the adjusted target exit, wherein at least one of the target exits is selected from the plurality of destination ports according to the actual transmission bandwidth occupancy and the predicted transmission bandwidth occupancy, the The target outlet is used to provide the host or the on-chip memory with data to be transmitted from a source port, and the source port is a port connected to various I/O devices. 19.一种片上系统,其特征在于,所述片上系统包括处理器、存储器、至少一个I/O设备以及如权利要求13-17任一项所述的路由系统;19. A system on a chip, characterized in that the system on a chip comprises a processor, a memory, at least one I/O device, and the routing system according to any one of claims 13-17; 其中,所述至少一个I/O设备与所述处理器通过所述路由系统互连;或者,Wherein, the at least one I/O device is interconnected with the processor through the routing system; or, 所述至少一个I/O设备与所述存储器通过所述路由系统互连。The at least one I/O device is interconnected with the memory through the routing system.
CN202011342617.3A 2020-11-25 2020-11-25 Routing method and system for on-chip bus Active CN112486871B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011342617.3A CN112486871B (en) 2020-11-25 2020-11-25 Routing method and system for on-chip bus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011342617.3A CN112486871B (en) 2020-11-25 2020-11-25 Routing method and system for on-chip bus

Publications (2)

Publication Number Publication Date
CN112486871A CN112486871A (en) 2021-03-12
CN112486871B true CN112486871B (en) 2023-06-13

Family

ID=74934870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011342617.3A Active CN112486871B (en) 2020-11-25 2020-11-25 Routing method and system for on-chip bus

Country Status (1)

Country Link
CN (1) CN112486871B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113900978B (en) * 2021-10-27 2024-05-10 海光信息技术股份有限公司 Data transmission method, device and chip
CN118642985B (en) * 2024-08-19 2024-11-19 成都电科星拓科技有限公司 Arbitration scheduling system and method for multi-port USB host chip

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6763415B1 (en) * 2001-06-08 2004-07-13 Advanced Micro Devices, Inc. Speculative bus arbitrator and method of operation
CN106708671A (en) * 2015-11-17 2017-05-24 深圳市中兴微电子技术有限公司 Method and device for detecting bus behavior of system on chip
CN106453114B (en) * 2016-10-11 2020-03-17 刘昱 Flow distribution method and device
CN109617806B (en) * 2018-12-26 2021-06-22 新华三技术有限公司 Data traffic scheduling method and device

Also Published As

Publication number Publication date
CN112486871A (en) 2021-03-12

Similar Documents

Publication Publication Date Title
US7624221B1 (en) Control device for data stream optimizations in a link interface
JP5335892B2 (en) High-speed virtual channel for packet-switched on-chip interconnect networks
US8316171B2 (en) Network on chip (NoC) with QoS features
US9444740B2 (en) Router, method for controlling router, and program
CN105247821B (en) Mechanism for controlling the utilization of resources using adaptive routing
JP5122025B2 (en) Repeater and chip circuit
JP5894171B2 (en) Arbitration of bus transactions on the communication bus and associated power management based on the health information of the bus device
US10735335B2 (en) Interface virtualization and fast path for network on chip
KR101478516B1 (en) Providing a fine-grained arbitration system
JP4820466B2 (en) Semiconductor system, repeater and chip circuit
CN112486871B (en) Routing method and system for on-chip bus
CN113271264B (en) Data stream transmission method and device of time-sensitive network
JP2018520434A (en) Method and system for USB 2.0 bandwidth reservation
CN112463673B (en) On-chip bus, and service quality arbitration method and device for on-chip bus
US9154569B1 (en) Method and system for buffer management
JP5715458B2 (en) Information processing system and mediation method
CN101252513B (en) On-chip network bandwidth resource scheduling method with guaranteed quality of service
WO2022160206A1 (en) System-on-chip abnormality processing method and apparatus, and system on chip
US20120278514A1 (en) Systems and Methods for Notification of Quality of Service Violation
Agarwal et al. Storage Stacks Need Congestion Control
US11868292B1 (en) Penalty based arbitration
US7552253B2 (en) Systems and methods for determining size of a device buffer
CN118175111A (en) A data transmission method, DMA controller, device and storage medium
CN100426258C (en) Embedded system and method for determining buffer size thereof
JP2024115216A (en) Transmission Control Circuit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Industrial incubation-3-8, North 2-204, No. 18, Haitai West Road, Huayuan Industrial Zone, Binhai New Area, Tianjin 300450

Applicant after: Haiguang Information Technology Co.,Ltd.

Address before: 100082 industrial incubation-3-8, North 2-204, 18 Haitai West Road, Huayuan Industrial Zone, Haidian District, Beijing

Applicant before: Haiguang Information Technology Co.,Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant