CN113014631A - Device cache pushing system and method based on Hlink - Google Patents

Device cache pushing system and method based on Hlink Download PDF

Info

Publication number
CN113014631A
CN113014631A CN202110188763.3A CN202110188763A CN113014631A CN 113014631 A CN113014631 A CN 113014631A CN 202110188763 A CN202110188763 A CN 202110188763A CN 113014631 A CN113014631 A CN 113014631A
Authority
CN
China
Prior art keywords
node
hlink
controller
address
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110188763.3A
Other languages
Chinese (zh)
Inventor
卢飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Qusu Technology Co ltd
Original Assignee
Zhejiang Qusu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Qusu Technology Co ltd filed Critical Zhejiang Qusu Technology Co ltd
Priority to CN202110188763.3A priority Critical patent/CN113014631A/en
Publication of CN113014631A publication Critical patent/CN113014631A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/54Organization of routing tables

Abstract

The invention discloses a device cache pushing system and method based on an Hlink, relates to the field of multi-node processing systems, and aims to solve the problem of how to realize large-bandwidth, low-delay and high-efficiency communication among multiple nodes. The system comprises at least two nodes, wherein the nodes comprise a CPU, an Hlink controller and at least one accelerator, and the nodes are in communication connection through the Hlink controller. The Hlink controller comprises a bus interface, an external interface, an Hlink protocol stack, a control register, a command register, a routing table, a data check and recovery unit and an address mapping table. The invention can realize the high-efficiency communication between the nodes, reduce a large amount of data copy operation in the data transmission process, save the memory bandwidth and reduce the communication delay.

Description

Device cache pushing system and method based on Hlink
Technical Field
The invention relates to a multi-node processing system, in particular to a system and a method for pushing a device cache based on an Hlink.
Background
At present, data processing nodes are usually connected together through an Ethernet switch, and limited by the delay and bandwidth of the Ethernet, and only limited communication can be carried out between the nodes. It is difficult to establish tightly coupled communication between the accelerators of two nodes. That is, the existing data processing chips can only work alone, and cannot be interconnected to form a set of larger-scale processing system. However, the task requirements of the current data processing chip are various, and it is difficult for a single hardware structure to efficiently meet the load diversity requirements at low cost.
The interconnection of nodes by ethernet switches, for example, referring to fig. 5, is a process for reading data in accelerator a of node 1 by accelerator B of node 2: 1. the accelerator A of the node 1 transmits result data to a memory controller; 2. the accelerator A of the node 1 informs the CPU of ready data transmission through interruption; 3. the CPU of the node 1 informs the network card to transmit data to the node 2; 4. the network card of the node 1 copies the data in the memory controller to the buffer area of the network card through DMA; 5. the network card of the node 1 sends data to the switch through the Ethernet; 6. the switch sends the data to the network card of the node 2; 7. after the network card of the node 2 receives the data, copying the data to a memory controller through DMA; 8. the network card of the node 2 informs the CPU that the data has been received through interruption; 9, the CPU analyzes the data in the Ethernet message and informs an accelerator B; 10. the accelerator B of the node 2 transfers the data from the memory controller to its own buffer by DMA, and thus completes the data reading. This approach has the following disadvantages: the transmission bandwidth is low and the delay is large; a large amount of data copy operation is needed, data transmission is based on an Ethernet protocol, the protocol overhead is large, and the efficiency is low.
Disclosure of Invention
In order to solve the above-mentioned problems in the prior art, i.e. to solve the problem of how to implement high-bandwidth, low-latency, and efficient communication between multiple nodes, in one aspect of the present invention, there is provided an Hlink-based device cache push system, which includes at least two nodes,
the node comprises a CPU, an Hlink controller and at least one accelerator; the nodes are in communication connection through an Hlink controller;
the system comprises an Hlink controller, a bus interface, an external interface, an Hlink protocol stack, a control register, a command register, a routing table, a data check and recovery unit and an address mapping table, wherein the Hlink controller comprises a bus interface, an external interface, a Hlink protocol stack, a control register, a command register, a routing table, a data check and recovery unit and an address mapping table;
the bus interface is used for communicating the Hlink controller with a module in a local node;
the external interface is used for being connected with other nodes;
the Hlink protocol stack is used for analyzing and transmitting messages;
the control register is used for configuring and initializing the Hlink controller;
the command register is used for command operation of the Hlink controller;
the routing table is used for appointing a node address for remote data access;
the data check and recovery module is used for checking and recovering data inside the Hlink controller;
the address mapping table is used for configuring the mapping relation between the local physical address base address of the local node and the remote physical address base address and verifying the access authority of the remote node.
In one embodiment, the address mapping table includes a local device ID, a local physical address base, a remote physical address base, an address size identifier, whether an internal address identifier is present, whether a push identifier is supported, and a valid bit identifier;
the local device ID is used for identifying an accelerator with remote access authority in a local node;
the local physical address base address is used for identifying a physical address of a local node;
the remote physical address base address is used for identifying a remote physical address corresponding to the local physical address base address;
the address size identification is used for marking an address interval of the remote accessible node;
whether the internal address identifier is used for indicating that the remote physical address base address is the memory space address of the remote node or the accelerator address of the remote node;
the pushing support identification is used for indicating whether the local node has the accelerator cache authority of the remote node;
the valid bit identifier is used for indicating whether the valid bit corresponds to an entry in the address mapping table.
In one embodiment, the Hlink controller has a buffer space for data buffering.
In another aspect of the present invention, a method for pushing an Hlink-based device cache is further provided, where the method is applied to the above system for pushing an Hlink-based device cache, and the method includes:
configuring an address mapping table and a routing table of an Hlink controller of each node;
configuring node addresses and accelerator addresses of a first node and a second node which need to establish remote access, sending the node addresses and the accelerator addresses to an Hlink controller of the first node and the second node, and verifying whether the Hlink controller of the first node and the second node has access authority or not;
initiating, by a control register of the first node, a remote data transfer when having access rights;
accelerating the transmission of data of the first node to an Hlink controller cache of the node;
a control register of a first node sends a start message to an Hlink controller of a second node, wherein the start message comprises an initial mark and a data size of transmitted data;
the method comprises the steps that an Hlink control device of a first node sends cached data to an Hlink controller of a second node for caching, and sends a finish message when the Hlink controller of the first node finishes caching;
and when receiving the ending message, the Hlink controller of the second node directly transmits the cached data to a buffer area of an accelerator of the second node.
In an embodiment, after "the Hlink controller of the first node sends the buffered data to the Hlink controller of the second node for buffering, and sends an end message at the end", the method further includes: the first node sends a remote interrupt signal to the CPU of the second node.
In an embodiment, the system further includes an application layer, and after "the Hlink controller of the second node directly transmits the buffered data to the buffer of the accelerator of the second node when receiving the end packet", the method further includes: the application layer is configured to delete the contents of the address mapping table configuration in the first and second nodes.
The invention has the advantages that:
the system and the method for pushing the device cache based on the Hlink can realize high-efficiency communication between nodes, and greatly save the resources of a CPU (central processing unit) because the CPU does not participate in the data transmission process. In the data transmission process, a large amount of data copying operations are reduced, the memory bandwidth is saved, and the communication delay is reduced. Further, the Hlink controller is provided with a remote access authority checking mechanism, so that the safety and accuracy of remote data pushing are guaranteed.
Drawings
Fig. 1 is the main structure of the Hlink-based device cache push system of the present invention.
Fig. 2 is a main structure of the Hlink controller of the present invention.
Fig. 3 is the main contents of the address mapping table of the present invention.
Fig. 4 is a schematic diagram illustrating the effect of implementing node interconnection through an Hlink controller.
Fig. 5 is a schematic diagram of the interconnection of nodes implemented by ethernet switches in the prior art.
Detailed Description
Referring to fig. 1, fig. 1 illustrates the main structure of an Hlink-based device cache push system. As shown in fig. 1, the system for pushing an Hlink-based device cache provided by the present invention includes at least two nodes. The node comprises a CPU, an Hlink controller and at least one accelerator, and the nodes are in communication connection through the Hlink controller. In addition, each node may also include a memory controller and a network card. It should be noted that the memory controller and the network card in the node are not essential items, and the node may not be provided with the memory controller and the network card. The number of accelerators in a node may be one or more, and is not limited to the number. In this embodiment, the number of accelerators per node is 3, and the accelerators are accelerator a, accelerator B, and accelerator C.
Referring to fig. 2, fig. 2 illustrates a main structure of an Hlink controller. As shown in fig. 2, the Hlink controller includes a bus interface, an external interface, an Hlink protocol stack, a control register, a command register, a routing table, a data checksum recovery unit, and an address mapping table. And the bus interface is used for communicating the Hlink controller with the modules in the local node. The external interface is used for being connected with other nodes. The Hlink protocol stack is used for analyzing and transmitting messages. The control register is used for configuration and initialization of the Hlink controller. The command register is used for command operation of the Hlink controller. The routing table is used to specify the node address for remote data access. The data check and recovery module is used for data check and recovery inside the Hlink controller. The address mapping table is used for configuring the mapping relation between the local physical address base address of the local node and the remote physical address base address and verifying the access authority of the remote node. The Hlink controller has a buffer space for data buffering.
Referring to fig. 3, fig. 3 illustrates the main contents of an address mapping table. As shown in fig. 3, the address mapping table contains a local device ID, a local physical address base, a remote physical address base, an address size identifier, whether an internal address identifier is present, whether a push identifier is supported, and a valid bit identifier. The local device ID is used to identify the accelerator in the local node that has remote access rights, e.g., accelerator a, which represents the local node with 0x01, may access the cache of the accelerator of the remote node. The home physical address base is used to identify a physical address of the home node, which is a physical address mapped to the home node when the remote node accesses the home node. For example, when accelerator a of the remote node accesses node address 0x1000, the Hlink controller will translate to address 0x12341000 through the address mapping table, and the remote node sends the address to the Hlink controller of the local node, through which it can point to some accelerator cache inside the local node. The remote physical address base is used to identify a remote physical address corresponding to the local physical address base. The address size identifier is used to identify an address range for the remotely accessible node that identifies an end location of the accelerator address of the remote node, and access is permitted only for address ranges contained within the base address and the end location. Whether the internal address identifier is used to indicate that the remote physical address base is a memory space address of the remote node or an accelerator address of the remote node. The Hlink controller may determine whether the "support cache push" bit needs to be verified based on the whether internal address identifier. For example, when accessing the memory space of a remote node, the Hlink controller may ignore the "whether cache push is supported" bit; when accessing the accelerator address of the remote node, the Hlink controller will check if the "support cache push" bit is 1, and if not 1, will prohibit the data access. Whether the push identification is supported is used for indicating whether the local node has the accelerator cache authority of the remote node. The valid bit identifier is used to indicate whether the entry in the address mapping table corresponding to the valid bit is valid. For example, if the valid bit is 0, the entry address mapping is invalid, and if the valid bit is 1, the entry address mapping is valid.
The embodiment of the invention also provides an equipment cache pushing method based on the Hlink, which is applied to the equipment cache pushing system based on the Hlink. The device cache pushing method based on the Hlink comprises the following steps.
Step S1: and configuring an address mapping table and a routing table of the Hlink controller of each node.
Specifically, the local device ID, the local physical address base, the remote physical address base, the address size identifier, whether the internal address identifier is available, whether the push identifier and the valid bit identifier are supported, in the address mapping table are configured through API software, and the routing table is configured according to the node address of each node.
Step S2: node addresses and accelerator addresses of a first node and a second node needing to be remotely accessed are configured and sent to the Hlink controllers of the first node and the second node, and whether the Hlink controllers of the first node and the second node have access rights or not is verified.
Specifically, the first node and the second node are nodes that need to establish remote access. For example, when the cache data of the accelerator a of the first node needs to be pushed to the cache of the accelerator B of the second node, the node addresses and the accelerator addresses of the first node and the second node are sent to the Hlink controllers of the first node and the second node, and the Hlink controllers of the first node and the second node verify whether the node addresses and the accelerator addresses have the access right by checking the respective address mapping tables.
Step S3: the remote data transfer is initiated by a control register of the first node when having access rights.
Step S4: the acceleration of the first node transfers data to the Hlink controller cache of the node.
Step S5: and the control register of the first node sends a starting message to the Hlink controller of the second node.
Specifically, the start message includes the start flag and the data size of the transmitted data, so that the Hlink controller of the second node can correctly identify and receive the transmitted data, and the buffer data is guaranteed not to overflow.
Step S6: and the Hlink control device of the first node sends the cached data to the Hlink controller of the second node for caching, and sends a finish message when the end.
Specifically, the end packet may include a data end flag, a data check bit, and an interrupt enable bit. That is, after the Hlink controller completes data transmission, the Hlink controller may verify whether the data is correct by comparing the data check bits. The remote interrupt signal may also be sent by the first node to the CPU of the second node when the transfer is complete by configuring the interrupt enable bit.
Step S7: and when receiving the ending message, the Hlink controller of the second node directly transmits the cached data to a buffer area of an accelerator of the second node. Specifically, the data is directly transmitted to accelerator B of the second node.
Further, the device cache pushing system based on the Hlink further comprises an application layer for supervising the data transmission process between the nodes. After step S7, the application layer is configured to delete the contents configured in the address mapping tables in the first node and the second node.
In practical use, referring to fig. 1, only 3 data copies are used in the data transmission process, which reduces a large number of data copy operations compared with the prior art. In addition, the nodes are interconnected through an Hlink controller, the bandwidth of the Hlink single lane is up to 25Gb/s, and the delay is as low as 10 ns. The node may map multiple accelerators of other nodes to its local space according to an address mapping table in the Hlink controller, so that its CPU may directly access and control these virtual accelerators to establish a tight coupling connection, as shown in fig. 4, which is an effect diagram for realizing node interconnection through the Hlink controller.
In conclusion, the system and the method for pushing the device cache based on the Hlink can realize efficient communication among the nodes, and the CPU does not participate in the data transmission process, thereby greatly saving the resources of the CPU. In the data transmission process, a large amount of data copying operations are reduced, the memory bandwidth is saved, and the communication delay is reduced.
In the data transmission process, the consistency of the cache data is not required to be ensured between the Hlink controller and the accelerator, and the consistency of the cache data can be ensured by the driving of the accelerator, so that the design complexity of the system is reduced.
The Hlink controller is provided with a remote access authority checking mechanism, and the safety and the accuracy of remote data pushing are guaranteed.
The above description is of the preferred embodiment of the present invention and the technical principles applied thereto, and it will be apparent to those skilled in the art that any changes and modifications based on the equivalent changes and simple substitutions of the technical solution of the present invention are within the protection scope of the present invention without departing from the spirit and scope of the present invention.

Claims (6)

1. An Hlink-based device cache push system, which is characterized by comprising at least two nodes,
the node comprises a CPU, an Hlink controller and at least one accelerator; the nodes are in communication connection through an Hlink controller;
the system comprises an Hlink controller, a bus interface, an external interface, an Hlink protocol stack, a control register, a command register, a routing table, a data check and recovery unit and an address mapping table, wherein the Hlink controller comprises a bus interface, an external interface, a Hlink protocol stack, a control register, a command register, a routing table, a data check and recovery unit and an address mapping table;
the bus interface is used for communicating the Hlink controller with a module in a local node;
the external interface is used for being connected with other nodes;
the Hlink protocol stack is used for analyzing and transmitting messages;
the control register is used for configuring and initializing the Hlink controller;
the command register is used for command operation of the Hlink controller;
the routing table is used for appointing a node address for remote data access;
the data check and recovery module is used for checking and recovering data inside the Hlink controller;
the address mapping table is used for configuring the mapping relation between the local physical address base address of the local node and the remote physical address base address and verifying the access authority of the remote node.
2. The Hlink-based device cache push system of claim 1, wherein said address mapping table contains a local device ID, a local physical address base, a remote physical address base, an address size identification, whether an internal address identification is present, whether a push identification is supported, and a valid bit identification;
the local device ID is used for identifying an accelerator with remote access authority in a local node;
the local physical address base address is used for identifying a physical address of a local node;
the remote physical address base address is used for identifying a remote physical address corresponding to the local physical address base address;
the address size identification is used for marking an address interval of the remote accessible node;
whether the internal address identifier is used for indicating that the remote physical address base address is the memory space address of the remote node or the accelerator address of the remote node;
the pushing support identification is used for indicating whether the local node has the accelerator cache authority of the remote node;
the valid bit identifier is used for indicating whether the valid bit corresponds to an entry in the address mapping table.
3. The Hlink-based device cache push system of claim 2, wherein the Hlink controller has a cache space for data caching.
4. An Hlink-based device cache pushing method applied to the Hlink-based device cache pushing system according to any one of claims 1 to 3, wherein the method comprises the following steps:
configuring an address mapping table and a routing table of an Hlink controller of each node;
configuring node addresses and accelerator addresses of a first node and a second node which need to establish remote access, sending the node addresses and the accelerator addresses to an Hlink controller of the first node and the second node, and verifying whether the Hlink controller of the first node and the second node has access authority or not;
initiating, by a control register of the first node, a remote data transfer when having access rights;
accelerating the transmission of data of the first node to an Hlink controller cache of the node;
a control register of a first node sends a start message to an Hlink controller of a second node, wherein the start message comprises an initial mark and a data size of transmitted data;
the method comprises the steps that an Hlink control device of a first node sends cached data to an Hlink controller of a second node for caching, and sends a finish message when the Hlink controller of the first node finishes caching;
and when receiving the ending message, the Hlink controller of the second node directly transmits the cached data to a buffer area of an accelerator of the second node.
5. The method of claim 4, wherein after "the Hlink controller of the first node sends the buffered data to the Hlink controller of the second node and sends an end message at the end", the method further comprises:
the first node sends a remote interrupt signal to the CPU of the second node.
6. The method of claim 4, wherein the system further comprises an application layer, after "the Hlink controller of the second node directly transfers the buffered data to the buffer of the accelerator of the second node upon receiving the end packet", the method further comprising:
the application layer is configured to delete the contents of the address mapping table configuration in the first and second nodes.
CN202110188763.3A 2021-02-19 2021-02-19 Device cache pushing system and method based on Hlink Pending CN113014631A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110188763.3A CN113014631A (en) 2021-02-19 2021-02-19 Device cache pushing system and method based on Hlink

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110188763.3A CN113014631A (en) 2021-02-19 2021-02-19 Device cache pushing system and method based on Hlink

Publications (1)

Publication Number Publication Date
CN113014631A true CN113014631A (en) 2021-06-22

Family

ID=76402956

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110188763.3A Pending CN113014631A (en) 2021-02-19 2021-02-19 Device cache pushing system and method based on Hlink

Country Status (1)

Country Link
CN (1) CN113014631A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5887134A (en) * 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
CN104202391A (en) * 2014-08-28 2014-12-10 浪潮(北京)电子信息产业有限公司 RDMA (Remote Direct Memory Access) communication method between non-tightly-coupled systems of sharing system address space
JP2015135696A (en) * 2009-09-18 2015-07-27 インテル コーポレイション Providing hardware support for shared virtual memory between local physical memory and remote physical memory
CN105704098A (en) * 2014-11-26 2016-06-22 杭州华为数字技术有限公司 Data transmission method for virtualized networks, node controller and data transmission system for virtualized networks
CN107168810A (en) * 2017-05-10 2017-09-15 郑州云海信息技术有限公司 A kind of calculate node internal memory sharing system and reading and writing operation internal memory sharing method
CN109582611A (en) * 2017-09-29 2019-04-05 英特尔公司 Accelerator structure
CN110892387A (en) * 2017-07-14 2020-03-17 Arm有限公司 Memory node controller

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5887134A (en) * 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
JP2015135696A (en) * 2009-09-18 2015-07-27 インテル コーポレイション Providing hardware support for shared virtual memory between local physical memory and remote physical memory
CN104202391A (en) * 2014-08-28 2014-12-10 浪潮(北京)电子信息产业有限公司 RDMA (Remote Direct Memory Access) communication method between non-tightly-coupled systems of sharing system address space
CN105704098A (en) * 2014-11-26 2016-06-22 杭州华为数字技术有限公司 Data transmission method for virtualized networks, node controller and data transmission system for virtualized networks
CN107168810A (en) * 2017-05-10 2017-09-15 郑州云海信息技术有限公司 A kind of calculate node internal memory sharing system and reading and writing operation internal memory sharing method
CN110892387A (en) * 2017-07-14 2020-03-17 Arm有限公司 Memory node controller
CN109582611A (en) * 2017-09-29 2019-04-05 英特尔公司 Accelerator structure

Similar Documents

Publication Publication Date Title
US7587536B2 (en) Method and apparatus for distributing USB hub functions across a network
EP3214550B1 (en) Control of persistent memory via a computer bus
US6603744B2 (en) Connection establishment method, communication method, state change transmission method, state changing method, wireless apparatus, wireless device, and computer
US6421769B1 (en) Efficient memory management for channel drivers in next generation I/O system
KR101720134B1 (en) Bus bridge apparatus
US7705850B1 (en) Computer system having increased PCIe bandwidth
US9219695B2 (en) Switch, information processing apparatus, and communication control method
US20160283422A1 (en) Network interface controller with direct connection to host memory
US7058744B2 (en) Cluster system, computer and program
CN106959935B (en) Method compatible with I2C communication and IPMB communication
US20050132089A1 (en) Directly connected low latency network and interface
WO2002041157A2 (en) Method and apparatus for converting address information between pci bus protocol and a message passing queue-oriented bus protocol
US7469309B1 (en) Peer-to-peer data transfer method and apparatus with request limits
KR102303424B1 (en) Direct memory access control device for at least one processing unit having a random access memory
CN113014631A (en) Device cache pushing system and method based on Hlink
JP2000339267A (en) Pci bus control system
CN111190840A (en) Multi-party central processing unit communication architecture based on field programmable gate array control
JP2011113163A (en) Inter-end point communication control device and method in io access communication system
WO2019124259A1 (en) Configuration management device, configuration management system, configuration management method, and configuration management program
CN104850517A (en) Method and apparatus for transmitting packet data using DMA
KR100599112B1 (en) Equipment and method for communication between agents in PCI system
CN116756078B (en) Notification method and device of pcie data packet and storage medium
CN112948317A (en) Multi-node system based on Hlink and processing method
TWI411922B (en) Universal serial bus host controller and method utilizing the same
JPH054040Y2 (en)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210622