CN117725011A - Host bridging device - Google Patents

Host bridging device Download PDF

Info

Publication number
CN117725011A
CN117725011A CN202311714543.5A CN202311714543A CN117725011A CN 117725011 A CN117725011 A CN 117725011A CN 202311714543 A CN202311714543 A CN 202311714543A CN 117725011 A CN117725011 A CN 117725011A
Authority
CN
China
Prior art keywords
host
request
memory
target
protocol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311714543.5A
Other languages
Chinese (zh)
Inventor
岳龙
王彦伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311714543.5A priority Critical patent/CN117725011A/en
Publication of CN117725011A publication Critical patent/CN117725011A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the application provides host bridging equipment, wherein the equipment comprises: the device comprises a first connector, a protocol converter and a second connector, wherein the protocol converter is connected between the first connector and the second connector, the first connector is used for being connected with a first host, and the second connector is used for being connected with a second host; the first connector is used for receiving a reference request initiated by the first host; a protocol converter for converting the reference request into a target operation instruction in a target instruction format used by the second host according to a data operation specification agreed in the target protocol; and the second connector is used for sending the target operation instruction to the second host. According to the method and the device, the problem that bridging efficiency between the bridging device and the host is low is solved, and the effect of improving bridging efficiency between the bridging device and the host is achieved.

Description

Host bridging device
Technical Field
The embodiment of the application relates to the field of computers, in particular to host bridging equipment.
Background
rapidIO is an open interconnection technology standard for meeting interconnection of high-performance embedded systems, and has the characteristics of high reliability, high bandwidth, low delay, interchangeability and the like. At present, the main supporting RAPID IO interface is ASIC or DSP, soC, FPGA and other embedded programmable devices, and the high-performance CPU hardly provides the RAPID IO interface. The current mainstream scheme is to adopt a PCIe to RAPID IO bridge to realize interconnection between the CPU and devices of the RAPID IO interface. However, the RAPID IO-to-PCIe device belongs to a PCIe device, and at present, the PCIe device cannot maintain cache consistency in a hardware manner. If PCIe to SRIO devices are used in the interconnection system, cache consistency must be maintained in software, which will affect the running rate of the device.
Disclosure of Invention
The embodiment of the application provides host bridging equipment, which at least solves the problem that bridging efficiency between the bridging equipment and a host is low in the related art.
According to one embodiment of the present application, there is provided a host bridge device including: a fast interconnect input-output controller, a protocol converter, and a computational fast link controller, wherein,
the protocol converter is connected between the quick interconnection input-output controller and the calculation quick link controller, the quick interconnection input-output controller is used for being connected with a quick interconnection input-output interface of a first host, and the calculation quick link controller is used for being connected with a calculation quick link protocol interface of a second host;
the fast interconnect input/output controller is configured to receive a reference request initiated by the fast interconnect input/output interface of the first host, where the reference request is configured to request a target operation on data in a target memory space of the second host, and transmit the reference request to the protocol converter, where the target operation has an effect of triggering updating a target cache space of the second host, and the cache space is configured to cache data in the memory space;
The protocol converter is configured to convert the reference request into a target operation instruction in a target instruction format used by the second host according to a data operation specification agreed in a target protocol, where the target protocol is a protocol that has a function of keeping a memory space and a cache space of the same device consistent with data versions, and transmit the target operation instruction to the computing fast link controller;
the computing quick link controller is configured to send the target operation instruction to the computing quick link protocol interface of the second host.
Optionally, the protocol converter is further connected to a first device memory managed by the second host;
the protocol converter is configured to receive, by the computing quick link controller, a candidate operation instruction generated by the second host executing the target operation instruction from the computing quick link protocol interface, where the candidate operation instruction is configured to instruct to execute the target operation on data in a first device memory, and the target memory space includes the first device memory; and sending the candidate operation instruction to the first equipment memory.
Optionally, the protocol converter includes: a coherence proxy engine, wherein,
the first request sending port of the consistency proxy engine is connected with the cache consistency communication interface of the computing quick link controller, the first request receiving port of the consistency proxy engine is connected with the global shared memory of the quick interconnection input-output controller, the equipment memory connecting port of the consistency proxy engine is connected with the first equipment memory, and the second request receiving port of the consistency proxy engine is connected with the memory equipment communication port of the computing quick link controller;
the fast interconnect input/output controller is configured to transmit the reference request to the request receiving port through the global shared memory;
the coherence agent engine is configured to convert the reference request into the target operation instruction in a target instruction format used by the second host according to a data operation specification agreed in a cache protocol of a computing quick link protocol, and send the target operation instruction to the cache coherence communication interface of the computing quick link controller through the first request sending port;
The computing quick link controller is further configured to, when receiving the candidate operation instruction fed back by the computing quick link protocol interface of the second host, transmit the candidate operation instruction to the second request receiving port of the coherence proxy engine through the memory device communication port;
the coherence agent engine is further configured to, when the candidate operation instruction is received from the second request sending port and the first device memory has the effect of triggering an update of the cache space, convert the candidate operation instruction into a target request conforming to a storage protocol of a computing quick link protocol, and invoke the device memory connection port to send the target request to the first device memory.
Optionally, the protocol converter further comprises a first arbiter, wherein,
a third request receiving port of the first arbiter is connected with the memory device communication port of the computing quick link controller, and a second request transmitting port of the first arbiter is connected with the second request receiving port of the coherence agent engine;
the first arbiter is configured to detect a first operated object indicated by the candidate operation instruction when the candidate operation instruction is received through the third request receiving port, and transmit the candidate operation instruction to the second request receiving port through the second request sending port when the first operated object is determined to be the first device memory.
Optionally, the protocol converter is further connected to a second device memory managed by the second host;
the protocol converter is configured to transmit the target operation instruction to the second device memory when the target memory space is determined to be the second device memory; and transmitting the target operation instruction to the second host under the condition that the target memory space is determined to be the host memory of the second host.
Optionally, the fast interconnect input-output connector includes: a first interface, a second interface, a third interface and a connector, wherein,
the first interface is used for connecting the connector to the first host, the connector is connected to the protocol converter through the second interface, and the connector is also connected with the computing quick link connector through the third interface;
the connector is configured to receive a request initiated by the first host from the first interface, where the request initiated by the first host is used to request to operate on data in a target memory space of the second host; transmitting a request initiated by the first host as the reference request to the protocol converter in case the operation requested by the first host is the target operation; and sending a request initiated by the first host to the second host through the third interface under the condition that the operation requested by the first host is not the target operation.
Optionally, the host bridge device further includes: a message router, wherein,
the message router is connected between the connector and the third interface, and is also connected with a third device memory managed by the second host;
the message router is configured to transmit, when it is determined that the request initiated by the first host is used to request to perform an operation on data stored in the third device memory, the request initiated by the first host to the third device memory; and transmitting the request initiated by the first host to the third interface under the condition that the request initiated by the first host is determined to be used for requesting to perform operation on the data stored in the host memory of the second host.
Optionally, the message router includes: a parsing module and a routing module, wherein,
the analysis module is respectively connected with the connector and the routing module, and the routing module is respectively connected with the third equipment memory and the third interface;
the analyzing module is used for analyzing the request initiated by the first host to obtain the operation position information requested by the request initiated by the first host, and sending the operation position information and the request initiated by the first host to the routing module;
The routing module is configured to transmit, to the third device memory, a request initiated by the first host, where the operation location information is used to indicate a storage location in the third device memory; and transmitting a request initiated by the first host to the third interface under the condition that the operation position information is used for indicating a storage position in the host memory of the second host.
Optionally, the message router further includes: a second arbiter, wherein,
the fourth request receiving port of the second arbiter is connected with the third request sending port of the routing module, and the fourth request sending port of the second arbiter is connected with the third equipment memory;
the second arbiter is configured to detect a second operated object indicated by the request initiated by the first host when a request issued by the first host is received from the fourth request receiving port, and transmit the request initiated by the first host to the third device memory through the fourth request sending port when the second operated object is determined to be the third device memory.
Optionally, the protocol converter is further configured to: setting an operation state to be a congestion state under the condition that the reference request is received, wherein the congestion state is used for blocking the protocol converter from receiving the request; and under the condition that a response result returned by the second host responding to the target operation instruction is received, converting the running state from the congestion state to an unobstructed state, wherein the unobstructed state is used for indicating the protocol converter to continuously receive a request.
According to the method and the device, the first host initiates the reference request for requesting the target operation of the data in the target memory space of the second host to the second host, the target operation has the effect of updating the target cache space of the second host, therefore, the first connector, the second connector and the protocol converter are arranged on the host bridging device, and the protocol converter is connected between the first connector and the second connector, so that after the first connector receives the reference request, the reference request is sent to the protocol converter, the protocol converter can convert the reference according to the data operation specification agreed in the target protocol, and the target operation instruction is converted into the target operation instruction in the target instruction format used by the second host device. The bridging of the first host and the second host through the hardware equipment is realized, so that the second host which does not support the target protocol has the function agreed by the target protocol, the problem of lower bridging efficiency between the bridging equipment and the hosts can be solved, and the effect of improving the bridging efficiency between the bridging equipment and the hosts is achieved.
Drawings
FIG. 1 is a schematic diagram of a host bridge device according to an embodiment of the present application;
FIG. 2 is an alternative host bridge schematic according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an alternative protocol conversion of an implementation flow according to the present application;
FIG. 4 is a request routing diagram in the related art;
FIG. 5 is an alternative request routing schematic according to an embodiment of the present application;
fig. 6 is a schematic diagram of an alternative host bridge system according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
Fig. 1 is a schematic diagram of a host bridge device according to an embodiment of the present application, as shown in fig. 1, the device includes: a first connector, a protocol converter and a second connector, wherein,
the protocol converter is connected between the first connector and the second connector, the first connector is used for being connected with a first host, and the second connector is used for being connected with a second host;
The first connector is configured to receive a reference request initiated by the first host, where the reference request is used to request to perform a target operation on data in a target memory space of the second host, and transmit the reference request to the protocol converter, where the target operation has an effect of triggering updating a target cache space of the second host, and the cache space is used to cache the data in the memory space;
the protocol converter is configured to convert the reference request into a target operation instruction in a target instruction format used by the second host according to a data operation specification agreed in a target protocol, where the target protocol is a protocol that has a function of keeping a memory space and a cache space of the same device consistent with data versions, and transmit the target operation instruction to the second connector;
the second connector is configured to send the target operation instruction to the second host.
Through the above, the first host initiates the reference request for requesting to perform the target operation on the data in the target memory space of the second host to the second host, and the target operation has the effect of updating the target cache space of the second host, so that the first connector, the second connector and the protocol converter are disposed on the host bridge device, and the protocol converter is connected between the first connector and the second connector, so that after the first connector receives the reference request, the reference request is sent to the protocol converter, so that the protocol converter can convert the reference according to the data operation specification agreed in the target protocol, and the target protocol is a protocol for realizing the function of keeping the memory space and the cache space of the same device consistent with the data version, and therefore, after the second host executes the operation according to the target operation instruction, the memory space and the cache space of the second host can be ensured to keep the data version consistent. The bridging of the first host and the second host through the hardware equipment is realized, so that the second host which does not support the target protocol has the function agreed by the target protocol, the problem of lower bridging efficiency between the bridging equipment and the hosts can be solved, and the effect of improving the bridging efficiency between the bridging equipment and the hosts is achieved.
Optionally, in the embodiment of the present application, the reference request is a coherent access request, where the request is used to request access to a memory space of the second host, and a target operation that needs to keep a data version consistent with a memory space and a cache space in the second host may include, but is not limited to, a read instruction, a read data, an IO degree operation, a read ownership, a data cache refresh operation, a data cache invalidation operation, and the present solution is not limited thereto.
Alternatively, in the embodiments of the present application, the target protocol may be, but is not limited to, rapid. Io protocol and CXL protocol. In this embodiment, the protocol used by the reference request initiated by the first host may be the same protocol as the target protocol, or may be two different protocols, for example, when the reference request may be a protocol in rapid. Io protocol format, the target protocol may be a rapid. Io protocol, or may be a CXL protocol. Fig. 2 is an optional host bridge schematic diagram according to an embodiment of the present application, as shown in fig. 2, the host bridge device may be used alone, but not limited to, for bridging between a host (a first host) of a RAPID IO protocol and a host (a second host) of a CXL protocol, to provide hardware-consistent interconnection support for a host device (a CPU or an accelerator card) with a CXL interface and the RAPID IO host, which is a feature that cannot be supported by conventional PCIe to RAPID IO, because most CPUs are not provided with hardware cache consistency when they are externally connected to IO devices such as PCIe. The host bridging device is mainly used for interconnection between a CXL host and a RAPID IO host, wherein the CXL host comprises a CPU or SoC supporting CXL protocol, and the RAPID IO host comprises an integrated SoC, a DSP, an FPGA or an ASIC terminal. The CXL and the RAPID IO provided by the device have the characteristic of consistent memory access, can support the CPU/SoC host and the terminal RAPID IO host to realize the memory access with quick consistency through the CXL-RAPID IO bridging device, and has higher efficiency compared with the maintenance of cache consistency through a software mode.
As an alternative embodiment, the protocol converter is further connected to a first device memory managed by the second host;
the protocol converter is configured to receive, through the second connector, a candidate operation instruction generated by the second host executing the target operation instruction, where the candidate operation instruction is used to instruct to execute the target operation on data in a first device memory, and the target memory space includes the first device memory; and sending the candidate operation instruction to the first equipment memory.
Optionally, in the embodiment of the present application, after receiving the candidate operation instruction, the protocol converter may directly send the candidate operation instruction to the first device memory, or may further perform protocol conversion on the candidate operation instruction, convert the candidate operation instruction according to the target protocol format, and send the converted operation instruction to the first memory device.
Optionally, in this embodiment of the present application, the number of the first memory devices may be one or more, and when the number of the first device memories is plural, a connection interface corresponding to each first device memory one to one may be provided on the protocol converter, or a fixed connection interface may also be provided, and connected to plural first device memories through the connection interface.
Through the above, the first device memory managed by the protocol converter and the second host is connected, so that bridging between the first device memory and the second host through hardware is realized, and then the second host can bridge connection between the device and the first device memory through the host, so that data interaction can be avoided when original data interaction between the second host and the first device memory passes through faults, and the data transmission speed can be accelerated to a certain extent through a hardware channel mode.
As an alternative embodiment, the protocol converter includes: a conversion module and a control module, wherein,
the conversion module is connected with the control module, and the control module is respectively connected with the first connector, the second connector and the first equipment memory;
the control module is used for transmitting the reference request to the conversion module;
the conversion module is used for converting the reference request into the target operation instruction in a target instruction format used by the second host according to the data operation specification agreed in the cache protocol of the CXL protocol, and returning the target operation instruction to the control module;
The control module is further configured to transmit the target operation instruction to the second connector, receive the candidate operation instruction, and transmit the candidate operation instruction to the conversion module when the first device memory has an effect of triggering an update of a cache space;
the conversion module is further used for converting the candidate operation instruction into a target request of a storage protocol conforming to the CXL protocol and returning the target request to the control module;
the control module is further configured to send the target request to the first device memory.
Optionally, in this embodiment of the present application, when converting the reference request into the target operation instruction, the conversion module may convert the reference request according to an instruction format of the CXL CACHE so as to obtain the target operation instruction in the CXL CACHE instruction format, and when converting the candidate operation instruction into the target request, the conversion module may convert the candidate operation instruction according to the CXL MEM memory format so as to obtain the target request in the CXL MEM format.
Fig. 3 is a schematic diagram of an alternative protocol conversion of an implementation flow according to the present application, as shown in fig. 3, where the reference request is an access request with a CACHE consistency requirement, and the reference request sent by the first device includes a request for executing operations in 13 such as a read instruction, a read data, an IO operation, a read ownership, a data CACHE refresh operation, and the like, each of which involves a CACHE update operation of OWNER, and the conversion module implements a corresponding function for such a request by using 16 operation types corresponding to a D2H request transaction message in a CACHE sub-protocol.
As an alternative embodiment, the protocol converter further comprises a first detection module, wherein,
the first detection module is connected with the control module;
the first detection module is configured to detect a first operated object indicated by the candidate operation instruction, and instruct the control module to transmit the candidate operation instruction to the control module when determining that the first operated object is the first device memory.
Optionally, in this embodiment of the present application, the first detection module may, but not limited to, identify a key of the operation instruction, and determine that the first operated object is the first device memory when the target key is detected, where the target key may, but not limited to, be a key for characterizing that device data stores address information in the device memory, or may also be a key for characterizing a target port number of the protocol converter, where the target port number is a port number connected to the first device memory.
Through the above, by setting the first detection module, the operated object indicated by the received operation instruction is detected, and the operation instruction is transmitted to the control module under the condition that the operated object is the first equipment memory, so that the accuracy of the operation instruction received by the control module is ensured, and unnecessary load pressure caused by other operation instructions being transmitted to the control module by mistake to the control module and the conversion module is avoided.
As an alternative embodiment, the protocol converter is further connected to a second device memory managed by the second host;
the protocol converter is configured to transmit the target operation instruction to the second device memory when the target memory space is determined to be the second device memory; and transmitting the target operation instruction to the second host under the condition that the target memory space is determined to be the host memory of the second host.
Optionally, in the embodiment of the present application, the protocol converter may directly transmit the target operation instruction to the second device memory, or may further convert the target operation instruction into a target request according to a storage protocol of the CXL protocol, and send the target request to the second device memory.
Optionally, in this embodiment of the present invention, the memory space of the second host includes a device memory managed by the second host and a host memory locally connected to the second host, where the reference request may be a request to perform an operation on the device memory or perform an operation on the host memory, and for the converted target operation instruction, on one hand, the second host may determine whether the object to be performed is the device memory or the host memory by directly sending the target operation instruction to the second host, and after the object to be performed is the device memory, generate a candidate operation instruction for the device memory, and forward the candidate operation instruction to the device memory through the protocol converter, and on the other hand, the protocol converter may further parse the operated object requested to be performed by the reference request after receiving the reference request, and directly send the target operation instruction to the device memory after the operated object is the device memory.
Optionally, in the embodiment of the present application, when the reference request is received, the protocol converter may parse the reference request, so as to determine address information (the first information may, but is not limited to, include a source address or a destination address) stored in the data of the operation requested by the reference request, and further determine whether the target memory space is the device memory or the host memory according to the address information.
Through the above, after receiving the reference request, the protocol converter determines the specific type of the target memory space, so that the target operation instruction is sent to the second device memory under the condition that the target memory space is determined to be the second device memory, and the overlong data transmission link caused by directly transmitting the target operation instruction to the second host in the related technology is avoided, thereby improving the response speed of operation.
As an alternative embodiment, the first connector includes: a first interface, a second interface, a third interface and a controller, wherein,
the first interface is used for connecting the controller to the first host, the controller is connected to the protocol converter through the second interface, and the controller is also connected with the second connector through the third interface;
The controller is configured to receive a request initiated by the first host from the first interface, where the request initiated by the first host is used to request to operate on data in a target memory space of the second host; transmitting a request initiated by the first host as the reference request to the protocol converter in case the operation requested by the first host is the target operation; and sending a request initiated by the first host to the second host through the third interface under the condition that the operation requested by the first host is not the target operation.
Optionally, in the embodiment of the present application, the controller may determine whether the operation requested to be performed by the refer request is a target operation by identifying a keyword used to characterize the operation type in the refer request, or may determine whether the operation requested to be performed by the refer request is a target operation by identifying a protocol format of the refer request.
Optionally, in this embodiment of the present application, between the second host and the third interface, in order to ensure the transmission quality of data or in order for the second host to be able to better identify the operation request, a protocol conversion device may be further configured to convert the reference request into a protocol format supported by the second host.
Through the above, the instructions sent by the first host are classified by the setting controller, so that the reference request for executing the target operation is sent to the protocol converter, and the request for executing the target operation is not directly sent to the second host, but all the requests initiated by the first host are directly sent to the protocol converter for protocol conversion, so that the load pressure of the protocol converter is reduced, and on the other hand, the conversion rate of the reference request by the protocol converter can be indirectly accelerated, and the bridging efficiency of the host bridging device to the first host and the second host is further ensured.
As an alternative embodiment, the host bridge device further comprises: a message router, wherein,
the message router is connected between the controller and the third interface, and is also connected with a third equipment memory managed by the second host;
the message router is configured to transmit, when it is determined that the request initiated by the first host is used to request to perform an operation on data stored in the third device memory, the request initiated by the first host to the third device memory; and transmitting the request initiated by the first host to the third interface under the condition that the request initiated by the first host is determined to be used for requesting to perform operation on the data stored in the host memory of the second host.
Optionally, in this embodiment, when the request initiated by the first host is used to perform an operation on data stored in the third device memory, the request initiated by the first host may be directly transmitted to the third device memory, or the request initiated by the first host may be converted into a candidate request conforming to a storage protocol of the CXL protocol, and the candidate request may be sent to the third device memory.
Optionally, in the embodiment of the present application, the message router may determine address information (which may, but is not limited to, including a source address or a destination address) of the data for which the operation is requested to be performed by analyzing the request initiated by the first host, and further determine whether the request initiated by the first host is a request for performing the operation on the data stored in the third device memory according to the address information.
Fig. 4 is a schematic diagram of a request routing in the related art, as shown in fig. 4, a request sent by a first host is directly sent to a second host through a host bridge device, and the second host determines whether an operation object requested to be executed by the request is a local memory (host memory) or an HDM (device memory), and if the operation object is the host memory, the request is directly transmitted to the host memory, and if the operation object is the device memory, the request is transmitted to the HDM through the host bridge device, so that the operation clearly increases a transmission path of the request of which the operation object is the device memory.
Fig. 5 is an alternative request routing schematic, as shown in fig. 5, where a request sent by a first host is transmitted to a message router in a host bridge device, and it is determined by the message router whether an object requesting to perform an operation is a host memory or a device memory, and a request that an operated object is the device memory is directly transmitted to the device memory. In contrast, conventional schemes have a long processing path for request transactions, and during execution, the cxl.io protocol and the cxl.mem protocol share bandwidth, which is low in bandwidth utilization, resulting in low throughput. According to the scheme, the condition that the address in the IO transaction request is in the HDM range is optimized, the transmission mode of bypassing the CXL host is adopted, and the throughput rate is improved under the condition that the data read-write integrity is guaranteed.
Through the steps, when the request is a request for non-execution target operation, the message router is set to detect the object for requesting execution, so that when the request initiated by the first host is used for requesting execution of the data stored in the third device memory, the request is directly sent to the third device memory, the problem that the request is sent to the second host firstly in related technology is avoided, the host determines the executed object and then transmits the executed object to the third device memory, the request transmission path is shortened, the speed for responding to the request is increased, and the bridging efficiency of the host bridging device to the first host and the second host is further ensured.
As an alternative embodiment, the message router includes: a parsing module and a routing module, wherein,
the analysis module is respectively connected with the controller and the routing module, and the routing module is respectively connected with the third equipment memory and the third interface;
the analyzing module is used for analyzing the request initiated by the first host to obtain the operation position information requested by the request initiated by the first host, and sending the operation position information and the request initiated by the first host to the routing module;
The routing module is configured to transmit, to the third device memory, a request initiated by the first host, where the operation location information is used to indicate a storage location in the third device memory; and transmitting a request initiated by the first host to the third interface under the condition that the operation position information is used for indicating a storage position in the host memory of the second host.
Optionally, in the embodiment of the present application, the operation location information is used to characterize a storage location of the operated data, and the operation location information may be, but is not limited to, a source address or a destination address of the operated data, which is not limited in this scheme.
As an alternative embodiment, the message router further includes: a second detection module, wherein,
the second detection module is respectively connected with the routing module and the third equipment memory;
the second detection module is configured to detect a second operated object indicated by the request initiated by the first host, and instruct the routing module to transmit the request initiated by the first host to the third device memory if it is determined that the second operated object is the third device memory.
Through the steps, the second detection module is arranged to detect the operated object indicated by the received request, and the operation instruction is transmitted to the routing module under the condition that the operated object is the third equipment memory, so that the accuracy of the request transmitted to the third equipment memory through the routing module is ensured.
As an alternative embodiment, the protocol converter is further configured to: setting an operation state to be a congestion state under the condition that the reference request is received, wherein the congestion state is used for blocking the protocol converter from receiving the request; and under the condition that a response result returned by the second host responding to the target operation instruction is received, converting the running state from the congestion state to an unobstructed state, wherein the unobstructed state is used for indicating the protocol converter to continuously receive a request.
Through the steps, after the protocol converter receives the reference request, the running state of the equipment is in a congestion state, so that the protocol converter cannot receive other requests until the second host receives a response result returned by the second host responding to the target operation instruction, and the running state is converted into a smooth state, so that the request is received again, the processing time sequence of the protocol converter to the request is ensured, and the phenomenon that the protocol converter has disordered request processing is avoided.
Fig. 6 is a schematic diagram of an alternative host bridge system according to an embodiment of the present application, as shown in fig. 6, including a rapid. Io controller (corresponding to a first connector in the present application), a bridge controller and a CXL controller (corresponding to a second connector in the present application), where the bridge controller includes a coherence agent module (corresponding to a protocol converter in the present application), a stream data management module, a mapping management module, a message management module, a routing module (corresponding to a routing module in the present application), an arbiter (corresponding to a first detection module and a second detection module in the present application), and a TLP converter.
The CXL protocol controller comprises an elastic bus physical layer, a data link layer and a transaction layer. The main function is to realize CXL protocol, provide CXL x16 interface to the outside, the bus that the internal conversion corresponds to CXL.IO, CXL.MEM, CXL.CACHE three subprotocol. The CXL.IO protocol is transmitted and received by adopting an AXI Stream interface bus, a CACHE protocol and MEM protocol share interface bus, and the CACHE protocol comprises an AXIMM Master interface bus of an M port and an AXIMM Slave interface bus of an S port. The AXIMM Master interface bus initiates the H2D requests in the Cache protocol and the M2S requests in the MEM protocol mainly with the Master. The AXIMM Slave interface bus mainly receives a D2H request and an S2M request in a Cache protocol initiated by a consistency agent by a Slave. The CXL protocol controller supports two modes of CXL Root Port and CXL Endpoint, wherein the CXL Root Port mode is mainly applicable to CXL Endpoint equipment, and the CXL Endpoint mode is mainly applicable to CXL Endpoint equipment.
The RAPID IO protocol controller realizes the standard RAPID IO protocol in a layered manner according to the physical layer, the transmission layer and the logic layer.
The consistency agent module mainly realizes the consistency access request agent between the RAPID IO end equipment and the CXL end equipment. And providing a cache backup of the CXL host side and a cache backup of the RAPID IO equipment side in the module. The coherence agent has 5 interfaces, a is an AXIMM Master interface, responsible for initiating D2H requests and S2M requests to the CACHEMEM logic of the CXL controller. B is an AXIMM Slave interface, and is responsible for receiving M2S related requests initiated by an arbiter. CSR is control register interface, adopts AXIMM Slave interface bus, is mainly used for CXL host computer to read relevant equipment attribute. C is an AXIMM Slave interface bus, and receives a request initiated by a global shared memory logic block of the RIO controller. The D port is accessed into the HDM to realize read-write access to the HDM.
The global shared memory logic portion of the RAPID IO controller transaction layer communicates with the coherence agent module using an AXI memory map bus. When the coherence proxy module receives the global shared memory request, the module firstly enters a blocking state, then passes through an AXI memory mapping bus, and initiates a request (D2H) to a CXL host by using a CXL.CACHE protocol through A, and when the CXL host responds (H2D), the module can complete the response in a read channel or a write channel of the AXI memory mapping bus of the RIO transaction layer, and then jumps out of the blocking state.
The stream data management module realizes bridging between a stream data interface and the routing module in the RAPID IO transaction layer IO logic. S is an AXI Stream interface bus, and receives Stream data writing messages of IO logic blocks of the RIO controller. M is AXIMM interface bus, converts the data stream into memory read-write transaction, accesses the S3 port of the routing module, and the route of the memory read-write transaction is decided by the routing module.
Direct memory access between the RAPID IO host (first host) and the CXL host (second host) is initiated by the RIO device. When the RAPID IO terminal equipment initiates a data stream writing request, the stream data management module analyzes the message and then transmits the message to the S3 port of the routing module. The routing module CSR stores the HPA range of the HDM (device Memory) and the HPA range of the CXL host Memory, so that the data stream is sent to different targets according to the destination address HPA of the data stream.
The mapping management module realizes analysis and conversion of read-write operation in the IO logic of the RAPID IO transaction layer. S is an AXI Stream interface bus, and receives a read operation, a write operation and an atomic operation message of an IO logic block of the RAPID IO controller. M is an AXIMM interface bus, the IO message is converted into a memory read-write transaction, the memory read-write transaction is accessed to an S2 port of the routing module, and a route of the memory read-write transaction is decided by the routing module. When RAPID IO end equipment initiates a read operation, a write operation and a response write operation, and an atomic operation requests, the mapping management module analyzes the message and then transmits the message to the routing module.
And the message management module analyzes the two message messages of the RAPID IO transaction layer. S is an AXI Stream interface bus, and receives doorbell messages and data message messages of a RAPID IO controller message logic block. M is AXIMM interface bus, converts data message into memory read-write transaction, converts doorbell message into interrupt transaction, and accesses S1 port of the routing module. The route of the memory read-write transaction is decided by the routing module. The interrupt corresponding to the doorbell message will be forwarded by the routing module to the TLP translator.
The function of the arbiter is to realize the access arbitration of two masters to the HDM area, namely CXL host and RAPID IO device. The S1 port is driven by an M port of the CACHE/MEM logic block, receives an M2S request initiated by the CXL host terminal, the S2 port is driven by an M1 port of the routing module, and receives a memory access request initiated by the RAPID IO device terminal.
The routing module realizes path control of IO transactions and message transactions initiated by the RAPID IO end equipment, and can forward the corresponding memory access transactions to the CXL host through the TLP converter or to the consistency agent through the arbiter. The source address or the destination address of the IO transaction (except MAINTANCE) initiated by the RAPID IO terminal equipment is the physical address (HPA) of the CXL host, which comprises a memory area of the CXL host terminal and an HDM area on a translation device, and the two HPAs have unused physical transmission paths. When the source address or the destination address HPA is in the local memory area of the host, the routing module forwards the memory access request to the TLP converter through M2 and sends the memory access request to the CXL host through CXL.IO protocol. When the source address or destination address HPA is in the HDM region, the routing module forwards the memory access request to the arbiter through M1, and directly accesses the HDM region through the coherence agent. In contrast, conventional schemes have a long processing path for request transactions, and during execution, the cxl.io protocol and the cxl.mem protocol share bandwidth, which is low in bandwidth utilization, resulting in low throughput. According to the scheme, the condition that the address in the IO transaction request is in the HDM range is optimized, the transmission mode of bypassing the CXL host is adopted, and the throughput rate is improved under the condition that the data read-write integrity is guaranteed.
The TLP converter realizes multiplexing of various transactions to CXL.IO protocol, including: memory read-write of CXL.IO domain is converted into memory read-write on an AXIMM Master interface bus (M1), memory read-write of AXIMM Slave (S1) domain is converted into memory read-write of CXL.IO domain, MSI/MSI-X is converted into corresponding TLP.
Through the embodiment, the consistency proxy is adopted to realize the consistency access of units such as the rapidIO equipment and the CXL interface CPU/SoC, compared with the prior PCI-to-rapidIO device and method which must adopt software to maintain consistency, the scheme has the characteristic of hardware consistency, reduces software expenditure and improves bridging efficiency.
In the above embodiment, the CXL interface supports two modes, either the CXL root port device or the CXL end device. The CXL interface supports existing standards such as CXL 1.0/2.0/3/0 and the like, x16 and a degradation mode. The Rapid IO interface supports 1/2/4/8/16 channels and is compatible with SRIO. And providing a plurality of Rapid IO or SRIO hardware interfaces, and supporting the simultaneous connection of a plurality of Rapid IO or SRIO units.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principles of the present application should be included in the protection scope of the present application.

Claims (10)

1. A host bridging device, comprising: a fast interconnect input-output controller, a protocol converter, and a computational fast link controller, wherein,
the protocol converter is connected between the quick interconnection input-output controller and the calculation quick link controller, the quick interconnection input-output controller is used for being connected with a quick interconnection input-output interface of a first host, and the calculation quick link controller is used for being connected with a calculation quick link protocol interface of a second host;
the fast interconnect input/output controller is configured to receive a reference request initiated by the fast interconnect input/output interface of the first host, where the reference request is configured to request a target operation on data in a target memory space of the second host, and transmit the reference request to the protocol converter, where the target operation has an effect of triggering updating a target cache space of the second host, and the cache space is configured to cache data in the memory space;
The protocol converter is configured to convert the reference request into a target operation instruction in a target instruction format used by the second host according to a data operation specification agreed in a target protocol, where the target protocol is a protocol that has a function of keeping a memory space and a cache space of the same device consistent with data versions, and transmit the target operation instruction to the computing fast link controller;
the computing quick link controller is configured to send the target operation instruction to the computing quick link protocol interface of the second host.
2. The device of claim 1, wherein the protocol converter is further coupled to a first device memory managed by the second host;
the protocol converter is configured to receive, by the computing quick link controller, a candidate operation instruction generated by the second host executing the target operation instruction from the computing quick link protocol interface, where the candidate operation instruction is configured to instruct to execute the target operation on data in a first device memory, and the target memory space includes the first device memory; and sending the candidate operation instruction to the first equipment memory.
3. The apparatus of claim 2, wherein the protocol converter comprises: a coherence proxy engine, wherein,
the first request sending port of the consistency proxy engine is connected with the cache consistency communication interface of the computing quick link controller, the first request receiving port of the consistency proxy engine is connected with the global shared memory of the quick interconnection input-output controller, the equipment memory connecting port of the consistency proxy engine is connected with the first equipment memory, and the second request receiving port of the consistency proxy engine is connected with the memory equipment communication port of the computing quick link controller;
the fast interconnect input/output controller is configured to transmit the reference request to the request receiving port through the global shared memory;
the coherence agent engine is configured to convert the reference request into the target operation instruction in a target instruction format used by the second host according to a data operation specification agreed in a cache protocol of a computing quick link protocol, and send the target operation instruction to the cache coherence communication interface of the computing quick link controller through the first request sending port;
The computing quick link controller is further configured to, when receiving the candidate operation instruction fed back by the computing quick link protocol interface of the second host, transmit the candidate operation instruction to the second request receiving port of the coherence proxy engine through the memory device communication port;
the coherence agent engine is further configured to, when the candidate operation instruction is received from the second request sending port and the first device memory has the effect of triggering an update of the cache space, convert the candidate operation instruction into a target request conforming to a storage protocol of a computing quick link protocol, and invoke the device memory connection port to send the target request to the first device memory.
4. The apparatus of claim 3, wherein the protocol converter further comprises a first arbiter, wherein,
a third request receiving port of the first arbiter is connected with the memory device communication port of the computing quick link controller, and a second request transmitting port of the first arbiter is connected with the second request receiving port of the coherence agent engine;
The first arbiter is configured to detect a first operated object indicated by the candidate operation instruction when the candidate operation instruction is received through the third request receiving port, and transmit the candidate operation instruction to the second request receiving port through the second request sending port when the first operated object is determined to be the first device memory.
5. The device of claim 1, wherein the protocol converter is further coupled to a second device memory managed by the second host;
the protocol converter is configured to transmit the target operation instruction to the second device memory when the target memory space is determined to be the second device memory; and transmitting the target operation instruction to the second host under the condition that the target memory space is determined to be the host memory of the second host.
6. The apparatus of claim 1, wherein the quick interconnect input-output connector comprises: a first interface, a second interface, a third interface and a connector, wherein,
the first interface is used for connecting the connector to the first host, the connector is connected to the protocol converter through the second interface, and the connector is also connected with the computing quick link connector through the third interface;
The connector is configured to receive a request initiated by the first host from the first interface, where the request initiated by the first host is used to request to operate on data in a target memory space of the second host; transmitting a request initiated by the first host as the reference request to the protocol converter in case the operation requested by the first host is the target operation; and sending a request initiated by the first host to the second host through the third interface under the condition that the operation requested by the first host is not the target operation.
7. The device of claim 6, wherein the host bridge device further comprises: a message router, wherein,
the message router is connected between the connector and the third interface, and is also connected with a third device memory managed by the second host;
the message router is configured to transmit, when it is determined that the request initiated by the first host is used to request to perform an operation on data stored in the third device memory, the request initiated by the first host to the third device memory; and transmitting the request initiated by the first host to the third interface under the condition that the request initiated by the first host is determined to be used for requesting to perform operation on the data stored in the host memory of the second host.
8. The apparatus of claim 7, wherein the message router comprises: a parsing module and a routing module, wherein,
the analysis module is respectively connected with the connector and the routing module, and the routing module is respectively connected with the third equipment memory and the third interface;
the analyzing module is used for analyzing the request initiated by the first host to obtain the operation position information requested by the request initiated by the first host, and sending the operation position information and the request initiated by the first host to the routing module;
the routing module is configured to transmit, to the third device memory, a request initiated by the first host, where the operation location information is used to indicate a storage location in the third device memory; and transmitting a request initiated by the first host to the third interface under the condition that the operation position information is used for indicating a storage position in the host memory of the second host.
9. The apparatus of claim 8, wherein the message router further comprises: a second arbiter, wherein,
The fourth request receiving port of the second arbiter is connected with the third request sending port of the routing module, and the fourth request sending port of the second arbiter is connected with the third equipment memory;
the second arbiter is configured to detect a second operated object indicated by the request initiated by the first host when a request issued by the first host is received from the fourth request receiving port, and transmit the request initiated by the first host to the third device memory through the fourth request sending port when the second operated object is determined to be the third device memory.
10. The apparatus of claim 1, wherein the protocol converter is further configured to: setting an operation state to be a congestion state under the condition that the reference request is received, wherein the congestion state is used for blocking the protocol converter from receiving the request; and under the condition that a response result returned by the second host responding to the target operation instruction is received, converting the running state from the congestion state to an unobstructed state, wherein the unobstructed state is used for indicating the protocol converter to continuously receive a request.
CN202311714543.5A 2023-12-13 2023-12-13 Host bridging device Pending CN117725011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311714543.5A CN117725011A (en) 2023-12-13 2023-12-13 Host bridging device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311714543.5A CN117725011A (en) 2023-12-13 2023-12-13 Host bridging device

Publications (1)

Publication Number Publication Date
CN117725011A true CN117725011A (en) 2024-03-19

Family

ID=90204600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311714543.5A Pending CN117725011A (en) 2023-12-13 2023-12-13 Host bridging device

Country Status (1)

Country Link
CN (1) CN117725011A (en)

Similar Documents

Publication Publication Date Title
US6934878B2 (en) Failure detection and failure handling in cluster controller networks
US6513091B1 (en) Data routing using status-response signals
US8095701B2 (en) Computer system and I/O bridge
CN110402568A (en) A kind of method and device of communication
US20030126348A1 (en) Multi-processing memory duplication system
US9219695B2 (en) Switch, information processing apparatus, and communication control method
KR101056153B1 (en) Method and apparatus for conditional broadcast of barrier operations
US9612934B2 (en) Network processor with distributed trace buffers
CN110119304B (en) Interrupt processing method and device and server
US10404800B2 (en) Caching network fabric for high performance computing
JP2002342299A (en) Cluster system, computer and program
WO2017101080A1 (en) Write request processing method, processor and computer
CN105095254A (en) Method and apparatus for achieving data consistency
CN114064552B (en) Node controller, multiprocessor system and method for maintaining cache consistency
US7409486B2 (en) Storage system, and storage control method
JP2009282917A (en) Interserver communication mechanism and computer system
US6901475B2 (en) Link bus for a hub based computer architecture
CN116483259A (en) Data processing method and related device
CN117725011A (en) Host bridging device
JP2022510803A (en) Memory request chain on the bus
WO2022073399A1 (en) Storage node, storage device and network chip
US20050165974A1 (en) Computer apparatus and computer system
US6298409B1 (en) System for data and interrupt posting for computer devices
CN112463670A (en) Storage controller access method and related device
WO2019223444A1 (en) Data storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination