WO2012119369A1 - Message processing method, device and system based on cc-numa - Google Patents

Message processing method, device and system based on cc-numa Download PDF

Info

Publication number
WO2012119369A1
WO2012119369A1 PCT/CN2011/077898 CN2011077898W WO2012119369A1 WO 2012119369 A1 WO2012119369 A1 WO 2012119369A1 CN 2011077898 W CN2011077898 W CN 2011077898W WO 2012119369 A1 WO2012119369 A1 WO 2012119369A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
processor
packet
node controller
message
Prior art date
Application number
PCT/CN2011/077898
Other languages
French (fr)
Chinese (zh)
Inventor
程永波
贺成洪
兰可嘉
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201180001573.0A priority Critical patent/CN102318275B/en
Priority to PCT/CN2011/077898 priority patent/WO2012119369A1/en
Publication of WO2012119369A1 publication Critical patent/WO2012119369A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols

Definitions

  • the present invention relates to the field of non-uniform storage access technologies, and in particular, to a CC-NUMA-based message processing method, apparatus, and system. Background technique
  • Cache Coherence Non-uniform Memory Access is an important system structure currently used in large-scale parallel computer design.
  • each node is composed of a node controller and a plurality of central processing unit CPUs, each node is interconnected through a network, and each central processing unit CPU can access both local memory resources and other nodes in the entire system.
  • the system is called a "non-uniform" access system because each central processor accesses local memory resources faster than accessing other nodes' memory resources.
  • FIG. 1 is a basic structural diagram of a node in the prior art, it can be seen that the node controller in the node has two processor CPUs, and each processor passes through a Quick Path Interconnect (QPI) bus and the node. The controller is connected; the node controller is provided with a network interface (NI, Network Interface), and each node in the system is extended by the network interface NI interconnection, so that the memory resources in the entire system are shared.
  • NI Network Interface
  • the node controller in the node When the node controller in the node receives the packet sent by the other node controller through the network interface, the node controller performs address resolution on the received packet, and sends the packet according to the parsed packet address. The corresponding processor processes to complete access to its memory resource data. Similarly, when the processor needs to access the remote resource, the processor can send the message to the node controller through the QPI bus, and the node controller queries the routing table according to the destination node address of the packet, and passes the network interface corresponding to the destination node. The message is sent to the next hop node, and finally the message is sent to the destination node to complete the access of the resource.
  • the node controller In order to complete the sharing of resources within the system, the node controller needs to maintain the address space within the entire node. The directory, and the routing of the packets accessing the two central processors, the system resource access efficiency is low. At the same time, when the QPI bus connection between the processor and the node controller fails, or the node controller itself fails, other nodes may not be able to send a message to the node, cannot access the resources in the node, or cause the entire node to report. The processing of the file is interrupted, which in turn affects the resource access of the entire system, and the reliability of system resource access is low.
  • the present invention provides a CC-NUMA-based message processing method, apparatus, and system to improve resource access efficiency and reliability.
  • the present invention provides the following technical solutions as follows:
  • a packet processing method based on CC-NUMA in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the method includes:
  • the destination address of the message is the node, it is determined whether the directory of the address space corresponding to the message is maintained by the node controller;
  • a packet processing method based on CC-NUMA in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the method includes:
  • the packet and the local node are controlled. a directory maintained by the device and containing at least an address space corresponding to the message, sent to another node controller, so that the another node controller forwards the message to the processor, and maintains the The directory of the address space corresponding to the text.
  • a packet processing method based on CC-NUMA in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, including:
  • the processor to which the address space corresponding to the message is located is determined, and the message is sent to the processor.
  • the present invention also provides a CC-NUMA-based message processing apparatus, in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the message processing apparatus includes:
  • a message receiving unit configured to receive a packet sent by another node, and perform address resolution on the packet
  • an address analyzing unit configured to determine, when the destination address of the packet is a node, Whether the directory of the address space is maintained by the local node controller, and if so, the operation of the processor address determining unit is performed;
  • a processor address determining unit configured to determine a processor to which the address space corresponding to the text belongs
  • a message sending unit configured to send the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor processes the “3 ⁇ 4 text”.
  • a CC-NUMA-based message processing device is configured with two node controllers in a node, and each node controller maintains a directory of its corresponding address space, and the message processing device includes: a link failure acquiring unit , for obtaining a link connection failure of the fast channel interconnection bus in the system;
  • a message receiving unit configured to receive a message sent by another node, and perform address resolution on the message
  • an address analyzing unit configured to determine, when the destination address of the text is a node, the “3 ⁇ 4 text corresponding to the message Whether the directory of the address space is maintained by the local node controller, and if so, the operation of the processor address determining unit is performed;
  • a processor address determining unit configured to determine a processing crying port to which the address space corresponding to the text belongs.
  • a fault processing unit configured to: when the link between the local node controller and the processor is faulty, the message and a directory maintained by the local node controller and including at least an address space corresponding to the packet, Sending to another node controller, so that the other node controller forwards the message to the processor and maintains a directory of the address space corresponding to the text.
  • a CC-NUMA-based message processing device is configured with two node controllers in a node, and each node controller maintains a directory of its corresponding address space, and the message processing device includes: a controller fault acquiring unit , used to obtain fault information of another node controller;
  • An address space data obtaining unit configured to acquire, by broadcast snooping, a directory of an address space maintained by the another node controller
  • a message receiving unit configured to receive a packet sent by another node, and perform an address resolution analysis unit on the packet, where the packet is determined to be a destination address of the packet, and the packet is determined to be corresponding to the packet.
  • the processor to which the address space belongs;
  • a message sending unit configured to send the message to the processor.
  • a CC-NUMA based message processing system includes two node controllers and at least two processors;
  • the node controller and the processor are connected by a fast channel interconnect bus
  • the two node controllers are connected through a network interface
  • the node controller has built-in CC-NUMA-based message processing apparatus as described above.
  • the embodiment of the present invention discloses a CC-NUMA-based message processing method, apparatus, and system, in which a node controller receives a destination address sent by another node as a node of the node.
  • a node controller receives a destination address sent by another node as a node of the node.
  • the fast channel interconnect bus sends the message to the processor; if the address space corresponding to the message does not belong to the address space maintained by the node controller, the node controller forwards the message to another node controller In order for another node controller to process the message, since two node controllers in the node respectively maintain data of a part of the address space, both node controllers can receive the message sent by other nodes, and receive the message. Address resolution of the received message, so that the message is sent to the corresponding processor to complete resource access and improve resource access. speed, This improves system performance.
  • the node controller can obtain the QPI link failure information, when the processor corresponding to the address space corresponding to the received message and the node controller
  • the node controller can forward the packet and the directory information of the address space corresponding to the packet to another node controller, and the other node controller processes the packet and sends the packet. To the corresponding processor, communication interruption due to link failure is avoided.
  • FIG. 1 is a schematic diagram of a basic structure of a node in the prior art
  • FIG. 2 is a schematic diagram showing the basic structure of a node in the present invention.
  • FIG. 3 is a flowchart of an embodiment of a method for processing a message based on CC-NUMA according to an embodiment of the present invention
  • FIG. 4 is a flowchart of another embodiment of a CC-NMUA-based packet processing method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of another embodiment of a CC-NUMA-based message processing method according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of another embodiment of a CC-NMUA-based packet processing method according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a CC-NUMA-based message processing apparatus according to an embodiment of the present invention.
  • FIG. 8 is a CC-NUMA-based message processing apparatus according to an embodiment of the present invention. Schematic;
  • FIG. 9 is a schematic structural diagram of a CC-NUMA-based message processing apparatus according to an embodiment of the present invention.
  • the technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. example. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without the creative work are all within the scope of the present invention.
  • the present invention configures two node controllers (NCs) in one node, and each node controller maintains its corresponding address.
  • the address space is also called the system address and is used to locate the specific location of the resources in the CC-NUMA system.
  • each node in the system is assigned an address space of a region, and the address space of the node is separately assigned to the processor in the node. For example, suppose that there are two processor CPUs in a node, node 1 in the system allocates 0-1TB of address space, and processor 1 and processor 2 in node 1 respectively allocate 512 GB of address space; Node 2 allocates 1TB-2TB of address space, in which processor 1 and processor 2 of node 2 respectively allocate 512 GB of address space, and the allocation of address space of other nodes in the system is similar.
  • FIG. 2 is a schematic diagram of a basic structure of a node in the present invention.
  • the node includes a node controller. NC1, node controller NC2, first processor CPU1 and second processor CPU2, wherein each processor is connected to node controller NC1 and node controller NC2 via first fast channel interconnect bus QPI0 and second fast interconnect QPI1, respectively
  • the node controller NC1 and the node controller NC2 are connected through the network interface NI.
  • Each node controller has several network interfaces NI, and the node controllers in the node can be interconnected with the node controllers in other nodes through the network interface NI.
  • the invention divides the address space of the node into two regions, and each node controller maintains a directory of the address space of one region, for example, the node of the node
  • the address space can be divided into a first address space and a second address space, wherein one node controller maintains a directory of the first address space, and another node controller maintains a directory of the second address space.
  • the first address space may include a part of the address space allocated to the address space of the first processor and a part of the address space allocated to the address space of the second processor, and the second address space is also It is possible to include both a portion of the address space in the address space allocated to the first processor and a portion of the address space in the address space allocated to the second processor.
  • the directory that maintains the address space refers to the data access situation of the node controller to the address space and the data state of the address space. For example, when a processor in another node needs to access a resource saved on a processor in the node, the processor of the other node sends a resource access request in the form of a packet, and the node controller of the node receives the packet. After that, the processing status of the resource data that the packet requests to access is recorded in the directory of the address space (for example, the data state of the address space requested to be accessed may be a modified state, an exclusive state, a shared state, etc.), and a node that sends a message. Information such as ID and processor ID.
  • each node is provided with two node controllers, and each node controller is provided with a plurality of network interfaces NI connected to the node controllers of other nodes, compared with the prior art, in the CC-NUMA system.
  • the number of NI interfaces provided by each node for connecting to other nodes is twice the number of NI interfaces provided by each node in the prior art for interconnecting between nodes. Therefore, when each CC-NUMA system Nodes are connected to each other to form a larger topology network.
  • the average number of hops of the intermediate nodes is reduced during the process of sending packets from the source node to the destination node, which improves system performance.
  • FIG. 3 is a schematic flowchart of an embodiment of a CC-NUMA-based packet processing method according to the present invention.
  • two node controllers and two nodes are configured in one node.
  • the controllers are interconnected through a network interface, and each node controller maintains a directory of its corresponding address space, and the method includes:
  • Step 301 The first node controller receives a packet sent by another node, and performs address resolution on the packet.
  • the processor of the other node may send a message to the node, where the packet includes the address of the destination node to be accessed (destination node ID) ), the address of the destination processor (processor ID), and the address space accessed Data requests, etc.
  • the node controller in the node After receiving the packet sent by other nodes through the network interface NI, the node controller in the node generally performs a cyclic redundancy check (CRC), and after receiving the CRC, the received message is received. Perform address resolution, that is, decode the address of the packet to determine whether the destination address of the packet is the local node.
  • CRC cyclic redundancy check
  • the node controller queries the routing table according to the routing table.
  • the packet is forwarded to the node controller in the other node.
  • the node controller needs to perform corresponding processing, so as to finally send the packet to the node and the packet.
  • the processor requests access to the address space data corresponding to the processor.
  • Step 302 When the destination address of the packet is the node, the first node controller determines whether the directory of the address space corresponding to the packet is maintained by the first node controller, and if yes, proceeds to step 303; Then, the process proceeds to step 304.
  • the first node controller determines that the destination address of the packet is the node according to the parsed address
  • the first node controller needs to determine the address space corresponding to the packet according to the parsed address, that is, the packet. Whether the address space to be accessed and the directory of the address space to be accessed by the message are maintained by the first node controller.
  • the parity bit of the address space may be preset, and the parity bit and the node controller are preset. Correspondence between the two. Specifically, setting the first bit in the address space to the parity bit can be set as needed. For example, the sixth bit in the address space can be set to the parity bit, and set to the parity in the address space. When the parity bit is "0", the directory of the corresponding address space is maintained by the first node controller; when the parity bit in the address space is " ⁇ , the directory of the corresponding address space is controlled by the second node controller.
  • setting the first bit in the address space to the parity bit can be set as needed.
  • one of the 6th to the 10th bits in the address space may be selected as the parity bit.
  • the numbers of the fifth and sixth digits in the address space are different, the directory of the corresponding address space is maintained by the second node controller.
  • there are other ways to distinguish the directory of the address space maintained by the first node controller and the second node controller which is not enumerated here.
  • the first node controller may determine the address space corresponding to the packet according to the correspondence between the parity bit of the address space corresponding to the packet and the node controller. Whether the parity bit corresponds to the first node controller, and if so, the directory of the address space corresponding to the message is maintained by the first node controller. For example, after receiving the packet, the first node controller performs address resolution on the packet, and determines that the parity bit (such as the fifth digit in the address) in the address of the data resource that the packet requests to access is "0". The directory of the address space that the message needs to access is maintained by the first node controller, otherwise it is maintained by the second node controller.
  • Step 303 The first node controller determines, by the processor to which the address space corresponding to the packet belongs, the packet is sent to the processor through a fast channel interconnect bus connected to the processor, so that the processor reports the packet.
  • the text is processed.
  • the first node controller may determine the required access of the packet according to the result of the address resolution of the packet.
  • the processor to which the address space belongs After determining the processor in which the address space requested to be accessed is located, the first node controller can query the routing table, and the fast channel connected to the processor by the first node controller according to the pre-configured port routing path
  • the interconnect bus QPI sends the message to the processor for the processor to process the message.
  • Step 304 The first node controller forwards the packet to the second node controller in the local node, so that the second node controller determines the processor to which the address space corresponding to the packet belongs, and the packet is sent.
  • the processor corresponding to the packet is sent to perform packet processing.
  • the first node controller passes the network interface NI between the first node controller and the second node controller.
  • the message is forwarded to the second node controller.
  • the second node controller After receiving the packet, the second node controller performs address resolution on the packet, and determines which processor in the node is required to access the address space, and the second node controller is based on the Configured port path through the second node controller
  • a fast channel interconnect bus with the processor forwards the message to the processor, and the processor processes the message.
  • a node controller in the node after receiving a packet sent by another node, a node controller in the node performs address resolution on the packet, and the directory corresponding to the address space of the packet is maintained by the node controller.
  • the node controller further determines the processor to which the address space corresponding to the message belongs, and sends the message to the processor through the fast channel interconnect bus connected to the processor; if the message corresponds to The directory of the address space is not maintained by the node controller, and the node controller forwards the message to another node controller in the node, so that the other node controller processes the message.
  • each node controller Since the directory of the address space in the node can be divided into two parts, each node controller maintains data of a part of the address space, so that other nodes in the CC-NUMA system need to access resources in the processor of the node, according to the routing path. Selecting which node controller in the node to send the message, the two node controllers in the node can receive different message requests, and respectively perform address resolution of the 3 ⁇ 4 text, and maintain the directory of the corresponding address space.
  • the first node controller sends the packet to the packet corresponding to the packet.
  • the processor depends on the address space that the message needs to access.
  • the packet is processed according to the resource, and the processed packet is sent to the first node controller, so that the first node controller returns the processed packet to the source node that sends the packet, that is, sends the resource. Accessing the requested node.
  • the processor also returns the processed message to the second node controller,
  • the two-node controller can query the routing table and select the network interface NI to return the processed message to the source node that initiated the resource access request.
  • FIG. 4 is a flowchart of another embodiment of a CC-NUMA-based message processing method according to an embodiment of the present invention.
  • the node of the present invention is configured with two node controllers, and each node controller is respectively configured.
  • the directory of the corresponding address space is maintained.
  • the method in this embodiment includes:
  • Step 401 The first node controller receives a packet sent by another node, and performs an address on the packet. Analysis.
  • Step 402 When the destination address of the packet is the node, the first node controller determines whether the directory of the address space corresponding to the packet is maintained by the first node controller, and if yes, proceeds to step 403; Then, proceed to step 404.
  • Step 403 The first node controller determines, by the processor to which the address space corresponding to the packet belongs, the packet to be sent to the processor through a fast channel interconnect bus connected to the processor.
  • Step 404 The first node controller forwards the packet to the second node controller, and the second node controller receives the packet, and performs address resolution on the packet to determine a processor to which the address space corresponding to the packet belongs.
  • the directory of the address space corresponding to the message is maintained, and the message is sent to the processor.
  • Step 405 After receiving the packet sent by the first node controller or the second node controller, the processor processes the packet, and determines whether the directory of the address space corresponding to the packet is maintained by the first node controller. If yes, the processor sends the processed message to the first node controller, so that the first node controller returns the processed message to the source node; if not, the processor processes the processed message. The text is sent to the second node controller, so that the second processor returns the processed message to the source node.
  • the processor After receiving the packet, the processor processes the packet according to the data of the address space requested by the packet. Since the node controller needs to record information such as the data state of the address space accessed by the packet, the processing is performed. The subsequent message still needs to be returned to the corresponding node controller, so that the node controller maintains the directory of the address space requested by the message. Therefore, after the processor finishes processing the packet, it needs to determine which node controller maintains the directory of the address space corresponding to the packet, and returns the processed packet to the node controller. Finally, the node controller returns the message after the processor to the source node that sent the message.
  • the node controller After the node controller receives the received message, it is required to determine whether the directory of the address space corresponding to the packet is maintained by the local node controller, and if yes, the local node controller sends the message to the node.
  • the processor corresponding to the address space of the message therefore, the processor can directly return the message to the node controller in the node to send the message to the node, that is, when the first node controller sends the message After going to a processor in the node, the processor processes the "3 ⁇ 4 text" and returns the processed message to the first node controller; when the second node controller sends the message to the node After a processor in the processor, the processor processes the packet and returns the processed packet to the second node controller.
  • Step 406 After receiving the processed text returned by the processor, the first node controller or the second node controller updates the directory of the address space corresponding to the processed message, and the processed message is processed. Returns to the source node that sent the message.
  • the node controller when the node controller receives the packet sent by the other node, it determines whether the directory of the address space corresponding to the packet is maintained by the node, and if so, the node controller needs to update the packet.
  • the directory of the address space corresponding to the packet for example, the source node that sends the request for the message, and the like.
  • the node controller needs to request the packet.
  • the data status is recorded, etc., so when the node controller determines that the directory of the address space corresponding to the message sent by the other node is maintained by the local node controller, and the node controller returns the processed message to the source. Before the node, the node controller needs to maintain the directory of the address space corresponding to the packet.
  • the above two embodiments are described by taking the packet that the other node requests the resource access of the other node as an example, and the process of the node controller and the processor after receiving the packet in the node.
  • the processor in the node may also generate a resource access request packet, query the routing table, and select an optimal routing path.
  • the optimal routing path determines whether the generated packet is sent to the first node controller or the second node controller, and after the first node controller or the second node controller receives the packet sent by a processor in the node, It also queries its own routing table to determine which externally connected NI interface in the node controller forwards the message.
  • the fast-path interconnect bus QPI between a node controller and a processor in a node may be faulty.
  • the link fails, and the message transmission processing cannot be performed, and the communication that causes the resource access is interrupted.
  • FIG. 5 is a schematic flowchart diagram of another embodiment of a CC-NUMA-based packet processing method according to the present invention.
  • the embodiment is applied to a node controller and a processor.
  • the fast channel interconnect bus fails, as in the above embodiment, two node controllers are configured in the node, and each node controller maintains its corresponding address.
  • the directory of the space includes:
  • Step 501 The first node controller receives the packet sent by the other node, and performs address resolution on the packet.
  • Step 502 When the first node controller determines that the destination address of the packet is the node and the address space corresponding to the packet When the address space maintained by the first node controller is within the address space, the processor to which the address space corresponding to the message is located is determined.
  • Step 503 If the link of the fast path interconnection bus between the first node controller and the processor fails, the first node controller corresponding to the packet and the first node controller, including at least the packet
  • the directory of the address space is sent to the second node controller, so that the second node controller maintains the directory of the address space corresponding to the message, and forwards the message to the processor to which the address space corresponding to the message belongs.
  • the node controller in the node needs to obtain the link failure of the fast channel interconnection bus QPI between the node controller and the processor.
  • the node controller in the node needs to obtain the link failure of the fast channel interconnection bus QPI between the node controller and the processor.
  • both processors in the node can obtain information about the link failure.
  • the interrupt source of the fault reports the interrupt fault information to the processor, and the processor notifies the node controller of the fast-path interconnect bus to the node controller in the node.
  • the controller receives link fault information of the fast channel interconnect bus sent by the processor. For example, when the fast channel interconnection bus between the first node controller and the certain processor in the node fails, the message data sent by the first node controller received by the processor generates an error, when the generated error data When the number of packets exceeds a preset value, the processor determines that the fast path interconnection bus between the first node controller and the processor is faulty, and the processor notifies the first node controller and the second link of the detected link failure. Node controller.
  • the node controller can also detect the QPI link failure connected thereto, and report the fault to a processor in the node, the processor links the link.
  • the failure notifies other processors and another node controller. It is also possible to notify the node controller of the fast-path interconnect bus link that the link failure of the fast-path interconnect bus link is faulty to another A node controller.
  • the processor or the first node controller invokes the basic input/output system BIOS program, performs port routing path configuration, and changes the first node controller to send the message to
  • the port routing path of the processor changes the path between the first node controller and the processor to a routing path through the network interface NI between the first node controller and the second node controller.
  • the routing path between the first node controller and the first processor in the node is reconfigured, when the first node controls After receiving the packet, the device determines that the directory of the address space corresponding to the packet is maintained by itself and the packet is sent to the first processor, because the fast channel is interconnected between the first node controller and the first processor at this time. If the bus fails, the first node controller executes an interrupt handler, and according to the reconfigured port routing path, through the network interface NI between the second node controller, the message and the first node controller maintain at least A directory containing the address space corresponding to the message is sent to the second node controller.
  • the second node controller After the second node controller receives the packet from the first node controller, the second node controller also needs to perform address resolution on the packet to determine which processor in the node belongs to the address space of the packet, and The message is sent to the processor through the fast channel interconnect bus connected to the processor, so that the processor processes the message. Of course, after the processor finishes processing the packet, the processed packet is returned to the second node controller. During the processing of the entire packet, the second node controller will catalog the address space corresponding to the packet. Perform maintenance, and finally return the processed message returned by the processor to the source node that sent the data request message.
  • the first node controller may also send the message only to the second node controller, the second node.
  • the controller performs address resolution on the packet, and sends the packet to the processor corresponding to the requested access.
  • the request for the directory modification is sent to the first node control through the NI interface with the first node controller. And the first node controller maintains the directory of the address space corresponding to the message.
  • the node controller can obtain the QPI link fault information, and when the first node controller receives the packet, determines the address corresponding to the packet.
  • the processor to which the space belongs when the QPI bus between the processor and the first node controller fails, the first node controller may forward the message and the directory information of the address space corresponding to the packet to the second node.
  • the controller performs address resolution on the packet by the second node controller, determines which processor in the node belongs to the address space corresponding to the packet, and uses the fast channel interconnection bus connected to the processor to report the packet.
  • the text is sent to the processor, so even if a QPI link in the node fails, the new routing path can be selected, and the packet is sent to the corresponding processor, thereby avoiding the communication interruption caused by the link failure. .
  • the node controller in the node fails, the node controller cannot establish a directory of the address space. For example, if the memory of the address space directory in the node controller fails, the node controller cannot reacquire. The directory of the address space. In this case, other nodes may not be able to access the data resources saved in the processor of the node, and the node cannot receive, process, or forward the packets sent by other nodes, thereby affecting the resource access of the entire CC-NUMA system.
  • FIG. 6 is a schematic flowchart diagram of another embodiment of a CC-NUMA-based packet processing method according to the present invention.
  • two node controllers are configured in a node, and each The node controller maintains a directory of its corresponding address space.
  • the method in this embodiment includes:
  • Step 601 The first node controller acquires fault information of the second node controller.
  • the second node controller's own directory is lost, or the memory of the second node controller's internal storage address space directory fails, causing the second node controller to fail to perform directory maintenance of the address space, and the failure interrupt source will be the second node controller.
  • the fault information is sent to a processor in the node, and the processor notifies the other nodes in the node and the first node controller of the fault information of the second node controller.
  • the processor also sends the fault information of the second node controller to other nodes in the system. After the other node obtains the fault information of the second node controller, the packet routing path sent to the node is re-configured, and the second node controller will not receive the packet information sent by other nodes.
  • the second node controller when the second node controller detects its own fault and cannot establish or maintain a directory of the address space, the second node controller can directly notify the first node controller of its own fault information.
  • Step 602 The first node controller acquires a directory of an address space maintained by the second node controller by using broadcast snooping.
  • the first node controller sends the interception to the other nodes in the system by means of broadcast snooping to obtain the maintenance of the first node controller.
  • the data information of the address space, the processor in the other node returns the corresponding data information to the first node controller, so that the first node controller establishes the directory information of the address space originally maintained by the second node controller.
  • Step 603 When the first node controller receives the packet sent by the other node, the address resolution is performed on the packet.
  • Step 604 When the destination address of the packet is the node, the first node controller determines the processor to which the address space corresponding to the packet belongs, and sends the packet to the address space corresponding to the packet. processor.
  • the first node controller acquires the directory of the address space maintained by the original second node controller by means of broadcast interception. At this time, the first node controller maintains the entire node.
  • the first node controller receives the packet sent by the other node, the first node controller does not need to determine whether the address space corresponding to the packet is maintained by the local node controller, and the first node controller can directly determine the packet.
  • the processor to which the corresponding address space belongs, and the message is sent to the processor through a fast channel interconnect bus connected to the processor.
  • the other node controller can obtain the node controller failure information, and obtain the data information of the address space maintained by the node controller through broadcast interception, and All the messages sent to the node are processed by another node controller, which avoids the communication interruption of the entire CC-NUMA system due to the failure of the node controller itself in the node.
  • FIG. 7 an embodiment of a CC-NUMA-based message processing apparatus according to the present invention is configured.
  • two node controllers are configured in a node, and each node controller maintains its corresponding address.
  • the message processing device may be a node controller or a part of a node controller.
  • the message processing device includes: a message receiving unit 701, an address analyzing unit 702, a processor address determining unit 703, and a message sending. Unit 704.
  • the message receiving unit 701 is configured to receive a packet sent by another node, and perform address solution on the packet.
  • the address analyzing unit 702 is configured to determine, when the destination address of the packet is the node, whether the directory of the address space corresponding to the packet is maintained by the local node controller, and if yes, perform the operation of the processor address determining unit.
  • the processor address determining unit 703 is configured to determine a processor to which the address space corresponding to the >3 ⁇ 4 text belongs.
  • the message sending unit 704 is configured to send the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor processes the message.
  • the operation of the message receiving unit and the address analyzing unit in the node controller may be part of the Rbox module or the Rbox module in the node controller.
  • the packet may be sent to the address determining unit to perform the operations of the processor determining unit and the message sending unit.
  • the CCM cache Coherence Module
  • the CCM cache Coherence Module
  • the corresponding QPI bus can be selected to send the message to the corresponding processor, and the QPI link layer management module connected to the processor needs to be completed when the message is sent to the corresponding processor.
  • QPIL selects the corresponding QPI bus and sends the message to the corresponding processor.
  • the address analysis unit 702 may include:
  • An address analysis sub-unit configured to determine, according to a correspondence between a parity bit of the preset address space and a node controller, whether a parity bit of the address space corresponding to the packet corresponds to the node controller, and if The directory of the address space corresponding to the message is maintained by the local node controller.
  • the address processing device of the present embodiment further includes: an address space maintenance unit, configured to maintain a directory of an address space corresponding to the received message.
  • FIG. 8 is a schematic structural diagram of another embodiment of a CC-NUMA-based message processing apparatus according to the present invention.
  • the apparatus of this embodiment is applied to a fast channel interconnection bus between a node controller and a processor.
  • the node is configured with two node controllers in the node, and each node controller maintains its corresponding address space.
  • the packet processing device includes: a link fault obtaining unit 801 and a message receiving unit 802.
  • the link fault obtaining unit 801 is configured to acquire a link connection fault of the fast channel interconnect bus in the system.
  • the message receiving unit 802 is configured to receive a packet sent by another node, and perform address resolution on the packet.
  • the address analyzing unit 803 is configured to: when the destination address of the packet is the node, determine whether the directory of the address space corresponding to the packet is maintained by the node controller, and if yes, perform the operation of the processor address determining unit.
  • the processor address determining unit 804 is configured to determine a processor to which the address space corresponding to the text belongs.
  • the fault processing unit 805 is configured to: when the link between the local node controller and the processor is faulty, the message and the directory maintained by the local node controller and including at least the address space corresponding to the packet, Sending to another node controller, so that the other node controller forwards the message to the processor, and maintains the directory of the address space corresponding to the "3 ⁇ 4".
  • the link fault obtaining unit 801 includes: a link fault information receiving unit, configured to receive link fault information of a fast channel interconnect bus between the node controller and the processor sent by the processor, according to a manner of obtaining a link fault. .
  • the link fault detecting unit is configured to detect a link of the fast channel bus connected to the processor, and obtain a link fault of the fast channel interconnect bus.
  • FIG. 9 is a schematic structural diagram of another embodiment of a CC-NUMA-based message processing apparatus according to the present invention.
  • the message processing apparatus includes: a controller fault acquiring unit 901, an address space data acquiring unit 902, a message receiving unit 903, and an address analyzing unit 904. And a message transmitting unit 905.
  • the controller fault acquiring unit 901 is configured to acquire fault information of another node controller.
  • the address space data obtaining unit 902 is configured to obtain, by broadcast snooping, a directory of an address space maintained by the another node controller.
  • the message receiving unit 903 is configured to receive a message sent by another node, and perform an address de-address analysis unit 904 on the message, where the address address corresponding to the message is determined when the destination address of the message is the node.
  • the processor to which it belongs is configured to perform an address de-address analysis unit 904 on the message, where the address address corresponding to the message is determined when the destination address of the message is the node.
  • the message sending unit 905 is configured to send the message to the processor to which the address space corresponding to the message belongs.
  • the controller fault acquiring unit 901 includes: a first fault information receiving unit, configured to receive fault information of the another node controller sent by a processor.
  • the method includes: a second fault receiving unit, configured to receive fault information sent by the another node controller.
  • the present invention further provides a CC-NUMA-based message processing system, including: two node controllers, and at least two processors; each node controller is interconnected with each processor through a fast channel. The bus is connected. The two node controllers are connected through a network interface.
  • Each node controller has built-in CC-NUMA-based message processing apparatus described in the above embodiments of the present invention.
  • the various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments may be referred to each other.
  • the description is relatively simple, and the relevant part can be referred to the method part.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented directly in hardware, a software module executed by a processor, or a combination of both.
  • the software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Disclosed are a message processing method, device and system based on CC-NUMA. A certain node controller in the present node performs address resolution on a message sent by another node, and if the destination address of the message is the present node and the catalogue of the address space corresponding to the message is maintained by the node controller, it determines a processor to which the address space corresponding to the message belongs; if the link of a quick path interconnect bus between the node controller and the processor is normal, then it sends the message to the processor via the quick path interconnect bus, so that the processor processes the message; if the link of the quick path interconnect bus between the node controller and the processor ­has failed, or the catalogue of the address space corresponding to the message is not maintained by the node controller, then it forwards the message to another node controller in the present node and this other node controller processes the message. The method can improve the efficiency and reliability of accessing system resources.

Description

基于 CC-NUMA的报文处理方法、 装置和系统  Message processing method, device and system based on CC-NUMA
技术领域 本发明涉及非均匀存储访问技术领域, 尤其涉及一种基于 CC-NUMA的报 文处理方法、 装置和系统。 背景技术 TECHNICAL FIELD The present invention relates to the field of non-uniform storage access technologies, and in particular, to a CC-NUMA-based message processing method, apparatus, and system. Background technique
高速緩存一致性非均匀存储访问系统 ( CC-NUMA , Cache Coherence Non-uniform Memory Access )是目前应用于大规模并行计算机设计中一种重要 的系统结构。 在 CC-NUMA 系统中, 每个节点由节点控制器和多个中央处理 器 CPU组成, 各节点通过网络互连, 每个中央处理器 CPU既可以访问本地内 存资源也可以访问整个系统中其他节点上的资源,由于每个中央处理器访问本 地内存资源的速度比访问其他节点的内存资源的速度要快,因此该系统被称为 "非均匀的" 访问系统。  Cache Coherence Non-uniform Memory Access (CC-NUMA) is an important system structure currently used in large-scale parallel computer design. In the CC-NUMA system, each node is composed of a node controller and a plurality of central processing unit CPUs, each node is interconnected through a network, and each central processing unit CPU can access both local memory resources and other nodes in the entire system. On the resource, the system is called a "non-uniform" access system because each central processor accesses local memory resources faster than accessing other nodes' memory resources.
在 CC-NUMA 系统中, 每个节点中会有节点控制器, 节点控制器可以完 成分布式内存的共享和緩存一致性维护。 参见图 1 , 为现有技术中节点的基本 结构图, 可见该节点中节点控制器拥有两个处理器 CPU, 每个处理器均通过 一条快速通道互联( QPI, Quick Path Interconnect )总线与该节点控制器相连; 节点控制器上设有网络接口 (NI, Network Interface ), 系统中的各个节点通过 网络接口 NI互连的方式进行扩展, 从而使整个系统中的内存资源共享。  In a CC-NUMA system, there is a node controller in each node, and the node controller can complete distributed memory sharing and cache consistency maintenance. Referring to FIG. 1 , which is a basic structural diagram of a node in the prior art, it can be seen that the node controller in the node has two processor CPUs, and each processor passes through a Quick Path Interconnect (QPI) bus and the node. The controller is connected; the node controller is provided with a network interface (NI, Network Interface), and each node in the system is extended by the network interface NI interconnection, so that the memory resources in the entire system are shared.
当本节点中的节点控制器通过网络接口收到其他节点控制器发送的报文 时, 节点控制器会将收到的报文进行地址解析, 并依据解析出的报文地址将报 文发给对应的处理器进行处理, 以完成对其内存资源数据的访问。 同样, 当处 理器需要访问远程资源时, 处理器可以通过 QPI 总线将报文发送给节点控制 器, 节点控制器根据报文的目的节点地址, 查询路由表, 并通过与目的节点对 应的网络接口将该报文发送到下一跳节点,最终将报文发送到目的节点以便完 成资源的访问。  When the node controller in the node receives the packet sent by the other node controller through the network interface, the node controller performs address resolution on the received packet, and sends the packet according to the parsed packet address. The corresponding processor processes to complete access to its memory resource data. Similarly, when the processor needs to access the remote resource, the processor can send the message to the node controller through the QPI bus, and the node controller queries the routing table according to the destination node address of the packet, and passes the network interface corresponding to the destination node. The message is sent to the next hop node, and finally the message is sent to the destination node to complete the access of the resource.
为了完成系统内资源的共享,该节点控制器需要维护整个节点内地址空间 的目录, 并对访问两个中央处理器的报文进行路由选择处理, 系统资源访问效 率较低。 同时, 当处理器与节点控制器之间的 QPI总线连接失效, 或者节点控 制器自身出现故障, 其他节点可能无法向该节点发送报文, 无法访问该节点内 的资源, 或者导致整个节点的报文处理中断, 进而影响整个系统的资源访问, 系统资源访问的可靠性低。 In order to complete the sharing of resources within the system, the node controller needs to maintain the address space within the entire node. The directory, and the routing of the packets accessing the two central processors, the system resource access efficiency is low. At the same time, when the QPI bus connection between the processor and the node controller fails, or the node controller itself fails, other nodes may not be able to send a message to the node, cannot access the resources in the node, or cause the entire node to report. The processing of the file is interrupted, which in turn affects the resource access of the entire system, and the reliability of system resource access is low.
发明内容 Summary of the invention
有鉴于此, 本发明提供一种基于 CC一 NUMA的报文处理方法、 装置和系 统, 以提高资源访问的效率以及可靠性。  In view of this, the present invention provides a CC-NUMA-based message processing method, apparatus, and system to improve resource access efficiency and reliability.
为实现以上目的, 本发明提供了如下技术方案如下:  To achieve the above object, the present invention provides the following technical solutions as follows:
一种基于 CC-NUMA的报文处理方法, 在节点内配置有两个节点控制器, 每个节点控制器分别维护其对应的地址空间的目录, 所述方法包括:  A packet processing method based on CC-NUMA, in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the method includes:
接收其他节点发送的报文, 对所述报文进行地址解析;  Receiving a packet sent by another node, and performing address resolution on the packet;
当所述 4艮文的目的地址为本节点时,判断所述^艮文对应的地址空间的目录 是否由本节点控制器所维护;  When the destination address of the message is the node, it is determined whether the directory of the address space corresponding to the message is maintained by the node controller;
如果是,确定所述报文对应的地址空间所归属的处理器,通过与所述处理 器相连的快速通道互联总线将所述报文发送到所述处理器,以便所述处理器对 所述报文进行处理;  If yes, determining, by the processor to which the address space corresponding to the message belongs, sending the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor is The message is processed;
如果否, 则将所述报文转发给本节点内的另一节点控制器, 以便所述另一 节点控制器确定所述报文对应的地址空间所归属的处理器,将所述报文发送给 该^艮文对应的处理器进行 文处理。  If not, forwarding the packet to another node controller in the node, so that the another node controller determines a processor to which the address space corresponding to the packet belongs, and sends the packet The processing is performed on the processor corresponding to the file.
一种基于 CC-NUMA的报文处理方法, 在节点内配置有两个节点控制器, 每个节点控制器分别维护其对应的地址空间的目录, 所述方法包括:  A packet processing method based on CC-NUMA, in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the method includes:
接收其他节点发送的报文, 对所述报文进行地址解析;  Receiving a packet sent by another node, and performing address resolution on the packet;
当所述报文的目的地址为本节点时,判断所述报文对应的地址空间是否在 其所维护的地址空间内;  When the destination address of the packet is the node, determining whether the address space corresponding to the packet is in the address space maintained by the packet;
如果是,确定所述报文对应的地址空间所归属的处理器, 当本节点控制器 与所述处理器间的快速通道互联总线的链路出现故障时,将所述报文以及本节 点控制器所维护的、至少包含所述报文对应的地址空间的目录,发送给另一节 点控制器, 以便所述另一节点控制器将所述报文转发给所述处理器, 并维护该 才艮文对应的地址空间的目录。 If yes, determining, by the processor to which the address space corresponding to the message belongs, when the link of the fast channel interconnection bus between the node controller and the processor fails, the packet and the local node are controlled. a directory maintained by the device and containing at least an address space corresponding to the message, sent to another node controller, so that the another node controller forwards the message to the processor, and maintains the The directory of the address space corresponding to the text.
一种基于 CC-NUMA的报文处理方法, 在节点内配置有两个节点控制器, 每个节点控制器分别维护其对应的地址空间的目录, 包括:  A packet processing method based on CC-NUMA, in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, including:
获取另一节点控制器的故障信息;  Obtaining fault information of another node controller;
通过广播侦听获取所述另一节点控制器所维护的地址空间的目录; 接收其他节点发送的报文, 对所述报文进行地址解析;  Obtaining, by broadcast snooping, a directory of an address space maintained by the another node controller; receiving a packet sent by another node, and performing address resolution on the packet;
当所述 ·艮文的目的地址为本节点时,确定所述>¾文对应的地址空间所归属 的处理器, 将所述报文发送给所述处理器。  When the destination address of the message is the node, the processor to which the address space corresponding to the message is located is determined, and the message is sent to the processor.
本发明还提供了一种基于 CC-NUMA的报文处理装置, 在节点内配置有 两个节点控制器,每个节点控制器分别维护其对应的地址空间的目录, 该报文 处理装置包括:  The present invention also provides a CC-NUMA-based message processing apparatus, in which two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the message processing apparatus includes:
报文接收单元,用于接收其他节点发送的报文,对所述报文进行地址解析; 地址分析单元, 用于当所述报文的目的地址为本节点时, 判断所述报文对 应的地址空间的目录是否由本节点控制器所维护,如果是, 则执行处理器地址 判断单元的操作;  a message receiving unit, configured to receive a packet sent by another node, and perform address resolution on the packet; and an address analyzing unit, configured to determine, when the destination address of the packet is a node, Whether the directory of the address space is maintained by the local node controller, and if so, the operation of the processor address determining unit is performed;
处理器地址判断单元, 用于确定所述 文对应的地址空间所归属的处理 器;  a processor address determining unit, configured to determine a processor to which the address space corresponding to the text belongs;
报文发送单元,用于通过与所述处理器相连的快速通道互联总线将所述报 文发送到所述处理器, 以便所述处理器对所述"¾文进行处理。  And a message sending unit, configured to send the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor processes the “3⁄4 text”.
一种基于 CC-NUMA的报文处理装置, 在节点内配置有两个节点控制器, 每个节点控制器分别维护其对应的地址空间的目录, 该报文处理装置包括: 链路故障获取单元, 用于获取系统中的快速通道互联总线的链路连接故 障;  A CC-NUMA-based message processing device is configured with two node controllers in a node, and each node controller maintains a directory of its corresponding address space, and the message processing device includes: a link failure acquiring unit , for obtaining a link connection failure of the fast channel interconnection bus in the system;
报文接收单元,用于接收其他节点发送的报文,对所述报文进行地址解析; 地址分析单元, 用于当所述 文的目的地址为本节点时, 判断所述"¾文对 应的地址空间的目录是否由本节点控制器所维护,如果是, 则执行处理器地址 判断单元的操作;  a message receiving unit, configured to receive a message sent by another node, and perform address resolution on the message; and an address analyzing unit, configured to determine, when the destination address of the text is a node, the “3⁄4 text corresponding to the message Whether the directory of the address space is maintained by the local node controller, and if so, the operation of the processor address determining unit is performed;
处理器地址判断单元, 用于确定所述 文对应的地址空间所归属的处理 哭口., 故障处理单元 , 用于当本节点控制器与所述处理器之间链路出现故障时 , 将所述报文以及本节点控制器所维护的、至少包含该报文对应的地址空间的目 录,发送给另一节点控制器, 以便所述另一节点控制器将所述报文转发给所述 处理器, 并维护所述 文对应的地址空间的目录。 a processor address determining unit, configured to determine a processing crying port to which the address space corresponding to the text belongs. a fault processing unit, configured to: when the link between the local node controller and the processor is faulty, the message and a directory maintained by the local node controller and including at least an address space corresponding to the packet, Sending to another node controller, so that the other node controller forwards the message to the processor and maintains a directory of the address space corresponding to the text.
一种基于 CC-NUMA的报文处理装置, 在节点内配置有两个节点控制器, 每个节点控制器分别维护其对应的地址空间的目录, 该报文处理装置包括: 控制器故障获取单元, 用于获取另一节点控制器的故障信息;  A CC-NUMA-based message processing device is configured with two node controllers in a node, and each node controller maintains a directory of its corresponding address space, and the message processing device includes: a controller fault acquiring unit , used to obtain fault information of another node controller;
地址空间数据获取单元,用于通过广播侦听获取所述另一节点控制器所维 护的地址空间的目录;  An address space data obtaining unit, configured to acquire, by broadcast snooping, a directory of an address space maintained by the another node controller;
报文接收单元, 用于接收其他节点的发送的报文,对所述报文进行地址解 地址分析单元, 用于当所述报文的目的地址为本节点时, 判断所述报文对 应的地址空间所归属的处理器;  a message receiving unit, configured to receive a packet sent by another node, and perform an address resolution analysis unit on the packet, where the packet is determined to be a destination address of the packet, and the packet is determined to be corresponding to the packet. The processor to which the address space belongs;
报文发送单元, 用于将所述报文发送给所述处理器。  a message sending unit, configured to send the message to the processor.
一种基于 CC-NUMA的报文处理系统, 包括, 两个节点控制器, 以及至 少两个处理器;  A CC-NUMA based message processing system includes two node controllers and at least two processors;
所述节点控制器与所述处理器之间通过快速通道互联总线相连;  The node controller and the processor are connected by a fast channel interconnect bus;
所述两个节点控制器之间通过网络接口连接;  The two node controllers are connected through a network interface;
所述节点控制器内置有以上所述的基于 CC-NUMA的报文处理装置。 从上述的技术方案可以看出, 本发明实施例公开一种基于 CC-NUMA的 报文处理方法、装置和系统, 该方法中当节点控制器接收到其他节点发送的目 的地址为本节点的报文时,判断该报文对应的地址空间的目录是否由本节点控 制器所维护,如果是, 本节点控制器会确定该报文对应的地址空间所归属的处 理器, 并通过与该处理器相连的快速通道互联总线将报文发送到该处理器; 如 果该报文对应的地址空间不属于该节点控制器所维护的地址空间,则该节点控 制器将该报文转发给另一节点控制器, 以便另一节点控制器对该报文进行处 理, 由于节点内的两个节点控制器分别维护一部分地址空间的数据, 同时两个 节点控制器都可以接收其他节点发送的报文, 并对接收到的报文的地址解析, 以便将报文发送到对应的处理器, 以完成资源访问, 提高了资源访问的速度, 进而提高了系统性能。 The node controller has built-in CC-NUMA-based message processing apparatus as described above. As can be seen from the foregoing technical solution, the embodiment of the present invention discloses a CC-NUMA-based message processing method, apparatus, and system, in which a node controller receives a destination address sent by another node as a node of the node. In the text, it is determined whether the directory of the address space corresponding to the message is maintained by the node controller. If yes, the node controller determines the processor to which the address space corresponding to the message belongs, and is connected to the processor. The fast channel interconnect bus sends the message to the processor; if the address space corresponding to the message does not belong to the address space maintained by the node controller, the node controller forwards the message to another node controller In order for another node controller to process the message, since two node controllers in the node respectively maintain data of a part of the address space, both node controllers can receive the message sent by other nodes, and receive the message. Address resolution of the received message, so that the message is sent to the corresponding processor to complete resource access and improve resource access. speed, This improves system performance.
同时, 当某条 QPI链路出现故障时, 节点控制器可以获取到 QPI链路故 障信息, 当接收到的报文对应的地址空间所属的处理器与该节点控制器间的 At the same time, when a QPI link fails, the node controller can obtain the QPI link failure information, when the processor corresponding to the address space corresponding to the received message and the node controller
QPI总线出现故障时,该节点控制器可以将该报文以及该报文对应的地址空间 的目录信息转发给另一节点控制器,由另一节点控制器将报文进行处理并将报 文发送到相应的处理器, 避免了由于链路故障而导致的通信中断。 When the QPI bus fails, the node controller can forward the packet and the directory information of the address space corresponding to the packet to another node controller, and the other node controller processes the packet and sends the packet. To the corresponding processor, communication interruption due to link failure is avoided.
当本节点内某一个节点控制器出现故障后 ,另一节点控制器可以获取该节 点控制器故障信息,并通过广播侦听获取到该节点控制器所维护的地址空间的 目录信息, 并由另一节点控制器对发往本节点内的所有"¾文进行处理,避免了 由于本节点内节点控制器自身故障, 导致整个 CC-NUMA系统的通信中断。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施 例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地, 下面描述 中的附图仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲,在不付 出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。  After one node controller in the node fails, the other node controller can obtain the node controller failure information, and obtain the directory information of the address space maintained by the node controller through broadcast interception, and another A node controller processes all the "3" messages sent to the node, avoiding communication interruption of the entire CC-NUMA system due to the failure of the node controller itself in the node. BRIEF DESCRIPTION OF THE DRAWINGS In order to more clearly illustrate the present invention The embodiments or the prior art solutions will be briefly described below, and the drawings used in the description of the prior art will be briefly described. It is obvious that the drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings may be obtained based on these drawings without any creative work.
图 1为现有技术中节点的基本结构示意图;  1 is a schematic diagram of a basic structure of a node in the prior art;
图 2 为本发明中节点的基本结构示意图;  2 is a schematic diagram showing the basic structure of a node in the present invention;
图 3 为本发明实施例公开的一种基于 CC一 NUMA的 ^艮文处理方法一个 实施例的流程图;  FIG. 3 is a flowchart of an embodiment of a method for processing a message based on CC-NUMA according to an embodiment of the present invention;
图 4 为本发明实施例公开的一种基于 CC-NMUA的报文处理方法的另一 个实施例的流程图;  FIG. 4 is a flowchart of another embodiment of a CC-NMUA-based packet processing method according to an embodiment of the present invention;
图 5 为本发明实施例公开的一种基于 CC-NUMA的报文处理方法的另一 个实施例的流程图;  FIG. 5 is a flowchart of another embodiment of a CC-NUMA-based message processing method according to an embodiment of the present invention;
图 6 为本发明实施例公开的一种基于 CC-NMUA的报文处理方法的另一 个实施例的流程图;  FIG. 6 is a flowchart of another embodiment of a CC-NMUA-based packet processing method according to an embodiment of the present invention;
图 7 为本发明一个实施例公开的一种基于 CC-NUMA的报文处理装置的 结构示意图;  FIG. 7 is a schematic structural diagram of a CC-NUMA-based message processing apparatus according to an embodiment of the present invention;
图 8 为本发明一个实施例公开的一种基于 CC-NUMA的报文处理装置的 结构示意图; FIG. 8 is a CC-NUMA-based message processing apparatus according to an embodiment of the present invention. Schematic;
图 9 为本发明一个实施例公开的一种基于 CC-NUMA的报文处理装置的 结构示意图。 具体实施方式 下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是 全部的实施例。基于本发明中的实施例, 本领域普通技术人员在没有做出创造 性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。  FIG. 9 is a schematic structural diagram of a CC-NUMA-based message processing apparatus according to an embodiment of the present invention. The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. example. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without the creative work are all within the scope of the present invention.
为了解决现有技术中 CC-NUMA系统在进行资源访问时可能出现的问题, 本发明在一个节点内配置两个节点控制器(NC, Network Controller ), 每个节 点控制器分别维护其对应的地址空间的目录。 其中地址空间也称为系统地址, 是用来定位 CC-NUMA系统中资源的具体位置的。  In order to solve the problems that may occur when the CC-NUMA system in the prior art performs resource access, the present invention configures two node controllers (NCs) in one node, and each node controller maintains its corresponding address. The directory of the space. The address space is also called the system address and is used to locate the specific location of the resources in the CC-NUMA system.
在 CC-NUMA 系统中, 系统中的每个节点均会分配给一个区域的地址空 间, 节点的地址空间又分别分配给节点内的处理器。 例如, 假设一个节点内有 两个处理器 CPU, 系统中的节点 1分配 0-1TB的地址空间, 其中节点 1 内的 处理器 1和处理器 2分别分配了其中的 512GB的地址空间; 系统中的节点 2 分配了 1TB-2TB的地址空间, 其中节点 2的处理器 1和处理器 2分别分配了 其中的 512GB的地址空间, 系统中其他节点的地址空间的分配依此类推。  In the CC-NUMA system, each node in the system is assigned an address space of a region, and the address space of the node is separately assigned to the processor in the node. For example, suppose that there are two processor CPUs in a node, node 1 in the system allocates 0-1TB of address space, and processor 1 and processor 2 in node 1 respectively allocate 512 GB of address space; Node 2 allocates 1TB-2TB of address space, in which processor 1 and processor 2 of node 2 respectively allocate 512 GB of address space, and the allocation of address space of other nodes in the system is similar.
在现有技术中,一个节点内设置有一个节点控制器, 该节点控制器需要对 整个节点的地址空间的目录进行维护。而本发明中一个节点内会设置有两个节 点控制器, 参见图 2, 为本发明中节点的基本结构示意图, 以一个节点内设置 有两个处理器 CPU为例, 本节点包括节点控制器 NC1、 节点控制器 NC2、 第 一处理器 CPU1 和第二处理器 CPU2,其中每个处理器通过第一快速通道互联 总线 QPI0和第二快速互联 QPI1分别与节点控制器 NC1和节点控制器 NC2 相连, 节点控制器 NC1和节点控制器 NC2通过网络接口 NI相连。 每个节点 控制器上均设有若干个网络接口 NI, 本节点内的节点控制器可以通过网络接 口 NI与其他节点中的节点控制器互连。 本发明将本节点的地址空间分为两个 区域, 每个节点控制器对一个区域的地址空间的目录进行维护, 如, 本节点的 地址空间可以分别分为第一地址空间和第二地址空间,其中一个节点控制器维 护第一地址空间的目录, 另一个节点控制器维护第二地址空间的目录。 需要说 明的是,第一地址空间中可能会同时包含分配给第一处理器的地址空间的中一 部分地址空间和分配给第二处理器的地址空间中的一部分地址空间,同样第二 地址空间也可能同时包含分配给第一处理器的地址空间中的一部分地址空间 和分配给第二处理器的地址空间中的一部分地址空间。 In the prior art, a node controller is provided in a node, and the node controller needs to maintain a directory of the address space of the entire node. In the present invention, two node controllers are disposed in one node, and FIG. 2 is a schematic diagram of a basic structure of a node in the present invention. Taking a two-processor CPU in a node as an example, the node includes a node controller. NC1, node controller NC2, first processor CPU1 and second processor CPU2, wherein each processor is connected to node controller NC1 and node controller NC2 via first fast channel interconnect bus QPI0 and second fast interconnect QPI1, respectively The node controller NC1 and the node controller NC2 are connected through the network interface NI. Each node controller has several network interfaces NI, and the node controllers in the node can be interconnected with the node controllers in other nodes through the network interface NI. The invention divides the address space of the node into two regions, and each node controller maintains a directory of the address space of one region, for example, the node of the node The address space can be divided into a first address space and a second address space, wherein one node controller maintains a directory of the first address space, and another node controller maintains a directory of the second address space. It should be noted that the first address space may include a part of the address space allocated to the address space of the first processor and a part of the address space allocated to the address space of the second processor, and the second address space is also It is possible to include both a portion of the address space in the address space allocated to the first processor and a portion of the address space in the address space allocated to the second processor.
其中, 维护地址空间的目录,是指节点控制器对该地址空间的数据访问情 况, 以及该地址空间的数据状态进行记录。 如, 当其他节点中的处理器需要访 问本节点中某处理器上保存的资源时,其他节点的处理器会以报文的形式发送 资源访问请求, 当本节点的节点控制器接收到报文后, 需要在地址空间的目录 中记录该报文请求访问的资源数据的处理状态(如请求访问的地址空间的数据 状态可以为修改态、 独占态、 共享态等), 以及发送报文的节点 ID 和处理器 ID等信息。  The directory that maintains the address space refers to the data access situation of the node controller to the address space and the data state of the address space. For example, when a processor in another node needs to access a resource saved on a processor in the node, the processor of the other node sends a resource access request in the form of a packet, and the node controller of the node receives the packet. After that, the processing status of the resource data that the packet requests to access is recorded in the directory of the address space (for example, the data state of the address space requested to be accessed may be a modified state, an exclusive state, a shared state, etc.), and a node that sends a message. Information such as ID and processor ID.
本发明中每个节点设置有两个节点控制器,每个节点控制器均设有若干个 与其他节点的节点控制器相连的网络接口 NI, 与现有技术相比, CC-NUMA 系统中的每个节点所提供的用于与其他节点相连的 NI接口数量是现有技术中 每个节点所提供的用于节点之间互联的 NI 接口数量的两倍, 因此, 当 CC-NUMA系统内各个节点相互连接成较大的拓朴网,整个系统中不同节点之 间进行通信时,报文从源节点发送到目的节点的过程中, 经过的中间节点的平 均跳数减少, 提高了系统性能。  In the present invention, each node is provided with two node controllers, and each node controller is provided with a plurality of network interfaces NI connected to the node controllers of other nodes, compared with the prior art, in the CC-NUMA system. The number of NI interfaces provided by each node for connecting to other nodes is twice the number of NI interfaces provided by each node in the prior art for interconnecting between nodes. Therefore, when each CC-NUMA system Nodes are connected to each other to form a larger topology network. When communication between different nodes in the whole system, the average number of hops of the intermediate nodes is reduced during the process of sending packets from the source node to the destination node, which improves system performance.
基于本发明的节点结构, 参见图 3 , 为本发明的一种基于 CC-NUMA的报 文处理方法一个实施例的流程示意图,本发明中一个节点内配置有两个节点控 制器, 两个节点控制器通过网络接口互连,每个节点控制器分别维护其对应的 地址空间的目录, 该方法包括:  Based on the node structure of the present invention, FIG. 3 is a schematic flowchart of an embodiment of a CC-NUMA-based packet processing method according to the present invention. In the present invention, two node controllers and two nodes are configured in one node. The controllers are interconnected through a network interface, and each node controller maintains a directory of its corresponding address space, and the method includes:
步骤 301 : 第一节点控制器接收其他节点发送的报文, 对该报文进行地址 解析。  Step 301: The first node controller receives a packet sent by another node, and performs address resolution on the packet.
当其他节点中的处理器需要访问本节点中某处理器 CPU中的资源时, 其 他节点的处理器可以向本节点发送报文,该报文中包含需要访问的目的节点的 地址(目的节点 ID )、 目的处理器的地址(处理器 ID ), 以及访问的地址空间 的数据请求等。 本节点中的节点控制器通过网络接口 NI接收到其他节点发送 的到报文后, 一般会进行循环冗余校验(CRC, Cyclic Redundancy Check ), CRC校验正确后, 对接收到的报文进行地址解析, 即对报文进行地址译码, 以判断出该报文的目的地址是否为本节点, 如果报文的目的地址不是本节点, 则节点控制器会查询路由表,依据路由表将该报文转发给其他节点中的节点控 制器; 当该报文的目的地址为本节点时, 就需要由该节点控制器进行相应的处 理,以便最终将报文发送给本节点内与该报文请求访问的地址空间数据相对应 的处理器。 When a processor in another node needs to access a resource in a processor CPU of the node, the processor of the other node may send a message to the node, where the packet includes the address of the destination node to be accessed (destination node ID) ), the address of the destination processor (processor ID), and the address space accessed Data requests, etc. After receiving the packet sent by other nodes through the network interface NI, the node controller in the node generally performs a cyclic redundancy check (CRC), and after receiving the CRC, the received message is received. Perform address resolution, that is, decode the address of the packet to determine whether the destination address of the packet is the local node. If the destination address of the packet is not the local node, the node controller queries the routing table according to the routing table. The packet is forwarded to the node controller in the other node. When the destination address of the packet is the node, the node controller needs to perform corresponding processing, so as to finally send the packet to the node and the packet. The processor requests access to the address space data corresponding to the processor.
步骤 302: 当该报文的目的地址为本节点时, 第一节点控制器判断该报文 对应的地址空间的目录是否由第一节点控制器所维护, 如果是, 则进入步骤 303; 如果否, 则进入步骤 304。  Step 302: When the destination address of the packet is the node, the first node controller determines whether the directory of the address space corresponding to the packet is maintained by the first node controller, and if yes, proceeds to step 303; Then, the process proceeds to step 304.
当第一节点控制器根据解析出的地址确定出该报文的目的地址为本节点 时, 第一节点控制器还需要依据解析出的地址, 判断该报文对应的地址空间, 即该报文需要访问的地址空间,以及该报文所需访问的地址空间的目录是否由 第一节点控制器来维护。  When the first node controller determines that the destination address of the packet is the node according to the parsed address, the first node controller needs to determine the address space corresponding to the packet according to the parsed address, that is, the packet. Whether the address space to be accessed and the directory of the address space to be accessed by the message are maintained by the first node controller.
为了确定出报文对应的地址空间是由第一节点控制器维护还是第二节点 控制器维护, 可以预先设定地址空间的奇偶校验位, 并预先设定奇偶校验位与 节点控制器之间的对应关系。具体将地址空间中的第几位设定为奇偶校验位可 以根据需要进行设定, 例如, 可以将地址空间中第 6位设定为奇偶校验位, 且 设定为地址空间中的奇偶校验位为 "0" 时, 对应的地址空间的目录由第一节 点控制器维护; 当地址空间中的奇偶校验位为 "Γ 时, 则对应的地址空间的 目录由第二节点控制器维护, 因此在地址空间中前面的连续 32个地址空间的 目录由第一节点控制器来维护,该 32个连续的地址空间依次向后的 32个连续 地址空间的目录由第二节点控制器来维护,且后续的地址空间的目录维护也是 依次交替的由第一节点控制器或第二节点控制器来维护。  In order to determine whether the address space corresponding to the message is maintained by the first node controller or the second node controller, the parity bit of the address space may be preset, and the parity bit and the node controller are preset. Correspondence between the two. Specifically, setting the first bit in the address space to the parity bit can be set as needed. For example, the sixth bit in the address space can be set to the parity bit, and set to the parity in the address space. When the parity bit is "0", the directory of the corresponding address space is maintained by the first node controller; when the parity bit in the address space is "Γ, the directory of the corresponding address space is controlled by the second node controller. Maintenance, so the directory of 32 consecutive address spaces in front of the address space is maintained by the first node controller, and the 32 consecutive address spaces are sequentially returned to the directory of 32 consecutive address spaces by the second node controller. Maintenance, and directory maintenance of subsequent address spaces are also maintained by the first node controller or the second node controller in an alternating manner.
具体将地址空间中的第几位设定为奇偶校验位可以根据需要设定。为了能 合理的分配第一节点控制器和第二节点控制器维护的地址空间的目录,可以在 地址空间中的第 6位到第 10位中选择一位作为奇偶校验位。 当然, 也可以在 地址空间中选择连续的两位作为奇偶校验位,如, 可以在将地址空间中第五位 和第六位一起作为奇偶校验位, 当地址空间中第五位和第六位都为 "0" 或都 为 "1" 时, 对应的地址空间的目录由第一节点控制器来维护, 而地址空间中 的第五位和第六位的数字不同时,对应的地址空间的目录由第二节点控制器来 维护。 当然,还可以有其他方式来区分第一节点控制器和第二节点控制器维护 的地址空间的目录, 在此不——列举。 Specifically, setting the first bit in the address space to the parity bit can be set as needed. In order to reasonably allocate the directory of the address space maintained by the first node controller and the second node controller, one of the 6th to the 10th bits in the address space may be selected as the parity bit. Of course, you can also select consecutive two bits as the parity bit in the address space. For example, you can put the fifth bit in the address space. Together with the sixth bit as a parity bit, when the fifth and sixth bits in the address space are both "0" or both "1", the directory of the corresponding address space is maintained by the first node controller. When the numbers of the fifth and sixth digits in the address space are different, the directory of the corresponding address space is maintained by the second node controller. Of course, there are other ways to distinguish the directory of the address space maintained by the first node controller and the second node controller, which is not enumerated here.
当第一节点控制器对接收到的报文进行地址解析之后,就可以根据该报文 对应的地址空间的奇偶校验位与节点控制器之间的对应关系,判断该报文对应 的地址空间的奇偶校验位是否对应第一节点控制器,如果是, 则该报文对应的 地址空间的目录是由第一节点控制器所维护。例如, 第一节点控制器接收到报 文以后,对报文进行地址解析, 判断出该报文请求访问的数据资源的地址中奇 偶校验位(如地址中的第五位) 为 "0" 时, 则该报文需要访问的地址空间的 目录是由第一节点控制器所维护, 否则由第二节点控制器所维护。  After performing the address resolution on the received packet, the first node controller may determine the address space corresponding to the packet according to the correspondence between the parity bit of the address space corresponding to the packet and the node controller. Whether the parity bit corresponds to the first node controller, and if so, the directory of the address space corresponding to the message is maintained by the first node controller. For example, after receiving the packet, the first node controller performs address resolution on the packet, and determines that the parity bit (such as the fifth digit in the address) in the address of the data resource that the packet requests to access is "0". The directory of the address space that the message needs to access is maintained by the first node controller, otherwise it is maintained by the second node controller.
步骤 303: 第一节点控制器确定该报文对应的地址空间所归属的处理器, 通过与该处理器相连的快速通道互联总线将该报文发送到该处理器,以便该处 理器对该报文进行处理。  Step 303: The first node controller determines, by the processor to which the address space corresponding to the packet belongs, the packet is sent to the processor through a fast channel interconnect bus connected to the processor, so that the processor reports the packet. The text is processed.
当该报文所请求访问的地址空间所对应的目录是由第一节点控制器来维 护时, 第一节点控制器就可以依据对该报文的地址解析的结果,确定该报文所 需访问的地址空间所归属的处理器。在确定了 ^艮文请求访问的地址空间所在的 处理器后,第一节点控制器就可以查询路由表,依据预先配置的端口路由路径, 通过第一节点控制器与该处理器相连的快速通道互联总线 QPI,将该报文发送 至该处理器, 以便该处理器对该报文进行处理。  When the directory corresponding to the address space requested by the packet is maintained by the first node controller, the first node controller may determine the required access of the packet according to the result of the address resolution of the packet. The processor to which the address space belongs. After determining the processor in which the address space requested to be accessed is located, the first node controller can query the routing table, and the fast channel connected to the processor by the first node controller according to the pre-configured port routing path The interconnect bus QPI sends the message to the processor for the processor to process the message.
步骤 304: 第一节点控制器将该报文转发给本节点内的第二节点控制器, 以便第二节点控制器确定所述报文对应的地址空间所归属的处理器,将所述报 文发送给该报文对应的处理器进行报文处理。  Step 304: The first node controller forwards the packet to the second node controller in the local node, so that the second node controller determines the processor to which the address space corresponding to the packet belongs, and the packet is sent. The processor corresponding to the packet is sent to perform packet processing.
如果该报文所需访问的地址空间对应的目录并不是由第一节点控制器所 维护,则第一节点控制器通过第一节点控制器与第二节点控制器之间的网络接 口 NI, 将该报文转发给第二节点控制器。 第二节点控制器接收到该报文后, 对该 4艮文进行地址解析,确定该 ·艮文所需访问的地址空间属于本节点内的哪个 处理器, 并由第二节点控制器依据预先配置的端口路径,通过第二节点控制器 与该处理器之间的快速通道互联总线将该报文转发给该处理器,并由该处理器 对报文进行处理。 If the directory corresponding to the address space that the message needs to access is not maintained by the first node controller, the first node controller passes the network interface NI between the first node controller and the second node controller. The message is forwarded to the second node controller. After receiving the packet, the second node controller performs address resolution on the packet, and determines which processor in the node is required to access the address space, and the second node controller is based on the Configured port path through the second node controller A fast channel interconnect bus with the processor forwards the message to the processor, and the processor processes the message.
在本实施例的方法中,本节点中某节点控制器接收到其他节点发送的报文 后,对报文进行地址解析, 当该报文对应的地址空间的目录在该节点控制器所 维护的地址空间内,该节点控制器会进一步确定该报文对应的地址空间所属的 处理器, 并通过与该处理器相连的快速通道互联总线将报文发送到该处理器; 如果该报文对应的地址空间的目录不由该节点控制器所维护,则该节点控制器 将该报文转发给本节点内的另一节点控制器,以便另一节点控制器对该报文进 行处理。 由于节点内地址空间的目录可以分成两部分,每个节点控制器分别维 护其中一部分地址空间的数据, 这样 CC-NUMA 系统中其他节点需要访问本 节点的处理器中的资源时,可以根据路由路径选择向本节点内的哪个节点控制 器发送报文, 本节点中的两个节点控制器可以接收不同的报文请求, 并分别进 行 "¾文的地址解析, 维护其对应的地址空间的目录, 并进行将^艮文发送到对应 的处理器的操作, 由于两个节点控制器同时工作, 以完成资源访问, 且与现有 技术相比,每个节点控制器维护的地址空间的目录减少, 节点控制器的报文处 理速度加快, 进而提高了系统资源访问的速度, 进而提高了系统性能。 需要说明的是,在步骤 303中, 当第一节点控制器将报文发送到该报文对 应的地址空间所在的处理器后,处理器依据该报文所需访问的地址空间的数据 资源对该报文进行相应的处理, 并将处理后的报文发送给第一节点控制器, 以 便第一节点控制器将处理后的报文返回给发送报文的源节点,即发送资源访问 请求的节点。 同样, 当第二节点控制器将报文发送给该报文对应的地址空间所 在的处理器之后, 处理器也会将处理后的报文返回给第二节点控制器, 第二节 点控制器可以查询路由表, 选择网络接口 NI将处理后的报文返回给发起资源 访问请求的源节点。  In the method of this embodiment, after receiving a packet sent by another node, a node controller in the node performs address resolution on the packet, and the directory corresponding to the address space of the packet is maintained by the node controller. In the address space, the node controller further determines the processor to which the address space corresponding to the message belongs, and sends the message to the processor through the fast channel interconnect bus connected to the processor; if the message corresponds to The directory of the address space is not maintained by the node controller, and the node controller forwards the message to another node controller in the node, so that the other node controller processes the message. Since the directory of the address space in the node can be divided into two parts, each node controller maintains data of a part of the address space, so that other nodes in the CC-NUMA system need to access resources in the processor of the node, according to the routing path. Selecting which node controller in the node to send the message, the two node controllers in the node can receive different message requests, and respectively perform address resolution of the 3⁄4 text, and maintain the directory of the corresponding address space. And performing the operation of sending the message to the corresponding processor, because the two node controllers work simultaneously to complete the resource access, and the directory of the address space maintained by each node controller is reduced compared with the prior art, The packet processing speed of the node controller is increased, and the system resource access speed is increased, thereby improving the system performance. In the step 303, the first node controller sends the packet to the packet corresponding to the packet. After the processor in which the address space is located, the processor depends on the address space that the message needs to access. The packet is processed according to the resource, and the processed packet is sent to the first node controller, so that the first node controller returns the processed packet to the source node that sends the packet, that is, sends the resource. Accessing the requested node. Similarly, after the second node controller sends the message to the processor where the address space corresponding to the message is located, the processor also returns the processed message to the second node controller, The two-node controller can query the routing table and select the network interface NI to return the processed message to the source node that initiated the resource access request.
参见图 4,为本发明实施例公开的一种基于 CC-NUMA的报文处理方法的 另一个实施例的流程图, 本发明的节点内配置有两个节点控制器,每个节点控 制器分别维护其对应的地址空间的目录, 本实施例的方法包括:  FIG. 4 is a flowchart of another embodiment of a CC-NUMA-based message processing method according to an embodiment of the present invention. The node of the present invention is configured with two node controllers, and each node controller is respectively configured. The directory of the corresponding address space is maintained. The method in this embodiment includes:
步骤 401 : 第一节点控制器接收其他节点发送的报文, 对该报文进行地址 解析。 Step 401: The first node controller receives a packet sent by another node, and performs an address on the packet. Analysis.
步骤 402: 当该报文的目的地址为本节点时, 第一节点控制器判断该报文 对应的地址空间的目录是否由第一节点控制器所维护, 如果是, 则进入步骤 403; 如果否, 则进入步骤 404。  Step 402: When the destination address of the packet is the node, the first node controller determines whether the directory of the address space corresponding to the packet is maintained by the first node controller, and if yes, proceeds to step 403; Then, proceed to step 404.
步骤 403: 第一节点控制器确定该报文对应的地址空间所归属的处理器, 通过与该处理器相连的快速通道互联总线将该报文发送到该处理器。  Step 403: The first node controller determines, by the processor to which the address space corresponding to the packet belongs, the packet to be sent to the processor through a fast channel interconnect bus connected to the processor.
步骤 404: 第一节点控制器将该报文转发给第二节点控制器, 第二节点控 制器接收该报文, 并对报文进行地址解析,确定该报文对应的地址空间所属的 处理器, 维护该报文对应的地址空间的目录, 并将报文发送到该处理器。  Step 404: The first node controller forwards the packet to the second node controller, and the second node controller receives the packet, and performs address resolution on the packet to determine a processor to which the address space corresponding to the packet belongs. The directory of the address space corresponding to the message is maintained, and the message is sent to the processor.
步骤 405: 处理器接收第一节点控制器或第二节点控制器发送的报文后, 对该报文进行处理,并判断该报文对应的地址空间的目录是否由第一节点控制 器所维护, 如果是, 则该处理器将处理后的报文发送给第一节点控制器, 以便 第一节点控制器将处理后的报文返回源节点; 如果否, 则该处理器将处理后的 报文发送给第二节点控制器, 以便第二处理器将处理后的报文返回源节点。  Step 405: After receiving the packet sent by the first node controller or the second node controller, the processor processes the packet, and determines whether the directory of the address space corresponding to the packet is maintained by the first node controller. If yes, the processor sends the processed message to the first node controller, so that the first node controller returns the processed message to the source node; if not, the processor processes the processed message. The text is sent to the second node controller, so that the second processor returns the processed message to the source node.
处理器接收到报文之后,依据该报文所请求的地址空间的数据,对该报文 进行处理,由于节点控制器需要对报文所访问的地址空间的数据状态等信息进 行记录, 所以处理后的报文仍需要返回相应的节点控制器, 以便节点控制器对 该报文所请求访问的地址空间的目录进行维护。因此,处理器完成报文处理后, 需要判断哪个节点控制器维护该报文对应的地址空间的目录,并将处理后的报 文返回给该节点控制器。最后节点控制器将处理器后的报文返回给发送报文的 源节点。  After receiving the packet, the processor processes the packet according to the data of the address space requested by the packet. Since the node controller needs to record information such as the data state of the address space accessed by the packet, the processing is performed. The subsequent message still needs to be returned to the corresponding node controller, so that the node controller maintains the directory of the address space requested by the message. Therefore, after the processor finishes processing the packet, it needs to determine which node controller maintains the directory of the address space corresponding to the packet, and returns the processed packet to the node controller. Finally, the node controller returns the message after the processor to the source node that sent the message.
需要说明的是, 由于节点控制器将接收到的报文后, 需要判断该报文对应 的地址空间的目录是否由本节点控制器所维护, 如果是, 则本节点控制器将报 文发送后该报文对应的地址空间所在的处理器, 因此, 处理器可以将该报文直 接返回给本节点内向其发送报文的节点控制器,也就是说, 当第一节点控制器 将"¾文发送到本节点内的某处理器后, 该处理器会对该"¾文进行处理, 并将处 理后的报文返回给第一节点控制器;当第二节点控制器将报文发送到本节点内 的某处理器后, 该处理器会对该报文进行处理, 并将处理后的报文返回给第二 节点控制器。 步骤 406: 第一节点控制器或第二节点控制器接收到处理器返回的处理 后的 文后, 更新该处理后的^艮文对应的地址空间的目录, 并将该处理后的才艮 文返回给发送该报文的源节点。 It should be noted that, after the node controller receives the received message, it is required to determine whether the directory of the address space corresponding to the packet is maintained by the local node controller, and if yes, the local node controller sends the message to the node. The processor corresponding to the address space of the message, therefore, the processor can directly return the message to the node controller in the node to send the message to the node, that is, when the first node controller sends the message After going to a processor in the node, the processor processes the "3⁄4 text" and returns the processed message to the first node controller; when the second node controller sends the message to the node After a processor in the processor, the processor processes the packet and returns the processed packet to the second node controller. Step 406: After receiving the processed text returned by the processor, the first node controller or the second node controller updates the directory of the address space corresponding to the processed message, and the processed message is processed. Returns to the source node that sent the message.
需要说明的是, 本发明中当节点控制器接收到其他节点发送的报文后, 会 判断该报文对应的地址空间的目录是否由本节点维护,如果是, 则本节点控制 器还需要更新该报文对应的地址空间的目录,如,记录发送该报文请求的源节 点等信息, 当该报文请求访问的处理器返回处理后的报文时, 本节点控制器还 需要对报文请求的数据状态进行记录等,因此当节点控制器确定接收到其他节 点发送的报文对应的地址空间的目录由本节点控制器所维护之后,且在该节点 控制器将该处理后报文返回给源节点之前,该节点控制器需要对该报文对应的 地址空间的目录进行维护。  It should be noted that, in the present invention, when the node controller receives the packet sent by the other node, it determines whether the directory of the address space corresponding to the packet is maintained by the node, and if so, the node controller needs to update the packet. The directory of the address space corresponding to the packet, for example, the source node that sends the request for the message, and the like. When the processor that the packet requests to access returns the processed packet, the node controller needs to request the packet. The data status is recorded, etc., so when the node controller determines that the directory of the address space corresponding to the message sent by the other node is maintained by the local node controller, and the node controller returns the processed message to the source. Before the node, the node controller needs to maintain the directory of the address space corresponding to the packet.
以上两个实施例都是以本节点接收到其他节点请求资源访问的报文为例, 对本节点内节点控制器以及处理器接收到报文之后的处理过程进行的描述。当 本节点中的某处理器需要访问其他节点中某处理器上的资源时,本节点中的该 处理器也可以生成资源访问请求的报文, 并查询路由表, 选择最优路由路径, 依据最优的路由路径确定将生成的报文发送给第一节点控制器还是第二节点 控制器,第一节点控制器或第二节点控制器接收到本节点内某处理器发送的报 文之后,也会查询自身的路由表以确定通过本节点控制器中的哪个对外连接的 NI接口将该报文转发出去。 该过程与现有技术中本节点的处理器发送其他节 点的处理器中资源的报文的过程相同, 在此不再贅述。 在 CC-NUMA 系统中, 节点内的某节点控制器与某处理器之间的快速通 道互联总线 QPI可能会出现故障,现有技术中, 当节点控制器与某处理器之间 的快速通道互联总线 QPI出现故障,导致链路失效,就会无法进行报文的发送 处理, 引起资源访问的通信中断。  The above two embodiments are described by taking the packet that the other node requests the resource access of the other node as an example, and the process of the node controller and the processor after receiving the packet in the node. When a processor in the node needs to access resources on a processor in another node, the processor in the node may also generate a resource access request packet, query the routing table, and select an optimal routing path. The optimal routing path determines whether the generated packet is sent to the first node controller or the second node controller, and after the first node controller or the second node controller receives the packet sent by a processor in the node, It also queries its own routing table to determine which externally connected NI interface in the node controller forwards the message. This process is the same as the process in which the processor of the local node sends the packets of the resources in the processor of the other node in the prior art, and details are not described herein again. In the CC-NUMA system, the fast-path interconnect bus QPI between a node controller and a processor in a node may be faulty. In the prior art, when the node controller and a processor are connected by a fast channel, If the bus QPI fails, the link fails, and the message transmission processing cannot be performed, and the communication that causes the resource access is interrupted.
为了避免 QPI链路故障造成的通信中断, 参见图 5, 为本发明一种基于 CC-NUMA的报文处理方法的另一个实施例的流程示意图,本实施例应用于节 点控制器与处理器之间的快速通道互联总线出现故障的情况,与以上实施例相 同,在节点内配置有两个节点控制器,每个节点控制器分别维护其对应的地址 空间的目录。 本实施例的 文处理方法包括: In order to avoid the communication interruption caused by the QPI link failure, FIG. 5 is a schematic flowchart diagram of another embodiment of a CC-NUMA-based packet processing method according to the present invention. The embodiment is applied to a node controller and a processor. In the case where the fast channel interconnect bus fails, as in the above embodiment, two node controllers are configured in the node, and each node controller maintains its corresponding address. The directory of the space. The text processing method of this embodiment includes:
步骤 501 : 第一节点控制器接收其他节点发送的报文, 对报文进行地址解 步骤 502: 当第一节点控制器判断出该报文的目的地址为本节点且该报文 对应的地址空间在第一节点控制器所维护的地址空间内时,确定该 4艮文对应的 地址空间所归属的处理器。  Step 501: The first node controller receives the packet sent by the other node, and performs address resolution on the packet. Step 502: When the first node controller determines that the destination address of the packet is the node and the address space corresponding to the packet When the address space maintained by the first node controller is within the address space, the processor to which the address space corresponding to the message is located is determined.
其中,步骤 501和步骤 502的操作与以上实施例所描述的相应操作过程相 同, 在此不再贅述。  The operations of the steps 501 and 502 are the same as the corresponding operations described in the foregoing embodiments, and details are not described herein again.
步骤 503: 如果第一节点控制器与该处理器间的快速通道互联总线的链路 出现故障, 第一节点控制器将该报文以及第一节点控制器所维护的、至少包含 该报文对应的地址空间的目录发送给第二节点控制器,以便第二节点控制器维 护该报文对应的地址空间的目录,并将该报文转发给该报文对应的地址空间所 归属的处理器。  Step 503: If the link of the fast path interconnection bus between the first node controller and the processor fails, the first node controller corresponding to the packet and the first node controller, including at least the packet The directory of the address space is sent to the second node controller, so that the second node controller maintains the directory of the address space corresponding to the message, and forwards the message to the processor to which the address space corresponding to the message belongs.
为了确定链路故障,节点内的节点控制器需要获取节点控制器与处理器间 的快速通道互联总线 QPI的链路故障。当节点内中任何一条连接节点控制器与 处理器的快速互联通道出现故障后,节点中的两个处理器均可以获取到该链路 出现故障的信息。  In order to determine the link failure, the node controller in the node needs to obtain the link failure of the fast channel interconnection bus QPI between the node controller and the processor. When any of the nodes in the node and the fast interconnect channel of the processor fail, both processors in the node can obtain information about the link failure.
当节点中的快速通道互联总线出现故障后,故障的中断源会将该中断故障 信息上报给处理器,处理器会将快速通道互联总线的链路故障通知给本节点内 的节点控制器,节点控制器接收处理器发送的快速通道互联总线的链路故障信 息。如, 当节点内的第一节点控制器与某处理器间的快速通道互联总线出现故 障后, 处理器接收到的第一节点控制器发送的报文数据会产生错误, 当产生的 错误数据的包数超过预设值时,处理器确定第一节点控制器与该处理器间的快 速通道互联总线出现故障 ,该处理器会将检测出的链路故障通知给第一节点控 制器和第二节点控制器。  When the fast-path interconnect bus in the node fails, the interrupt source of the fault reports the interrupt fault information to the processor, and the processor notifies the node controller of the fast-path interconnect bus to the node controller in the node. The controller receives link fault information of the fast channel interconnect bus sent by the processor. For example, when the fast channel interconnection bus between the first node controller and the certain processor in the node fails, the message data sent by the first node controller received by the processor generates an error, when the generated error data When the number of packets exceeds a preset value, the processor determines that the fast path interconnection bus between the first node controller and the processor is faulty, and the processor notifies the first node controller and the second link of the detected link failure. Node controller.
当然, 当与节点控制器相连的链路故障出现问题后, 节点控制器也可以检 测与其相连的 QPI链路故障,并将故障上报给节点中的某个处理器,该处理器 将该链路故障通知其他处理器以及另一节点控制器。也可以由检测出快速通道 互联总线链路故障的节点控制器将该快速通道互联总线的链路故障通知给另 一节点控制器。 Of course, when there is a problem with the link failure connected to the node controller, the node controller can also detect the QPI link failure connected thereto, and report the fault to a processor in the node, the processor links the link. The failure notifies other processors and another node controller. It is also possible to notify the node controller of the fast-path interconnect bus link that the link failure of the fast-path interconnect bus link is faulty to another A node controller.
当第一节点控制器与处理器间的链路出现故障后,处理器或第一节点控制 器会调用基本输入输出系统 BIOS程序, 进行端口路由路径配置, 更改第一节 点控制器发送报文到该处理器的端口路由路径,将第一节点控制器与该处理器 间的路径更改为通过第一节点控制器与第二节点控制器间的网络接口 NI的路 由路径。  When the link between the first node controller and the processor fails, the processor or the first node controller invokes the basic input/output system BIOS program, performs port routing path configuration, and changes the first node controller to send the message to The port routing path of the processor changes the path between the first node controller and the processor to a routing path through the network interface NI between the first node controller and the second node controller.
例如,当第一节点控制器与本节点内的第一处理器间的快速通道互联总线 出现故障时, 第一节点控制器到第一处理器间的路由路径会重新配置, 当第一 节点控制器接收到报文后,确定报文对应的地址空间的目录由自身维护且该报 文是发往第一处理器的,由于此时第一节点控制器与第一处理器间的快速通道 互联总线出现故障, 第一节点控制器会执行中断处理程序,依据重新配置的端 口路由路径, 通过与第二节点控制器间的网络接口 NI, 将该报文以及第一节 点控制器维护的、至少包含该报文对应的地址空间的目录,发送给第二节点控 制器。  For example, when the fast path interconnection bus between the first node controller and the first processor in the node fails, the routing path between the first node controller and the first processor is reconfigured, when the first node controls After receiving the packet, the device determines that the directory of the address space corresponding to the packet is maintained by itself and the packet is sent to the first processor, because the fast channel is interconnected between the first node controller and the first processor at this time. If the bus fails, the first node controller executes an interrupt handler, and according to the reconfigured port routing path, through the network interface NI between the second node controller, the message and the first node controller maintain at least A directory containing the address space corresponding to the message is sent to the second node controller.
当第二节点控制器接收到第一节点控制器转发报文后,第二节点控制器也 需要对报文进行地址解析,确定该报文的地址空间归属于本节点内的哪个处理 器, 并通过与该处理器相连的快速通道互联总线, 将报文发送给该处理器, 以 便处理器对该报文进行处理。 当然, 该处理器完成报文处理后会将处理后的报 文再返回给第二节点控制器, 整个报文的处理过程中, 第二节点控制器都会对 该报文对应的地址空间的目录进行维护 ,并最终将处理器返回的处理后的报文 返回给发送数据请求报文的源节点。  After the second node controller receives the packet from the first node controller, the second node controller also needs to perform address resolution on the packet to determine which processor in the node belongs to the address space of the packet, and The message is sent to the processor through the fast channel interconnect bus connected to the processor, so that the processor processes the message. Of course, after the processor finishes processing the packet, the processed packet is returned to the second node controller. During the processing of the entire packet, the second node controller will catalog the address space corresponding to the packet. Perform maintenance, and finally return the processed message returned by the processor to the source node that sent the data request message.
需要说明的是,当第一节点控制器与该处理器间的快速通道互联总线的链 路出现故障, 第一节点控制器也可以仅将该报文发送给第二节点控制器, 第二 节点控制器对报文进行地址解析, 将报文发送给该对应所请求访问的处理器。 在第二节点控制器需要对该报文的地址空间的目录内容进行读写修改或回写 时, 通过与第一节点控制器之间的 NI接口将对目录修改的请求发送给第一节 点控制器, 并由第一节点控制器对该报文对应的地址空间的目录进行维护。  It should be noted that when the link of the fast path interconnection bus between the first node controller and the processor fails, the first node controller may also send the message only to the second node controller, the second node. The controller performs address resolution on the packet, and sends the packet to the processor corresponding to the requested access. When the second node controller needs to perform read/write modification or write back to the directory content of the address space of the packet, the request for the directory modification is sent to the first node control through the NI interface with the first node controller. And the first node controller maintains the directory of the address space corresponding to the message.
本实施例中, 当节点内某条 QPI链路出现故障时,节点控制器可以获取到 QPI链路故障信息, 当第一节点控制器接收到的报文, 确定该报文对应的地址 空间所属的处理器, 当该处理器与第一节点控制器间的 QPI总线出现故障时, 第一节点控制器可以将该报文以及该报文对应的地址空间的目录信息转发给 第二节点控制器, 由第二节点控制器将报文进行地址解析,确定出该报文对应 的地址空间归属于本节点内的哪个处理器,并通过与该处理器相连的快速通道 互联总线将该报文发送到该处理器, 因此, 即使节点内某条 QPI链路出现故障 后, 仍可以选择新的路由路径, 将报文发送到对应的处理器, 避免了由于链路 故障而导致的通信中断。 在现有技术中, 当本节点中的节点控制器出现故障, 该节点控制器无法建 立地址空间的目录, 如, 节点控制器中存储地址空间目录的存储器失效, 该节 点控制器不可以重新获取地址空间的目录。在此情况下, 其他节点可能无法访 问本节点的处理器中保存的数据资源, 本节点也无法接收、处理或转发其他节 点发送的报文, 进而影响整个 CC-NUMA 系统的资源访问。 为了解决这一问 题, 参见图 6, 为本发明一种基于 CC-NUMA的报文处理方法的另一个实施例 的流程示意图, 本实施例中在节点内配置有两个节点控制器,每个节点控制器 分别维护其对应的地址空间的目录。本实施例适用于节点内的某个节点控制器 出现故障的情况下, 本实施例的方法包括: In this embodiment, when a certain QPI link in the node fails, the node controller can obtain the QPI link fault information, and when the first node controller receives the packet, determines the address corresponding to the packet. The processor to which the space belongs, when the QPI bus between the processor and the first node controller fails, the first node controller may forward the message and the directory information of the address space corresponding to the packet to the second node. The controller performs address resolution on the packet by the second node controller, determines which processor in the node belongs to the address space corresponding to the packet, and uses the fast channel interconnection bus connected to the processor to report the packet. The text is sent to the processor, so even if a QPI link in the node fails, the new routing path can be selected, and the packet is sent to the corresponding processor, thereby avoiding the communication interruption caused by the link failure. . In the prior art, when the node controller in the node fails, the node controller cannot establish a directory of the address space. For example, if the memory of the address space directory in the node controller fails, the node controller cannot reacquire. The directory of the address space. In this case, other nodes may not be able to access the data resources saved in the processor of the node, and the node cannot receive, process, or forward the packets sent by other nodes, thereby affecting the resource access of the entire CC-NUMA system. In order to solve this problem, referring to FIG. 6, FIG. 6 is a schematic flowchart diagram of another embodiment of a CC-NUMA-based packet processing method according to the present invention. In this embodiment, two node controllers are configured in a node, and each The node controller maintains a directory of its corresponding address space. In this embodiment, when a node controller in a node is faulty, the method in this embodiment includes:
步骤 601 : 第一节点控制器获取第二节点控制器的故障信息;  Step 601: The first node controller acquires fault information of the second node controller.
第二节点控制器自身的目录丟失,或者第二节点控制器内部存储地址空间 目录的存储器出现故障, 导致第二节点控制器不能进行地址空间的目录维护 时,故障中断源将第二节点控制器的故障信息上^艮给本节点内的某处理器, 该 处理器会第二节点控制器的故障信息通知给节点内的其他处理器以及第一节 点控制器。该处理器还会将第二节点控制器的故障信息中发送给系统中其他节 点。其他节点获取到该第二节点控制器的故障信息后,会重现配置发往该节点 的报文路由路径, 第二节点控制器将不会接收到其他节点发送的报文信息。  The second node controller's own directory is lost, or the memory of the second node controller's internal storage address space directory fails, causing the second node controller to fail to perform directory maintenance of the address space, and the failure interrupt source will be the second node controller. The fault information is sent to a processor in the node, and the processor notifies the other nodes in the node and the first node controller of the fault information of the second node controller. The processor also sends the fault information of the second node controller to other nodes in the system. After the other node obtains the fault information of the second node controller, the packet routing path sent to the node is re-configured, and the second node controller will not receive the packet information sent by other nodes.
当然, 第二节点控制器检测出自身故障, 无法建立或维护地址空间的目录 时, 第二节点控制器也可以直接将自身的故障信息通知第一节点控制器。  Of course, when the second node controller detects its own fault and cannot establish or maintain a directory of the address space, the second node controller can directly notify the first node controller of its own fault information.
步骤 602: 第一节点控制器通过广播侦听获取该第二节点控制器所维护的 地址空间的目录。 为了在第二节点控制器出现故障的情况下,不影响其他节点的资源访问请 求,第一节点控制器会通过广播侦听的方式向系统中的其他节点发送获取第一 节点控制器所维护的地址空间的数据信息,其他节点中的处理器会将相应的数 据信息返回给该第一节点控制器,以便第一节点控制器建立第二节点控制器原 来所维护的地址空间的目录信息。 Step 602: The first node controller acquires a directory of an address space maintained by the second node controller by using broadcast snooping. In order to prevent the resource access request of other nodes in the event that the second node controller fails, the first node controller sends the interception to the other nodes in the system by means of broadcast snooping to obtain the maintenance of the first node controller. The data information of the address space, the processor in the other node returns the corresponding data information to the first node controller, so that the first node controller establishes the directory information of the address space originally maintained by the second node controller.
步骤 603: 当第一节点控制器接收到其他节点的发送的报文时, 对该报文 进行地址解析。  Step 603: When the first node controller receives the packet sent by the other node, the address resolution is performed on the packet.
步骤 604: 当该报文的目的地址为本节点时, 第一节点控制器确定该报文 对应的地址空间所归属的处理器,将该报文发送给该报文对应的地址空间所归 属的处理器。  Step 604: When the destination address of the packet is the node, the first node controller determines the processor to which the address space corresponding to the packet belongs, and sends the packet to the address space corresponding to the packet. processor.
由于第二节点控制器出现故障后,第一节点控制器通过广播侦听的方式获 取到原来第二节点控制器所维护的地址空间的目录, 此时, 第一节点控制器维 护着整个节点内所有地址空间的目录,当第一节点控制器接收到其他节点发送 的报文时, 无需判断该报文对应的地址空间是否由本节点控制器所维护, 第一 节点控制器可以直接确定该报文对应的地址空间所归属的处理器,并通过与该 处理器相连的快速通道互联总线将该报文发送给该处理器。  After the second node controller fails, the first node controller acquires the directory of the address space maintained by the original second node controller by means of broadcast interception. At this time, the first node controller maintains the entire node. When the first node controller receives the packet sent by the other node, the first node controller does not need to determine whether the address space corresponding to the packet is maintained by the local node controller, and the first node controller can directly determine the packet. The processor to which the corresponding address space belongs, and the message is sent to the processor through a fast channel interconnect bus connected to the processor.
本实施例中, 当其中一个节点控制器出现故障后, 另一节点控制器可以获 取该节点控制器故障信息 ,并通过广播侦听获取到该节点控制器所维护的地址 空间的数据信息, 并由另一节点控制器对发往本节点的所有报文进行处理,避 免了由于节点内节点控制器自身故障, 导致的整个 CC-NUMA 系统的通信中 断。 参见图 7, 为本发明一种基于 CC-NUMA的报文处理装置的一个实施例, 本实施例中,在节点内配置有两个节点控制器,每个节点控制器分别维护其对 应的地址空间, 该报文处理装置可以为节点控制器,也可以为节点控制器的一 部分, 该报文处理装置包括: 报文接收单元 701、 地址分析单元 702、 处理器 地址判断单元 703和报文发送单元 704。  In this embodiment, after one of the node controllers fails, the other node controller can obtain the node controller failure information, and obtain the data information of the address space maintained by the node controller through broadcast interception, and All the messages sent to the node are processed by another node controller, which avoids the communication interruption of the entire CC-NUMA system due to the failure of the node controller itself in the node. Referring to FIG. 7, an embodiment of a CC-NUMA-based message processing apparatus according to the present invention is configured. In this embodiment, two node controllers are configured in a node, and each node controller maintains its corresponding address. The message processing device may be a node controller or a part of a node controller. The message processing device includes: a message receiving unit 701, an address analyzing unit 702, a processor address determining unit 703, and a message sending. Unit 704.
报文接收单元 701 , 用于接收其他节点发送的报文, 对该报文进行地址解 地址分析单元 702, 用于当报文的目的地址为本节点时, 判断报文对应的 地址空间的目录是否由本节点控制器所维护, 如果是, 则执行处理器地址判断 单元的操作。 The message receiving unit 701 is configured to receive a packet sent by another node, and perform address solution on the packet. The address analyzing unit 702 is configured to determine, when the destination address of the packet is the node, whether the directory of the address space corresponding to the packet is maintained by the local node controller, and if yes, perform the operation of the processor address determining unit.
处理器地址判断单元 703 , 用于确定所述 >¾文对应的地址空间所归属的处 理器。  The processor address determining unit 703 is configured to determine a processor to which the address space corresponding to the >3⁄4 text belongs.
报文发送单元 704, 用于通过与处理器相连的快速通道互联总线将该报文 发送到该处理器, 以便该处理器对该报文进行处理。  The message sending unit 704 is configured to send the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor processes the message.
在节点控制器中执行报文接收单元和地址分析单元的操作的可以为节点 控制器中的 Rbox模块或 Rbox模块的一部分。 当节点控制器确定接收到的报 文所对应的地址空间的目录由本节点控制器所维护时,就可以将报文发送到地 址判断单元,执行处理器判断单元和报文发送单元的操作,在节点控制器中执 行该处理器地址判断单元和报文发送单元的可以为緩存一致性模块( CCM, Cache Coherence module )也可以为该 CCM模块的一部分。 当确定了报文对应 的处理器后就可以选择相应的 QPI总线将报文发送到对应的处理器,完成报文 发送到相应的处理器时需要由与处理器相连的 QPI链路层管理模块 QPIL选择 相应的 QPI总线, 将报文发送到对应的处理器中。  The operation of the message receiving unit and the address analyzing unit in the node controller may be part of the Rbox module or the Rbox module in the node controller. When the node controller determines that the directory of the address space corresponding to the received message is maintained by the local node controller, the packet may be sent to the address determining unit to perform the operations of the processor determining unit and the message sending unit. The CCM (Cache Coherence Module) may be a part of the CCM module, and the CCM (Cache Coherence Module) may be executed in the node controller. After determining the processor corresponding to the message, the corresponding QPI bus can be selected to send the message to the corresponding processor, and the QPI link layer management module connected to the processor needs to be completed when the message is sent to the corresponding processor. QPIL selects the corresponding QPI bus and sends the message to the corresponding processor.
需要说明的,判断出报文对应的地址空间的目录是否哪个节点控制器所维 护的具体方式有多种, 对应的, 该地址分析单元 702可以包括:  It should be noted that, in the directory of the address space corresponding to the packet, it is determined that the node controller is protected by a plurality of methods. The address analysis unit 702 may include:
地址分析子单元,用于根据预设的地址空间的奇偶校验位与节点控制器之 间的对应关系,判断该报文对应的地址空间的奇偶校验位是否对应本节点控制 器, 如果是, 则该报文对应的地址空间的目录由本节点控制器所维护。  An address analysis sub-unit, configured to determine, according to a correspondence between a parity bit of the preset address space and a node controller, whether a parity bit of the address space corresponding to the packet corresponds to the node controller, and if The directory of the address space corresponding to the message is maintained by the local node controller.
在节点控制器接收到目的地址为本节点的报文,且确定了报文对应的地址 空间的目录由本节点控制器所维护时,本节点控制器还需要记录发送报文以访 问该地址空间的源节点地址, 并进行该地址空间的数据状态的更新等, 因此, 本实施例的报文处理装置还包括: 地址空间维护单元, 用于维护接收到的报文 对应的地址空间的目录。  When the node controller receives the packet whose destination address is the local node, and determines that the directory of the address space corresponding to the packet is maintained by the local node controller, the node controller also needs to record and send the packet to access the address space. The address processing device of the present embodiment further includes: an address space maintenance unit, configured to maintain a directory of an address space corresponding to the received message.
进一步的, 当本节点控制器将报文发送给该报文对应的处理器后, 处理器 会依据报文的请求信息对报文进行处理,并将处理后的报文返回给该节点控制 器, 对应的, 该报文处理装置还包括: 报文返回单元, 用于接收处理器返回的 处理后的报文, 并将处理后的报文发送给该报文对应的源节点。 参见图 8,为本发明一种基于 CC-NUMA的报文处理装置的另一个实施例 的结构示意图,本实施例的装置应用于节点控制器与处理器之间的快速通道互 联总线出现故障后,在本实施例中节点在节点内配置有两个节点控制器,每个 节点控制器分别维护其对应的地址空间, 该报文处理装置包括: 链路故障获取 单元 801、 报文接收单元 802、 地址分析单元 803、 处理器地址判断单元 804 和故障处理单元 805。 Further, after the node controller sends the packet to the processor corresponding to the packet, the processor processes the packet according to the request information of the packet, and returns the processed packet to the node controller. Correspondingly, the message processing apparatus further includes: a message returning unit, configured to receive, returned by the processor The processed packet is sent to the source node corresponding to the packet. FIG. 8 is a schematic structural diagram of another embodiment of a CC-NUMA-based message processing apparatus according to the present invention. The apparatus of this embodiment is applied to a fast channel interconnection bus between a node controller and a processor. In this embodiment, the node is configured with two node controllers in the node, and each node controller maintains its corresponding address space. The packet processing device includes: a link fault obtaining unit 801 and a message receiving unit 802. The address analyzing unit 803, the processor address determining unit 804, and the fault handling unit 805.
其中, 链路故障获取单元 801 , 用于获取系统中的快速通道互联总线的链 路连接故障。  The link fault obtaining unit 801 is configured to acquire a link connection fault of the fast channel interconnect bus in the system.
报文接收单元 802,用于接收其他节点发送的报文,对报文进行地址解析。 地址分析单元 803 , 用于当报文的目的地址为本节点时, 判断报文对应的 地址空间的目录是否由本节点控制器所维护,如果是, 则执行处理器地址判断 单元的操作。  The message receiving unit 802 is configured to receive a packet sent by another node, and perform address resolution on the packet. The address analyzing unit 803 is configured to: when the destination address of the packet is the node, determine whether the directory of the address space corresponding to the packet is maintained by the node controller, and if yes, perform the operation of the processor address determining unit.
处理器地址判断单元 804, 用于确定^艮文对应的地址空间所归属的处理 器。  The processor address determining unit 804 is configured to determine a processor to which the address space corresponding to the text belongs.
故障处理单元 805 , 用于当本节点控制器与所述处理器之间链路出现故障 时,将该报文以及本节点控制器所维护的、至少包含该报文对应的地址空间的 目录,发送给另一节点控制器, 以便该另一节点控制器将该报文转发给该处理 器, 并维护该"¾文对应的地址空间的目录。  The fault processing unit 805 is configured to: when the link between the local node controller and the processor is faulty, the message and the directory maintained by the local node controller and including at least the address space corresponding to the packet, Sending to another node controller, so that the other node controller forwards the message to the processor, and maintains the directory of the address space corresponding to the "3⁄4".
根据获取链路故障的方式的不同, 链路故障获取单元 801 , 包括: 链路故障信息接收单元,用于接收处理器发送的节点控制器与该处理器间 快速通道互联总线的链路故障信息。  The link fault obtaining unit 801 includes: a link fault information receiving unit, configured to receive link fault information of a fast channel interconnect bus between the node controller and the processor sent by the processor, according to a manner of obtaining a link fault. .
链路故障检测单元, 用于检测与处理器之间相连的快速通道总线的链路, 获取该快速通道互联总线的链路故障。  The link fault detecting unit is configured to detect a link of the fast channel bus connected to the processor, and obtain a link fault of the fast channel interconnect bus.
需要说明的是, 当该另一节点控制器接收到本节点控制器转发报文后, 另 一节点控制器也需要对报文进行地址解析,确定该报文的地址空间所归属与本 节点内的哪个处理器, 并通过与该处理器相连的快速通道互联总线,将报文发 送给该处理器, 以便处理器对该报文进行处理。 当然, 该处理器完成报文处理 后,会将处理后的报文再返回给该另一节点控制器, 该另一节点控制器会维护 该报文对应的地址空间的目录, 并将处理后的报文返回给发送报文的源节点。 参见图 9,为本发明一种基于 CC-NUMA的报文处理装置的另一个实施例 的结构示意图, 本实施例的装置应用于某节点控制器发生故障时, 本实施例在 节点内配置有两个节点控制器, 每个节点控制器分别维护其对应的地址空间, 该报文处理装置包括: 控制器故障获取单元 901、地址空间数据获取单元 902、 报文接收单元 903、 地址分析单元 904和报文发送单元 905。 It should be noted that, after the other node controller receives the packet forwarded by the local node controller, the other node controller also needs to perform address resolution on the packet, and determines that the address space of the packet belongs to the local node. Which processor, and through the fast-channel interconnect bus connected to the processor, sends a message to the processor for the processor to process the message. Of course, the processor completes the message processing. After that, the processed message is returned to the other node controller, and the other node controller maintains the directory of the address space corresponding to the message, and returns the processed message to the sent message. Source node. FIG. 9 is a schematic structural diagram of another embodiment of a CC-NUMA-based message processing apparatus according to the present invention. When the device in this embodiment is applied to a node controller, the embodiment is configured in the node. Two node controllers, each of which maintains its corresponding address space. The message processing apparatus includes: a controller fault acquiring unit 901, an address space data acquiring unit 902, a message receiving unit 903, and an address analyzing unit 904. And a message transmitting unit 905.
控制器故障获取单元 901 , 用于获取另一节点控制器的故障信息。  The controller fault acquiring unit 901 is configured to acquire fault information of another node controller.
地址空间数据获取单元 902, 用于通过广播侦听获取该另一节点控制器所 维护的地址空间的目录。  The address space data obtaining unit 902 is configured to obtain, by broadcast snooping, a directory of an address space maintained by the another node controller.
报文接收单元 903 , 用于接收其他节点的发送的报文, 对报文进行地址解 地址分析单元 904 , 用于当 ^艮文的目的地址为本节点时, 判断^艮文对应的 地址空间所归属的处理器。  The message receiving unit 903 is configured to receive a message sent by another node, and perform an address de-address analysis unit 904 on the message, where the address address corresponding to the message is determined when the destination address of the message is the node. The processor to which it belongs.
报文发送单元 905, 用于将报文发送给该报文对应的地址空间所归属的处 理器。  The message sending unit 905 is configured to send the message to the processor to which the address space corresponding to the message belongs.
根据获取控制器故障的方式不同, 该控制器故障获取单元 901 , 包括: 第一故障信息接收单元,用于接收某处理器发送的所述另一节点控制器的 故障信息。  The controller fault acquiring unit 901 includes: a first fault information receiving unit, configured to receive fault information of the another node controller sent by a processor.
包括:第二故障接收单元,用于接收所述另一节点控制器发送的故障信息。 另外, 本发明还提供了一种基于 CC-NUMA的报文处理系统, 包括, 两 个节点控制器, 以及至少两个处理器; 每个节点控制器均与各个处理器之间通 过快速通道互联总线相连。 两个节点控制器之间通过网络接口连接。  The method includes: a second fault receiving unit, configured to receive fault information sent by the another node controller. In addition, the present invention further provides a CC-NUMA-based message processing system, including: two node controllers, and at least two processors; each node controller is interconnected with each processor through a fast channel. The bus is connected. The two node controllers are connected through a network interface.
每个节点控制器都内置有本发明以上实施例所描述的基于 CC-NUMA的 报文处理装置。 本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是 与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于 实施例公开的装置而言, 由于其与实施例公开的方法相对应, 所以描述的比较 简单, 相关之处参见方法部分说明即可。 Each node controller has built-in CC-NUMA-based message processing apparatus described in the above embodiments of the present invention. The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments may be referred to each other. For For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the method part.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例 的单元及算法步骤, 能够以电子硬件、 计算机软件或者二者的结合来实现, 为 了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描 述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于 技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来 使用不同方法来实现所描述的功能, 但是这种实现不应认为超出本发明的范 围。  A person skilled in the art will further appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software or a combination of both, in order to clearly illustrate the hardware and software. Interchangeability, the composition and steps of the various examples have been generally described in terms of function in the above description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处 理器执行的软件模块, 或者二者的结合来实施。软件模块可以置于随机存储器 ( RAM )、内存、只读存储器 ( ROM )、电可编程 ROM、电可擦除可编程 ROM, 寄存器、 硬盘、 可移动磁盘、 CD-ROM, 或技术领域内所公知的任意其它形式 的存储介质中。  The steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented directly in hardware, a software module executed by a processor, or a combination of both. The software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本 发明。 对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见 的, 本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下, 在 其它实施例中实现。 因此, 本发明将不会被限制于本文所示的这些实施例, 而 是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。  The above description of the disclosed embodiments enables those skilled in the art to make or use the invention. Various modifications to these embodiments are obvious to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention is not to be limited to the embodiments shown herein, but is to be accorded to the broadest scope of the principles and novel features disclosed herein.

Claims

权 利 要 求 Rights request
1、 一种基于 CC-NUMA的报文处理方法, 其特征在于, 在节点内配置有 两个节点控制器,每个节点控制器分别维护其对应的地址空间的目录, 所述方 法包括:  A CC-NUMA-based packet processing method, wherein two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the method includes:
接收其他节点发送的报文, 对所述报文进行地址解析;  Receiving a packet sent by another node, and performing address resolution on the packet;
当所述 4艮文的目的地址为本节点时,判断所述^艮文对应的地址空间的目录 是否由本节点控制器所维护;  When the destination address of the message is the node, it is determined whether the directory of the address space corresponding to the message is maintained by the node controller;
如果是,确定所述报文对应的地址空间所归属的处理器,通过与所述处理 器相连的快速通道互联总线将所述报文发送到所述处理器,以便所述处理器对 所述 文进行处理;  If yes, determining, by the processor to which the address space corresponding to the message belongs, sending the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor is Processing
如果否, 则将所述报文转发给本节点内的另一节点控制器, 以便所述另一 节点控制器确定所述报文对应的地址空间所归属的处理器,将所述报文发送给 该报文对应的处理器进行报文处理。  If not, forwarding the packet to another node controller in the node, so that the another node controller determines a processor to which the address space corresponding to the packet belongs, and sends the packet The packet processing is performed on the processor corresponding to the packet.
2、 根据权利要求 1所述的方法, 其特征在于, 所述判断所述报文对应的 地址空间的目录是否由本节点控制器所维护, 包括:  The method according to claim 1, wherein the determining whether the directory of the address space corresponding to the packet is maintained by the node controller comprises:
根据预设的地址空间的奇偶校验位与节点控制器之间的对应关系,判断所 述报文对应的地址空间的奇偶校验位是否对应本节点控制器,如果是, 则所述 报文对应的地址空间的目录由本节点控制器所维护。  Determining, according to a correspondence between the parity bit of the preset address space and the node controller, whether the parity bit of the address space corresponding to the packet corresponds to the local node controller, and if yes, the packet The directory of the corresponding address space is maintained by the local node controller.
3、 根据权利要求 1所述的方法, 其特征在于, 当判断出所述报文对应的 地址空间由本节点控制器所维护后,还包括: 维护所述>¾文对应的地址空间的 目录信息。  The method according to claim 1, wherein after determining that the address space corresponding to the message is maintained by the local node controller, the method further includes: maintaining directory information of the address space corresponding to the >3⁄4 text .
4、 根据权利要求 1所述的方法, 其特征在于, 当本节点控制器将报文发 送给所述处理器之后, 还包括:  The method according to claim 1, wherein after the node controller sends the message to the processor, the method further includes:
接收所述处理器返回的处理后的报文,并将所述处理后的报文发送给所述 文对应的源节点。  Receiving the processed message returned by the processor, and sending the processed message to the source node corresponding to the file.
5、 一种基于 CC-NUMA的报文处理方法, 其特征在于, 在节点内配置有 两个节点控制器,每个节点控制器分别维护其对应的地址空间的目录, 所述方 法包括:  5. A CC-NUMA-based message processing method, wherein two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the method includes:
接收其他节点发送的报文, 对所述报文进行地址解析; 当所述报文的目的地址为本节点时,判断所述报文对应的地址空间是否在 其所维护的地址空间内; Receiving a packet sent by another node, and performing address resolution on the packet; When the destination address of the packet is the node, determining whether the address space corresponding to the packet is in the address space maintained by the packet;
如果是,确定所述报文对应的地址空间所归属的处理器, 当本节点控制器 与所述处理器间的快速通道互联总线的链路出现故障时,将所述报文以及本节 点控制器所维护的、至少包含所述报文对应的地址空间的目录,发送给另一节 点控制器, 以便所述另一节点控制器将所述报文转发给所述处理器, 并维护该 才艮文对应的地址空间的目录。  If yes, determining, by the processor to which the address space corresponding to the message belongs, when the link of the fast channel interconnection bus between the node controller and the processor fails, the packet and the local node are controlled. a directory maintained by the device and containing at least an address space corresponding to the message, sent to another node controller, so that the other node controller forwards the message to the processor, and maintains the The directory of the address space corresponding to the text.
6、 根据权利要求 5所述的方法, 其特征在于, 获取本节点控制器与所述 处理器之间的快速通道互联总线的链路故障, 包括:  The method according to claim 5, wherein the link failure of the fast channel interconnection bus between the node controller and the processor is obtained, including:
接收所述处理器发送的所述节点控制器与所述处理器间快速通道互联总 线的链路故障信息。  Receiving link fault information of the fast channel interconnection bus between the node controller and the processor sent by the processor.
7、 根据权利要求 5所述的方法, 其特征在于, 获取本节点控制器与处理 器之间的快速通道互联总线的链路故障, 包括:  The method according to claim 5, wherein the link failure of the fast channel interconnection bus between the node controller and the processor is obtained, including:
检测与所述处理器之间的快速通道总线的链路,获取所述快速通道互联总 线的链路故障。  A link of the fast track bus between the processor is detected to obtain a link failure of the fast track interconnect bus.
8、 根据权利要求 5所述的方法, 其特征在于, 所述另一节点控制器将所 述报文转发给所述处理器, 包括:  The method according to claim 5, wherein the forwarding, by the another node controller, the message to the processor includes:
所述另一节点控制器对所述报文进行地址解析,确定所述报文的地址空间 所属的处理器,通过与所述处理器相连的快速通道互联总线,将所述报文发送 给所述处理器, 以便所述处理器对所述报文进行处理。  The another node controller performs address resolution on the packet, determines a processor to which the address space of the packet belongs, and sends the packet to the server through a fast channel interconnection bus connected to the processor. a processor, such that the processor processes the message.
9、 一种基于 CC-NUMA的报文处理方法, 其特征在于, 在节点内配置有 两个节点控制器, 每个节点控制器分别维护其对应的地址空间的目录, 包括: 获取另一节点控制器的故障信息;  9. A CC-NUMA-based message processing method, wherein two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, including: acquiring another node Controller fault information;
通过广播侦听获取所述另一节点控制器所维护的地址空间的目录; 接收其他节点发送的报文, 对所述报文进行地址解析;  Obtaining, by broadcast snooping, a directory of an address space maintained by the another node controller; receiving a packet sent by another node, and performing address resolution on the packet;
当所述 ·艮文的目的地址为本节点时,确定所述>¾文对应的地址空间所归属 的处理器, 将所述报文发送给所述处理器。  When the destination address of the message is the node, the processor to which the address space corresponding to the message is located is determined, and the message is sent to the processor.
10、 根据权利要求 9所述的方法, 其特征在于, 所述获取另一节点控制器 的故障信息, 包括: 接收本节点内某处理器发送的所述另一节点控制器的故障信息。The method according to claim 9, wherein the acquiring the fault information of the other node controller comprises: Receiving fault information of the another node controller sent by a processor in the node.
11、 根据权利要求 9所述的方法, 其特征在于, 所述获取另一节点控制器 的故障信息, 包括: The method according to claim 9, wherein the acquiring the fault information of the other node controller comprises:
接收所述另一节点控制器发送的所述另一节点控制器的故障信息。  Receiving failure information of the another node controller sent by the another node controller.
12、 一种基于 CC-NUMA的报文处理装置, 其特征在于, 在节点内配置 有两个节点控制器,每个节点控制器分别维护其对应的地址空间的目录, 该报 文处理装置包括:  12. A CC-NUMA-based message processing apparatus, wherein two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the message processing apparatus includes :
报文接收单元,用于接收其他节点发送的报文,对所述报文进行地址解析; 地址分析单元, 用于当所述报文的目的地址为本节点时, 判断所述报文对 应的地址空间的目录是否由本节点控制器所维护,如果是, 则执行处理器地址 判断单元的操作;  a message receiving unit, configured to receive a packet sent by another node, and perform address resolution on the packet; and an address analyzing unit, configured to determine, when the destination address of the packet is a node, Whether the directory of the address space is maintained by the local node controller, and if so, the operation of the processor address determining unit is performed;
处理器地址判断单元, 用于确定所述 文对应的地址空间所归属的处理 器;  a processor address determining unit, configured to determine a processor to which the address space corresponding to the text belongs;
报文发送单元,用于通过与所述处理器相连的快速通道互联总线将所述报 文发送到所述处理器, 以便所述处理器对所述>¾文进行处理。  And a message sending unit, configured to send the message to the processor through a fast channel interconnect bus connected to the processor, so that the processor processes the >3⁄4 text.
13、 根据权利要求 12所述的装置, 其特征在于, 所述地址分析单元, 包 括:  The device according to claim 12, wherein the address analyzing unit comprises:
地址分析子单元,用于根据预设的地址空间的奇偶校验位与节点控制器之 间的对应关系 ,判断所述报文对应的地址空间的奇偶校验位是否对应本节点控 制器, 如果是, 则所述 >¾文对应的地址空间的目录由本节点控制器所维护。  An address analysis subunit, configured to determine, according to a correspondence between a parity bit of the preset address space and the node controller, whether the parity bit of the address space corresponding to the packet corresponds to the node controller, if If yes, the directory of the address space corresponding to the >3⁄4 text is maintained by the local node controller.
14、 根据权利要求 12所述的装置, 其特征在于, 还包括: 地址空间维护 单元, 用于维护所述 文对应的地址空间的目录。  The device according to claim 12, further comprising: an address space maintenance unit, configured to maintain a directory of the address space corresponding to the text.
15、 根据权利要求 12所述的装置, 其特征在于, 还包括:  The device according to claim 12, further comprising:
报文返回单元, 用于接收所述处理器返回的处理后的报文, 并将所述处理 后的报文发送给所述报文对应的源节点。  And a packet returning unit, configured to receive the processed packet returned by the processor, and send the processed packet to the source node corresponding to the packet.
16、 一种基于 CC-NUMA的报文处理装置, 其特征在于, 在节点内配置 有两个节点控制器,每个节点控制器分别维护其对应的地址空间的目录, 该报 文处理装置包括:  16. A CC-NUMA-based message processing apparatus, wherein two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the message processing apparatus includes :
链路故障获取单元, 用于获取系统中的快速通道互联总线的链路连接故 障; a link fault obtaining unit, configured to acquire a link connection of a fast channel interconnect bus in the system, Barrier
报文接收单元,用于接收其他节点发送的报文,对所述报文进行地址解析; 地址分析单元, 用于当所述报文的目的地址为本节点时, 判断所述报文对 应的地址空间的目录是否由本节点控制器所维护,如果是, 则执行处理器地址 判断单元的操作;  a message receiving unit, configured to receive a packet sent by another node, and perform address resolution on the packet; and an address analyzing unit, configured to determine, when the destination address of the packet is a node, Whether the directory of the address space is maintained by the local node controller, and if so, the operation of the processor address determining unit is performed;
处理器地址判断单元, 用于确定所述 文对应的地址空间所归属的处理 器;  a processor address determining unit, configured to determine a processor to which the address space corresponding to the text belongs;
故障处理单元 , 用于当本节点控制器与所述处理器之间链路出现故障时 , 将所述报文以及本节点控制器所维护的、至少包含该报文对应的地址空间的目 录,发送给另一节点控制器, 以便所述另一节点控制器将所述报文转发给所述 处理器, 并维护所述 文对应的地址空间的目录。  a fault processing unit, configured to: when the link between the local node controller and the processor is faulty, the message and a directory maintained by the local node controller and including at least an address space corresponding to the packet, Sending to another node controller, so that the other node controller forwards the message to the processor and maintains a directory of the address space corresponding to the text.
17、 根据权利要求 16所述的装置, 其特征在于, 所述链路故障获取单元, 包括:  The device according to claim 16, wherein the link failure acquiring unit comprises:
链路故障信息接收单元,用于接收所述处理器发送的所述节点控制器与所 述处理器间快速通道互联总线的链路故障信息。  And a link failure information receiving unit, configured to receive link fault information of the fast channel interconnection bus between the node controller and the processor sent by the processor.
18、 根据权利要求 16所述的装置, 其特征在于, 所述链路故障获取单元, 包括:  The device according to claim 16, wherein the link failure acquiring unit comprises:
链路故障检测单元, 用于检测与所述处理器之间的快速通道总线的链路, 获取所述快速通道互联总线的链路故障。  The link fault detecting unit is configured to detect a link of the fast channel bus with the processor, and obtain a link fault of the fast channel interconnect bus.
19、 一种基于 CC-NUMA的报文处理装置, 其特征在于, 在节点内配置 有两个节点控制器,每个节点控制器分别维护其对应的地址空间的目录, 该报 文处理装置包括:  19. A CC-NUMA-based message processing apparatus, wherein two node controllers are configured in a node, and each node controller maintains a directory of its corresponding address space, and the message processing apparatus includes :
控制器故障获取单元, 用于获取另一节点控制器的故障信息;  a controller fault acquiring unit, configured to acquire fault information of another node controller;
地址空间数据获取单元,用于通过广播侦听获取所述另一节点控制器所维 护的地址空间的目录;  An address space data obtaining unit, configured to acquire, by broadcast snooping, a directory of an address space maintained by the another node controller;
报文接收单元, 用于接收其他节点的发送的报文,对所述报文进行地址解 地址分析单元, 用于当所述报文的目的地址为本节点时, 判断所述报文对 应的地址空间所归属的处理器; 报文发送单元, 用于将所述报文发送给所述处理器。 a message receiving unit, configured to receive a packet sent by another node, and perform an address resolution analysis unit on the packet, where the packet is determined to be a destination address of the packet, and the packet is determined to be corresponding to the packet. The processor to which the address space belongs; a message sending unit, configured to send the message to the processor.
20、 根据权利要求 19所述的装置, 其特征在于, 所述控制器故障获取单 元, 包括:  The device according to claim 19, wherein the controller fault acquiring unit comprises:
第一故障信息接收单元,用于接收某处理器发送的所述另一节点控制器的 故障信息。  The first fault information receiving unit is configured to receive fault information of the another node controller sent by a certain processor.
21、 根据权利要求 19所述的装置, 其特征在于, 所述控制器故障获取单 元, 包括:第二故障接收单元,用于接收所述另一节点控制器发送的故障信息。  The device according to claim 19, wherein the controller fault acquiring unit comprises: a second fault receiving unit, configured to receive fault information sent by the another node controller.
22、 一种基于 CC-NUMA的报文处理系统, 其特征在于, 包括, 两个节 点控制器, 以及至少两个处理器;  22. A CC-NUMA based message processing system, comprising: two node controllers; and at least two processors;
所述节点控制器与所述处理器之间通过快速通道互联总线相连;  The node controller and the processor are connected by a fast channel interconnect bus;
所述两个节点控制器之间通过网络接口连接;  The two node controllers are connected through a network interface;
所述节点控制器内置有权利要求 12至 21任一项所述的基于 CC-NUMA 的报文处理装置。  The node controller incorporates the CC-NUMA-based message processing apparatus according to any one of claims 12 to 21.
PCT/CN2011/077898 2011-08-02 2011-08-02 Message processing method, device and system based on cc-numa WO2012119369A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201180001573.0A CN102318275B (en) 2011-08-02 2011-08-02 Method, device, and system for processing messages based on CC-NUMA
PCT/CN2011/077898 WO2012119369A1 (en) 2011-08-02 2011-08-02 Message processing method, device and system based on cc-numa

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/077898 WO2012119369A1 (en) 2011-08-02 2011-08-02 Message processing method, device and system based on cc-numa

Publications (1)

Publication Number Publication Date
WO2012119369A1 true WO2012119369A1 (en) 2012-09-13

Family

ID=45429434

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/077898 WO2012119369A1 (en) 2011-08-02 2011-08-02 Message processing method, device and system based on cc-numa

Country Status (2)

Country Link
CN (1) CN102318275B (en)
WO (1) WO2012119369A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11057302B2 (en) 2016-11-30 2021-07-06 New H3C Technologies Co., Ltd. Sending packet

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708190B (en) * 2012-05-15 2016-09-28 浪潮电子信息产业股份有限公司 A kind of method of node control chip catalogue Cache in CC-NUMA system
CN103181132B (en) * 2012-10-15 2015-11-25 华为技术有限公司 Request message processing method and sending method, node and system
CN103092807B (en) * 2012-12-24 2015-09-09 杭州华为数字技术有限公司 Node Controller, parallel computation server system and method for routing
CN104753753B (en) * 2013-12-31 2018-11-16 杭州华为数字技术有限公司 A kind of transmission method, equipment and the computer system of QPI message
CN104935530B (en) * 2015-04-29 2017-12-19 浪潮电子信息产业股份有限公司 A kind of method, interchanger and the system of intercomputer data exchange
CN104899160B (en) * 2015-05-30 2019-02-19 华为技术有限公司 A kind of cache data control method, Node Controller and system
JP6536677B2 (en) * 2015-12-29 2019-07-03 華為技術有限公司Huawei Technologies Co.,Ltd. CPU and multi CPU system management method
CN105808499A (en) * 2016-04-01 2016-07-27 浪潮电子信息产业股份有限公司 CPU interconnection device and multichannel server CPU interconnection topological structure
WO2018032519A1 (en) * 2016-08-19 2018-02-22 华为技术有限公司 Resource allocation method and device, and numa system
CN107239432A (en) * 2017-08-08 2017-10-10 郑州云海信息技术有限公司 A kind of server with novel topological structure
CN107749825B (en) * 2017-10-24 2021-03-09 盛科网络(苏州)有限公司 Flow control method and device based on source chip ID in cross-chip forwarding
CN114094209B (en) * 2021-10-18 2023-10-20 华人运通(江苏)技术有限公司 Battery management system, communication control method and device thereof and vehicle

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6480973B1 (en) * 1999-09-30 2002-11-12 Bull Information Systems Inc. Gate close failure notification for fair gating in a nonuniform memory architecture data processing system
US6769017B1 (en) * 2000-03-13 2004-07-27 Hewlett-Packard Development Company, L.P. Apparatus for and method of memory-affinity process scheduling in CC-NUMA systems
CN1664784A (en) * 2005-03-30 2005-09-07 中国人民解放军国防科学技术大学 Large-scale parallel computer system sectionalized parallel starting method
CN1991795A (en) * 2005-12-28 2007-07-04 国际商业机器公司 System and method for information processing
CN101216815A (en) * 2008-01-07 2008-07-09 浪潮电子信息产业股份有限公司 Double-wing extendable multi-processor tight coupling sharing memory architecture
CN101273332A (en) * 2005-09-30 2008-09-24 英特尔公司 Thread-data affinity optimization using compiler
US20080294832A1 (en) * 2007-04-26 2008-11-27 Hewlett-Packard Development Company, L.P. I/O Forwarding Technique For Multi-Interrupt Capable Devices

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8578130B2 (en) * 2003-03-10 2013-11-05 International Business Machines Corporation Partitioning of node into more than one partition
CN101651559B (en) * 2009-07-13 2011-07-06 浪潮电子信息产业股份有限公司 Failover method of storage service in double controller storage system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6480973B1 (en) * 1999-09-30 2002-11-12 Bull Information Systems Inc. Gate close failure notification for fair gating in a nonuniform memory architecture data processing system
US6769017B1 (en) * 2000-03-13 2004-07-27 Hewlett-Packard Development Company, L.P. Apparatus for and method of memory-affinity process scheduling in CC-NUMA systems
CN1664784A (en) * 2005-03-30 2005-09-07 中国人民解放军国防科学技术大学 Large-scale parallel computer system sectionalized parallel starting method
CN101273332A (en) * 2005-09-30 2008-09-24 英特尔公司 Thread-data affinity optimization using compiler
CN1991795A (en) * 2005-12-28 2007-07-04 国际商业机器公司 System and method for information processing
US20080294832A1 (en) * 2007-04-26 2008-11-27 Hewlett-Packard Development Company, L.P. I/O Forwarding Technique For Multi-Interrupt Capable Devices
CN101216815A (en) * 2008-01-07 2008-07-09 浪潮电子信息产业股份有限公司 Double-wing extendable multi-processor tight coupling sharing memory architecture

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11057302B2 (en) 2016-11-30 2021-07-06 New H3C Technologies Co., Ltd. Sending packet

Also Published As

Publication number Publication date
CN102318275A (en) 2012-01-11
CN102318275B (en) 2015-01-07

Similar Documents

Publication Publication Date Title
WO2012119369A1 (en) Message processing method, device and system based on cc-numa
JP4290730B2 (en) Tree-based memory structure
US9256500B2 (en) Physical domain error isolation and recovery in a multi-domain system
US8346997B2 (en) Use of peripheral component interconnect input/output virtualization devices to create redundant configurations
US20190235777A1 (en) Redundant storage system
US11330071B2 (en) Inter-process communication fault detection and recovery system
CN114787781A (en) System and method for enabling high availability managed failover services
US10785350B2 (en) Heartbeat in failover cluster
WO2015035574A1 (en) Failure processing method, computer system, and apparatus
JPH09185594A (en) Direct bulk data transfer
US9100443B2 (en) Communication protocol for virtual input/output server (VIOS) cluster communication
WO2015135383A1 (en) Data migration method, device, and computer system
EP3796615B1 (en) Fault tolerance processing method, device, and server
WO2022155919A1 (en) Fault handling method and apparatus, and system
CN107209725A (en) Method, processor and the computer of processing write requests
WO2023072048A1 (en) Network storage method, storage system, data processing unit, and computer system
US11841793B2 (en) Switch-based free memory tracking in data center environments
US8305883B2 (en) Transparent failover support through pragmatically truncated progress engine and reversed complementary connection establishment in multifabric MPI implementation
WO2019119269A1 (en) Network fault detection method and control center device
WO2024051410A1 (en) Data access method and apparatus, network interface card, readable medium, and electronic device
US20220035742A1 (en) System and method for scalable hardware-coherent memory nodes
JP2006053896A (en) Software transparent expansion of number of fabrics covering multiple processing nodes in computer system
CN113626139B (en) High-availability virtual machine storage method and device
WO2023029485A1 (en) Data processing method and apparatus, computer device, and computer-readable storage medium
TWI571077B (en) Integration network device and service integration method thereof

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201180001573.0

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11860127

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11860127

Country of ref document: EP

Kind code of ref document: A1