WO2015035882A1 - 基于节点控制器的请求响应方法和装置 - Google Patents

基于节点控制器的请求响应方法和装置 Download PDF

Info

Publication number
WO2015035882A1
WO2015035882A1 PCT/CN2014/085969 CN2014085969W WO2015035882A1 WO 2015035882 A1 WO2015035882 A1 WO 2015035882A1 CN 2014085969 W CN2014085969 W CN 2014085969W WO 2015035882 A1 WO2015035882 A1 WO 2015035882A1
Authority
WO
WIPO (PCT)
Prior art keywords
node controller
processor
node
packet
information
Prior art date
Application number
PCT/CN2014/085969
Other languages
English (en)
French (fr)
Inventor
王工艺
陈奔
赵亚飞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP17200376.6A priority Critical patent/EP3355181B1/en
Priority to EP14843838.5A priority patent/EP3046035B1/en
Publication of WO2015035882A1 publication Critical patent/WO2015035882A1/zh
Priority to US15/066,623 priority patent/US10324646B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0813Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0817Cache consistency protocols using directory methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0831Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1048Scalability

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a node controller-based request response method and apparatus.
  • the modern advanced features of the system architecture enable the processor (CPU) to have error reporting and error correction capabilities, and support CPU hot swap technology.
  • Some foundry manufacturers have already supported hot plugging of non-uniform memory access (NUMA) hardware, namely the insertion and removal of physical nodes.
  • NUMA non-uniform memory access
  • This advanced feature requires the kernel to remove the CPU it is using when necessary. For example, in order to remotely access a service (RAS), a CPU that executes malicious code must be kept out of the system execution path. Therefore, CPU hot plugging technology needs to be supported in the Linux kernel.
  • the operating system (OS) goes offline with the CPU logic, and the operating system will no longer use the CPU thread that was offline. The processes and interrupts that were originally bound to it are also migrated to other threads.
  • Table 1 shows that a CPU on the NC3 monopolizes the NC0 memory address Addr0, and the NC0 directory information is recorded as the E state, and the NC3 is exclusive. If the CPU on the NC3 modifies the data at this address, when the NC3 is logically removed, the data is written back to the memory of a CPU on NC0, and the directory information is updated to the I state.
  • Table 2 shows that a certain CPU on the NC3 monopolizes the NC0 memory address Addr0, and the NC0 directory information is recorded as the E state, and NC3 is exclusive. However, if the CPU does not modify the data on this address on NC3, if the NC3 is logically removed, the data will not be written back to the memory of a CPU on NC0, and the directory information still indicates that NC3 exclusively uses the data on Addr0. .
  • Table 3 shows that a CPU on the NC3 shares the NC0 memory address Addr0, and the NC0 directory information is recorded in the S state, and NC3 and NC1 are shared. If the NC node is logically removed, this data will not be written back to the memory of a CPU on NC0. The directory information still indicates that NC3 and NC1 share the data on Addr0.
  • NC3 will listen for messages. If NC3 has been physically removed, , the listen message will not respond and the system will hang.
  • NC3 Directory Status Information
  • Embodiments of the present invention provide a method and apparatus for intercepting, which can configure an information directory of other NCs when a node controller (NC) is removed, so that the memory address of each node controller is updated and recorded.
  • NC node controller
  • the information occupied by the node without having to perform a memory refresh on the CPU of each node, so that the current NC can solve the processing of the interception or request sent to the removed node, and directly convert the interception request into an invalid response.
  • an embodiment of the present invention provides a node controller-based request response method, where the method includes:
  • the first node controller receives the first packet; the first packet is a listening message from the CPU interface or a request message from another node controller interface;
  • the method further includes:
  • the method further includes:
  • the method further includes:
  • the invalid response message is sent to the first processor interface.
  • the method further includes:
  • the first message is a data or a status request message from the third node controller interface, sending the request message to the first processor interface, so that the first processor records the Requested information.
  • an embodiment of the present invention provides a request response method based on a node controller, where the method includes:
  • the first node controller receives the interception message sent by the second node controller interface; the destination node number DNID of the interception message is the first processor;
  • the processor presence information includes information on whether a processor located on a current node exists;
  • the invalid response message is sent to the second node controller interface, where the DNID of the invalid response message is the source node number SNID of the first packet;
  • the SNID of the invalid response message is the DNID of the first packet.
  • the method further includes:
  • the method further includes:
  • the processor When the first processor is removed, the processor records the record information of the first processor in the bit information, and updates the record information of the first processor to not exist.
  • an embodiment of the present invention provides a request response apparatus, where the apparatus includes:
  • a receiving unit configured to receive a first packet, where the first packet is a listening packet from a CPU interface Or data or status request messages from other node controller interfaces;
  • An obtaining unit configured to acquire an information directory, where the information directory includes information that the memory address of the current node controller is occupied by other node controllers;
  • a first identifying unit configured to query, in the information directory, whether a memory address requested by the first packet is occupied by the second node controller
  • the second identifying unit queries the node in-position information to determine the second node control Whether the device exists;
  • a processing unit configured to: when the second node controller is determined to be absent, generate an invalid response message and send the destination node number DNID of the invalid response message as a source node number of the first packet
  • the SNID of the invalid response message is the DNID of the first packet.
  • the device further includes: an information directory management unit, configured to change, in the information directory, information that the memory address of the first node controller is occupied by the second node controller And releasing the memory address occupied by the second node controller, and updating the node in-position information, and updating the information of the second node controller to not exist.
  • an information directory management unit configured to change, in the information directory, information that the memory address of the first node controller is occupied by the second node controller And releasing the memory address occupied by the second node controller, and updating the node in-position information, and updating the information of the second node controller to not exist.
  • the processing unit is further configured to:
  • the processing unit is further configured to:
  • processing unit is further configured to:
  • the memory address requested by the first packet is not occupied by the second node controller, when the first packet is a data or status request message from the third node controller interface, then Transmitting, by the first processor interface, the request message, to enable the first processor to record the requested message interest.
  • an embodiment of the present invention provides a request response apparatus, where the apparatus includes:
  • the receiving unit receives the listening message sent by the interface of the second node controller; the destination node number DNID of the listening message is the first processor;
  • Obtaining a unit acquiring processor in-position information; the processor in-position information includes information about whether a processor on the current node exists;
  • An identifying unit configured to identify, according to the processor in-bit information, whether the first processor exists
  • the processing unit if the first processor does not exist, generates an invalid response message and sends the invalid response message to the second node controller interface; wherein the DNID of the invalid response message is the source node number SNID of the interception message The SNID of the invalid response message is the DNID of the interception message.
  • the processing unit is further configured to:
  • the apparatus further includes a processor in-position information management unit, configured to update the first bit in the bit information when the first processor is removed Recording information of a processor, updating the record information of the first processor to not exist.
  • the corresponding method and apparatus for intercepting the embodiment of the present invention whether the memory address requested by the listening message from the CPU interface or the data or status request message from the other node controller interface is queried by another node in the information directory
  • the controller occupies, thereby determining whether the interception or request message is sent to the removed node, thereby realizing that the current NC resolves the processing of the interception or request to the removed node, and directly listens. Or the request is turned into an invalid response, which greatly improves the performance and reliability of the system.
  • FIG. 1 is a schematic diagram of multi-node interconnection according to Embodiment 1 of the present invention.
  • FIG. 3 is a flowchart of a method for request response based on a node controller according to Embodiment 1 of the present invention
  • FIG. 4 is a schematic diagram of multi-node interconnection according to Embodiment 2 of the present invention.
  • FIG. 5 is a swimming lane diagram of message transmission and response according to Embodiment 2 of the present invention.
  • FIG. 6 is a flowchart of a method for request response based on a node controller according to Embodiment 2 of the present invention.
  • FIG. 7 is a schematic structural diagram of a request response apparatus according to Embodiment 3 of the present invention.
  • FIG. 8 is a schematic structural diagram of a request response apparatus according to Embodiment 4 of the present invention.
  • the node controller NC0 receives the first packet sent by the CPU0, and the NC0 determines whether the memory address requested by the first packet is occupied by the NC3 by querying the information directory. If it is determined that it is occupied, NC0 will query the node in-position information to determine whether the NC3 node exists. If the NC3 node is removed, a failure response message generated directly by NC0 is returned to CPU0. Therefore, the current node controller NC0 solves the processing of the first packet sent to the removed node NC3, directly converts the interception request into an invalid response, and sends an invalid response message to the CPU0, thereby greatly improving the System performance and reliability.
  • the specific node controller-based request response method is as shown in FIG. 3, and includes the following steps:
  • Step 310 The first node controller receives the first packet, where the first packet is a listening message from a CPU interface or a request message from another node controller interface.
  • the listening message is used to query whether other external nodes occupy data on a memory address of the node; the request message is used to obtain data on a memory address of the requested node.
  • CPU0 is the processor on NC0 and NC0 is receiving from CPU0.
  • the protocol processes the SnpInvOwn message
  • the DNID of the SnpInvOwn message corresponds to the node controller NC3.
  • Step 320 Obtain an information directory, and query, in the information directory, whether a memory address requested by the first packet is occupied by the second node controller.
  • the NC0 obtains the information directory, and whether the memory address monitored by the SnpInvOwn message in the information directory is occupied by the NC3.
  • the information directory includes information that the memory address of the current node controller is occupied by other node controllers; in this example, the information of the NC0 memory directory is occupied by NC1, NC2, and NC3.
  • Step 330 Determine whether the memory address is occupied by the second node controller.
  • Step 360 Query node in-position information to determine whether the second node controller exists.
  • the node in-position information refers to information about whether other nodes in the NC domain are recorded in the NC, and by querying the information, it can be determined whether the NC3 exists. If it is determined that NC3 is present, then step 370 is performed, otherwise steps 380, 390 are performed.
  • Step 370 Process the first packet, and generate a second packet to send to the second node controller interface.
  • the NC0 sends a snooping message corresponding to the NC domain to the NC3 interface according to the received SnpInvOwn message.
  • Step 380 Generate an invalid response message, where the DNID of the invalid response message is the SNID of the first message, and the SNID of the invalid response message is the DNID of the first message.
  • the NC0 directly generates an invalid corresponding message RspI according to the received interception request, and sends it to the CPU0, notifying the CPU0 that the NC3 that the request is being sent to is not in the bit.
  • the DNID in the invalid response message RspI is the source node number SNID in the SnpInvOwn; the SNID in the invalid response message RspI is the DNID in the SnpInvOwn.
  • Step 390 changing information occupied by the memory address of the first node controller in the information directory, releasing the memory address occupied by the second node controller, and updating the information directory, and information of the second node controller Updated to not exist.
  • the node in-position information is updated, the information that the NC3 is removed is recorded, and the information occupied by the NC0 memory address by the NC3 is changed in the information directory, and the memory address is released.
  • the method further includes:
  • Step 340 Generate an invalid response message and send the message to the first processor interface.
  • the NC0 directly generates an invalid corresponding message RspI according to the received interception request, and sends it to the CPU0 to notify the CPU0 that the NC3 that the request is being sent to is not in the bit.
  • the DNID in the invalid response message RspI is the source node number SNID in the SnpInvOwn; the SNID in the invalid response message RspI is the DNID in the SnpInvOwn.
  • the method further includes:
  • Step 350 sending a request to the first processor interface of the first node controller.
  • the received SnpInvOwn message is a request message sent by the NC1, it is a placeholder request sent to the memory address space of the CPU0 of the NC0 node. If the NC0 determines that the requested memory address is not the NC3 after the query. If it is occupied, the request is sent directly to CPU0.
  • the node in-position information of other NC nodes records correspondingly the removed node.
  • information about whether other NCs are in place is simply configured in the node controller.
  • the corresponding node in-position information is updated, and
  • the place information of the removed node is updated in the directory information, so that there is no need to perform memory refresh on the CPU of each node in the system when the node is removed, thereby implementing the message addressed by the NC to the removed node.
  • the processing directly converts the listening request into an invalid response for feedback, which is extremely Improve system performance and reliability.
  • the node controller-based request response method provided by the second embodiment of the present invention is described in detail below with reference to FIG. 4 and FIG. 5 and FIG. 6 as an example.
  • the method provided in Embodiment 2 of the present invention can be applied to a case where a processor on a node is hot removed in a multi-node interconnection system.
  • 4 is a schematic diagram of a multi-node interconnection according to Embodiment 2 of the present invention
  • FIG. 5 is a swimming lane diagram of a packet transmission and response according to Embodiment 2 of the present invention
  • FIG. 6 is a node controller according to Embodiment 2 of the present invention
  • Flowchart of the request response method In the multi-node interconnection system shown in FIG.
  • the CPU_b on the NC3 node is removed.
  • the NC3 can pass the query processor. Bit information, thereby determining that the listening message is to be sent to the processor CPU_b that has been removed. Therefore, the node controller NC3 solves the processing of the interception message sent to the removed processor CPU_b, directly converts the interception request into an invalid response, and sends an invalid response message to the NC0, thereby greatly improving the system. Performance and reliability.
  • Step 610 The first node controller receives the interception message sent by the interface of the second node controller; the destination node number DNID of the interception message is the first processor;
  • the NC0 generates a SnpInvOwn message after the protocol sent by the CPU0 is processed by the protocol, and sends the message to the NC3.
  • the DNID of the SnpInvOwn message corresponds to the CPU_b on the node controller NC3. That is, the SnpInvOwn message is sent to CPU_b.
  • the NC3 acquires the processor in-position information during the protocol processing of the sent interception.
  • the processor in-position information is pre-configured information for recording processor in-position information in the current node.
  • Step 630 Determine whether the first processor exists according to the processor in-bit information.
  • step 640 is performed;
  • step 650 is performed.
  • Step 640 The invalid response message is sent to the second node controller interface, where the DNID of the invalid response message is the source node number SNID of the first packet, and the SNID of the invalid response packet is The DNID of the first packet.
  • NC3 recognizes that CPU_b does not exist based on the processor in-bit information.
  • the NC3 directly generates an invalid corresponding message RspI according to the received interception request, sends it to the NC0, and then processes it by the NC0 and sends it to the CPU0, notifying the CPU0 that the CPU_b it is requesting to listen to does not exist.
  • the destination node number DNID in the invalid response message RspI is the source node number SNID in the SnpInvOwn message; the SNID in the invalid response message RspI is the DNID in the SnpInvOwn message.
  • Step 650 Send the interception message to the first processor interface.
  • the NC3 sends an interface that listens to the CPU_b.
  • the method further includes updating processor in-position information on the node when the processor on the node is hot removed, and recording information that the processor is removed in the processor in-position information, the processor The record information is updated to not exist.
  • the information of whether the processor on the node is in place is simply configured in the node controller, and when a processor is removed, the corresponding processor in-position information is updated. Therefore, there is no need to perform a memory refresh on the CPU of each node in the system when one processor is removed, thereby implementing the processing of the message sent to the removed processor by the NC, and directly converting the listening request into Invalid response feedback, greatly improving the performance and reliability of the system.
  • the present invention also discloses a device for requesting a response.
  • the request response device of the embodiment includes: a receiving unit 710, an obtaining unit 720, a first identifying unit 730, a second identifying unit 740, and processing.
  • Unit 750 The request response device of the embodiment of the present invention may be a node controller in a multi-node internetwork. Specifically, it may be a node controller chip or a circuit board with a node controller chip.
  • the receiving unit 710 is configured to receive a first packet, where the first packet is a listener from a CPU interface. Messages or data or status request messages from other node controller interfaces;
  • the obtaining unit 720 is configured to obtain an information directory, where the information directory includes information that the memory address of the current node controller is occupied by other node controllers;
  • a first identifying unit 730 configured to query, in the information directory, whether a memory address requested by the first packet is occupied by the second node controller;
  • a second identifying unit 740 if the memory address requested by the first packet is occupied by the second node controller, the second identifying unit queries the node in-position information to determine the second node Whether the controller exists;
  • the processing unit 750 is configured to: when it is determined that the second node controller does not exist, generate an invalid response message and send the target node number DNID of the invalid response message as a source node of the first packet No. SNID; the SNID of the invalid response message is the DNID of the first packet.
  • the request response device further includes an information directory management unit 760, configured to change, in the information directory, information that the memory address of the first node controller is occupied by the second node controller, and the second The node controller occupies the memory address release and updates the node presence information to update the information of the second node controller to not exist.
  • an information directory management unit 760 configured to change, in the information directory, information that the memory address of the first node controller is occupied by the second node controller, and the second The node controller occupies the memory address release and updates the node presence information to update the information of the second node controller to not exist.
  • processing unit 750 is further configured to:
  • processing unit 750 is further configured to:
  • the memory address requested by the first packet is not occupied by the second node controller, when the first packet is a data or status request message from the third node controller interface, then Transmitting, by the first processor interface, the request message, to enable the first processor to record the request information.
  • the request response apparatus updates the node in-bit information by configuring node in-position information in the apparatus and when an NC node is removed in the system, so that when there is a message addressed to the removed node,
  • the feedback request can be directly converted into invalid corresponding feedback, without having to perform a memory refresh on the CPU of each node in the system when the node is removed to achieve the record update of the node removal, which greatly improves the performance of the system and reliability.
  • the present invention also discloses another apparatus for detecting a response.
  • the request response apparatus of the embodiment includes: a receiving unit 810, an obtaining unit 820, an identifying unit 830, and a processing unit 840.
  • the request response apparatus in the embodiment of the present invention may specifically be a node controller in a multi-node internetwork. Specifically, it may be a node controller chip or a circuit board with a node controller chip.
  • the receiving unit 810 receives the interception message sent by the interface of the second node controller; the destination node number DNID of the interception message is the first processor;
  • the obtaining unit 820 is configured to acquire processor in-position information, where the processor in-position information includes information about whether a processor located on the current node exists;
  • the identifying unit 830 is configured to identify, according to the processor in-bit information, whether the first processor exists;
  • the processing unit 840 if the first processor does not exist, generate an invalid response message and send the message to the second node controller interface, where the DNID of the invalid response message is the source node number of the interception message SNID; the SNID of the invalid response message is the DNID of the interception message.
  • processing unit 840 is further configured to:
  • the request response apparatus is configured by configuring a processor on the current node in the apparatus. Bit information, and the processor bit information is updated when a processor on the current node is removed, so that when there is a snooping to the removed processor, the listening request can be directly invalidated accordingly. Feedback, so that there is no need to perform a memory refresh on each node's CPU in the system to record updates when a processor is removed, which greatly improves the performance and reliability of the system.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented in hardware, a software module executed by a processor, or a combination of both.
  • the software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

本发明实施例公开了一种基于节点控制器的请求响应方法和装置,所述方法包括:第一节点控制器接收第一报文;获取信息目录,在信息目录中查询第一报文所请求的内存地址是否被第二节点控制器占用;如果第一报文所请求的内存地址被第二节点控制器占用了,则查询节点在位信息,用以确定第二节点控制器是否存在;当确定第二节点控制器不存在时,生成无效响应报文并发送;其中,无效响应报文的目的节点号DNID为第一报文的源节点号SNID;无效响应报文的SNID为第一报文的DNID。

Description

基于节点控制器的请求响应方法和装置
本申请要求于2013年9月10日提交中国专利局、申请号201310410556.3、发明名称为“基于节点控制器的请求响应方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信技术领域,尤其涉及一种基于节点控制器的请求响应方法和装置。
背景技术
系统体系结构上的现代高级特性使处理器(CPU)具备了错误报告与错误更正的能力,且支持CPU热插拔的技术。一些代工生产商已经支持了非一致内存访问(Non-Uniform Memory Access,NUMA)硬件的热插拔,即物理节点的插入与移除。这种高级特性需要内核在必要时能移除正在使用的CPU。比如,为了远程访问服务(Remote Access Service,RAS)的需要,必须将一个执行恶意代码的CPU保持在系统执行路径之外。因此,在Linux内核中需要支持CPU热插拔技术。操作系统(OS)对CPU逻辑下线,操作系统将不再使用被下线的CPU线程,原来绑定在上面的进程和中断也被迁移到其他线程上。
在基于多节点互联情况下,可以对某节点上的节点控制器(NC)或某个CPU进行热移除。如果需要对NC进行逻辑上和物理上的移除,除了上面介绍的CPU节点移除的操作外,OS还会将节点内的内存下线,OS将节点内地址空间上正在使用的数据迁移到其他节点的内存上,并不再分配新的内存空间到这段地址。假设系统中存在NC0、NC1、NC2和NC3,其中,对NC3进行移除,则将NC3节点上所有CPU的所有服务都被迁移后,不会有任何东西运行在NC3节点上的CPU内,且其他节点不会使用NC3节点上的内存,NC3节点也不会访问其他节点上的内存。但是由于NC上有目录信息,NC3之前占用其他节点上内存数据的信息可能会保留。
假设NC0上某个内存地址Addr0的数据被NC3占用,则对NC3进行逻辑移 除时,会有如下几种情况:
Figure PCTCN2014085969-appb-000001
表1
表1表示NC3上某个CPU独占NC0内存地址Addr0,则NC0目录信息记录为E状态,且NC3独占。若NC3上此CPU对此地址上的数据进行了修改,则对NC3进行逻辑移除时,数据会写回NC0上某个CPU的内存上,则目录信息更新为I状态。
Figure PCTCN2014085969-appb-000002
表2
表2表示NC3上某个CPU独占NC0内存地址Addr0,则NC0目录信息记录为E状态,且NC3独占。但NC3上此CPU没有对此地址上的数据进行修改,则对NC3进行逻辑移除时,此数据不会写回NC0上某个CPU的内存上,则目录信息仍然表示NC3独占Addr0上的数据。
表3
表3表示NC3上某个CPU共享NC0内存地址Addr0,则NC0目录信息记录为S状态,且NC3和NC1共享。如果对NC节点进行逻辑移除时,此数据不会写回NC0上某个CPU的内存上,则目录信息仍然表示NC3和NC1共享Addr0上的数据。
后面两种情况下,如果不对NC0上的目录信息进行刷新,则如果NC0内的CPU0要独占Addr0的数据,根据CC协议,就会对NC3发侦听消息,此时如果NC3已经被物理移除,则会导致侦听消息无法响应,系统挂死。
现有解决方法是:在对NC3进行物理移除前,其他NC节点内的CPU都针对本节点所有内存地址空间向远端节点发出独占请求,当所有内存地址通过这种方式刷新完毕后,表2和表3的目录状态分别变为如下如表4、表5所示:
Figure PCTCN2014085969-appb-000004
表4
Figure PCTCN2014085969-appb-000005
表5
其他节点不会再有NC3占用的目录状态信息,都变为无效。此时对NC3进行物理移除,可以确保系统不会被挂死。
但是应用此方法在对节点移除时,需要其他节点把本地内存都刷新一遍,因此非常占用OS的使用时间,会导致系统响应很慢,系统性能极大下降。实测中,如果对单节点内256Gb内存进行刷新,且BIOS占用CPU60-70%的时间片,则需要大约20分钟完成,此期间OS的响应变得很慢,用户基本不可接受。 且单节点内存越大,系统规模越大,刷新内存所需的时间越长。
发明内容
本发明实施例提供了一种侦听相应方法和装置,可以在一个节点控制器(NC)移除时配置其他NC的信息目录,从而更新记录每个节点控制器的内存地址被被移除NC节点所占用的信息,而无需对每个节点的CPU进行内存刷新,从而实现由当前NC来解决对发往被移除节点的侦听或请求的处理,直接将侦听请求转为无效响应,极大的提高了系统的性能及可靠性。
第一方面,本发明实施例提供了一种基于节点控制器的请求响应方法,所述方法包括:
第一节点控制器接收第一报文;所述第一报文为来自CPU接口的侦听报文或者来自其他节点控制器接口的请求报文;
获取信息目录,在所述信息目录中查询所述第一报文所请求的内存地址是否被所述第二节点控制器占用;所述信息目录包括当前节点控制器的内存地址被其他节点控制器所占用的信息;
如果所述第一报文所请求的内存地址被所述第二节点控制器占用了,则查询节点在位信息,用以确定所述第二节点控制器是否存在;
当确定所述第二节点控制器不存在时,生成无效响应报文并发送;其中,所述无效响应报文的目的节点号DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
在第一种可能的实现方式中,当确定所述第二节点控制器不存在时,生成无效响应报文并发送之后,所述方法还包括:
在所述信息目录中更改所述第一节点控制器的内存地址被第二节点控制器所占用的信息,将所述被第二节点控制器占用内存地址释放,并更新所述节点在位信息将所述第二节点控制器的信息更新为不存在。
在第二种可能的实现方式中,所述方法还包括:
如果所述第二节点控制器存在,则对所述第一报文进行处理生成第二报 文,把所述第二报文发送到所述第二节点控制器接口。
在第三种可能的实现方式中,如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,所述方法还包括:
当所述第一报文为所述第一节点控制器上的第一处理器接口发送的侦听报文时,则生成无效响应报文发送给第一处理器接口。
在第四种可能的实现方式中,如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,所述方法还包括:
当所述第一报文为来自第三节点控制器接口的数据或状态请求报文时,则向第一处理器接口发送所述请求报文,用以使所述第一处理器记录所述请求的信息。
第二方面,本发明实施例提供了一种基于节点控制器的请求响应方法,所述方法包括:
第一节点控制器接收第二节点控制器接口发送的侦听报文;所述侦听报文的目的节点号DNID为第一处理器;
获取处理器在位信息;所述处理器在位信息包括所在当前节点上的处理器是否存在的信息;
如果所述第一处理器不存在,则生成无效响应报文发送给第二节点控制器接口;其中,所述无效响应报文的DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
在第一种可能的实现方式中,所述方法还包括:
如果所述第一处理器存在,则发送所述侦听报文到所述第一处理器接口。
在第二种可能的实现方式中,所述方法还包括:
当所述第一处理器被移除时,在所述处理器在位信息中更新所述第一处理器的记录信息,将所述第一处理器的记录信息更新为不存在。
第三方面,本发明实施例提供了一种请求响应装置,所述装置包括:
接收单元,用于接收第一报文,所述第一报文为来自CPU接口的侦听报文 或者来自其他节点控制器接口的数据或状态请求报文;
获取单元,用于获取信息目录,所述信息目录包括当前节点控制器的内存地址被其他节点控制器所占用的信息;
第一识别单元,用于在所述信息目录中查询所述第一报文所请求的内存地址是否被所述第二节点控制器占用;
第二识别单元,如果所述第一报文所请求的内存地址被所述第二节点控制器占用了,则所述第二识别单元查询节点在位信息,用以确定所述第二节点控制器是否存在;
处理单元,用于当确定所述第二节点控制器不存在时,生成无效响应报文并发送;其中,所述无效响应报文的目的节点号DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
在第一种可能的实现方式中,所述装置还包括:信息目录管理单元,用于在所述信息目录中更改所述第一节点控制器的内存地址被第二节点控制器所占用的信息,将所述被第二节点控制器占用内存地址释放,并更新所述节点在位信息,将所述第二节点控制器的信息更新为不存在。
在第二种可能的实现方式中,所述处理单元还用于:
如果所述第二节点控制器存在,则对所述第一报文进行处理生成第二报文,把所述第二报文发送到所述第二节点控制器接口。
在第三种可能的实现方式中,所述处理单元还用于:
如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,则当所述第一报文为所述第一节点控制器上的第一处理器接口发送的侦听报文时,则生成无效响应报文发送给第一处理器接口。
在第四种可能的实现方式中,所述处理单元还用于:
如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,则当所述第一报文为来自第三节点控制器接口的数据或状态请求报文时,则向第一处理器接口发送所述请求报文,用以使所述第一处理器记录所述请求的信 息。
在第四方面,本发明实施例提供了一种请求响应装置,所述装置包括:
接收单元,接收第二节点控制器接口发送的侦听报文;所述侦听报文的目的节点号DNID为第一处理器;
获取单元,获取处理器在位信息;所述处理器在位信息包括所在当前节点上的处理器是否存在的信息;
识别单元,用于根据所述处理器在位信息识别所述第一处理器是否存在;
处理单元,如果所述第一处理器不存在,则生成无效响应报文发送给第二节点控制器接口;其中,所述无效响应报文的DNID为所述侦听报文的源节点号SNID;所述无效响应报文的SNID为所述侦听报文的DNID。
在第一种可能的实现方式中,所述处理单元还用于:
如果所述第一处理器存在,则发送所述侦听报文到所述第一处理器接口。
在第二种可能的实现方式中,所述装置还包括处理器在位信息管理单元,用于当所述第一处理器被移除时,在所述处理器在位信息中更新所述第一处理器的记录信息,将所述第一处理器的记录信息更新为不存在。
本发明实施例的侦听相应方法和装置,通过在信息目录中查询来自CPU接口的侦听报文或者来自其他节点控制器接口的数据或状态请求报文所请求的内存地址是否被另一个节点控制器所占用,从而确定侦听或请求报文是否是发往被移除的节点的,从而实现由当前NC来解决对发往被移除节点的侦听或请求的处理,直接将侦听或请求转为无效响应,极大的提高了系统的性能及可靠性。
附图说明
图1为本发明实施例一提供的多节点互联示意图;
图2为本发明实施例一提供的报文发送和响应的泳道图;
图3为本发明实施例一提供的基于节点控制器的请求响应方法流程图;
图4为本发明实施例二提供的多节点互联示意图;
图5为本发明实施例二提供的报文发送和响应的泳道图;
图6为本发明实施例二提供的基于节点控制器的请求响应方法流程图;
图7为本发明实施例三提供的请求响应装置的结构示意图;
图8为本发明实施例四提供的请求响应装置的结构示意图。
下面通过附图和实施例,对本发明实施例的技术方案做进一步的详细描述。
具体实施方式
下面以图1、图2并结合图3为例详细说明本发明实施例一提供的基于节点控制器的请求响应方法。本发明实施例一提供的方法,可以应用于在多节点互联系统中,发生节点热移除的情况下。其中,图1为本发明实施例一提供的多节点互联示意图;图2为本发明实施例一提供的报文发送和响应的泳道图;图3为本发明实施例一提供的基于节点控制器的请求响应方法流程图。在如图1所示的多节点互联系统中,节点控制器NC0接收到CPU0发送的第一报文,NC0通过查询信息目录确定第一报文所请求侦听的内存地址是否被NC3所占用。如果确定被占用了,NC0会查询节点在位信息,确定NC3节点是否存在。如果NC3节点是被移除的,那么直接由NC0产生失效响应报文返回给CPU0。从而实现由当前节点控制器NC0来解决对发往被移除节点NC3的第一报文的处理,直接将侦听请求转为无效响应,发送无效响应报文给CPU0,从而极大的提高了系统的性能及可靠性。
具体的基于节点控制器的请求响应方法如图3所示,包括如下步骤:
步骤310,第一节点控制器接收第一报文;所述第一报文为来自CPU接口的侦听报文或者来自其他节点控制器接口的请求报文;
具体的,侦听报文是用于查询外部其他节点是否占有本节点某个内存地址上的数据;请求报文是用于获取被请求节点的某个内存地址上的数据。
在一个例子中,如图2所示,CPU0是NC0上的处理器,NC0接收来自CPU0 经过协议处理后的SnpInvOwn报文,该SnpInvOwn报文的DNID对应节点控制器NC3。
步骤320,获取信息目录,在所述信息目录中查询所述第一报文所请求的内存地址是否被所述第二节点控制器占用;
具体的,NC0在对发送的侦听请求进行协议处理的过程中,获取信息目录,在信息目录中SnpInvOwn报文所侦听的内存地址是否被NC3占用。所述信息目录包括当前节点控制器的内存地址被其他节点控制器所占用的信息;在本例中,NC0的信息目录中记录了NC0的内存地址被NC1、NC2和NC3所占用的信息。
步骤330,确定所述内存地址是否被第二节点控制器占用;
具体的,根据上述信息目录中记录的信息确定CPU0要求侦听的内存地址是否被NC3所占用。
如果NC3占用了所述内存地址,执行步骤360;
步骤360,查询节点在位信息,用以确定所述第二节点控制器是否存在;
具体的,节点在位信息是指NC内配置的记录NC域内其他节点是否存在的信息,通过查询该信息,可以确定NC3是否存在。如果确定NC3存在,则执行步骤370,否则执行步骤380、390。
步骤370,对第一报文进行处理,生成第二报文发送到第二节点控制器接口。
具体的,如果NC3存在,则NC0根据接收到SnpInvOwn报文生成相应的NC域的侦听报文发送到NC3接口。
步骤380,生成无效响应报文,所述无效响应报文的DNID为第一报文的SNID,无效响应报文的SNID为第一报文的DNID。
具体的,当查询节点在位信息确定NC3不存在,则NC0直接根据接收到的侦听请求生成无效相应报文RspI,发送给CPU0,通知CPU0它所发出请求侦听的NC3不在位。其中,无效响应报文RspI中的DNID为SnpInvOwn中的源节点号SNID;无效响应报文RspI中的SNID为SnpInvOwn中的DNID。
步骤390,在信息目录中更改第一节点控制器的内存地址被占用的信息,将所述被第二节点控制器占用内存地址释放,并更新信息目录,将所述第二节点控制器的信息更新为不存在。
具体的,更新节点在位信息,记录NC3被移除的信息,并在信息目录中更改NC0的内存地址被NC3所占用的信息,将该内存地址释放。
此外,当所述内存地址没有被第二节点控制器所占用,且第一节点控制器接收到的第一报文为来自第一处理器的侦听报文时,在步骤330之后还包括:
步骤340,生成无效响应报文发送给第一处理器接口;
具体的,NC0直接根据接收到的侦听请求生成无效相应报文RspI,发送给CPU0,通知CPU0它所发出请求侦听的NC3不在位。其中,无效响应报文RspI中的DNID为SnpInvOwn中的源节点号SNID;无效响应报文RspI中的SNID为SnpInvOwn中的DNID。
此外,当所述内存地址没有被第二节点控制器所占用,且第一节点控制器接收到的第一报文为来自其他节点控制器的请求报文时,在步骤330之后还包括:
步骤350,向第一节点控制器的第一处理器接口发送请求。
在另一个例子中,当接收到的SnpInvOwn报文为NC1发送的请求报文,是对NC0节点的CPU0的内存地址空间发送的占位请求,如果查询后NC0确定所请求的内存地址没有被NC3占用,则直接将该请求发送给CPU0。
此外,在其他时刻发生NC域内的某个NC节点被移除时,其他NC节点的节点在位信息会对被移除节点进行相应的记录。
通过采用本实施例一提供的基于节点控制器的请求响应方法,在节点控制器中简单配置其他NC是否在位的信息,当一个NC节点移除时,更新相应的节点在位信息,以及在目录信息中更新被移除节点的占位信息,从而无需在节点移除时对系统内的每个节点的CPU进行内存刷新,由此实现由NC来解决对发往被移除节点的报文的处理,直接将侦听请求转为无效响应进行反馈,极大的 提高了系统的性能及可靠性。
下面以图4、图5并结合图6为例详细说明本发明实施例二提供的基于节点控制器的请求响应方法。本发明实施例二提供的方法,可以应用于在多节点互联系统中,发生节点上的某个处理器被热移除的情况下。其中,图4为本发明实施例二提供的多节点互联示意图;图5为本发明实施例二提供的报文发送和响应的泳道图;图6为本发明实施例二提供的基于节点控制器的请求响应方法流程图。在如图4所示的多节点互联系统中,NC3节点上的CPU_b被移除,当节点控制器NC3接收到NC0发送的侦听报文是发送到CPU_b时,NC3就可以通过查询处理器在位信息,从而确定侦听报文是要发往已经被移除的处理器CPU_b的。从而由节点控制器NC3来解决对发往被移除处理器CPU_b的侦听报文的处理,直接将侦听请求转为无效响应,发送无效响应报文给NC0,从而极大的提高了系统的性能及可靠性。
具体的基于节点控制器的请求响应方法如图6所示,包括如下步骤:
步骤610,第一节点控制器接收第二节点控制器接口发送的侦听报文;所述侦听报文的目的节点号DNID为第一处理器;
具体的,在一个例子中,如图5所示,NC0对来自CPU0发送的侦听经过协议处理后产生SnpInvOwn报文,并发送给NC3,该SnpInvOwn报文的DNID对应节点控制器NC3上的CPU_b的,也就是说该SnpInvOwn报文是发往CPU_b的。
步骤620,获取处理器在位信息;所述处理器在位信息包括所在当前节点控制器的处理器是否存在的信息;
具体的,NC3在对发送的侦听进行协议处理的过程中,获取处理器在位信息。处理器在位信息为预先配置的信息,用来记录当前节点中的处理器在位信息。
步骤630,根据所述处理器在位信息确定所述第一处理器是否存在;
具体的,如果NC3中CPU_b不存在,则执行步骤640,;
如果NC3中CPU_b存在,则执行步骤650。
步骤640,生成无效响应报文发送给第二节点控制器接口;其中,所述无效响应报文的DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
具体的,在本例中,NC3根据处理器在位信息识别CPU_b不存在。NC3直接根据接收到的侦听请求生成无效相应报文RspI,发送给NC0,再由NC0处理后发送给CPU0,通知CPU0它所发出请求侦听的CPU_b不存在。其中,无效响应报文RspI中的目的节点号DNID为SnpInvOwn报文中的源节点号SNID;无效响应报文RspI中的SNID为SnpInvOwn报文中的DNID。
步骤650,发送所述侦听报文到所述第一处理器接口。
具体的,如果NC3根据处理器在位信息识别CPU_b存在,则NC3发送侦听到CPU_b的接口。
该方法还包括,当发生节点上的处理器被热移除时,更新该节点上的处理器在位信息,将处理器被移除的信息记录在处理器在位信息中,将该处理器的记录信息更新为不存在。
通过采用本实施例二提供的侦听相应方法,在节点控制器中简单配置该节点上的处理器是否在位的信息,当一个处理器被移除时,更新相应的处理器在位信息,从而无需在一个处理器移除时对系统内的每个节点的CPU进行内存刷新,由此实现由NC来解决对发往被移除处理器的报文的处理,直接将侦听请求转为无效响应进行反馈,极大的提高了系统的性能及可靠性。
相应的,本发明还公开了一种请求响应的装置,如图7所示,本实施例请求响应装置包括:接收单元710、获取单元720、第一识别单元730、第二识别单元740和处理单元750。本发明实施例的请求响应装置可以为一多节点互联网络中的节点控制器。具体的,可以是一个节点控制器芯片或者是一个具有节点控制器芯片的电路板。
接收单元710,用于接收第一报文,所述第一报文为来自CPU接口的侦听 报文或者来自其他节点控制器接口的数据或状态请求报文;
获取单元720,用于获取信息目录,所述信息目录包括当前节点控制器的内存地址被其他节点控制器所占用的信息;
第一识别单元730,用于在所述信息目录中查询所述第一报文所请求的内存地址是否被所述第二节点控制器占用;
第二识别单元740,如果所述第一报文所请求的内存地址被所述第二节点控制器占用了,则所述第二识别单元查询节点在位信息,用以确定所述第二节点控制器是否存在;
处理单元750,用于当确定所述第二节点控制器不存在时,生成无效响应报文并发送;其中,所述无效响应报文的目的节点号DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
进一步的,请求响应装置还包括信息目录管理单元760,用于在所述信息目录中更改所述第一节点控制器的内存地址被第二节点控制器所占用的信息,将所述被第二节点控制器占用内存地址释放,并更新所述节点在位信息,将所述第二节点控制器的信息更新为不存在。
进一步的,处理单元750还用于:
如果所述第二节点控制器存在,则对所述第一报文进行处理,生成第二报文发送到所述第二节点控制器接口。
进一步的,所述处理单元750还用于:
如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,则当所述第一报文为所述第一节点控制器上的第一处理器接口发送的侦听报文时,则生成无效响应报文发送给第一处理器接口。
进一步的,所述处理单元750还用于:
如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,则当所述第一报文为来自第三节点控制器接口的数据或状态请求报文时,则向第一处理器接口发送所述请求报文,用以使所述第一处理器记录所述请求的 信息。
本实施例提供的请求响应装置通过在装置中配置节点在位信息,并且在系统中一个NC节点移除时对节点在位信息进行更新,从而当有发往被移除节点的报文时,可以直接将侦听请求转为无效相应进行反馈,而无需在节点移除时对系统内的每个节点的CPU进行内存刷新来实现节点移除的记录更新,极大的提高了系统的性能及可靠性。
相应的,本发明还公开了另一种侦听响应的装置,如图8所示,本实施例请求响应装置包括:接收单元810、获取单元820、识别单元830和处理单元840。本发明实施例的请求响应装置具体可以为一多节点互联网络中的节点控制器。具体的,可以是一个节点控制器芯片或者是一个具有节点控制器芯片的电路板。
接收单元810,接收第二节点控制器接口发送的侦听报文;所述侦听报文的目的节点号DNID为第一处理器;
获取单元820,获取处理器在位信息;所述处理器在位信息包括所在当前节点上的处理器是否存在的信息;;
识别单元830,用于根据所述处理器在位信息识别所述第一处理器是否存在;
处理单元840,如果所述第一处理器不存在,则生成无效响应报文发送给第二节点控制器接口;其中,所述无效响应报文的DNID为所述侦听报文的源节点号SNID;所述无效响应报文的SNID为所述侦听报文的DNID。
进一步的,处理单元840还用于:
如果所述第一处理器存在,则发送所述侦听报文到所述第一处理器接口。
进一步的,所述装置还包括处理器在位信息管理单元850,用于当所述第一处理器被移除时,在所述处理器在位信息中更新所述第一处理器的记录信息,将所述第一处理器的记录信息更新为不存在。
本实施例提供的请求响应装置通过在装置中配置当前节点上的处理器在 位信息,并且在当前节点上一个处理器发生移除时对处理器在位信息进行更新,从而当有发往被移除处理器的侦听时,可以直接将侦听请求转为无效相应进行反馈,从而无需在一个处理器移除时对系统内的每个节点的CPU进行内存刷新来记录更新,极大的提高了系统的性能及可靠性。
专业人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (16)

  1. 一种基于节点控制器的请求响应方法,其特征在于,所述方法包括:
    第一节点控制器接收第一报文;所述第一报文为来自CPU接口的侦听报文或者来自其他节点控制器接口的请求报文;
    获取信息目录,在所述信息目录中查询所述第一报文所请求的内存地址是否被所述第二节点控制器占用;所述信息目录包括当前节点控制器的内存地址被其他节点控制器所占用的信息;
    如果所述第一报文所请求的内存地址被所述第二节点控制器占用了,则查询节点在位信息,用以确定所述第二节点控制器是否存在;
    当确定所述第二节点控制器不存在时,生成无效响应报文并发送;其中,所述无效响应报文的目的节点号DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
  2. 根据权利要求1所述的方法,其特征在于,当确定所述第二节点控制器不存在时,生成无效响应报文并发送之后,所述方法还包括:
    在所述信息目录中更改所述第一节点控制器的内存地址被第二节点控制器所占用的信息,将所述被第二节点控制器占用内存地址释放,并更新所述节点在位信息,将所述第二节点控制器的信息更新为不存在。
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    如果所述第二节点控制器存在,则对所述第一报文进行处理生成第二报文,把所述第二报文发送到所述第二节点控制器接口。
  4. 根据权利要求1所述的方法,其特征在于,如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,所述方法还包括:
    当所述第一报文为所述第一节点控制器上的第一处理器接口发送的侦听报文时,则生成无效响应报文发送给第一处理器接口。
  5. 根据权利要求1所述的方法,其特征在于,如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,所述方法还包括:
    当所述第一报文为来自第三节点控制器接口的数据或状态请求报文时,则向第一节点控制器的第一处理器接口发送所述请求报文,用以使所述第一处理器记录所述请求的信息。
  6. 一种基于节点控制器的请求响应方法,其特征在于,所述方法包括:
    第一节点控制器接收第二节点控制器接口发送的侦听报文;所述侦听报文的目的节点号DNID为第一处理器;
    获取处理器在位信息;所述处理器在位信息包括所在当前节点上的处理器是否存在的信息;
    如果所述第一处理器不存在,则生成无效响应报文发送给第二节点控制器接口;其中,所述无效响应报文的DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    如果所述第一处理器存在,则发送所述侦听报文到所述第一处理器接口。
  8. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    当所述第一处理器被移除时,在所述处理器在位信息中更新所述第一处理器的记录信息,将所述第一处理器的记录信息更新为不存在。
  9. 一种请求响应装置,其特征在于,所述装置包括:
    接收单元,用于接收第一报文,所述第一报文为来自CPU接口的侦听报文或者来自其他节点控制器接口的数据或状态请求报文;
    获取单元,用于获取信息目录,所述信息目录包括当前节点控制器的内存地址被其他节点控制器所占用的信息;
    第一识别单元,用于在所述信息目录中查询所述第一报文所请求的内存地址是否被所述第二节点控制器占用;
    第二识别单元,如果所述第一报文所请求的内存地址被所述第二节点控制器占用了,则所述第二识别单元查询节点在位信息,用以确定所述第 二节点控制器是否存在;
    处理单元,用于当确定所述第二节点控制器不存在时,生成无效响应报文并发送;其中,所述无效响应报文的目的节点号DNID为所述第一报文的源节点号SNID;所述无效响应报文的SNID为所述第一报文的DNID。
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:信息目录管理单元,用于在所述信息目录中更改所述第一节点控制器的内存地址被第二节点控制器所占用的信息,将所述被第二节点控制器占用内存地址释放,并更新所述节点在位信息,将所述第二节点控制器的信息更新为不存在。
  11. 根据权利要求9所述的装置,其特征在于,所述处理单元还用于:
    如果所述第二节点控制器存在,则对所述第一报文进行处理生成第二报文,把所述第二报文发送到所述第二节点控制器接口。
  12. 根据权利要求9所述的装置,其特征在于,所述处理单元还用于:
    如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,则当所述第一报文为所述第一节点控制器上的第一处理器接口发送的侦听报文时,则生成无效响应报文发送给第一处理器接口。
  13. 根据权利要求9所述的装置,其特征在于,所述处理单元还用于:
    如果所述第一报文所请求的内存地址没有被所述第二节点控制器占用,则当所述第一报文为来自第三节点控制器接口的数据或状态请求报文时,则向第一处理器接口发送所述请求报文,用以使所述第一处理器记录所述请求的信息。
  14. 一种请求响应装置,其特征在于,所述装置包括:
    接收单元,接收第二节点控制器接口发送的侦听报文;所述侦听报文的目的节点号DNID为第一处理器;
    获取单元,获取处理器在位信息;所述处理器在位信息包括所在当前节点上的处理器是否存在的信息;
    识别单元,用于根据所述处理器在位信息识别所述第一处理器是否存在;
    处理单元,如果所述第一处理器不存在,则生成无效响应报文发送给第二节点控制器接口;其中,所述无效响应报文的DNID为所述侦听报文的源节点号SNID;所述无效响应报文的SNID为所述侦听报文的DNID。
  15. 根据权利要求14所述的装置,其特征在于,所述处理单元还用于:
    如果所述第一处理器存在,则发送所述侦听报文到所述第一处理器接口。
  16. 根据权利要求14所述的装置,其特征在于,所述装置还包括处理器在位信息管理单元,用于当所述第一处理器被移除时,在所述处理器在位信息中更新所述第一处理器的记录信息,将所述第一处理器的记录信息更新为不存在。
PCT/CN2014/085969 2013-09-10 2014-09-05 基于节点控制器的请求响应方法和装置 WO2015035882A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP17200376.6A EP3355181B1 (en) 2013-09-10 2014-09-05 Method and apparatus for responding to request based on node controller
EP14843838.5A EP3046035B1 (en) 2013-09-10 2014-09-05 Request response method and device based on a node controller
US15/066,623 US10324646B2 (en) 2013-09-10 2016-03-10 Node controller and method for responding to request based on node controller

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310410556.3 2013-09-10
CN201310410556.3A CN103488606B (zh) 2013-09-10 2013-09-10 基于节点控制器的请求响应方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/066,623 Continuation US10324646B2 (en) 2013-09-10 2016-03-10 Node controller and method for responding to request based on node controller

Publications (1)

Publication Number Publication Date
WO2015035882A1 true WO2015035882A1 (zh) 2015-03-19

Family

ID=49828850

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/085969 WO2015035882A1 (zh) 2013-09-10 2014-09-05 基于节点控制器的请求响应方法和装置

Country Status (4)

Country Link
US (1) US10324646B2 (zh)
EP (2) EP3355181B1 (zh)
CN (1) CN103488606B (zh)
WO (1) WO2015035882A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488606B (zh) * 2013-09-10 2016-08-17 华为技术有限公司 基于节点控制器的请求响应方法和装置
CN105009086B (zh) * 2014-03-10 2019-01-18 华为技术有限公司 一种实现处理器切换的方法、计算机和切换装置
CN106708551B (zh) * 2015-11-17 2020-01-17 华为技术有限公司 一种热添加中央处理器cpu的配置方法及系统
CN111917656B (zh) * 2017-07-27 2023-11-07 超聚变数字技术有限公司 传输数据的方法和设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102439571A (zh) * 2011-10-27 2012-05-02 华为技术有限公司 一种防止节点控制器死锁的方法及节点控制器
US20120151458A1 (en) * 2010-12-08 2012-06-14 Oracle International Corporation System and method for removal of arraycopies in java by cutting the length of arrays
CN103488606A (zh) * 2013-09-10 2014-01-01 华为技术有限公司 基于节点控制器的请求响应方法和装置

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266743B1 (en) * 1999-02-26 2001-07-24 International Business Machines Corporation Method and system for providing an eviction protocol within a non-uniform memory access system
US7234029B2 (en) * 2000-12-28 2007-06-19 Intel Corporation Method and apparatus for reducing memory latency in a cache coherent multi-node architecture
US6754782B2 (en) * 2001-06-21 2004-06-22 International Business Machines Corporation Decentralized global coherency management in a multi-node computer system
JP4401899B2 (ja) * 2004-08-26 2010-01-20 パイオニア株式会社 ノードの存在確認方法、及びノードの存在確認装置
US7222200B2 (en) * 2004-10-14 2007-05-22 Dell Products L.P. Method for synchronizing processors in SMI following a memory hot plug event
US20060253662A1 (en) * 2005-05-03 2006-11-09 Bass Brian M Retry cancellation mechanism to enhance system performance
JP4848771B2 (ja) * 2006-01-04 2011-12-28 株式会社日立製作所 キャッシュ一貫性制御方法およびチップセットおよびマルチプロセッサシステム
US20080005486A1 (en) * 2006-06-29 2008-01-03 Mannava Phanindra K Coordination of snoop responses in a multi-processor system
CN101414268A (zh) * 2007-10-15 2009-04-22 南京大学 一种在ARM MPCore处理器上管理处理器热插拔的方法
FR2927437B1 (fr) * 2008-02-07 2013-08-23 Bull Sas Systeme informatique multiprocesseur
US8006135B2 (en) * 2009-01-14 2011-08-23 International Business Machines Corporation Method and system for remote node debugging using an embedded node controller
CN102577247B (zh) * 2009-10-23 2016-04-06 瑞典爱立信有限公司 使用连接外网的ue从第一本地网ue向第二本地网ue传送媒体会话
WO2012008008A1 (ja) * 2010-07-12 2012-01-19 富士通株式会社 情報処理システム
JP5505516B2 (ja) * 2010-12-06 2014-05-28 富士通株式会社 情報処理システムおよび情報送信方法
CN102023898A (zh) * 2010-12-21 2011-04-20 中兴通讯股份有限公司 中央处理器热插拔的实现方法及装置
JPWO2012124094A1 (ja) * 2011-03-16 2014-07-17 富士通株式会社 ディレクトリキャッシュ制御装置、ディレクトリキャッシュ制御回路、およびディレクトリキャッシュ制御方法
JP5929420B2 (ja) * 2012-03-29 2016-06-08 富士通株式会社 演算処理装置、演算処理装置の制御方法及び情報処理装置
CN102662770B (zh) * 2012-04-28 2014-02-19 中国人民解放军国防科学技术大学 分布式虚拟试验系统中的节点同步方法
CN103020004B (zh) * 2012-12-14 2015-09-09 杭州华为数字技术有限公司 高速缓存非对称一致性内存访问系统的访问方法和装置
CN103150264B (zh) * 2013-01-18 2014-09-17 浪潮电子信息产业股份有限公司 一种基于扩展型Cache Coherence协议的多级一致性域仿真验证和测试方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120151458A1 (en) * 2010-12-08 2012-06-14 Oracle International Corporation System and method for removal of arraycopies in java by cutting the length of arrays
CN102439571A (zh) * 2011-10-27 2012-05-02 华为技术有限公司 一种防止节点控制器死锁的方法及节点控制器
CN103488606A (zh) * 2013-09-10 2014-01-01 华为技术有限公司 基于节点控制器的请求响应方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3046035A4 *

Also Published As

Publication number Publication date
CN103488606A (zh) 2014-01-01
EP3355181B1 (en) 2020-04-22
US20160196087A1 (en) 2016-07-07
EP3355181A1 (en) 2018-08-01
US10324646B2 (en) 2019-06-18
EP3046035A4 (en) 2016-10-05
CN103488606B (zh) 2016-08-17
EP3046035B1 (en) 2018-04-25
EP3046035A1 (en) 2016-07-20

Similar Documents

Publication Publication Date Title
US11550819B2 (en) Synchronization cache seeding
US7596654B1 (en) Virtual machine spanning multiple computers
TWI431475B (zh) 用於在本地代理者之記憶體鏡像及遷移之裝置、系統及方法
CN103458036B (zh) 一种集群文件系统的访问装置和方法
US20100269027A1 (en) User level message broadcast mechanism in distributed computing environment
WO2019127915A1 (zh) 基于分布式一致性协议实现的数据读取方法及装置
US20080148281A1 (en) RDMA (remote direct memory access) data transfer in a virtual environment
US20180095906A1 (en) Hardware-based shared data coherency
JP2008027435A (ja) 排他的所有権のスヌープフィルタ
US9026698B2 (en) Apparatus, system and method for providing access to a device function
WO2015035882A1 (zh) 基于节点控制器的请求响应方法和装置
US8255913B2 (en) Notification to task of completion of GSM operations by initiator node
JP2009032264A (ja) スヌープ要求に使用可能なマスク
JP2018109965A (ja) データ処理
WO2016000470A1 (zh) 一种内存控制方法和装置
JP2017537404A (ja) メモリアクセス方法、スイッチ、およびマルチプロセッサシステム
US10157005B2 (en) Utilization of non-volatile random access memory for information storage in response to error conditions
JP2018503156A (ja) 書込み要求処理方法、プロセッサおよびコンピュータ
CN107368435B (zh) 一种精简目录及利用该精简目录实现Cache一致性监听的方法
WO2013075501A1 (zh) 节点热插拔的方法及装置
US9830263B1 (en) Cache consistency
WO2021213209A1 (zh) 数据处理方法及装置、异构系统
US20150261681A1 (en) Host bridge with cache hints
TW201502972A (zh) 共享記憶體系統
KR20050080704A (ko) 프로세서간 데이터 전송 장치 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14843838

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2014843838

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014843838

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE