WO2014059576A1 - 请求报文处理方法以及发送方法、节点和系统 - Google Patents

请求报文处理方法以及发送方法、节点和系统 Download PDF

Info

Publication number
WO2014059576A1
WO2014059576A1 PCT/CN2012/082955 CN2012082955W WO2014059576A1 WO 2014059576 A1 WO2014059576 A1 WO 2014059576A1 CN 2012082955 W CN2012082955 W CN 2012082955W WO 2014059576 A1 WO2014059576 A1 WO 2014059576A1
Authority
WO
WIPO (PCT)
Prior art keywords
request message
node
address
memory address
request
Prior art date
Application number
PCT/CN2012/082955
Other languages
English (en)
French (fr)
Inventor
杨宝川
王工艺
程永波
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201280002077.1A priority Critical patent/CN103181132B/zh
Priority to PCT/CN2012/082955 priority patent/WO2014059576A1/zh
Publication of WO2014059576A1 publication Critical patent/WO2014059576A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Definitions

  • Embodiments of the present invention relate to computer technologies, and, in particular, to a method and a method, a node, and a system for processing a request in a parallel computer system. Background technique
  • CC-NUMA Cache-Coherent Non-Uniform Memory Access
  • the processors of each node share a memory space, that is, the physical memory of each node can be accessed by other nodes.
  • the node controller of the local node (the Node Controller, when receiving and processing multiple request messages from the remote node, usually uses a first-come-first-served process to temporarily store the request message processing method of other requests, which is
  • the home agent controller in the NC of the local node needs to cache all the request messages, so that the local agent controller needs to have a more complicated logic scale; meanwhile, since a large number of request messages are stored in the local agent controller In the cache, it is easy to cause bottlenecks in the local node system, and even cause deadlock, which makes the local node unusable.
  • the nodes of the existing parallel computer system are processed on a first-come, first-served basis.
  • the node needs to store all the request packets, resulting in a large hardware resource consumption of the node, and the NC architecture of the node.
  • the design is complicated and the cost is high.
  • the expansion of the parallel computer system is limited, and the number of node expansion is limited, which easily causes system bottlenecks.
  • the embodiment of the invention provides a request message processing method, a sending method, a node and a system, which can reduce hardware resource consumption when a node processes a request message.
  • the embodiment of the present invention provides a request packet processing method, including: receiving a request packet sent by a remote node;
  • the winner packet refers to a request packet that the local node can preferentially process.
  • the embodiment of the present invention further provides a node, including:
  • a request message receiving module configured to receive a request message sent by the remote node
  • a request message processing module configured to determine whether the request message is a winner message, and if yes, processing the request message; otherwise, rejecting the request message;
  • the winner packet refers to a request packet that the local node can preferentially process.
  • an embodiment of the present invention provides a request packet processing system, including a local node and a remote node, where the local node is configured to process a request packet sent by the remote node, including the foregoing implementation of the present invention.
  • the node provided by the example.
  • an embodiment of the present invention provides a method for sending a request message, which is characterized by:
  • the memory address to be requested by the request message is obtained; when the memory address is queried from the hot address table, the sending of the request message is canceled, where The overheated address table stores the memory address that is the overheated address.
  • an embodiment of the present invention provides a node, including:
  • a memory address obtaining module configured to acquire a memory address to be requested by the request message when a request message is sent;
  • the request message sending processing module is configured to cancel the sending of the request message when the memory address is a hot address, and the memory of the hot address is stored in the hot address table. address.
  • the packet when the packet is processed, only the request packet for the winner packet sent by the remote node is processed, and the request packet of the other non-winner packet is rejected. It can effectively reduce the hardware resources of the nodes occupied by the request message processing, reduce the complexity of the NC architecture design of the nodes in the parallel computing system, and reduce the node cost. At the same time, because the node occupies less hardware resources, the parallel computer When the number of system nodes is expanded, the number of nodes to be expanded is not limited due to the limitation of excessive hardware resources. The convenience of node expansion in a computer system.
  • FIG. 1 is a schematic flowchart of a request packet processing method according to Embodiment 1 of the present invention
  • FIG. 2 is a schematic flowchart of a request packet processing method according to Embodiment 2 of the present invention
  • FIG. 3 is a request report according to Embodiment 3 of the present invention
  • FIG. 4 is a schematic flowchart of a method for processing a request packet according to Embodiment 4 of the present invention
  • FIG. 5 is a schematic structural diagram of a node according to Embodiment 5 of the present invention.
  • FIG. 6 is a schematic structural diagram of a node according to Embodiment 6 of the present invention.
  • FIG. 7 is a schematic structural diagram of a node according to Embodiment 7 of the present invention.
  • FIG. 8 is a schematic structural diagram of a node according to Embodiment 8 of the present invention.
  • FIG. 9 is a schematic structural diagram of a node according to Embodiment 9 of the present invention.
  • FIG. 10 is a schematic structural diagram of a request packet processing system according to Embodiment 10 of the present invention.
  • the technical solutions in the embodiments of the present invention will be clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. Examples are some embodiments of the invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
  • FIG. 1 is a schematic flowchart diagram of a method for processing a request packet according to Embodiment 1 of the present invention.
  • the embodiment can be applied to the processing of the request of the local node to the remote node in the parallel computer system.
  • the method for processing the request in the embodiment may include the following steps:
  • Step 101 The local node receives the request packet sent by the remote node.
  • Step 102 The local node determines whether the request message is a winner message, if yes, step 103 is performed. Otherwise, step 104 is performed, where the winner message refers to the local node may be prioritized. Processing request message;
  • Step 103 The local node processes the request packet, and ends.
  • Step 104 Reject the request.
  • the local node in the parallel computer system when the local node in the parallel computer system receives the request message sent by the remote node, only the winner message needs to be processed, and the non-winner message is rejected.
  • the local node only needs to cache the winner packet, and does not need to cache a large number of request packets, which can greatly save the hardware resources of the node, so that the NC architecture of the node only needs less buffer space to satisfy the request report.
  • the processing needs of the text can effectively reduce the complexity of the NC architecture design of the node and reduce the hardware cost of the node.
  • the hardware resource occupation is small, so that the number of nodes of the parallel computer system is not acceptable. Restricted capacity expansion, without worrying about the problem of system expansion due to insufficient hardware resources of the node. In addition, it can also avoid the problem that the node is deadlocked or even unable to be used due to excessive resource consumption.
  • the request packet processing method only needs to process the request packet sent by the remote node for the winner packet, and reject the request packet of the other non-winner packet. Processing, thereby effectively reducing the hardware resources of the node occupied by the request message processing, reducing the complexity of the NC architecture design of the nodes in the parallel computing system, and reducing the node cost; meanwhile, because the node occupies less hardware resources for processing the message,
  • the number of nodes in the parallel computer system is expanded, the number of nodes to be expanded is not limited due to the limitation of excessive hardware resources in processing request packets, which can effectively improve the convenience of node expansion in parallel computer systems.
  • FIG. 2 is a schematic flowchart diagram of a method for processing a request packet according to Embodiment 2 of the present invention.
  • the local node may determine the winner packet according to whether the request packet is the first-mentioned packet or the number of rejections of the request packet.
  • the request packet processing method in this embodiment is used. The following steps can be included:
  • Step 201 The local node receives the request packet sent by the remote node.
  • Step 202 Determine whether the local node has a request message being processed, and then execute the step.
  • Step 203 The local node rejects the request, and records the number of times the request is rejected, and ends;
  • Step 204 The local node determines whether the packet is the first arrival packet, if yes, step 205 is performed; otherwise, step 206 is performed; Step 205: Mark the request message as a winner message, and perform step 208;
  • Step 206 Mark the request message as a winner message when the request message is the most rejected request message
  • Step 207 Determine whether the request message is a winner message, if yes, go to step 208; otherwise, go to step 209;
  • Step 208 Process the request packet, and end
  • Step 209 Reject the request message, and record the number of times the request message is received, and the process ends.
  • the request packet that first arrives at the local node may be used as the winner packet, and other request packets may be used as the contestant packet. Reject all the competitors' messages and record the number of times the competitors' messages are rejected. When the winner's message is processed and the local node does not request the message processing, the request message with the most rejections can be used as the request message. Winner message, so that the winner's message can be processed the next time it is received.
  • a person skilled in the art can understand that, for a parallel computer system, after the request message sent by the remote node to the local node is rejected, the request message is still continuously sent, so that the request message is first sent to the local node. It also has a large number of rejected times. Therefore, determining the winner's message according to the number of rejections can satisfy the requirement of prioritizing the first sent message, thereby avoiding the problem that the request message processing delay is too long. .
  • the winner message may also be determined according to the type of the request message and the priority of the request message, or may be determined according to the number of times the request message is rejected.
  • the embodiment of the present invention is not particularly limited.
  • the number of the winners can be determined according to the hardware resources of the node, for example, the pipeline depth of the local node and the cache resource. This embodiment does not specifically limit the local node. It can satisfy the processing of the corresponding number of winners.
  • the processing of the request message may be a processing of a request for the same memory address of the local node, and processing of the request message for other memory addresses of the local node may be processed according to And the above processing method for the same memory address of the local node deal with.
  • the processing of the request message for each memory address is processed according to the same priority, so that the problem that the request message for the partial memory address cannot be processed for a long time can be avoided.
  • the local node stores the number of rejected requests of each request message, so that the request message can be determined as the winner message according to the number of rejected requests.
  • the local node may also store only the number of rejected requests of the most rejected request packets. After receiving the request packet, the local node may determine whether the request packet is received according to the number of rejected packets carried by the request packet. For the winner packet, the local node only needs less hardware resources to store and manage the number of times the request message is rejected, thereby further reducing the consumption of hardware resources when requesting message processing. This will be explained below with specific examples.
  • FIG. 3 is a schematic flowchart of a method for processing a request packet according to Embodiment 3 of the present invention. Specifically, as shown in FIG. 3, the request packet processing method in this embodiment may specifically include the following steps:
  • Step 301 The remote node sends a request packet for the local node, where the request packet carries the number of times rejected by the local node;
  • Step 302 The local node receives the request packet sent by the remote node, and determines whether the local node has a request packet being processed. If yes, step 303 is performed. Otherwise, step 304 is performed. Step 303: The local node determines that the request packet is carried. Whether the number of rejected times is equal to the maximum number of rejected requests in the local node records, and rejects the request message, and increments the maximum number of rejections recorded by the local node by one, and ends, otherwise, rejects the request message. , End;
  • Step 304 The local node determines whether the packet is the first arrival packet, if yes, step 305 is performed; otherwise, step 306 is performed;
  • Step 305 Mark the request message as a winner message, and perform step 308;
  • Step 306 Determine whether the number of rejected times carried in the request packet is equal to the maximum number of rejected times recorded by the local node, if the request message is marked as a winner message, go to step 308; otherwise, go to step 307;
  • Step 307 Reject the request message, and end;
  • Step 308 Clear the maximum number of rejected times in all the request messages recorded by the local node, and process the request message, and the process ends.
  • the number of rejected times carried in the request packet sent by the remote node may be carried in the reject response message when the local node rejects, or the remote node may collect statistics based on the received reject response.
  • the embodiment of the present invention is not particularly limited.
  • the number of rejected times is carried in the request packet, so that the local node only needs to record the maximum number of rejections, which can further reduce the hardware resource consumption when the local node processes the request message.
  • the local node may also notify the remote node to stop sending a request for sending a hot address to the remote node, so as to avoid excessive number of requests for the overheated address, saving
  • the broadband resource specifically, the embodiment may further include the following steps: The number of rejected requests of the most rejected request messages in all the request messages except for the winner message is greater than the first preset threshold.
  • the memory address is sent to each remote node, and the remote node is notified that the memory address is an overheated address, so that each remote node no longer initiates a request for a memory address.
  • the number of rejected requests of the most rejected request messages is less than the number of all the request messages except the winner message for the memory address.
  • the threshold is preset, the memory address is sent to each remote node, and the remote node is notified that the memory address is a non-overheated address, so that each remote node can initiate a request for the memory address. It can be seen that when the number of requests for a certain memory address is too many, the remote node may be notified to suspend the request initiated by the address, so as to avoid excessive request packets and processing, which may result in the consumption of broadband resources of the parallel computer system. Big problem.
  • the remote node may maintain a superheat address table, where the superheat address table stores a memory address that is sent by each node as a superheat address, so that the remote node can send a new request message before it needs to be sent.
  • the memory address to be requested by the newly sent request message is a superheat address. If yes, the request message is stopped. Otherwise, the polling is sent.
  • the sizes of the first preset threshold and the second preset threshold may be set to appropriate values according to requirements, for example, according to network communication conditions of the local node.
  • the request message when determining that the request message is a winner message, it may first determine whether the memory address requested by the request message is available, and only processes the request if it is available, specifically, The embodiment may further include the steps of: querying whether the status of the memory address to be requested by the request message as the winner message is available, and rejecting the request when the memory address status is unavailable, only in the memory address state The data request for the memory address is initiated when available, and the request message is processed.
  • the local node queries that the memory address requested by the request message is unavailable, for example, is being occupied by other requests, the state of the memory address can be listened to, and the request is not received until the memory address is not occupied. The text is processed.
  • the NC of the local node can record the directory of the status of all memory addresses under its jurisdiction.
  • the directory can indicate whether each memory address is occupied, that is, whether the memory address is valid, and invalid can be recorded as invalid (invalid) state; otherwise, The record is in a non-invalid state, and the NC of the local node can listen to the status of each memory address in real time and update the directory in real time.
  • the local node and the remote node may include a local proxy and a cache proxy (CA), wherein the local proxy may process the received external request packet, and may cache the The message is requested, and the CA can process the request to be sent by each core controller (core) in the node and send it to the corresponding node.
  • CA cache proxy
  • FIG. 4 is a schematic flowchart diagram of a method for processing a request packet according to Embodiment 4 of the present invention.
  • the present embodiment can be applied to the sending of the request message to the local node by the remote node in the parallel computer system. Specifically, as shown in FIG. 4, when the remote node needs to send the request message, the following steps can be included:
  • Step 401 When a request message is sent, obtain a memory address to be requested by the request message;
  • Step 402 When it is queried from the hot address table that the memory address is a hot address, cancel the sending of the request message, where the overheated address table stores an overheated memory address.
  • the remote node may receive the superheat address sent by the other node including the local node, and record the superheat address in the superheat address table, so that the remote node may send the request message based on the superheat address table.
  • the request message sent by the core processor in the node is sent to the secondary Determining whether to send the request to the destination address, specifically, the request corresponding to the memory address matching the overheated address in the overheated address table is not arbitrated, and the request is not sent;
  • the requested memory address that does not match the overheated address can be polled according to the round-robin fairness principle to send the request message.
  • the request message of the arbitration process can reach the above-mentioned local node controller through the inter-chip interconnection network for processing by the local node.
  • the local node may process according to the scheme shown in FIG. 1 or FIG. 2, and reject the non-winner message until the request message is set to be superior. Until the message.
  • the local node and the remote node both include a node controller and multiple core processors, and a local proxy and a cache proxy, and the node controller may include a local proxy.
  • the controller and the cache proxy controller may have the same or similar structure as the nodes in the existing parallel computer system, and details are not described herein again.
  • FIG. 5 is a schematic structural diagram of a node according to Embodiment 5 of the present invention.
  • the node in this embodiment may be a node in the parallel computer system, and the method shown in FIG. 1, FIG. 2 or FIG. 3 may be implemented.
  • the node in this embodiment may include a request message receiving module. 1 1 and request message processing module 12, wherein:
  • the request message receiving module 11 is configured to receive a request message sent by the remote node
  • the request message processing module 12 is configured to determine whether the request message is a winner message, and if yes, the request message is processed; otherwise, the request message is rejected, wherein the winner message refers to the local node. Priority request message.
  • the received request message may be processed according to the method steps shown in FIG. 1 , FIG. 2 or FIG. 3 .
  • the specific implementation refer to the description of the first, second or third embodiment of the method of the present invention. No longer.
  • the node of the embodiment can implement other functions of the node in the existing parallel computer system, and details are not described herein.
  • FIG. 6 is a schematic structural diagram of a node according to Embodiment 6 of the present invention.
  • the embodiment of the present invention can implement the method of the embodiment shown in FIG. 2, specifically, the technical solution of the embodiment shown in FIG.
  • the node in this embodiment may further include a winner message determining module 13 configured to determine, when the local node has no request message being processed, whether the request message is the first arrived message. If yes, the request message is marked as a winner message. Otherwise, if the request message is the most rejected request message, the request message is marked as a winner message.
  • the request message processing module 12 of the node of the embodiment is further configured to reject the request message when the local node has the request message being processed, and record the number of times the request message is rejected.
  • the request message processing module 12 of the node in this embodiment may be further configured to: when the request message is a winner message, query whether the state of the memory address requested by the request message is available, and when the memory address status is unavailable. , reject the request 4 ⁇ text.
  • FIG. 7 is a schematic structural diagram of a node according to Embodiment 7 of the present invention.
  • the request packet sent by the remote node carries the number of rejected times, and the local node records the maximum number of rejected requests in all the requested packets, specifically, based on the technical solution of the embodiment shown in FIG. 5 above.
  • the request message processing module 12 may include a determining unit 121 and a first request message processing unit 122, where:
  • the determining unit 121 is configured to determine whether the local node has a request message being processed, and the first request processing unit 122 is configured to determine, when the local node has no request for processing, to determine that the request message is carried. Whether the number of times the request message is rejected is equal to the maximum number of rejected messages in all the request messages recorded by the local node, and if so, the request message is used as the winner message, the request message is processed, and the local node is recorded. The maximum number of rejected times in all request messages is cleared. Otherwise, the request message is rejected.
  • the request message processing module 12 may further include a second request message processing unit 123, configured to determine, when the local node has the request message being processed, whether the number of rejected requests carried by the request message is It is equal to the maximum number of rejected requests in all the request packets recorded by the local node. If yes, the maximum number of rejected requests is increased by one, and the request message is rejected. Otherwise, the request message is directly rejected.
  • a second request message processing unit 123 configured to determine, when the local node has the request message being processed, whether the number of rejected requests carried by the request message is It is equal to the maximum number of rejected requests in all the request packets recorded by the local node. If yes, the maximum number of rejected requests is increased by one, and the request message is rejected. Otherwise, the request message is directly rejected.
  • the embodiment of the present invention can implement the method of the embodiment shown in FIG. 3, and the specific implementation process can be referred to the description of the third embodiment of the method of the present invention, and details are not described herein again.
  • FIG. 8 is a schematic structural diagram of a node according to Embodiment 8 of the present invention.
  • the node in this embodiment may further include a hot address notification module 15 for requesting a message other than a winner message for a memory address. Most rejected When the number of times the request packet is rejected is greater than the first preset threshold, the memory address is sent to each remote node, and the remote node is notified that the memory address is an overheated address, so that each remote node no longer initiates a memory address. Request message.
  • non-superheat address notification module 16 is configured to: when the number of rejected requests of the most rejected request messages other than the winner message for the memory address is less than a second preset threshold, The memory address is sent to each remote node, and the remote node is notified that the memory address is a non-overheated address, so that each remote node can initiate a request for a memory address.
  • FIG. 9 is a schematic structural diagram of a node according to Embodiment 9 of the present invention.
  • the node may be a node of the parallel computer system, and may send a request message to other nodes.
  • the node in this embodiment may include a memory address obtaining module 21 and a request message sending processing module 22, among them:
  • the memory address obtaining module 21 is configured to: when a request message is sent, obtain a memory address to be requested by the request message;
  • the request message sending processing module 22 is configured to cancel the sending of the request message when the memory address is the overheated address, and the overheated memory address is stored in the overheated address table.
  • the node in this embodiment may further include a superheat address receiving processing module 23, configured to receive a memory address that is sent by another node as a superheat address, and add the received memory address that is a superheat address to the superheat address table.
  • the non-superheat address receiving processing module 24 is configured to receive a memory address that is sent by another node as a non-overheated address, and delete the memory address that is a non-overheated address from the hot address table.
  • the node before the request message is sent, the node can select the request message for the overheated address, and only arbitrate the request message of the non-overheated address, so that a large number of request messages that cannot be processed can be avoided.
  • the description of the method embodiment shown in FIG. 4 is omitted.
  • FIG. 10 is a schematic structural diagram of a request packet processing system according to Embodiment 10 of the present invention.
  • the present embodiment requests the message processing system to be a parallel computer system, including a local node 10 and a remote node 20, and the local node 10 can be used to process the request sent by the remote node 20, specifically
  • the nodes shown in FIG. 5, FIG. 6, FIG. 7, or FIG. 8 may be included.
  • the foregoing remote node 20 may include the foregoing node shown in FIG. 9.
  • FIG. 9 For the specific structure, refer to FIG. 9 above, and details are not described herein again.
  • the local node 10 and the remote node 20 may each include the foregoing node of FIG. 5, FIG. 6, FIG. 8 or FIG. 8, and the node shown in FIG. 9, that is, the local node 10 and the remote node 20 may have The same structure and function, so that both the local node and the remote node can receive the request message sent by other nodes, and can also send the request message to other nodes.
  • the specific implementation refer to the foregoing method embodiment and device implementation of the present invention. The description of the example will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明提供一种请求报文处理方法以及发送方法、节点及系统。该方法包括:接收远端节点发送的请求报文;判断所述请求报文是否为优胜者报文,是则对所述请求报文进行处理,否则,拒绝所述请求报文;其中,所述优胜者报文是指本地节点可优先处理的请求报文。本发明实施例技术方案中,节点可仅对优胜者报文进行处理,对其他非优胜者报文则进行拒绝处理,从而可减少节点处理请求报文时的硬件资源消耗。

Description

清求 艮文处理方法以及发送方法、 节点和系统 技术领域 本发明实施例涉及计算机技术, 尤其涉及一种并行计算机系统中请求 才艮文处理方法以及发送方法、 节点及系统。 背景技术
随着信息技术的发展, 对于海量数据处理、 云存储及计算的应用需求越 来越多, 而并行计算技术则可以艮好的满足该类应用, 其中, 釆用高速緩存 一致性非均匀存 4诸访问 ( Cache-Coherent Non-Uniform Memory Access , CC-NUMA ) 架构的并行计算机系统就是一种常用的并行计算机技术。
目前, 在非均匀存储访问 ( Non-Uniform Memory Access , NUMA )架构 并行计算机系统中, 各节点的处理器共享内存空间, 即每个节点的物理内存 都可以被其他节点访问。 其中, 本地节点的节点控制器(Node Controller, NO在接收和处理来自远端节点的多个请求报文时, 通常釆用先到先处理, 暂存其他请求的请求报文处理方法, 这就会使得本地节点的 NC 中的本地代 理(Home Agent )控制器需要緩存所有的请求报文, 使得本地代理控制器需 要具有较复杂的逻辑规模; 同时, 由于大量请求报文存储在本地代理控制器 的緩存中, 容易造成本地节点系统瓶颈, 甚至造成死锁, 导致本地节点无法 使用。
综上, 现有并行计算机系统的节点釆用先到先处理, 暂存其他请求的请 求报文处理方法中, 节点需要存储所有的请求报文, 导致节点的硬件资源消 耗大, 节点的 NC架构设计复杂, 成本高; 同时, 由于节点资源消耗大, 使 得并行计算机系统的扩容受到限制, 节点扩容数量受到限制, 容易造成系统 瓶颈。 发明内容
本发明实施例提供一种请求报文处理方法以及发送方法、 节点及系统, 可减少节点处理请求报文时的硬件资源消耗。 第一方面, 本发明实施例提供一种请求报文处理方法, 包括: 接收远端节点发送的请求报文;
判断所述请求报文是否为优胜者报文, 是则对所述请求报文进行处理, 否则, 拒绝所述请求报文;
其中, 所述优胜者报文是指本地节点可优先处理的请求报文。
第二方面, 本发明实施例还提供一种节点, 包括:
请求报文接收模块, 用于接收远端节点发送的请求报文;
请求报文处理模块, 用于判断所述请求报文是否为优胜者报文, 是则对 所述请求报文进行处理, 否则, 拒绝所述请求报文;
其中, 所述优胜者报文是指本地节点可优先处理的请求报文。
第三方面, 本发明实施例提供一种请求报文处理系统, 包括本地节点和 远端节点, 所述本地节点用于对所述远端节点发送的请求报文进行处理, 包 括上述本发明实施例提供的节点。
第四方面, 本发明实施例提供一种请求报文发送方法, 其特征在于, 包 括:
当有请求报文发送时, 获取所述请求报文所要请求的内存地址; 从过热地址表中查询到所述内存地址为过热地址时, 取消对所述请求报 文的发送, 其中, 所述过热地址表中存储有为过热地址的内存地址。
第五方面, 本发明实施例提供一种节点, 包括:
内存地址获取模块, 用于当有请求报文发送时, 获取所述请求报文所要 请求的内存地址;
请求报文发送处理模块, 用于从过热地址表中查询到所述内存地址为过 热地址时, 取消对所述请求报文的发送, 其中, 所述过热地址表中存储有为 过热地址的内存地址。
本发明实施例在对报文进行处理时, 仅需对远端节点发送的为优胜者报 文的请求报文进行处理,而对其他非优胜者报文的请求报文则进行拒绝处理, 从而可有效减少请求报文处理时所占用的节点硬件资源, 降低并行计算系统 中节点的 NC架构设计的复杂性, 降低节点成本; 同时, 由于处理报文所占 用的节点硬件资源少, 在并行计算机系统节点数量扩容时, 不会因处理请求 报文占用硬件资源过多的限制而导致节点扩容数量受到限制, 可有效提高并 行计算机系统中节点扩容的便利性。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实 施例或现有技术描述中所需要使用的附图作一简单地介绍, 显而易见地, 下 面描述中的附图是本发明的一些实施例, 对于本领域普通技术人员来讲, 在 不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例一提供的请求报文处理方法的流程示意图; 图 2为本发明实施例二提供的请求报文处理方法的流程示意图; 图 3为本发明实施例三提供的请求报文处理方法的流程示意图; 图 4为本发明实施例四提供的请求报文处理方法的流程示意图; 图 5为本发明实施例五提供的节点的结构示意图;
图 6为本发明实施例六提供的节点的结构示意图;
图 7为本发明实施例七提供的节点的结构示意图;
图 8为本发明实施例八提供的节点的结构示意图;
图 9为本发明实施例九提供的节点的结构示意图;
图 10为本发明实施例十提供的请求报文处理系统的结构示意图。 具体实施方式 为使本发明的目的、 技术方案和优点更加清楚, 下面将结合本发明实 施例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显 然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。 基于 本发明中的实施例, 本领域普通技术人员在没有做出创造性劳动的前提下 所获得的所有其他实施例, 都属于本发明保护的范围。
图 1为本发明实施例一提供的请求报文处理方法的流程示意图。 本实 施例可应用于并行计算机系统中本地节点对远端节点的请求 4艮文的处理 中, 如图 1所示, 本实施例请求 4艮文处理方法可包括以下步骤:
步骤 101、 本地节点接收远端节点发送的请求报文;
步骤 102、 本地节点判断该请求报文是否为优胜者报文, 是则执行步 骤 103 , 否则, 执行步骤 104 , 其中, 该优胜者报文是指本地节点可优先 处理的请求报文;
步骤 103、 本地节点对该请求报文进行处理, 结束;
步骤 104、 拒绝该请求 4艮文。
本实施例中, 当并行计算机系统中的本地节点接收到远端节点发送的 请求报文时, 只需要对优胜者报文进行处理, 而对非优胜者报文, 则都进 行拒绝处理。 这样, 本地节点只需要緩存优胜者报文, 而不需要緩存大量 的请求报文, 可极大的节省节点的硬件资源, 使得节点的 NC架构仅需要 较少的緩存空间即可满足对请求报文的处理需要, 可有效降低节点的 NC 架构设计的复杂度, 降低节点硬件成本; 同时, 由于节点仅需对优胜者报 文进行处理, 硬件资源占用小, 使得并行计算机系统的节点数量可不受限 制的扩容, 而不用担心因节点硬件资源不足而无法进行系统扩容的问题, 此外, 也可避免节点因资源消耗过多而出现死锁, 甚至无法使用的问题。
综上, 本发明实施例提供的请求报文处理方法, 仅需对远端节点发送 的为优胜者报文的请求报文进行处理, 而对其他非优胜者报文的请求报文 则进行拒绝处理, 从而可有效减少请求报文处理时所占用的节点硬件资 源, 降低并行计算系统中节点的 NC架构设计的复杂性, 降低节点成本; 同时, 由于处理报文所占用的节点硬件资源少, 在并行计算机系统节点数 量扩容时, 不会因处理请求报文占用硬件资源过多的限制而导致节点扩容 数量受到限制, 可有效提高并行计算机系统中节点扩容的便利性。
图 2为本发明实施例二提供的请求报文处理方法的流程示意图。 本实 施例中, 本地节点可根据请求报文是否为首先到达的报文或根据请求报文 的拒绝次数确定优胜者报文, 具体地, 如图 2所示, 本实施例请求报文处 理方法可包括以下步骤:
步骤 201、 本地节点接收远端节点发送的请求报文;
步骤 202、 判断本地节点是否有正在处理的请求报文, 是则执行步骤
203 , 否则, 执行步骤 204;
步骤 203、 本地节点拒绝该 ^艮文请求, 并记录该请求 文被拒绝的次 数, 结束;
步骤 204、 本地节点判断该报文是否为首先到达的报文, 是则执行步 骤 205 , 否则, 执行步骤 206; 步骤 205、 将该请求报文标记为优胜者报文, 执行步骤 208 ;
步骤 206、 在该请求报文为被拒绝次数最多的请求报文时, 将该请求 报文标记为优胜者报文;
步骤 207、 判断该请求报文是否为优胜者报文, 是则执行步骤 208 , 否则, 执行步骤 209;
步骤 208、 对该请求报文进行处理, 结束;
步骤 209、 拒绝该请求报文, 并记录该请求报文的次数, 结束。
本实施例中, 本地节点对各远端节点发送的请求报文进行处理时, 可 将首先到达本地节点的请求报文作为优胜者报文, 其他的请求报文均可作 为竟争者报文, 拒绝所有的竟争者报文, 并记录各竟争者报文被拒绝的次 数; 当优胜者报文处理完毕, 本地节点没有请求报文处理时, 可将拒绝次 数最多的请求报文作为优胜者报文, 以便下次接收到该优胜者报文时, 可 对其进行处理。
本领域技术人员可以理解, 对于并行计算机系统而言, 各远端节点向 本地节点发送的请求 文被拒绝后, 仍旧会持续的发送请求 4艮文, 这样, 首先发送给本地节点的请求报文也通常会具有较多的被拒绝次数, 因此根 据拒绝次数来确定优胜者报文, 可满足对先发送的报文进行优先处理的要 求, 从而可避免对请求报文处理延时过长的问题。
本领域技术人员可以理解, 实际应用中, 也可根据请求报文的类型, 以及请求报文的优先级等, 来确定优胜者报文, 或者, 也可结合请求报文 被拒绝的次数来确定优胜者报文, 对此, 本发明实施例不做特别限制。
本领域技术人员可以理解, 实际应用中, 除了可以将拒绝次数最多的 报文作为优胜者报文外, 也可将 2个或 2个以上, 且拒绝次数较多的多个 报文作为优胜者报文, 实际应用中, 可根据节点的硬件资源, 例如, 可根 据本地节点的流水线深度以及緩存资源, 来确定优胜者报文的数量, 本实 施例对此并不做特别限制, 只要本地节点可满足对相应数量的优胜者报文 进行处理即可。
本实施例中, 所述的对请求报文的处理, 可以是指针对本地节点的同 一内存地址的请求才艮文的处理, 对于针对本地节点的其他内存地址的请求 报文的处理, 可按照与上述针对本地节点的同一内存地址的处理方法进行 处理。 这样, 对于针对每个内存地址的请求报文的处理, 均会按照相同的 优先级进行处理, 从而可避免针对部分内存地址的请求报文长时间无法处 理的问题。
本领域技术人员可以理解, 上述图 2所实施例中, 本地节点存储有各 请求报文的被拒绝次数, 从而可根据请求报文的被拒绝次数来确定请求报 文是否可以为优胜者报文, 实际应用中, 本地节点也可仅存储被拒绝最多 的请求报文的被拒绝次数, 本地节点接收到请求报文后, 可根据请求报文 自身携带的被拒绝次数, 来确定请求报文是否为优胜者报文, 这样, 本地 节点仅需要较小的硬件资源来存储和管理请求报文被拒绝的次数, 从而可 进一步地降低请求报文处理时硬件资源的消耗。 下面将以具体的实例对此 进行说明。
图 3为本发明实施例三提供的请求报文处理方法的流程示意图。 具体 地, 如图 3所示, 本实施例请求报文处理方法具体可包括以下步骤:
步骤 301、 远端节点发送针对本地节点的请求报文, 该请求报文携带 有被本地节点拒绝的次数;
步骤 302、 本地节点接收远端节点发送的请求报文, 判断本地节点是 否有正在处理的请求报文, 是则执行步骤 303 , 否则, 执行步骤 304; 步骤 303、 本地节点判断该请求报文携带的被拒绝次数是否等于本地 节点记录所有请求报文中最多的被拒绝次数, 是则拒绝该请求报文, 并将 本地节点记录的最多的拒绝次数加 1 , 结束, 否则, 拒绝该请求报文, 结 束;
步骤 304、 本地节点判断该报文是否为首先到达的报文, 是则执行步 骤 305 , 否则, 执行步骤 306;
步骤 305、 将该请求报文标记为优胜者报文, 执行步骤 308 ;
步骤 306、 判断该请求报文携带的被拒绝次数是否等于本地节点记录 的最多的被拒绝次数, 是则该请求报文标记为优胜者报文, 执行步骤 308 , 否则, 执行步骤 307 ;
步骤 307、 拒绝该请求报文, 结束;
步骤 308、将本地节点记录的所有请求报文中最多的被拒绝次数清零, 并对该请求 文进行处理, 结束。 本实施例中, 本地节点可存储有被拒绝次数最多的请求报文的被拒绝 次数, 假设为 a, 并将其初始值设置为 a=0 , 这样, 本地节点接收到请求 报文后, 就可以将 a与请求报文自身携带的被拒绝次数进行比较, 确定请 求报文是否可以作为优胜者报文, 或者是否需要将 a值加 1。
本领域技术人员可以理解, 远端节点发送的请求报文中携带的被拒绝 次数, 可以是本地节点拒绝时在拒绝响应消息中携带的, 或者, 由远端节 点根据接收到的拒绝响应自身统计的, 对此本发明实施例并不做特别限 制。
可以看出, 本实施例中通过在请求报文中携带被拒绝次数, 使得本地 节点只需要记录最大的拒绝次数即可, 可进一步降低本地节点处理请求报 文时的硬件资源消耗。
在上述图 1、 图 2或图 3所示实施例技术方案基础上, 本地节点还可 通知远端节点停止向其发送过热地址的请求, 以避免针对该过热地址请求 报文数量过多, 节省宽带资源, 具体地, 本实施例还可包括以下步骤: 当 针对一个内存地址除优胜者报文外的其他所有请求报文中被拒绝最多的 请求报文的被拒绝次数大于第一预设阈值时, 将该内存地址发送给各远端 节点, 通知各远端节点该内存地址为过热地址, 以便各远端节点不再发起 针对内存地址的请求才艮文。 进一步地, 本地节点将为过热地址的内存地址 发送给远端节点后, 当针对该内存地址除优胜者报文外的其他所有请求报 文中被拒绝最多的请求报文的被拒绝次数小于第二预设阈值时, 则可将该 内存地址发送给各远端节点, 通知各远端节点该内存地址为非过热地址, 以便各远端节点可发起针对该内存地址的请求才艮文。 可以看出, 对于对某 一内存地址的请求拒绝次数过多时, 可通知远端节点暂停对该地址发起的 请求, 以避免请求报文过多且无法进行处理而导致并行计算机系统宽带资 源耗费过大的问题。
实际应用中, 远端节点可维护一过热地址表, 该过热地址表中存储有 各节点发送过来的为过热地址的内存地址, 这样, 远端节点在有新的请求 报文需要发送前, 可首先查询其要新发送的请求报文所要请求的内存地址 是否为过热地址, 是则停止发送该请求报文, 否则对其进行轮询发送, 其 具体实现将在后面进行说明。 本领域技术人员可以理解, 实际应用中, 上述的第一预设阈值和第二 预设阈值的大小可根据需要而设定为合适的数值, 例如可根据本地节点的 网络通信情况来确定。
上述本发明各实施例中, 当判断请求报文为优胜者报文时, 可首先确 定该请求报文所要请求的内存地址是否可用, 只有在可用时, 才对其进行 处理, 具体地, 本实施例还可包括以下步骤: 查询作为优胜者报文的请求 报文所要请求的内存地址的状态是否可用, 并在该内存地址状态为不可用 时, 拒绝该请求才艮文, 只有在内存地址状态为可用时才发起针对该内存地 址的数据请求, 对请求报文进行处理。 此外, 当本地节点查询到请求报文 所要请求的内存地址不可用, 例如正被其他请求所占用, 则可侦听该内存 地址的状态, 直到内存地址没被占用时, 才对该请求才艮文进行处理。
实际应用中, 本地节点的 NC可记录其所管辖的所有内存地址的状态 的目录, 该目录可表明各内存地址是否占用, 即内存地址是否有效, 无效 可记录为无效(invalid ) 状态, 否则, 记录为非无效状态, 且本地节点的 NC可实时侦听各内存地址的状态, 并实时更新该目录。
本领域技术人员可以理解, 上述的本地节点和远端节点均可包括本地 代理和緩存代理 ( Cache Agent, CA ) , 其中, 本地代理可对接收到的外 部请求报文进行处理, 并可緩存该请求报文, 而 CA则可以对节点中的各 核心控制器 (core ) 所要发送的请求进行处理, 并发送至相应的节点。
图 4为本发明实施例四提供的请求报文处理方法的流程示意图。 本实 施例可应用于并行计算机系统中远端节点向本地节点进行请求报文的发 送中, 具体地, 如图 4所示, 当远端节点需要发送请求报文时, 可包括如 下步骤:
步骤 401、 当有请求报文发送时, 获取该请求报文所要请求的内存地 址;
步骤 402、 从过热地址表中查询到该内存地址为过热地址时, 取消对 该请求 4艮文的发送, 其中, 该过热地址表中存储有过热的内存地址。
本实施例中, 远端节点可接收其他节点包括上述的本地节点发送的过 热地址, 并记录在过热地址表中, 这样, 远端节点在发送请求报文时, 可 基于该过热地址表来对节点中的核心处理器所要发送的请求报文进行仲 裁, 以确定是否发送请求 ^艮文到目的地址, 具体地, 可对与过热地址表中 的过热地址匹配的内存地址对应的请求 ^艮文不进行仲裁, 不发送该请求才艮 文; 对与过热地址不匹配的请求的内存地址, 可按照轮叫 (round-robin ) 公平原则轮询仲裁, 以发送请求报文。
本实施例中, 仲裁处理的请求报文可通过片间互联网络到达上述的本 地节点控制器, 以便由本地节点进行处理。
本实施例中, 本地节点在接收到请求报文后, 即可按照图 1或图 2所 示方案进行处理, 对非优胜者报文, 进行拒绝处理, 直到该请求报文被设 定为优胜者报文为止。
本领域技术人员可以理解, 本实施例中, 本地节点和远端节点均包括 有节点控制器以及多个核心处理器, 以及本地代理和緩存代理 (Cache Agent ) , 且节点控制器可包括本地代理控制器和緩存代理控制器, 其具 体结构可与现有并行计算机系统中的节点具有相同或类似的结构, 在此不 再赘述。
图 5为本发明实施例五提供的节点的结构示意图。 本实施例节点可以 为并行计算机系统中的节点, 其可实现上述图 1、 图 2或图 3所示的方法, 具体地, 如图 5所示, 本实施例节点可包括请求报文接收模块 1 1和请求 报文处理模块 12 , 其中:
请求报文接收模块 11 , 用于接收远端节点发送的请求报文;
请求报文处理模块 12 , 用于判断该请求报文是否为优胜者报文, 是则 对该请求报文进行处理, 否则, 拒绝该请求报文, 其中, 优胜者报文是指 本地节点可优先处理的请求报文。
本实施例可按照上述图 1、 图 2或图 3所示的方法步骤对接收到的请 求报文进行处理, 其具体实现可参见上述本发明方法实施例一、 二或三的 说明, 在此不再赘述。
本领域技术人员可以理解, 本实施例节点除了可实现对请求报文的处 理外, 还可实现与现有并行计算机系统中的节点的其他功能, 在此不再赘 述。
图 6为本发明实施例六提供的节点的结构示意图。 本发明实施例可实 现上述图 2所示实施例方法, 具体地, 在上述图 5所示实施例技术方案基 础上, 如图 6所示, 本实施例节点还可包括优胜者报文判断模块 13 , 用于 在本地节点无正在处理的请求报文时, 判断该请求报文是否为首先到达的 报文, 是则将该请求报文标记为优胜者报文, 否则, 若该请求报文为被拒 绝的次数最多的请求报文, 则将所述请求报文标记为优胜者报文。
进一步地, 本实施例节点的请求报文处理模块 12, 还用于在本地节点 有正在处理的请求报文时,拒绝请求报文,并记录请求报文被拒绝的次数。
此外, 本实施例节点的请求报文处理模块 12,还可用于判断请求报文 为优胜者报文时, 查询请求报文所要请求的内存地址的状态是否可用, 并 在内存地址状态为不可用时, 拒绝该请求 4艮文。
图 7为本发明实施例七提供的节点的结构示意图。 本实施例中, 远端 节点发送的请求报文中携带有被拒绝的次数, 本地节点记录所有请求报文 中最多的被拒绝次数, 具体地, 在上述图 5所示实施例技术方案基础上, 如图 7所示, 本实施例节点中, 请求报文处理模块 12可包括判断单元 121 和第一请求报文处理单元 122 , 其中:
判断单元 121 , 用于判断本地节点是否有正在处理的请求报文; 第一请求 ^艮文处理单元 122, 用于在本地节点无正在处理的请求 4艮文 时, 判断该请求报文携带的该请求报文被拒绝的次数是否等于本地节点记 录的所有请求报文中最多的被拒绝次数, 若是, 则将请求报文作为优胜者 报文, 对请求报文进行处理, 并将本地节点记录的所有请求报文中最多的 被拒绝次数清零, 否则, 拒绝该请求报文。
此外, 如图 7所示, 请求报文处理模块 12还可包括第二请求报文处 理单元 123 , 用于在本地节点有正在处理的请求 文时, 判断请求 4艮文携 带的被拒绝次数是否等于本地节点记录的所有请求报文中最多的被拒绝 次数, 是则将最多的被拒绝次数加 1 , 并拒绝请求报文, 否则, 直接拒绝 请求报文。
本实施例节点可实现上述图 3所示实施例方法, 其具体实现过程可参 见上述本发明方法实施例三的说明, 在此不再赘述。
图 8为本发明实施例八提供的节点的结构示意图。 在上述图 5、 图 6 或图 7的基础上, 如图 8所示, 本实施例节点还可包括过热地址通知模块 15 , 用于当针对一个内存地址除优胜者报文外的请求报文中被拒绝最多的 请求报文被拒绝的次数大于第一预设阈值时, 将该内存地址发送给各远端 节点, 通知各远端节点所述内存地址为过热地址, 以便各远端节点不再发 起针对内存地址的请求报文。 以及, 非过热地址通知模块 16, 用于当针对 该内存地址除优胜者报文外的其他请求报文中被拒绝最多的请求报文的 被拒绝的次数小于第二预设阈值时, 将该内存地址发送给各远端节点, 通 知各远端节点该内存地址为非过热地址, 以便各远端节点可发起针对内存 地址的请求 ^艮文。
图 9为本发明实施例九提供的节点的结构示意图。 本实施例节点可以 是并行计算机系统各节点, 可发送请求报文到其他节点, 具体地, 如图 9 所示, 本实施例节点可包括内存地址获取模块 21和请求报文发送处理模 块 22 , 其中:
内存地址获取模块 21 , 用于当有请求报文发送时, 获取该请求报文所 要请求的内存地址;
请求 4艮文发送处理模块 22,用于从过热地址表中查询到该内存地址为 过热地址时, 取消对该请求 4艮文的发送, 其中, 该过热地址表中存储有过 热的内存地址。
如图 9所示, 本实施例节点还可包括过热地址接收处理模块 23 , 用于 接收其他节点发送的为过热地址的内存地址, 并将接收的为过热地址的内 存地址加入在过热地址表中。 以及, 非过热地址接收处理模块 24 , 用于接 收其他节点发送的为非过热地址的内存地址, 并将从过热地址表中删除为 非过热地址的内存地址。
本实施例节点在进行请求 4艮文发送前, 可陣选出针对过热地址的请求 报文, 只对非过热地址的请求报文进行仲裁发送, 这样, 可避免大量不能 处理的请求报文的发送, 可减少请求报文接收的节点对请求报文的处理, 其具体实现可参见上述图 4所示方法实施例的说明, 在此不再赘述。
图 10为本发明实施例十提供的请求报文处理系统的结构示意图。 如 图 10所示, 本实施例请求报文处理系统为并行计算机系统, 包括本地节 点 10和远端节点 20 , 该本地节点 10可用于对远端节点 20发送的请求才艮 文进行处理, 具体可包括图 5、 图 6、 图 7或图 8所示的节点, 其具体结 构和功能可参见上述图 5、 图 6、 图 7或图 8的说明, 在此不再赘述。 本实施例中, 上述的远端节点 20可包括上述图 9所示的节点, 其具 体结构可参见上述图 9所示, 在此不再赘述。
本实施例中, 本地节点 10和远端节点 20均可包括上述图 5、 图 6、 图 Ί或图 8的节点, 以及图 9所示的节点, 即本地节点 10和远端节点 20 可具有相同的结构和功能, 这样, 本地节点和远端节点均可以接收其他节 点发送的请求报文, 同时也可向其他节点发送请求报文, 其具体实现可参 见上述本发明方法实施例以及装置实施例的说明, 在此不再赘述。
本领域的技术人员可以清楚地了解到, 为描述的方便和简洁, 上述描 述的系统和节点的具体工作过程, 可以参考前述方法实施例中的对应过 程, 在此不再赘述。
本领域普通技术人员可以理解: 实现上述方法实施例的全部或部分步 骤可以通过程序指令相关的硬件来完成, 前述的程序可以存储于一计算机 可读取存储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而前述的存储介质包括: ROM、 RAM, 磁碟或者光盘等各种可以存储程 序代码的介质。
最后应说明的是: 以上各实施例仅用以说明本发明的技术方案, 而非 对其限制; 尽管参照前述各实施例对本发明进行了详细的说明, 本领域的 普通技术人员应当理解: 其依然可以对前述各实施例所记载的技术方案进 行修改, 或者对其中部分或者全部技术特征进行等同替换; 而这些修改或 者替换, 并不使相应技术方案的本质脱离本发明各实施例技术方案的范 围。

Claims

权 利 要 求 书
1、 一种请求^艮文处理方法, 其特征在于, 包括:
接收远端节点发送的请求报文;
判断所述请求报文是否为优胜者报文, 是则对所述请求报文进行处 理, 否则, 拒绝所述请求报文;
其中, 所述优胜者报文是指本地节点可优先处理的请求报文。
2、 根据权利要求 1所述的请求报文处理方法, 其特征在于, 所述判 断所述请求报文是否为优胜者报文之前, 还包括:
在所述本地节点无正在处理的请求报文时, 判断所述请求报文是否为 首先到达的报文, 是则将所述请求报文标记为优胜者报文, 否则, 若所述 请求报文为被拒绝次数最多的请求报文, 则将所述请求报文标记为优胜者 报文。
3、 根据权利要求 2所述的请求报文处理方法, 其特征在于, 所述接 收远端节点发送的请求报文之后, 还包括:
在所述本地节点有正在处理的请求 文时, 拒绝所述请求 文, 并记 录所述请求报文被拒绝的次数。
4、 根据权利要求 2或 3所述的请求报文处理方法, 其特征在于, 所 述若所述请求报文为所述请求报文被拒绝次数最多的请求报文, 则将所述 请求报文标记为优胜者报文, 包括:
判断所述请求报文携带的所述请求报文被拒绝的次数是否等于所述 本地节点记录的所有请求报文中最多的被拒绝次数, 若是, 则将所述请求 报文标记为优胜者报文, 并将所述本地节点记录的所有请求报文中最多的 被拒绝次数清零。
5、 根据权利要求 4所述的请求报文处理方法, 其特征在于, 还包括: 在所述本地节点有正在处理的请求报文时, 判断所述请求报文携带的 被拒绝次数是否等于所述本地节点记录的所有请求报文中最多的被拒绝 次数, 是则将所述最多的被拒绝次数加 1。
6、 根据权利要求 1-5任一所述的请求报文处理方法, 其特征在于, 还 包括:
当针对一个内存地址除优胜者报文外的所有请求报文中被拒绝最多 的请求报文的被拒绝次数大于第一预设阈值时, 将所述内存地址发送给各 远端节点, 通知所述各远端节点所述内存地址为过热地址, 以便所述各远 端节点不再发起针对所述内存地址的请求报文。
7、 根据权利要求 6所述的请求报文处理方法, 其特征在于, 所述将 所述内存地址发送给各远端节点之后, 还包括:
当针对所述内存地址除优胜者报文外的所有请求报文中被拒绝最多 的请求报文的被拒绝次数小于第二预设阈值时, 将所述内存地址发送给各 远端节点, 通知所述各远端节点所述内存地址为非过热地址, 以便所述各 远端节点可发起针对所述内存地址的请求报文, 所述第二预设阈值小于所 述第一预设阈值。
8、 根据权利要求 1所述的请求报文处理方法, 其特征在于, 在对所 述请求 ^艮文进行处理前, 还包括:
查询所述请求报文所要请求的内存地址的状态是否可用, 并在所述内 存地址状态为不可用时, 拒绝所述请求报文。
9、 根据权利要求 1所述的请求报文处理方法, 其特征在于, 所述方 法为针对本地节点的同一内存地址的请求报文进行处理的方法。
10、 一种节点, 其特征在于, 包括:
请求报文接收模块, 用于接收远端节点发送的请求报文;
请求报文处理模块, 用于判断所述请求报文是否为优胜者报文, 是则 对所述请求报文进行处理, 否则, 拒绝所述请求报文;
其中, 所述优胜者报文是指本地节点可优先处理的请求报文。
11、 根据权利要求 10所述的节点, 其特征在于, 还包括:
优胜者报文判断模块, 用于在所述本地节点无正在处理的请求报文 时, 判断所述请求报文是否为首先到达的报文, 是则将所述请求报文标记 为优胜者报文, 否则, 若所述请求报文为被拒绝次数最多的请求报文, 则 将所述请求报文标记为优胜者报文。
12、 根据权利要求 11所述的节点, 其特征在于, 所述请求报文处理 模块, 还用于在所述本地节点有正在处理的请求报文时, 拒绝所述请求报 文, 并记录所述请求报文被拒绝的次数。
13、 根据权利要求 11或 12所述的节点, 其特征在于, 所述请求报文 处理模块包括:
判断单元, 用于判断所述本地节点是否有正在处理的请求报文; 第一请求报文处理单元, 用于在所述本地节点无正在处理的请求报文 时, 判断所述请求报文携带的所述请求报文被拒绝的次数是否等于所述本 地节点记录的所有请求报文中最多的被拒绝次数, 若是, 则将所述请求报 文标记为优胜者报文, 对所述请求报文进行处理, 并将所述本地节点记录 的所有请求报文中最多的被拒绝次数清零, 否则, 拒绝所述请求报文。
14、 根据权利要求 13所述的节点, 其特征在于, 所述请求报文处理 模块还包括:
第二请求^艮文处理单元, 用于在所述本地节点有正在处理的请求 4艮文 时, 判断所述请求报文携带的被拒绝次数是否等于所述本地节点记录的所 有请求报文中最多的被拒绝次数, 是则将所述最多的被拒绝次数加 1 , 并 拒绝所述请求报文, 否则, 拒绝所述请求报文。
15、 根据权利要求 10-14任一所述的节点, 其特征在于, 还包括: 过热地址通知模块, 用于当针对一个内存地址除优胜者报文外的所有 请求报文中被拒绝最多的请求报文的被拒绝次数大于第一预设阈值时, 将 所述内存地址发送给各远端节点, 通知所述各远端节点所述内存地址为过 热地址, 以便所述各远端节点不再发起针对所述内存地址的请求 ^艮文。
16、 根据权利要求 15所述的节点, 其特征在于, 还包括:
非过热地址通知模块, 用于当针对所述内存地址除优胜者报文外的所 有请求报文中被拒绝最多的请求报文的被拒绝次数小于第二预设阈值时, 将所述内存地址发送给各远端节点, 通知所述各远端节点所述内存地址为 非过热地址, 以便所述各远端节点可发起针对所述内存地址的请求 4艮文, 所述第二预设阈值小于所述第一预设阈值。
17、 根据权利要求 10所述的节点, 其特征在于, 所述请求报文处理 模块, 还用于判断所述请求报文为优胜者报文时, 查询所述请求报文所要 请求的内存地址的状态是否可用, 并在所述内存地址状态为不可用时, 拒 绝所述请求报文。
18、 一种请求^艮文发送方法, 其特征在于, 包括:
当有请求报文发送时, 获取所述请求报文所要请求的内存地址; 从过热地址表中查询到所述内存地址为过热地址时, 取消对所述请求 报文的发送, 其中, 所述过热地址表中存储有为过热地址的内存地址。
19、 根据权利要求 18所述的请求报文发送方法, 其特征在于, 还包 括:
接收其他节点发送的为过热地址的内存地址, 并将接收的所述为过热 地址的内存地址加入在所述过热地址表中。
20、 根据权利要求 18或 19所述的请求报文发送方法, 其特征在于, 还包括:
接收其他节点发送的为非过热地址的内存地址, 并从所述过热地址表 中删除所述为非过热地址的内存地址。
21、 一种节点, 其特征在于, 包括:
内存地址获取模块, 用于当有请求报文发送时, 获取所述请求报文所 要请求的内存地址;
请求报文发送处理模块, 用于从过热地址表中查询到所述内存地址为 过热地址时, 取消对所述请求报文的发送, 其中, 所述过热地址表中存储 有为过热地址的内存地址。
22、 根据权利要求 21所述的节点, 其特征在于, 还包括:
过热地址接收处理模块, 用于接收其他节点发送的为过热地址的内存 地址, 并将接收的所述为过热地址的内存地址加入在所述过热地址表中。
23、 根据权利要求 21或 22所述的节点, 其特征在于, 还包括: 非过热地址接收处理模块, 用于接收其他节点发送的为非过热地址的 内存地址, 并从所述过热地址表中删除所述为非过热地址的内存地址。
24、一种请求^艮文处理系统, 其特征在于, 包括本地节点和远端节点, 所述本地节点用于对所述远端节点发送的请求报文进行处理, 包括权利要 求 10-17任一所述的节点。
25、 根据权利要求 24所述的请求报文处理系统, 其特征在于, 所述 远端节点包括权利要求 21-23任一所述的节点。
PCT/CN2012/082955 2012-10-15 2012-10-15 请求报文处理方法以及发送方法、节点和系统 WO2014059576A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201280002077.1A CN103181132B (zh) 2012-10-15 2012-10-15 请求报文处理方法以及发送方法、节点和系统
PCT/CN2012/082955 WO2014059576A1 (zh) 2012-10-15 2012-10-15 请求报文处理方法以及发送方法、节点和系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/082955 WO2014059576A1 (zh) 2012-10-15 2012-10-15 请求报文处理方法以及发送方法、节点和系统

Publications (1)

Publication Number Publication Date
WO2014059576A1 true WO2014059576A1 (zh) 2014-04-24

Family

ID=48639400

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/082955 WO2014059576A1 (zh) 2012-10-15 2012-10-15 请求报文处理方法以及发送方法、节点和系统

Country Status (2)

Country Link
CN (1) CN103181132B (zh)
WO (1) WO2014059576A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105812266B (zh) * 2014-12-31 2018-10-23 北京东土科技股份有限公司 一种请求报文的硬件配置处理方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266743B1 (en) * 1999-02-26 2001-07-24 International Business Machines Corporation Method and system for providing an eviction protocol within a non-uniform memory access system
CN101635678A (zh) * 2009-06-15 2010-01-27 中兴通讯股份有限公司 一种p2p终端流量控制的方法及系统
CN102318275A (zh) * 2011-08-02 2012-01-11 华为技术有限公司 基于cc-numa的报文处理方法、装置和系统
CN102594669A (zh) * 2012-02-06 2012-07-18 福建星网锐捷网络有限公司 数据报文的处理方法、装置及设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7305492B2 (en) * 2001-07-06 2007-12-04 Juniper Networks, Inc. Content service aggregation system
CN101908953A (zh) * 2009-06-02 2010-12-08 中兴通讯股份有限公司 一种对重传数据进行调度的方法及装置
CN102594691B (zh) * 2012-02-23 2019-02-15 中兴通讯股份有限公司 一种处理报文的方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266743B1 (en) * 1999-02-26 2001-07-24 International Business Machines Corporation Method and system for providing an eviction protocol within a non-uniform memory access system
CN101635678A (zh) * 2009-06-15 2010-01-27 中兴通讯股份有限公司 一种p2p终端流量控制的方法及系统
CN102318275A (zh) * 2011-08-02 2012-01-11 华为技术有限公司 基于cc-numa的报文处理方法、装置和系统
CN102594669A (zh) * 2012-02-06 2012-07-18 福建星网锐捷网络有限公司 数据报文的处理方法、装置及设备

Also Published As

Publication number Publication date
CN103181132B (zh) 2015-11-25
CN103181132A (zh) 2013-06-26

Similar Documents

Publication Publication Date Title
US9925492B2 (en) Remote transactional memory
WO2018076793A1 (zh) 一种NVMe数据读写方法及NVMe设备
WO2013078875A1 (zh) 内容的管理方法的方法、装置和系统
WO2022007470A1 (zh) 一种数据传输的方法、芯片和设备
WO2019127915A1 (zh) 基于分布式一致性协议实现的数据读取方法及装置
TW201543218A (zh) 具有多節點連接的多核網路處理器互連之晶片元件與方法
WO2011107046A2 (zh) 内存访问监测方法和装置
WO2015027806A1 (zh) 一种内存数据的读写处理方法和装置
WO2018223789A1 (zh) 事务标识操作方法、系统和计算机可读存储介质
CA2987807C (en) Computer device and method for reading/writing data by computer device
WO2015039569A1 (zh) 副本存储装置及副本存储方法
CN111404931B (zh) 一种基于持久性内存的远程数据传输方法
WO2014075255A1 (zh) 一种基于PCIE Switch通信的方法、装置及系统
US11231964B2 (en) Computing device shared resource lock allocation
JP4554139B2 (ja) 遠隔要求を処理する方法および装置
US20110246667A1 (en) Processing unit, chip, computing device and method for accelerating data transmission
EP2568379B1 (en) Method for preventing node controller deadlock and node controller
EP3788492B1 (en) Separating completion and data responses for higher read throughput and lower link utilization in a data processing network
WO2016201998A1 (zh) 一种缓存分配、数据访问、数据发送方法、处理器及系统
WO2015067004A9 (zh) 一种并发访问请求的处理方法及装置
WO2014101502A1 (zh) 基于内存芯片互连的内存访问处理方法、内存芯片及系统
WO2015090194A1 (zh) 一种实现设备共享的方法和装置
WO2014059576A1 (zh) 请求报文处理方法以及发送方法、节点和系统
CN115114042A (zh) 存储数据访问方法、装置、电子设备和存储介质
US20140036929A1 (en) Phase-Based Packet Prioritization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12886667

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12886667

Country of ref document: EP

Kind code of ref document: A1