WO2015135385A1 - 服务器及数据访问方法 - Google Patents

服务器及数据访问方法 Download PDF

Info

Publication number
WO2015135385A1
WO2015135385A1 PCT/CN2015/070453 CN2015070453W WO2015135385A1 WO 2015135385 A1 WO2015135385 A1 WO 2015135385A1 CN 2015070453 W CN2015070453 W CN 2015070453W WO 2015135385 A1 WO2015135385 A1 WO 2015135385A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor
identifier
target
node
access request
Prior art date
Application number
PCT/CN2015/070453
Other languages
English (en)
French (fr)
Inventor
王工艺
贺成洪
赵亚飞
常胜
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015135385A1 publication Critical patent/WO2015135385A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17356Indirect interconnection networks

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a server and a data access method.
  • SMP Symmetric Multi-Processor
  • NUMA Non-Uniform Memory Access
  • MPP Massive Parallel Processing
  • the SMP server refers to the symmetric operation of multiple central processing units (CPUs) in the server. There is no primary or secondary or affiliation. Each CPU shares the same physical memory. The time required to access any address in the memory is the same. SMP The disadvantage is that the expansion performance is limited; the NUMA server has multiple CPU modules, each CPU module is composed of multiple CPUs (such as 4), and has independent local memory, I/O slots, etc., which can pass between CPU modules. Interconnect modules (such as Crossbar Switch) perform connection and information exchange. Each CPU accesses local memory much faster than accessing remote memory (memory of other CPU modules in the system). When the number of CPUs increases, server performance cannot be linear.
  • MPP server is connected by multiple SMP servers through a certain node interconnection network, each SMP node can run its own operating system, database, etc., but the CPU in each node cannot access the memory of another node, between nodes
  • the information interaction is implemented through the node internet.
  • the first one is a single-cube interconnect architecture. It is the largest processor interconnect architecture recommended by Intel. It can support 8 CPU interconnects, but the maximum can only be extended to 8P systems. Connectivity, scalability is affected.
  • the second processor interconnect architecture is two CPUs or four CPUs and one node in one node.
  • the Node Controller (NC) is interconnected, and the NC and NC are interconnected to form a larger system.
  • the disadvantage of this architecture is that the externally connected links on the NC become bandwidth bottlenecks, and the CPUs in the nodes need to perform transaction processing and bandwidth requirements through the same NC.
  • the third processor interconnect architecture interconnects two CPUs or four CPUs in a node with two NCs.
  • This topology scheme interconnects two NCs, and the two NCs share the transaction processing and bandwidth requirements according to the address space. , can better meet the bandwidth needs.
  • This topology scheme has a small delay at 4P, but for a system above 8P or even larger, the CPU in one node needs to span two NCs when accessing the memory on another node, the delay is large, and the delay is for NUMA system performance. great influence.
  • the technical problem to be solved by the present invention is how to reduce the server delay while ensuring the bandwidth of the server.
  • the present invention provides a server, including:
  • the processor interconnect node includes at least one node controller and at least two base nodes, each of the base nodes including at least four processors;
  • the node controller is connected to the basic node, and configured to manage a transaction of the processor according to an address space of the processor;
  • the node controller is further configured to receive an access request of the source processor and a source processor identifier, and send the access request and the node controller identifier to the target processor according to the target address carried in the access request, where The source processor and the target processor are located at different base nodes, and the target address is an address of the target processor.
  • the node controller is further configured to receive a data response from the target processor, and use the source processor identifier to identify the data A response is sent to the source processor.
  • the node controller includes a control chip, a local proxy LP, and a remote proxy RP;
  • the control chip is configured to receive the source processor identifier and the access request from the source processor, obtain an RP identifier from the access request, and send the access request to an RP pointed to by the RP identifier
  • the source processor identifier ;
  • the RP is configured to obtain the target address from the access request, decode the target address to obtain an LP identifier, and send the access request to an LP pointed to by the LP identifier; and receive the access request from the LP Transmitting the data response to the source processor corresponding to the source processor identifier;
  • the LP is configured to record the RP identifier, obtain the target address from the access request, and send the access request and the node controller identifier to the target processor pointed by the target address, where The node controller is identified as the LP identifier; the data response is received from the target processor; and the data response is sent to the RP pointed to by the RP identifier.
  • the node controller is further configured to: Receiving a new access request, indicating that the data on the target address is accessed, receiving a listening message sent by the target processor and the node controller identifier, where the target includes the target Addressing: sending the interception message to the source processor according to the source processor identifier; receiving a listen response returned by the source processor, and sending the listen response to the destination according to the target address Target processor.
  • the LP is further configured to receive the interception message from the target processor; from the second directory information Obtaining the RP identifier, and sending the interception message to the RP pointed to by the RP identifier, where the second directory information is directory information saved in the LP; according to the target address Transmitting the listening response to the target processor;
  • the RP is further configured to send the interception message to the source processor pointed to by the source processor identifier, and send the interception response to the LP pointed to by the node controller identifier.
  • the processor interconnecting node The first basic node, the second basic node, and the two node controllers are included, and the first basic node and the second basic node respectively include at least four processors.
  • the present invention provides a data access method, which is applicable to the server of any one of the first aspect and the possible implementation of the first aspect, where the source processor needs to access the target processor, the data Access methods include:
  • the node controller receives the access request of the source processor and the source processor identifier, where the access request carries a target address, where the target address is an address of the target processor;
  • the node controller receives a data response from the target processor and sends the data response to the source processor in accordance with the source processor identification.
  • the node controller includes a control chip, a local proxy LP, and a remote proxy RP, and the node controller performs the location according to the target address.
  • the access request and the node controller identifier are sent to the target processor, including:
  • the control chip receives the source processor identifier and the access request from the source processor, and obtains an RP identifier from the access request, and sends the access request to the RP pointed to by the RP identifier.
  • Source processor identifier ;
  • the LP records the RP identifier, obtains the target address from the access request, and sends the access request and the node controller identifier to the target processor pointed by the target address.
  • the node controller is identified as the LP identifier;
  • Receiving, by the node controller, a data response from the target processor, and sending the data response to the source processor according to the source processor identifier including:
  • the RP sends the data response to the source processor corresponding to the source processor identifier.
  • the target processor receives a new access request, indicating that the target needs to be accessed
  • the data access method further includes:
  • the node controller receives a listening message sent by the target processor and the node controller identifier, where the listening message includes the target address;
  • the node controller receives a listening response returned by the source processor and sends the listening response to the target processor according to the target address.
  • the node controller sends the snooping to the source processor according to the source processor identifier News, including;
  • the LP obtains the RP identifier from the second directory information, and sends the interception message to the RP pointed to by the RP identifier, where the second directory information is the directory information saved in the LP;
  • the node controller receives the listening response returned by the source processor, and sends the listening response to the target processor according to the target address, including:
  • the control chip receives the listening response from the source processor and from the listening response Obtaining the RP identifier, and sending the listening response to the RP pointed to by the RP identifier;
  • the LP sends the listening response to the target processor according to the target address.
  • At least one NC guarantees the bandwidth of the server; further, the processors in the same basic node can directly interconnect and access each other's data, and between processors in different basic nodes of the same processor interconnect node When accessing data, there is no need to cross the link between the NCs, reducing server latency.
  • 1a is a block diagram showing the structure of a server according to an embodiment of the present invention.
  • 1b is a block diagram showing the structure of a processor interconnect node according to an embodiment of the invention.
  • 1c is a block diagram showing the structure of a processor interconnect node according to an embodiment of the invention.
  • FIG. 2 shows a flow chart of a data access method according to an embodiment of the present invention
  • FIG. 3 is a block diagram showing the structure of a processor interconnect node according to an embodiment of the invention.
  • FIG. 4 shows a flow chart of a data access method according to another embodiment of the present invention.
  • FIG. 5 is a block diagram showing the structure of a server according to another embodiment of the present invention.
  • FIG. 1a shows a block diagram of a server in accordance with an embodiment of the present invention.
  • the server 100 may specifically include:
  • the processor interconnect node 110 includes at least one node controller 120 and at least two base nodes 130, each of the base nodes 130 including at least four processors 140;
  • the node controller 120 is coupled to the base node 130 for managing transactions of the processor 140 in accordance with an address space of the processor 140.
  • server 100 may include a processor interconnect node 110, which may include at least one node controller 120. Further, the processor interconnect node 110 may further include at least two basic nodes 130, and each of the basic nodes 130 may include at least four processors 140.
  • the node controller 120 is connected to the base node 130, and the transactions of the processor 140 can be managed in accordance with the address spaces of different processors 140 in the base node 130.
  • the node controller 120 can also be connected to node controllers in other processor interconnect nodes, such that one processor can access processors in other processor interconnect nodes through a link between the node controller and the node controller. The bandwidth requirements of the server.
  • the node controller 120 is further configured to receive an access request of the source processor and a source processor identifier, and send the access request and the node controller identifier to the target according to the target address carried in the access request.
  • a processor wherein the source processor and the target processor are located at different base nodes, the target address is an address of the target processor; and the node controller 120 is further configured to process from the target The device receives the data response and sends the data response to the source processor in accordance with the source processor identification.
  • the processors 140 in the same basic node 130 can directly communicate with each other through the communication module in the processor 140 to implement mutual access, different basic nodes.
  • the processors 140 in 130 can communicate with each other through the node controller 120 to achieve mutual access.
  • the node controller 120 may send the access request to the target processor during the source processor.
  • the target processor may send a data response to the corresponding node controller 120 according to the node controller identifier, and after receiving the data response, the node controller 120 may send the data response according to the source processor identifier.
  • the source processor In the process of communication, there is no need to cross the link between the NCs, which can reduce the delay of the server.
  • the node controller 120 may include a control chip, a home agent LP, and a remote agent RP.
  • the above components of the node controller 120 can perform the following actions:
  • the control chip is configured to receive the source processor identifier and the access request from the source processor, obtain an RP identifier from the access request, and send the access request to an RP pointed to by the RP identifier The source processor identifier.
  • the RP is configured to obtain the target address from the access request, decode the target address to obtain an LP identifier, and send the access request to an LP pointed to by the LP identifier; and receive the access request from the LP Transmitting the data response to the source processor corresponding to the source processor identifier.
  • the LP is configured to record the RP identifier, obtain the target address from the access request, and send the access request and the node controller identifier to the target processor pointed by the target address, where The node controller is identified as the LP identifier; the data response is received from the target processor; and the data response is sent to the RP pointed to by the RP identifier.
  • the node controller 120 may receive the interception sent by the target processor. a message and the node controller identifier, the interception message includes Sending the listening message to the source processor according to the source processor identifier; receiving a listening response returned by the source processor, and sending the listening response according to the target address Go to the target processor.
  • At least one NC guarantees the bandwidth of the server; the processors in the same basic node can directly interconnect and access each other's data, and data access between processors in different basic nodes of the same processor interconnect node When it is not necessary to cross the link between the NCs, the server latency is reduced.
  • FIG. 1b is a block diagram showing the structure of a processor interconnect node according to an embodiment of the invention.
  • the processor interconnection node 200 may specifically include: a first basic node 210, a second basic node 220, and two node controllers 230, where the first basic node 210 includes at least four processors 240.
  • the second base node 220 includes at least four processors 250.
  • a 4P node is formed by four processors, and the 4P node may be referred to as a basic node, wherein each processor has its own memory and communication module, and each processor can communicate through the communication module. Accessing data in its own memory can also be in-memory data.
  • Each processor interconnect node can be composed of eight processors, which are interconnected by two base nodes as described above through two node controllers (NCs). The two NCs can be responsible for the planes of the two different address spaces, that is, the transactions of the processors respectively responsible for the two different address spaces, and can also be adjusted as needed, which is not limited by the present invention.
  • 1c is a block diagram showing the structure of a processor interconnect node in accordance with an embodiment of the present invention.
  • the node controller 1 is responsible for the transaction of the processor 0 in the base node 0, the transaction of the processor 1, and the processor 4 and the processor 5 in the basic node 1, and the node controller 0 is responsible for the processor in the basic node 0. 2.
  • the two NCs can each be interconnected with other NCs to form a server with a larger bandwidth, that is, the NC can be connected to other NCs through the interconnection interface.
  • the dual NC can guarantee the bandwidth of the server during the process of the processor performing cross-NC access.
  • processor in the basic node 0 accesses the memory in the basic node 1
  • the node controller 0 or the node controller 1 can directly access, and the link between the NCs is no longer needed, for example, in the basic node 0.
  • Processor 2 can pass node controller 0 Direct access to the memory of the processor 6 in the base node 1 ensures that the server latency caused by cross-base node access in the processor interconnect node is low.
  • the server provided by the present invention includes at least one node controller and at least two basic nodes that communicate through the node controller, which can reduce the bandwidth of the server while reducing the different basic nodes in the same processor interconnect node. Server latency when processors access each other.
  • the data access method may be applied to the server in the foregoing embodiment of the present invention.
  • the data access method may include:
  • Step 300 The node controller receives an access request of the source processor and a source processor identifier, where the access request carries a target address, where the target address is an address of the target processor.
  • the NC processing the source processor transaction may be determined according to the address of the data to be accessed, that is, the target address, and The NC sends the access request and the source processor ID.
  • the target address can be included in the access request.
  • the transaction of the source processor and the transaction of the target processor may be respectively managed by two NCs in the processor interconnection node, and two The NC manages different address spaces and can share the bandwidth pressure of the processor interconnect nodes. There is no interconnection between the two NCs, and communication cannot be performed directly.
  • the source processor needs to first send the access request and the source processor identifier to an intermediate processor, and the intermediate processor and the source processor belong to the same basic node, and can directly communicate without using the NC, and The transaction of the intermediate processor and the transaction of the target processor are managed by the same NC through which the intermediate processor can communicate with the target processor.
  • FIG. 3 shows a structural block diagram of a processor interconnect node according to an embodiment of the present invention.
  • the CPU 5 is a source processor
  • the CPU 2 is a target processor
  • the CPU 5 needs to access data of the CPU 2
  • the CPU 5 needs to
  • the access request and the ID of the CPU 5 are sent to the CPU 6 or the CPU 7, and the CPU 6 or CPU 7 will
  • the access request and the identification of the CPU 5 are sent to the CPU 2 via the right NC.
  • the selection of CPU 6 and CPU 7 can be determined by the routing configuration of the processor interconnect node.
  • Step 310 The node controller sends the access request and the node controller identifier to the target processor according to the target address.
  • the NC may record the source processor identifier for subsequently determining which processor occupies the data of the target address.
  • the NC can determine the target processor where the data to be accessed is located by the target address, and send the access request to the target processor, and the NC can also send the identifier of the NC, that is, the node controller identifier, to the target processor, and the target processor can be correct. Returns the data response.
  • Step 320 The node controller receives a data response from the target processor, and sends the data response to the source processor according to the source processor identifier.
  • the target processor may return a data response.
  • the node controller may first determine the NC that can forward the data response, and after receiving the data response, the NC may The source processor requesting access to the data is determined by the recorded source processor identifier, and the data response is sent to the source processor to complete the data access.
  • the NC may include a local proxy (Local Proxy, LP) and a remote proxy (Remote Proxy, RP).
  • the LP can be used to complete the protocol processing of the CPU in the basic node and the NC outside the basic node. From the CPU in the basic node, the LP has a Cache Agent (CA) function, that is, the CPU in the basic node thinks that there is an LP.
  • CA Cache Agent
  • the processor core although the processor core is not on the LP, but in the processor in the remote node; from the basic node outside the NC, the LP has a memory agent (Home Agent, HA) function, that is, the basic node outside the NC It is considered that there is memory on the LP, although the memory is not on the LP, but on the processor in the basic node; the RP can complete the protocol processing of the CPU in the basic node and the NC outside the basic node, from the CPU in the basic node. Look at the RP has the HA function, that is, the CPU in the basic node thinks that there is memory on the RP.
  • HA Home Agent
  • the memory is not on the RP, it is connected to the processor outside the basic node; from the NC outside the basic node, the RP has the CA function. That is, the NC outside the basic node thinks that the RP has a processor core, although the processor core is not on the RP but in the basic node. On the processor.
  • an LP can be responsible for HA transactions located in two processors, where the HA transaction is the process of requesting access to memory.
  • the RP can interleave requests for eight CPUs by low-order address interleaving.
  • the existence of LP and RP allows the processors inside and outside the basic node to access the data in the internal and external memory of the basic node without data inconsistency.
  • the source processor may determine, according to the target address, whether an access request needs to be sent to the NC, if the source processor and the target processor where the data to be accessed are located belong to the same basic node ( For example, the basic node 0), because the CPUs in the same basic node can be directly accessed through the communication module, there is no need to send the access request to the NC; if the source processor and the target processor where the data to be accessed are located do not belong to the same The basic node (such as the source processor belongs to the base node 0 and the target processor belongs to the base node 1). Since the processor in the different basic nodes needs to access through the NC, the access request needs to be sent to the NC.
  • the basic node such as the source processor belongs to the base node 0 and the target processor belongs to the base node 1). Since the processor in the different basic nodes needs to access through the NC, the access request needs to be sent to the NC.
  • control chip may be further included in the NC, and the step 310 may specifically include:
  • Step 311 The control chip receives the source processor identifier and the access request from the source processor, and obtains an RP identifier from the access request, and sends the access request to an RP pointed to by the RP identifier. And the source processor identifier.
  • the target address may be included in the access request to indicate the address of the data that the source processor needs to access.
  • the control chip receives the source processor ID and access from the source processor. After the request, the RP identifier may be obtained from the target address included in the access request, the RP identifier indicating which RP in the NC the control chip sends the source processor identifier and the access request to.
  • the control chip may send a source processor identifier and an access request to the RP pointed to by the RP identifier, where the RP identifier may be an address bit [A7, A6] in the target address, and the source processor identifier may be used for directory information in the RP.
  • the source processor that records the data on the destination address in memory. For example, as shown in FIG.
  • the CPU 5 can determine that the access request is transmitted to the right NC through the intermediate processor CPU6 or CPU7, and the control chip of the right NC receives
  • the access request is obtained from the RP identifier, and the access request can be sent to the RP2 according to the RP identifier.
  • Step 312 The RP obtains the target address from the access request, decodes the target address to obtain an LP identifier, and sends the access request to the LP pointed by the LP identifier.
  • a directory information that is, a first directory information, in which a processor occupies data at an address of the memory
  • the processor can record by using the identifier of the processor.
  • each cache line can be marked as one of the following four states: Modified, Exclusive, Shared, Invalid.
  • Modified, Exclusive, Shared, Invalid When a cache line is marked as invalid, it indicates that the cache line is invalid, that is, it is a blank line. The invalid line must be taken out of memory and become a shared or exclusive state to implement the read request.
  • the status of the cache line may also be recorded in the first directory information of the RP.
  • the RP finds that the target address is recorded as invalid in the first directory information, the RP may be based on the [A45, A42] address bit of the target address. It is judged which processor interconnection node needs to send the received access request, and if it is the processor interconnection node where the RP is located, it can judge which LP is sent according to the address bits A41 and A6 of the target address. For example, as shown in FIG.
  • Step 313 The LP records the RP identifier, obtains the target address from the access request, and sends the access request and the node controller identifier to the target processor pointed by the target address.
  • the node controller is identified as the LP identity.
  • the LP may record the identifier of the RP, that is, the RP identifier, for subsequently returning a data response to the RP.
  • the LP can also obtain the target address in the access request, and according to the address bit A39 in the target address, it can determine which processor, that is, the target processor, to send the access request to.
  • the source processor and the target processor do not belong to the same processor interconnect node, refer to the server of the above embodiment of the present invention, and different processor interconnect nodes can pass the chain between the NCs.
  • the road is connected.
  • the data requested by the source processor to access the target processor needs to cross the link between the NCs.
  • the LP and RP of different NCs may implement the sending and receiving of the access request, that is, the RP and the LP do not belong to the same NC, RP belongs to the NC of the processor interconnection node where the source processor is located, and LP belongs to the NC of the processor interconnection node where the target processor is located.
  • the LP can also record the NC where the RP is located.
  • the LP may first determine the NC where the RP is located according to the recorded information, and then determine the RP according to the RP identifier.
  • step 320 may specifically include:
  • Step 321 The LP receives the data response from the target processor, and sends the data response to the RP pointed to by the RP identifier.
  • Step 322 The RP sends the data response to the source processor corresponding to the source processor identifier.
  • the target processor may determine the corresponding LP by the node controller identifier, and return a data response to the LP.
  • the LP may according to the previously recorded RP identifier, that is, the address address of the target address [A7, A6], The data response is forwarded to the RP corresponding to the RP identifier.
  • the LP may also first determine the NC where the RP is located according to the previously recorded information.
  • the CPU 2 may also receive the node controller identification while receiving the access request.
  • the CPU 2 can send a data response to the LP0 pointed to by the node controller identifier, and the target address can be included in the data response.
  • the RP records the source processor identifier, and after receiving the data response, the RP may forward the data response to the source processor corresponding to the source processor identifier according to the recorded source processor identifier, so that the source processor receives
  • the data on the target address can be accessed, and the data access process is completed.
  • the target processor can save the directory information and record that the external processor occupies the data on the target address of the memory, but the record is not occupied by which external processor; the LP can save the directory information and record the NC where the LP is located.
  • the NC where the RP of the LP record is located may be different from the NC where the LP is located; the first directory information may be saved in the RP.
  • the record is that the source processor occupies the data on the target address of the memory.
  • the source processor that accesses the data on the target address in the first directory information of RP2 is the CPU 5, and the RP2 can send the data response to the CPU 5, so that the CPU 5 can access the target address on the CPU 2. The data on it.
  • the processors in the same basic node may be interconnected in any manner, and communicated through the communication module of the processor, which is not limited by the present invention.
  • the NC can specifically manage which processor transactions, and can be divided according to different address spaces, and can also be adaptively changed according to requirements, and the present invention also does not limit this.
  • the right NC manages the transactions of CPU2, CPU3, CPU6, and CPU7
  • the left NC manages the transactions of CPU0, CPU1, CPU4, and CPU5; it can also perform cross management according to requirements, such as the right NC management CPU1, CPU2
  • the transactions of the CPU 4 and the CPU 7 and the left NC manage the transactions of the CPU 0, the CPU 3, the CPU 5, and the CPU 6.
  • the processors in the same basic node can directly interconnect and access each other's data, and data access between processors in different basic nodes is not required.
  • Crossing the links between the NCs reduces server latency while maintaining server bandwidth.
  • FIG. 4 shows a flow chart of a data access method according to another embodiment of the present invention.
  • the target processor sends the data response to the source processor according to the access request
  • there may be other processors that need to access the data of the target address and target the target
  • the processor sends a new access request.
  • searching for the directory information in the target processor can find that the data on the target address has been occupied by the external processor, and the target processor can initiate the interception.
  • the data access method may mainly include:
  • Step 400 The node controller receives a listening message sent by the target processor and the node controller identifier, where the listening message includes the target address.
  • the target processor may determine, according to the saved directory information, that the data on the target address has been The external processor is occupied, but it is not sure which external processor is occupied.
  • the target processor may initiate the interception to the NC, that is, send the interception message, and may also send the node controller identifier that it receives through the above step 120 to the NC, and the control chip for the NC determines the interception message. Which LP is sent to.
  • the target address can also be included in the listening message.
  • Step 410 The node controller sends the interception message to the source processor according to the source processor identifier.
  • Step 420 The node controller receives a listening response returned by the source processor, and sends the listening response to the target processor according to the target address.
  • the NC may send the interception message to the source processor according to the source processor identifier.
  • the source processor may return a listening response to the NC.
  • the source processor may first determine, by using the target address, the NC that can forward the listening response.
  • the NC can also determine the target processor by the target address, and send the listening response to the target processor to complete the listening.
  • each HA transaction of the processor corresponds to one LP
  • the target processor may send a sound to the NC according to the correspondence between the HA transaction and the LP. Message.
  • the target processor may send a listening message to multiple NCs at the same time, and the NC that does not proxy the target processor may return an invalid response (Response Invalid, RSPI) to the target processor after receiving the listening message.
  • RSPI Response Invalid
  • the HA0 transaction corresponds to LP0.
  • the CPU 2 sends a listening message, it will simultaneously send to the two NCs on the left and the right. After the NC receives the listening message on the left side, it does not proxy the CPU2. Transaction, so you can return RSPI directly to CPU2.
  • the transaction of the source processor and the transaction of the target processor may be respectively managed by two NCs in the processor interconnection node, There is no interconnection between the two NCs and communication is not possible directly.
  • the target processor needs to first send the interception message to an intermediate processor, the intermediate processor and the target processor belong to the same basic node, and can directly communicate without using the NC, and the intermediate processor
  • the transaction of the transaction with the source processor is managed by the same NC through which the intermediate processor can communicate with the target processor.
  • the listening message of the target processor can be sent to the source processor. As shown in FIG.
  • CPU5 if CPU5 is the source processor and CPU0 is the target processor, CPU0 needs to send a listening message to CPU2 or CPU3 when CPU0 initiates the interception, and CPU2 or CPU3 sends the interception message to the right NC.
  • the selection of CPU 2 and CPU 3 can be determined by the routing configuration of the processor interconnect node.
  • the step 410 may further include:
  • Step 411 The LP receives the interception message from the target processor.
  • Step 412 The LP acquires the RP identifier from the second directory information, and sends the interception message to the RP pointed to by the RP identifier, where the second directory information is a directory saved in the LP. information.
  • the NC may also receive the node controller identifier sent by the target processor, and the control chip may send a listening message to the LP pointed to by the node controller identifier.
  • the LP may determine, according to the second directory information saved by the LP, which NC is to send the interception message. Further, the LP may determine, according to the RP identifier, which RP in the NC to send the interception message. For example, as shown in FIG.
  • the LP0 of the right NC can determine that the interception message needs to be sent to the RP0, and the LP determines the RP by searching the second directory information. If LP0 belongs to the same NC, the listen message is sent to RP0 of the right NC.
  • Step 413 The RP sends the interception message to the source processor pointed to by the source processor identifier.
  • the RP may record the node controller identifier for subsequently returning the listen response correctly to the target processor.
  • the source device that occupies the data at the address is recorded in the first directory information, and the RP may send a listening message to the source processor according to the recorded source processor identifier.
  • the first directory information of the RP0 records that the CPU 5 occupies the data on the target address, and by looking up the first directory information, the RP0 can determine the CPU 5 and send a listening message to the CPU 5.
  • the NC receiving the interception message does not process the transaction of the source processor.
  • the NC needs to first listen.
  • the message is sent to an intermediate processor, the transaction of the intermediate processor is managed by the NC, and the intermediate processor and the source processor belong to the same basic node, and the target processor can be intercepted by the forwarding of the intermediate processor.
  • the message is sent to the source processor.
  • the choice of the intermediate processor can be determined by the routing configuration of the processor interconnect node.
  • the source processor and the target processor do not belong to the same processor interconnect node, refer to the server of the foregoing embodiment of the present invention, different processor interconnect nodes. Connections can be made between the links between the NCs.
  • the interception message sent by the target processor to the source processor needs to cross the link between the NCs, and the LP and RP of different NCs may implement the sending and receiving of the interception message, that is, the RP and the LP are not Belong to the same NC, the RP belongs to the NC of the processor interconnection node where the source processor is located, and the LP belongs to the NC of the processor interconnection node where the target processor is located.
  • the LP can also record the NC where the RP is located. Upon subsequent return of the listening response, the LP may first determine the NC where the RP is located according to the recorded information, and then determine the RP according to the RP identifier.
  • step 420 may specifically include:
  • Step 421 The control chip receives the listening response from the source processor, and obtains the RP identifier from the listening response, and sends the listening response to the RP pointed to by the RP identifier. ;
  • Step 422 The RP sends the listening response to the LP pointed to by the node controller identifier.
  • the source processor may return a listening response to the target processor through the NC, where the target address may be included in the listening response.
  • the control chip may obtain the RP identifier in the listening response, and send the listening response to the RP pointed to by the RP identifier.
  • Step 423 The LP sends the listening response to the target processor according to the target address.
  • the node controller identifier that is, the LP identifier
  • the RP may determine the LP through the node controller identity and send a listening response to the LP.
  • the LP may determine the target processor that initiates the current listening according to the target address included in the listening response.
  • the LP may also first determine the NC where the RP is located according to the previously recorded information. For example, as shown in FIG. 3, if RP0 is recorded, a listen message is sent to it.
  • the node controller identifier of LP is LP0, then RP0 can send a listening response to LP0.
  • the listening message may further include a listening type, and after the target processor completes the current listening, the right to use the data on the target address may be obtained.
  • the directory information saved in the target processor, the LP, and the RP can be modified according to the type of the interception in the interception message.
  • the interception is an exclusive type of interception
  • the information about the target address in the directory information saved in the target processor, the LP, and the RP may be cleared, and the source processor can no longer continue to occupy the data on the target address; Listening for the shared type of interception, the information about the target address in the directory information saved in the target processor, the LP, and the RP is changed to the shared state, the source processor, and the processor that sends the new access request to the target processor.
  • the data on the target address can be shared at the same time.
  • the processors in the same basic node may be interconnected in any manner, and communicated through the communication module of the processor, which is not limited by the present invention.
  • the NC can specifically manage which processor transactions, and can be divided according to different address spaces, and can also be adaptively changed according to requirements, and the present invention also does not limit this.
  • the processors in the same basic node can directly interconnect and access each other's data, and data access between processors in different basic nodes is not required.
  • Crossing the links between the NCs reduces server latency while maintaining server bandwidth.
  • FIG. 5 is a block diagram showing the structure of a server according to another embodiment of the present invention.
  • the server 500 may be a host server having a computing capability, a personal computer PC, or a portable computer or terminal that can be carried.
  • the specific embodiments of the present invention do not limit the specific implementation of the computing node.
  • the server 500 includes a processor 510, a communications interface 520, a memory 530, a bus 540, and a node controller 550.
  • the processor 510, the communication interface 520, the node controller 550, and the memory 530 are completed by the bus 540. Into each other's communication.
  • Communication interface 520 is for communicating with network devices, including, for example, a virtual machine management center, shared storage, and the like.
  • the node controller 550 is used to execute programs.
  • the processor 510 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention.
  • ASIC Application Specific Integrated Circuit
  • the memory 530 is used to store files.
  • the memory 530 may include a high speed RAM memory and may also include a non-volatile memory such as at least one disk memory.
  • Memory 530 can also be a memory array.
  • Memory 530 may also be partitioned, and the blocks may be combined into virtual volumes according to certain rules.
  • the above program may be program code including computer operating instructions. This program can be used to:
  • the node controller includes a control chip, a local proxy LP, and a remote proxy RP, and the node controller sends the access request and the node controller identifier to the target address according to the target address.
  • the target processor includes:
  • the control chip receives the source processor identifier and the access request from the source processor, and obtains an RP identifier from the access request, and sends the access request to the RP pointed to by the RP identifier.
  • Source processor identifier ;
  • the LP records the RP identifier, and obtains the target address from the access request, to
  • the target processor pointed to by the target address sends the access request and the node controller identifier, and the node controller identifier is the LP identifier;
  • Receiving, by the node controller, a data response from the target processor, and sending the data response to the source processor according to the source processor identifier including:
  • the RP sends the data response to the source processor corresponding to the source processor identifier.
  • the program may be further used to:
  • sending the interception message to the source processor according to the source processor identifier includes:
  • the LP obtains the RP identifier from the second directory information, and sends the interception message to the RP pointed to by the RP identifier, where the second directory information is the directory information saved in the LP;
  • the node controller receives the listening response returned by the source processor, and sends the listening response to the target processor according to the target address, including:
  • the control chip receives the listening response from the source processor, and obtains the RP identifier from the listening response, and sends the listening response to the RP pointed to by the RP identifier;
  • the LP sends the listening response to the target processor according to the target address.
  • the function is implemented in the form of computer software and sold or used as a stand-alone product, it is considered to some extent that all or part of the technical solution of the present invention (for example, a part contributing to the prior art) is It is embodied in the form of computer software products.
  • the computer software product is typically stored in a computer readable non-volatile storage medium, including instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform all of the methods of various embodiments of the present invention. Or part of the steps.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Multi Processors (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

一种服务器及数据访问方法,该服务器(100)包括处理器互联节点(110);处理器互联节点(110)包括至少一个节点控制器(120)和至少两个基本节点(130),每个基本节点(130)包括至少四个处理器(140);节点控制器(120),与基本节点(130)相连接,用于按照处理器(140)的地址空间管理处理器(140)的事务;还用于接收源处理器的访问请求及源处理器标识,按照访问请求中携带的目标地址,将访问请求以及节点控制器标识发往目标处理器。至少一个NC保证了服务器的带宽;相同基本节点中的处理器可以直接互联并互相访问,相同处理器互联节点的不同基本节点中的处理器进行数据访问时,不需跨越NC之间的链路,降低了服务器延迟。

Description

服务器及数据访问方法
本申请要求于2014年3月12日提交中国专利局、申请号201410091090.X、发明名称为“服务器及数据访问方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,尤其涉及一种服务器及数据访问方法。
背景技术
从系统架构来看,目前的商用服务器一般可以分为三类,即对称多处理器结构(Symmetric Multi-Processor,SMP)、非一致存储访问结构(Non-Uniform Memory Access,NUMA)以及海量并行处理结构(Massive Parallel Processing,MPP)。
SMP服务器是指服务器中多个中央处理器(Central Processing Unit,CPU)对称工作,无主次或从属关系,每个CPU共享相同的物理内存,访问内存中的任何地址所需的时间相同,SMP的缺点是扩展性能有限;NUMA服务器具有多个CPU模块,每个CPU模块由多个CPU(如4个)组成,并且具有独立的本地内存、I/O槽口等,CPU模块之间可以通过互联模块(如Crossbar Switch)进行连接完成信息交互,各个CPU访问本地内存的速度远远高于访问远地内存(系统内其它CPU模块的内存)的速度,当CPU数量增加时,服务器性能无法线性增加;MPP服务器由多个SMP服务器通过一定的节点互联网络进行连接,每个SMP节点可以运行自己的操作系统、数据库等,但是每个节点内的CPU不能访问另一个节点的内存,节点之间的信息交互通过节点互联网络实现。
当前具有三种处理器互联架构,第一种为单立方体互联架构,是Intel推荐的最大处理器互联架构,可以支持8个CPU互联,但是最大仅能扩展到8P系统,无法进行更多CPU的连接,扩展性受到影响。
第二种处理器互联架构为一个节点内两个CPU或者四个CPU与一个节 点控制器(Node Controller,NC)互联,NC与NC之间互联形成更大规模的系统。这种架构的缺点是NC上的对外连接的链路会成为带宽瓶颈,节点内CPU都需要通过同一个NC进行事务处理和带宽需求。
第三种处理器互联架构为一个节点内两个CPU或者四个CPU与两个NC互联,这种拓扑方案节点间通过两个NC互联,两个NC按照地址空间分摊了的事务处理和带宽需求,能较好的满足带宽需求。这种拓扑方案在4P时延迟较小,但是对于8P以上甚至更大系统,某个节点中的CPU访问另外节点上的内存时,需要跨越两个NC,延迟较大,并且延迟对于NUMA系统性能影响很大。
综上所述,如何在保证服务器带宽的同时减小服务器延迟是目前需要解决的问题。
发明内容
技术问题
有鉴于此,本发明要解决的技术问题是,如何在保证服务器带宽的同时减小服务器延迟。
解决方案
为了解决上述技术问题,在第一方面,本发明提供了一种服务器,包括:
处理器互联节点;
所述处理器互联节点包括至少一个节点控制器和至少两个基本节点,每个所述基本节点包括至少四个处理器;
所述节点控制器,与所述基本节点相连接,用于按照所述处理器的地址空间管理所述处理器的事务;
所述节点控制器,还用于接收源处理器的访问请求及源处理器标识,按照所述访问请求中携带的目标地址,将所述访问请求以及节点控制器标识发往目标处理器,其中,所述源处理器和所述目标处理器位于不同的基本节点,所述目标地址为所述目标处理器的地址。
结合第一方面,在第一方面的第一种可能的实现方式中,所述节点控制器,还用于从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述节点控制器包括控制芯片、本地代理LP和远端代理RP;
所述控制芯片,用于从所述源处理器接收所述源处理器标识和所述访问请求;从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识;
所述RP,用于从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求;从所述LP接收所述数据响应,将所述数据响应发送至所述源处理器标识对应的所述源处理器;
所述LP,用于记录所述RP标识,从所述访问请求中获取所述目标地址,向所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标识,所述节点控制器标识为所述LP标识;从所述目标处理器接收所述数据响应;向所述RP标识指向的所述RP发送所述数据响应。
结合第一方面以及第一方面的第一种和第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述节点控制器具体还用于:在所述目标处理器接收到新的访问请求,指示访问所述目标地址上的数据的情况下,接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括所述目标地址;按照所述源处理器标识向所述源处理器发送所述侦听消息;接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
结合第一方面的第三种可能的实现方式,在第一方面的第四种可能的实现方式中,LP还用于从所述目标处理器接收所述侦听消息;从第二目录信息中获取所述RP标识,并向所述RP标识指向的所述RP发送所述侦听消息,所述第二目录信息为所述LP中保存的目录信息;根据所述目标地址 向所述目标处理器发送所述侦听响应;
所述RP还用于向所述源处理器标识指向的所述源处理器发送所述侦听消息;向所述节点控制器标识指向的所述LP发送所述侦听响应。
结合第一方面以及第一方面的第一种至第四种可能的实现方式中的任意一种可能的实现方式,在第一方面的第五种可能的实现方式中,所述处理器互联节点包括第一基本节点、第二基本节点和两个节点控制器,所述第一基本节点和所述第二基本节点分别包括至少四个处理器。
在第二方面,本发明提供了一种数据访问方法,应用于第一方面以及第一方面的任意一种可能的实现方式所述的服务器,源处理器需要访问目标处理器时,所述数据访问方法包括:
节点控制器接收所述源处理器的访问请求及源处理器标识,所述访问请求中携带目标地址,所述目标地址为所述目标处理器的地址;
所述节点控制器按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器;
所述节点控制器从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
结合第二方面,在第二方面的第一种可能的实现方式中,所述节点控制器包括控制芯片、本地代理LP和远端代理RP,所述节点控制器按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器,包括:
所述控制芯片从所述源处理器接收所述源处理器标识和所述访问请求,并从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识;
所述RP从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求;
所述LP记录所述RP标识,从所述访问请求中获取所述目标地址,向所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标 识,所述节点控制器标识为所述LP标识;
所述节点控制器从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器,包括:
所述LP从所述目标处理器接收所述数据响应,并向所述RP标识指向的所述RP发送所述数据响应;
所述RP将所述数据响应发送至所述源处理器标识对应的所述源处理器。
结合第二方面以及第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,在所述目标处理器接收到新的访问请求,指示需要访问所述目标地址上的数据的情况下,所述数据访问方法还包括:
所述节点控制器接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括所述目标地址;
所述节点控制器按照所述源处理器标识向所述源处理器发送所述侦听消息;
所述节点控制器接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述节点控制器按照所述源处理器标识向所述源处理器发送所述侦听消息,包括;
所述LP从所述目标处理器接收所述侦听消息;
所述LP从第二目录信息中获取所述RP标识,并向所述RP标识指向的所述RP发送所述侦听消息,所述第二目录信息为所述LP中保存的目录信息;
所述RP向所述源处理器标识指向的所述源处理器发送所述侦听消息;
所述节点控制器接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器,包括:
所述控制芯片从所述源处理器接收所述侦听响应,并从所述侦听响应 中获取所述RP标识,向所述RP标识指向的所述RP发送所述侦听响应;
所述RP向所述节点控制器标识指向的所述LP发送所述侦听响应;
所述LP根据所述目标地址向所述目标处理器发送所述侦听响应。
有益效果
本实施例的服务器,至少一个NC保证了服务器的带宽;进一步地,相同基本节点中的处理器可以直接互联并互相访问彼此的数据,相同处理器互联节点的不同基本节点中的处理器之间进行数据访问时,不需要跨越NC之间的链路,降低了服务器延迟。
根据下面参考附图对示例性实施例的详细说明,本发明的其它特征及方面将变得清楚。
附图说明
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本发明的示例性实施例、特征和方面,并且用于解释本发明的原理。
图1a示出根据本发明一实施例的服务器的结构框图;
图1b示出根据本发明一实施例的处理器互联节点的结构框图;
图1c示出根据本发明一实施例的处理器互联节点的结构框图;
图2示出根据本发明一实施例的数据访问方法的流程图;
图3示出根据本发明一实施例的处理器互联节点的结构框图;
图4示出根据本发明另一实施例的数据访问方法的流程图;
图5示出根据本发明另一实施例的服务器的结构框图。
具体实施方式
以下将参考附图详细说明本发明的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
另外,为了更好的说明本发明,在下文的具体实施方式中给出了众多 的具体细节。本领域技术人员应当理解,没有某些具体细节,本发明同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本发明的主旨。
图1a示出根据本发明一实施例的服务器的结构框图。该服务器100,具体可以包括:
处理器互联节点110;
所述处理器互联节点110包括至少一个节点控制器120和至少两个基本节点130,每个所述基本节点130包括至少四个处理器140;
所述节点控制器120,与所述基本节点130相连接,用于按照所述处理器140的地址空间管理所述处理器140的事务。
具体地,服务器100可以包括处理器互联节点110,处理器互联节点110可以包括至少一个节点控制器120。进一步地,处理器互联节点110还可以包括至少两个基本节点130,每个基本节点130中可以包括至少四个处理器140。
在该服务器100中,节点控制器120与基本节点130相连接,可以按照基本节点130中不同处理器140的地址空间管理处理器140的事务。节点控制器120还可以与其他处理器互联节点中的节点控制器相连接,使得一个处理器可以通过节点控制器以及节点控制器之间的链路访问其他处理器互联节点中的处理器,满足服务器的带宽需求。
进一步地,所述节点控制器120,还用于接收源处理器的访问请求及源处理器标识,按照所述访问请求中携带的目标地址,将所述访问请求以及节点控制器标识发往目标处理器,其中,所述源处理器和所述目标处理器位于不同的基本节点,所述目标地址为所述目标处理器的地址;所述节点控制器120,还用于从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
对于处理器互联节点110,相同基本节点130内的处理器140之间可以通过处理器140中的通信模块直接进行通信,实现互访问,不同基本节点 130中的处理器140之间可以通过节点控制器120进行通信,实现互访问。并且,在源处理器需要访问目标处理器中的数据、且源处理器和目标处理器位于不同的基本节点时,在源处理器向目标处理器发送访问请求的过程中,节点控制器120可以接收源处理器的访问请求及源处理器标识,按照所述访问请求中携带的目标地址,将所述访问请求以及节点控制器标识发往目标处理器;在目标处理器向源处理器返回数据响应的过程中,目标处理器可以根据节点控制器标识向对应的节点控制器120发送数据响应,节点控制器120接收到该数据响应之后可以按照所述源处理器标识将所述数据响应发往所述源处理器。在通信的过程中,不需要跨越NC之间的链路,可以减少服务器的延迟。
具体地,所述节点控制器120可以包括控制芯片、本地代理LP和远端代理RP。在源处理器请求访问目标处理器中数据的过程中,节点控制器120的上述组件分别可以执行以下动作:
所述控制芯片,用于从所述源处理器接收所述源处理器标识和所述访问请求;从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识。
所述RP,用于从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求;从所述LP接收所述数据响应,将所述数据响应发送至所述源处理器标识对应的所述源处理器。
所述LP,用于记录所述RP标识,从所述访问请求中获取所述目标地址,向所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标识,所述节点控制器标识为所述LP标识;从所述目标处理器接收所述数据响应;向所述RP标识指向的所述RP发送所述数据响应。
在一种可能的实现方式中,在所述目标处理器接收到新的访问请求,指示访问所述目标地址上的数据的情况下,节点控制器120可以接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括 所述目标地址;按照所述源处理器标识向所述源处理器发送所述侦听消息;接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
本实施例的服务器,至少一个NC保证了服务器的带宽;相同基本节点中的处理器可以直接互联并互相访问彼此的数据,相同处理器互联节点的不同基本节点中的处理器之间进行数据访问时,不需要跨越NC之间的链路,降低了服务器延迟。
图1b示出根据本发明一实施例的处理器互联节点的结构框图。如图1b所示,该处理器互联节点200具体可以包括:第一基本节点210、第二基本节点220和两个节点控制器230,所述第一基本节点210包括至少四个处理器240,所述第二基本节点220包括至少四个处理器250。
具体地,由四个处理器构成一个4P节点,该4P节点可以称之为基本节点,其中每个处理器具有自身的内存、通信模块,每个处理器之间可以通过通信模块进行通信,可以访问自身内存中的数据也可以彼此内存中的数据。每个处理器互联节点可以由八个处理器构成,由两个如上所述的基本节点通过两个节点控制器(NC)互联而成。两个NC可以分别负责两个不同地址空间的平面,即分别负责两个不同地址空间的处理器的事务,也可以根据需要进行调整,本发明对此不作限定。图1c示出根据本发明一实施例的处理器互联节点的结构框图。如图1c所示,节点控制器1负责基本节点0中处理器0、处理器1的事务和基本节点1中处理器4、处理器5的事务,节点控制器0负责基本节点0中处理器2、处理器3的事务和基本节点1中处理器6、处理器7的事务。进一步地,两个NC可以各自与其他NC互联构成具有更大带宽的服务器,即NC可以通过互联接口与其他NC相连接。该处理器互联节点中,双NC可以保证在处理器进行跨NC访问的过程中服务器的带宽。此外,在基本节点0中的处理器访问基本节点1中的内存时,可以通过节点控制器0或节点控制器1直接进行访问,不再需要跨NC之间的链路,例如基本节点0中的处理器2可以通过节点控制器0 直接访问基本节点1中处理器6的内存,这样可以保证处理器互联节点中跨基本节点访问造成的服务器延迟较低。
综上所述,本发明提供的服务器,包括至少一个节点控制器和至少两个通过节点控制器相通信的基本节点,可以在保证服务器带宽的同时,降低相同处理器互联节点中不同基本节点的处理器互相访问时的服务器延迟。
图2示出根据本发明一实施例的数据访问方法的流程图。如图2所示,该数据访问方法可以应用于本发明上述实施例的服务器中,源处理器需要访问目标处理器时,该数据访问方法主要可以包括:
步骤300、节点控制器接收所述源处理器的访问请求及源处理器标识,所述访问请求中携带目标地址,所述目标地址为所述目标处理器的地址。
具体地,在本发明上述实施例的处理器互联节点中,源处理器需要访问目标处理器的情况下,可以根据待访问数据的地址即目标地址确定处理源处理器事务的NC,并向该NC发送访问请求和源处理器标识。在该访问请求中可以包括目标地址。
在一种可能的实现方式中,本发明上述实施例的处理器互联节点中,源处理器的事务与目标处理器的事务可能分别由该处理器互联节点中的两个NC管理的,两个NC管理的地址空间不同,可以分担处理器互联节点的带宽压力,这两个NC之间没有互联,不能直接进行通讯。在这种情况下,源处理器需要先将访问请求和源处理器标识发送至某一个中间处理器,该中间处理器与源处理器归属于同一基本节点,可以不通过NC直接进行通讯,且该中间处理器的事务与目标处理器的事务由同一个NC管理,中间处理器可以通过该NC与目标处理器进行通讯。通过中间处理器的转发,可以将源处理器的访问请求和源处理器标识发送至目标处理器。例如,图3示出根据本发明一实施例的处理器互联节点的结构框图,如图3所示,若CPU5为源处理器,CPU2为目标处理器,CPU5需要访问CPU2的数据,CPU5需要将访问请求和CPU5的标识发送至CPU6或CPU7,CPU6或CPU7将 访问请求和CPU5的标识通过右侧NC发送至CPU2。CPU6和CPU7的选择可以由该处理器互联节点的路由配置来确定。
步骤310、所述节点控制器按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器。
具体地,NC接收到访问请求和源处理器标识之后,可以记录源处理器标识,用于后续确定哪个处理器占用该目标地址的数据。NC可以通过目标地址确定待访问数据所在的目标处理器,并将访问请求发送至目标处理器,同时NC还可以向目标处理器发送NC的标识即节点控制器标识,用于目标处理器可以正确返回数据响应。
步骤320、所述节点控制器从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
具体地,目标处理器接收到数据访问请求之后,可以返回数据响应,在返回数据响应的过程中,可以先通过节点控制器标识确定可以转发数据响应的NC,NC在接收到数据响应之后,可以通过记录的源处理器标识确定请求访问数据的源处理器,并将数据响应发往源处理器,完成数据访问。
在一种可能的实现方式中,NC中可以包括本地代理(Local Proxy,LP)和远端代理(Remote Proxy,RP)。其中,LP可以用于完成基本节点内CPU与基本节点外NC的协议处理工作,从基本节点内CPU来看LP具有缓存代理(Cache Agent,CA)功能,即基本节点内CPU认为LP上是有处理器核的,虽然处理器核并不在LP上,而是在远端节点内的处理器中;从基本节点外NC来看LP具有内存代理(Home Agent,HA)功能,即基本节点外NC认为LP上是有内存的,虽然内存并不在LP上,而是连接在基本节点内的处理器上;RP可以完成基本节点内CPU与基本节点外NC的协议处理工作,从基本节点内CPU来看RP具有HA功能,即基本节点内CPU认为RP上是有内存的,虽然内存并不在RP上,而是连接在基本节点外的处理器上;从基本节点外NC来看RP具有CA功能,即基本节点外NC认为RP是有处理器核的,虽然处理器核并不在RP上,而是在基本节点内的处 理器上的。在CPU之间进行数据访问的过程中,一个LP可以负责位于两个处理器内的HA事务,其中HA事务即为请求访问内存的过程。RP可以通过低位地址交织管理八个CPU的请求。LP、RP的存在可以让基本节点内外的处理器都能够访问基本节点内外内存中的数据而不会出现数据不一致的现象。
在一种可能的实现方式中,在步骤300之前,源处理器可以根据目标地址确定是否需要将访问请求发往NC,若源处理器和待访问数据所在的目标处理器属于同一个基本节点(如基本节点0),由于同一个基本节点内的CPU之间可以通过通信模块直接访问,则不需要将访问请求发往NC;若源处理器和待访问数据所在的目标处理器不属于同一个基本节点(如源处理器归属于基本节点0,目标处理器归属于基本节点1),由于不同基本节点中的处理器需要通过NC进行访问,则需要将访问请求发往NC。在需要将访问请求发往NC的情况下,源处理器可以继续根据目标地址确定将访问请求发往该处理器互联节点中的哪一个NC。如图3所示,CPU0~CPU3属于同一基本节点,地址中地址位A41=0,CPU4~CPU7属于同一基本节点,地址中地址位A41=1;左侧NC代理的处理器地址中地址位A40=0,右侧NC代理的处理器地址中地址位A40=1。如果CPU5请求访问CPU2的内存数据,CPU5可以通过待访问数据所在的CPU2的地址位A41=0确定CPU2与CPU5不属于同一基本节点,访问过程中需要经过NC。CPU5再通过待访问数据所在的CPU2的地址位A40=1确定将访问请求发往右侧NC。
在一种可能的实现方式中,NC中还可以包括控制芯片,步骤310具体可以包括:
步骤311、所述控制芯片从所述源处理器接收所述源处理器标识和所述访问请求,并从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识。
具体地,在访问请求中可以包括目标地址,用于指示源处理器需要访问的数据所在的地址。控制芯片从源处理器接收到源处理器标识和访问请 求之后,可以在访问请求中包括的目标地址中获取RP标识,该RP标识指示控制芯片将源处理器标识和访问请求发往NC中的哪一个RP。控制芯片可以向该RP标识指向的RP发送源处理器标识和访问请求,其中,RP标识可以为目标地址中的地址位[A7,A6],该源处理器标识可以用于在RP的目录信息中记录占用内存中目标地址上的数据的源处理器。例如,如图3所示,若目标地址中的地址位[A7,A6]=10,则CPU5可以确定通过中间处理器CPU6或CPU7将访问请求发送至右侧NC,右侧NC的控制芯片接收该访问请求,并从中获取RP标识,并可以根据RP标识确定将该访问请求发送至RP2。
步骤312、所述RP从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求。
具体地,在RP中可以保存一个目录信息即第一目录信息,在该目录信息中可以记录某一个处理器占用了内存某一个地址上的数据,其中处理器可以通过处理器的标识来记录。根据缓存一致性和内存一致性协议MESI协议,每一个缓存行都可以被标记为以下四种状态之一:修改(Modified)、独占(Exclusive)、共享(Shared)、无效(Invalid)。在某一个缓存行被标记为无效态时,说明该缓存行是无效的,即为空行,无效行必须被从内存中取出,变为共享或者独占状态才能实现读请求。
在RP的第一目录信息中还可以记录缓存行的状态,在RP查找到该目标地址在第一目录信息中记录为无效态的情况下,RP可以根据目标地址的[A45,A42]地址位判断需要将接收到的访问请求发往哪个处理器互联节点,如果是该RP所在的处理器互联节点,则可以根据目标地址的地址位A41和A6判断发往哪个LP。例如,如图3所示,A41=0,且LP0和LP1代理的处理器地址位A41=0,LP3和LP4代理的处理器地址位A41=1,则RP可以根据目标地址译码得到LP标识,即由目标地址中的地址位A41=0确定将接收到的访问请求发往LP0或者LP1。进一步地,LP0代理的HA事务中A6=0,而LP1代理的HA事务中A6=1,则RP可以由目标地址中的 地址位A6=0确定将接收到的访问请求发往LP0即LP。
步骤313、所述LP记录所述RP标识,从所述访问请求中获取所述目标地址,向所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标识,所述节点控制器标识为所述LP标识。
具体地,LP接收到上述访问请求之后,可以记录RP的标识即RP标识,以用于后续向该RP返回数据响应。LP还可以获取访问请求中的目标地址,根据目标地址中的地址位A39可以确定将访问请求发往哪个处理器即目标处理器。在LP发送访问请求时还可以将节点控制器标识即LP标识一起发送至目标处理器,用于后续目标处理器向该LP返回数据响应。例如,如图3所示,目标地址中的地址位A39=0,则LP0可以将访问请求和节点控制器标识发送至对应的目标处理器CPU2。
在一种可能的实现方式中,若源处理器和目标处理器不属于同一个处理器互联节点,参见本发明上述实施例的服务器,不同的处理器互联节点之间可以通过NC之间的链路进行连接。在这种情况下,源处理器请求访问目标处理器的数据需要跨越NC之间的链路,可能由不同NC的LP和RP实现访问请求的发送与接收,即RP与LP并不属于同一个NC,RP属于源处理器所在处理器互联节点的NC,LP属于目标处理器所在处理器互联节点的NC。此时,LP除了记录RP标识之外,还可以记录RP所在的NC。在后续返回数据响应时,LP可以先根据记录的信息确定RP所在的NC,然后再根据RP标识确定RP。
在一种可能的实现方式中,步骤320具体可以包括:
步骤321、所述LP从所述目标处理器接收所述数据响应,并向所述RP标识指向的所述RP发送所述数据响应;
步骤322、所述RP将所述数据响应发送至所述源处理器标识对应的所述源处理器。
具体地,在目标处理器接收到访问请求和节点控制器标识之后,需要向请求访问数据的源处理器返回数据响应,但是目标处理器并不记录具体 是哪一个处理器需要访问该地址上的数据。目标处理器可以通过节点控制器标识确定对应的LP,并向该LP返回数据响应,LP接收到数据响应之后,可以根据之前记录的RP标识,即目标地址的地址位[A7,A6],将该数据响应转发至该RP标识对应的RP。在一种可能的实现方式中,参见本实施例的上述描述,LP还可以先根据之前记录的信息确定RP所在的NC。例如,如图3所示,CPU2在接收到访问请求的同时,还可以接收到节点控制器标识。CPU2可以根据节点控制器标识向其指向的LP0发送数据响应,在数据响应中可以包括目标地址。LP0接收到数据响应之后,可以从记录的信息中获取RP标识即[A7,A6]=10,通过该RP标识可以确定将数据响应发送其指向的RP2。
在上述步骤312中,RP记录了源处理器标识,RP接收到数据响应之后,可以按照记录的源处理器标识,将数据响应转发至源处理器标识对应的源处理器,这样源处理器接收到数据响应之后便可以访问该目标地址上的数据,完成了数据访问的过程。目标处理器中可以保存目录信息并记录有外部处理器占用了内存该目标地址上的数据,但并没有记录是外部哪个处理器占用的;LP中可以保存目录信息并记录是该LP所在的NC中的RP占用了内存该目标地址上的数据,进一步地,参见本发明上述实施例的相关描述,LP记录的RP所在的NC还可以与LP所在的NC不同;RP中可以保存第一目录信息并记录是源处理器占用了内存该目标地址上的数据。例如,如图3所示,在RP2的第一目录信息中记录访问该目标地址上的数据的源处理器为CPU5,RP2可以将数据响应发送至CPU5,这样CPU5便可以访问CPU2上该目标地址上的数据。
需要说明的是,本发明的处理器互联节点中,相同基本节点中的处理器之间,可以通过任一种方式进行互联,并通过处理器的通信模块进行通信,本发明对此不做限定。此外,本发明的处理器互联节点中,NC具体可以管理哪些处理器的事务,可以根据不同的地址空间进行划分,也可以根据需求发生适应性的改变,本发明同样对此不做限定。如图3所示,可以 根据地址空间的划分,右侧NC管理CPU2、CPU3、CPU6、CPU7的事务,左侧NC管理CPU0、CPU1、CPU4、CPU5的事务;还可以根据需求进行交叉管理,如右侧NC管理CPU1、CPU2、CPU4、CPU7的事务,左侧NC管理CPU0、CPU3、CPU5、CPU6的事务。
本实施例的数据访问方法,在同一个处理器互联节点中,相同基本节点中的处理器可以直接互联并互相访问彼此的数据,不同基本节点中的处理器之间进行数据访问时,不需要跨越NC之间的链路,在保证服务器带宽的同时,降低了服务器延迟。
图4示出根据本发明另一实施例的数据访问方法的流程图。在一种可能的实现方式中,在所述目标处理器根据所述访问请求将所述数据响应发送至所述源处理器之后,可能有其它处理器需要访问该目标地址的数据,并向目标处理器发送新的访问请求。此时查找目标处理器中的目录信息可以发现该目标地址上的数据已经被外部处理器占用,此时目标处理器可以对外发起侦听。如图4所示,该数据访问方法主要可以包括:
步骤400、所述节点控制器接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括所述目标地址。
具体地,目标处理器接收到新的访问请求时,并且该新的访问请求指示需要使用该目标地址上的数据时,目标处理器可以根据其保存的目录信息确定该目标地址上的数据已经被外部处理器占用,但并不确定具体被哪一个外部处理器占用。此时,目标处理器可以向NC发起侦听,即发送侦听消息,同时还可以向NC发送其通过上述步骤120接收到的节点控制器标识,用于NC的控制芯片确定将该侦听消息发往哪一个LP。在侦听消息中还可以包括目标地址。
步骤410、所述节点控制器按照所述源处理器标识向所述源处理器发送所述侦听消息;
步骤420、所述节点控制器接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
具体地,NC接收到上述侦听消息和节点控制器标识后,可以根据源处理器标识将该侦听消息发送至源处理器。在源处理器接收到侦听消息之后,可以向NC返回侦听响应,在源处理器向NC返回侦听响应的过程中,源处理器可以先通过目标地址确定可以转发该侦听响应的NC,NC在接收到侦听响应之后,还可以通过目标地址确定目标处理器,并将侦听响应发往目标处理器,完成侦听。
在一种可能的实现方式中,在该处理器互联节点中,处理器的每个HA事务对应一个LP,在步骤400之前,目标处理器可以根据HA事务与LP的对应关系向NC发送侦听消息。这样目标处理器可能会同时向多个NC发送侦听消息,不代理目标处理器的NC接收到侦听消息之后可以向目标处理器返回无效响应(Response Invalid,RSPI)。例如,如图3所示,HA0事务对应LP0,在CPU2发送侦听消息时,会同时发送至左右两侧的两个NC,在左侧NC接收到侦听消息后,由于其不代理CPU2的事务,所以可以直接向CPU2返回RSPI。
在一种可能的实现方式中,参见本发明上述实施例的数据访问方法的相关描述,源处理器的事务与目标处理器的事务可能分别由该处理器互联节点中的两个NC管理的,这两个NC之间没有互联,不能直接进行通讯。在这种情况下,目标处理器需要先将侦听消息发送至某一个中间处理器,该中间处理器与目标处理器归属于同一基本节点,可以不通过NC直接进行通讯,且该中间处理器的事务与源处理器的事务由同一个NC管理,中间处理器可以通过该NC与目标处理器进行通讯。通过中间处理器的转发,可以将目标处理器的侦听消息发送至源处理器。如图3所示,若CPU5为源处理器,CPU0为目标处理器,CPU0发起侦听时,CPU0需要将侦听消息发送至CPU2或CPU3,CPU2或CPU3将侦听消息通过右侧NC发送至CPU5。CPU2和CPU3的选择可以由该处理器互联节点的路由配置来确定。
在该数据访问方法中,在步骤410具体还可以包括:
步骤411、所述LP从所述目标处理器接收所述侦听消息;
步骤412、所述LP从第二目录信息中获取所述RP标识,并向所述RP标识指向的所述RP发送所述侦听消息,所述第二目录信息为所述LP中保存的目录信息。
具体地,在NC接收目标处理器发送的侦听消息的同时,还可以接收到目标处理器发送的节点控制器标识,控制芯片可以根据节点控制器标识向其指向的LP发送侦听消息。LP接收到侦听消息之后,可以根据LP保存的第二目录信息,确定向哪一个NC发送侦听消息,进一步地,LP可以根据RP标识确定向NC中哪一个RP发送侦听消息。例如,如图3所示,若目标地址中的地址位[A7,A6]=00,则右侧NC的LP0可以确定需要将侦听消息发送至RP0,LP通过查找第二目录信息,确定RP与LP0属于同一个NC,则将侦听消息发送至右侧NC的RP0。
步骤413、所述RP向所述源处理器标识指向的所述源处理器发送所述侦听消息。
具体地,RP接收到侦听消息之后,可以记录节点控制器标识,用于后续向目标处理器正确地返回侦听响应。第一目录信息中记录了占用该地址上数据的源处理器,RP可以按照记录的源处理器标识并向该源处理器发送侦听消息。例如,如图3所示,RP0的第一目录信息中记录CPU5占用了该目标地址上的数据,通过查找第一目录信息,RP0可以确定CPU5并向CPU5发送侦听消息。
在一种可能的实现方式中,参见本发明上述实施例的数据访问方法的相关描述,接收侦听消息的NC并不处理源处理器的事务,在这种情况下,NC需要先将侦听消息发送至某一个中间处理器,该中间处理器的事务由该NC管理,并且该中间处理器与源处理器归属于同一基本节点,通过中间处理器的转发,可以将目标处理器的侦听消息发送至源处理器。中间处理器的选择可以由该处理器互联节点的路由配置来确定。
在一种可能的实现方式中,若源处理器和目标处理器不属于同一个处理器互联节点,参见本发明上述实施例的服务器,不同的处理器互联节点 之间可以通过NC之间的链路进行连接。在这种情况下,目标处理器向源处理器发送的侦听消息需要跨越NC之间的链路,可能由不同NC的LP和RP实现侦听消息的发送与接收,即RP与LP并不属于同一个NC,RP属于源处理器所在处理器互联节点的NC,LP属于目标处理器所在处理器互联节点的NC。此时,LP除了记录RP标识之外,还可以记录RP所在的NC。在后续返回侦听响应时,LP可以先根据记录的信息确定RP所在的NC,然后再根据RP标识确定RP。
在该数据访问方法中,步骤420具体可以包括:
步骤421、所述控制芯片从所述源处理器接收所述侦听响应,并从所述侦听响应中获取所述RP标识,向所述RP标识指向的所述RP发送所述侦听响应;
步骤422、所述RP向所述节点控制器标识指向的所述LP发送所述侦听响应。
具体地,源处理器接收到侦听消息之后,可以通过NC向目标处理器返回侦听响应,在侦听响应中可以包括目标地址。控制芯片接收到侦听响应之后,可以在侦听响应中获取RP标识,向RP标识指向的RP发送该侦听响应。其中,RP标识为目标地址中的地址位[A7,A6]。例如,如图3所示,若目标地址中的地址位[A7,A6]=10,则CPU5可以确定需要将侦听响应发送至RP0。
步骤423、所述LP根据所述目标地址向所述目标处理器发送所述侦听响应。
具体地,通过上述步骤413,在RP中记录了节点控制器标识,即LP标识。在RP接收到侦听响应之后,RP可以通过节点控制器标识确定LP,并将侦听响应发送至该LP。该LP接收到侦听响应之后,可以根据侦听响应中包括的目标地址确定发起本次侦听的目标处理器。在一种可能的实现方式中,参见本实施例的上述描述,LP还可以先根据之前记录的信息确定RP所在的NC。例如,如图3所示,若RP0中记录了向其发送侦听消息的 LP的节点控制器标识即LP0,则RP0可以将侦听响应发送至LP0。LP0可以根据目标地址中的地址位A39=0确定将侦听响应发送至CPU2。
在一种可能的实现方式,侦听消息中还可以包括侦听类型,目标处理器完成本次侦听之后,可以获取到使用该目标地址上的数据的权限。同时,目标处理器、LP、RP中保存的目录信息可以根据侦听消息中的侦听类型做相应的修改。例如,若侦听为独占类型的侦听,目标处理器、LP、RP中保存的目录信息中有关该目标地址的信息可以被清除,源处理器不能再继续占用该目标地址上的数据;若侦听为共享类型的侦听,目标处理器、LP、RP中保存的目录信息中有关该目标地址的信息改为共享状态,源处理器和向目标处理器发送上述新的访问请求的处理器可以同时共享该目标地址上的数据。
需要说明的是,本发明的处理器互联节点中,相同基本节点中的处理器之间,可以通过任一种方式进行互联,并通过处理器的通信模块进行通信,本发明对此不做限定。此外,本发明的处理器互联节点中,NC具体可以管理哪些处理器的事务,可以根据不同的地址空间进行划分,也可以根据需求发生适应性的改变,本发明同样对此不做限定。
本实施例的数据访问方法,在同一个处理器互联节点中,相同基本节点中的处理器可以直接互联并互相访问彼此的数据,不同基本节点中的处理器之间进行数据访问时,不需要跨越NC之间的链路,在保证服务器带宽的同时,降低了服务器延迟。
图5示出根据本发明另一个实施例的一种服务器的结构框图。所述服务器500可以是具备计算能力的主机服务器、个人计算机PC、或者可携带的便携式计算机或终端等。本发明具体实施例并不对计算节点的具体实现做限定。
所述服务器500包括处理器(processor)510、通信接口(Communications Interface)520、存储器(memory)530、总线540和节点控制器550。其中,处理器510、通信接口520、节点控制器550以及存储器530通过总线540完 成相互间的通信。
通信接口520用于与网络设备通信,其中网络设备包括例如虚拟机管理中心、共享存储等。
节点控制器550用于执行程序。处理器510可能是一个中央处理器CPU,或者是专用集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。
存储器530用于存放文件。存储器530可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。存储器530也可以是存储器阵列。存储器530还可能被分块,并且所述块可按一定的规则组合成虚拟卷。
在一种可能的实施方式中,上述程序可为包括计算机操作指令的程序代码。该程序具体可用于:
接收所述源处理器的访问请求及源处理器标识,所述访问请求中携带目标地址,所述目标地址为所述目标处理器的地址;
按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器;
从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
在一种可能的实现方式中,所述节点控制器包括控制芯片、本地代理LP和远端代理RP,所述节点控制器按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器,包括:
所述控制芯片从所述源处理器接收所述源处理器标识和所述访问请求,并从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识;
所述RP从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求;
所述LP记录所述RP标识,从所述访问请求中获取所述目标地址,向 所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标识,所述节点控制器标识为所述LP标识;
所述节点控制器从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器,包括:
所述LP从所述目标处理器接收所述数据响应,并向所述RP标识指向的所述RP发送所述数据响应;
所述RP将所述数据响应发送至所述源处理器标识对应的所述源处理器。
在一种可能的实现方式中,在所述目标处理器接收到新的访问请求,指示需要访问所述目标地址上的数据的情况下,该程序具体还可用于:
接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括所述目标地址;
按照所述源处理器标识向所述源处理器发送所述侦听消息;
接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
在一种可能的实现方式中,按照所述源处理器标识向所述源处理器发送所述侦听消息,包括:
所述LP从所述目标处理器接收所述侦听消息;
所述LP从第二目录信息中获取所述RP标识,并向所述RP标识指向的所述RP发送所述侦听消息,所述第二目录信息为所述LP中保存的目录信息;
所述RP向所述源处理器标识指向的所述源处理器发送所述侦听消息;
所述节点控制器接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器,包括:
所述控制芯片从所述源处理器接收所述侦听响应,并从所述侦听响应中获取所述RP标识,向所述RP标识指向的所述RP发送所述侦听响应;
所述RP向所述节点控制器标识指向的所述LP发送所述侦听响应;
所述LP根据所述目标地址向所述目标处理器发送所述侦听响应。
本领域普通技术人员可以意识到,本文所描述的实施例中的各示例性单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件形式来实现,取决于技术方案的特定应用和设计约束条件。专业技术人员可以针对特定的应用选择不同的方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
如果以计算机软件的形式来实现所述功能并作为独立的产品销售或使用时,则在一定程度上可认为本发明的技术方案的全部或部分(例如对现有技术做出贡献的部分)是以计算机软件产品的形式体现的。该计算机软件产品通常存储在计算机可读取的非易失性存储介质中,包括若干指令用以使得计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各实施例方法的全部或部分步骤。而前述的存储介质包括U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。

Claims (10)

  1. 一种服务器,其特征在于,包括:
    处理器互联节点;
    所述处理器互联节点包括至少一个节点控制器和至少两个基本节点,每个所述基本节点包括至少四个处理器;
    所述节点控制器,与所述基本节点相连接,用于按照所述处理器的地址空间管理所述处理器的事务;
    所述节点控制器,还用于接收源处理器的访问请求及源处理器标识,按照所述访问请求中携带的目标地址,将所述访问请求以及节点控制器标识发往目标处理器,其中,所述源处理器和所述目标处理器位于不同的基本节点,所述目标地址为所述目标处理器的地址。
  2. 根据权利要求1所述的服务器,其特征在于,所述节点控制器,还用于从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
  3. 根据权利要求2所述的服务器,其特征在于,所述节点控制器包括控制芯片、本地代理LP和远端代理RP;
    所述控制芯片,用于从所述源处理器接收所述源处理器标识和所述访问请求;从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识;
    所述RP,用于从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求;从 所述LP接收所述数据响应,将所述数据响应发送至所述源处理器标识对应的所述源处理器;
    所述LP,用于记录所述RP标识,从所述访问请求中获取所述目标地址,向所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标识,所述节点控制器标识为所述LP标识;从所述目标处理器接收所述数据响应;向所述RP标识指向的所述RP发送所述数据响应。
  4. 根据权利要求1-3中任一项所述的服务器,其特征在于,所述节点控制器具体还用于:在所述目标处理器接收到新的访问请求,指示访问所述目标地址上的数据的情况下,接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括所述目标地址;按照所述源处理器标识向所述源处理器发送所述侦听消息;接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
  5. 根据权利要求4所述的服务器,其特征在于,LP还用于从所述目标处理器接收所述侦听消息;从第二目录信息中获取所述RP标识,并向所述RP标识指向的所述RP发送所述侦听消息,所述第二目录信息为所述LP中保存的目录信息;根据所述目标地址向所述目标处理器发送所述侦听响应;
    所述RP还用于向所述源处理器标识指向的所述源处理器发送所述侦听消息;向所述节点控制器标识指向的所述LP发送所述侦听响应。
  6. 根据权利要求1-5中任一项所述的服务器,其特征在于,所述处理器互联节点包括第一基本节点、第二基本节点和两个节点控制器,所述第一基本节点和所述第二基本节点分别包括至少四个处理器。
  7. 一种数据访问方法,其特征在于,应用于权利要求1-6中任一项所 述的服务器,源处理器需要访问目标处理器时,所述数据访问方法包括:
    节点控制器接收所述源处理器的访问请求及源处理器标识,所述访问请求中携带目标地址,所述目标地址为所述目标处理器的地址;
    所述节点控制器按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器;
    所述节点控制器从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器。
  8. 根据权利要求7所述的数据访问方法,其特征在于,所述节点控制器包括控制芯片、本地代理LP和远端代理RP,所述节点控制器按照所述目标地址,将所述访问请求以及节点控制器标识发往所述目标处理器,包括:
    所述控制芯片从所述源处理器接收所述源处理器标识和所述访问请求,并从所述访问请求中获取RP标识,向所述RP标识指向的RP发送所述访问请求和所述源处理器标识;
    所述RP从所述访问请求中获取所述目标地址,对所述目标地址进行译码得到LP标识,向所述LP标识指向的LP发送所述访问请求;
    所述LP记录所述RP标识,从所述访问请求中获取所述目标地址,向所述目标地址所指向的所述目标处理器发送所述访问请求和节点控制器标识,所述节点控制器标识为所述LP标识;
    所述节点控制器从所述目标处理器接收数据响应,并按照所述源处理器标识将所述数据响应发往所述源处理器,包括:
    所述LP从所述目标处理器接收所述数据响应,并向所述RP标识指向 的所述RP发送所述数据响应;
    所述RP将所述数据响应发送至所述源处理器标识对应的所述源处理器。
  9. 根据权利要求7或8所述的数据访问方法,其特征在于,在所述目标处理器接收到新的访问请求,指示需要访问所述目标地址上的数据的情况下,所述数据访问方法还包括:
    所述节点控制器接收所述目标处理器发送的侦听消息和所述节点控制器标识,所述侦听消息中包括所述目标地址;
    所述节点控制器按照所述源处理器标识向所述源处理器发送所述侦听消息;
    所述节点控制器接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器。
  10. 根据权利要求9所述的数据访问方法,其特征在于,所述节点控制器按照所述源处理器标识向所述源处理器发送所述侦听消息,包括:
    所述LP从所述目标处理器接收所述侦听消息;
    所述LP从第二目录信息中获取所述RP标识,并向所述RP标识指向的所述RP发送所述侦听消息,所述第二目录信息为所述LP中保存的目录信息;
    所述RP向所述源处理器标识指向的所述源处理器发送所述侦听消息;
    所述节点控制器接收所述源处理器返回的侦听响应,并按照所述目标地址将所述侦听响应发往所述目标处理器,包括:
    所述控制芯片从所述源处理器接收所述侦听响应,并从所述侦听响应中获取所述RP标识,向所述RP标识指向的所述RP发送所述侦听响应;
    所述RP向所述节点控制器标识指向的所述LP发送所述侦听响应;
    所述LP根据所述目标地址向所述目标处理器发送所述侦听响应。
PCT/CN2015/070453 2014-03-12 2015-01-09 服务器及数据访问方法 WO2015135385A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410091090.X 2014-03-12
CN201410091090.XA CN103870435B (zh) 2014-03-12 2014-03-12 服务器及数据访问方法

Publications (1)

Publication Number Publication Date
WO2015135385A1 true WO2015135385A1 (zh) 2015-09-17

Family

ID=50908981

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/070453 WO2015135385A1 (zh) 2014-03-12 2015-01-09 服务器及数据访问方法

Country Status (2)

Country Link
CN (1) CN103870435B (zh)
WO (1) WO2015135385A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870435B (zh) * 2014-03-12 2017-01-18 华为技术有限公司 服务器及数据访问方法
CN105335217B (zh) * 2014-06-26 2018-11-16 华为技术有限公司 一种服务器静默方法与系统
CN104199740B (zh) * 2014-08-28 2019-03-01 浪潮(北京)电子信息产业有限公司 共享系统地址空间的非紧耦合多节点多处理器系统和方法
CN104793974A (zh) * 2015-04-28 2015-07-22 浪潮电子信息产业股份有限公司 一种启动系统的方法及一种计算机系统
CN104794099A (zh) * 2015-04-28 2015-07-22 浪潮电子信息产业股份有限公司 一种资源融合的方法、系统及一种远端代理器
CN105045729B (zh) * 2015-09-08 2018-11-23 浪潮(北京)电子信息产业有限公司 一种远端代理带目录的缓存一致性处理方法与系统
CN109923846B (zh) * 2016-11-14 2020-12-15 华为技术有限公司 确定热点地址的方法及其设备
CN107241282B (zh) * 2017-07-24 2021-04-27 郑州云海信息技术有限公司 一种减少协议处理流水线停顿的方法及系统
CN107451075B (zh) * 2017-09-22 2023-06-20 北京算能科技有限公司 数据处理芯片和系统、数据存储转发和读取处理方法
CN110098945B (zh) * 2018-01-30 2021-10-19 华为技术有限公司 应用于节点系统的数据处理方法及装置
CN112306913B (zh) * 2019-07-30 2023-09-22 华为技术有限公司 一种端点设备的管理方法、装置及系统
CN111241024A (zh) * 2020-02-20 2020-06-05 山东华芯半导体有限公司 一种全互联axi总线的级联方法
CN118430603A (zh) * 2023-01-31 2024-08-02 华为技术有限公司 一种译码方法、第一裸片和第二裸片

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6598120B1 (en) * 2002-03-08 2003-07-22 International Business Machines Corporation Assignment of building block collector agent to receive acknowledgments from other building block agents
CN102439571A (zh) * 2011-10-27 2012-05-02 华为技术有限公司 一种防止节点控制器死锁的方法及节点控制器
CN103294612A (zh) * 2013-03-22 2013-09-11 浪潮电子信息产业股份有限公司 一种在多级缓存一致性域系统局部域构造Share-F状态的方法
CN103870435A (zh) * 2014-03-12 2014-06-18 华为技术有限公司 服务器及数据访问方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908036B (zh) * 2010-07-22 2011-08-31 中国科学院计算技术研究所 一种高密度多处理器系统及其节点控制器
EP2568392A4 (en) * 2011-06-24 2013-05-22 Huawei Tech Co Ltd COMPUTERS SUBSYSTEM AND COMPUTER SYSTEM

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6598120B1 (en) * 2002-03-08 2003-07-22 International Business Machines Corporation Assignment of building block collector agent to receive acknowledgments from other building block agents
CN102439571A (zh) * 2011-10-27 2012-05-02 华为技术有限公司 一种防止节点控制器死锁的方法及节点控制器
CN103294612A (zh) * 2013-03-22 2013-09-11 浪潮电子信息产业股份有限公司 一种在多级缓存一致性域系统局部域构造Share-F状态的方法
CN103870435A (zh) * 2014-03-12 2014-06-18 华为技术有限公司 服务器及数据访问方法

Also Published As

Publication number Publication date
CN103870435A (zh) 2014-06-18
CN103870435B (zh) 2017-01-18

Similar Documents

Publication Publication Date Title
WO2015135385A1 (zh) 服务器及数据访问方法
US9892043B2 (en) Nested cache coherency protocol in a tiered multi-node computer system
US10824565B2 (en) Configuration based cache coherency protocol selection
RU2220444C2 (ru) Компьютерная система и способ передачи данных в компьютерной системе
JP3661761B2 (ja) 共用介入サポートを有する不均等メモリ・アクセス(numa)データ処理システム
KR100324975B1 (ko) 잠재적인 제3 노드 트랜잭션을 버퍼에 기록하여 통신 대기시간을 감소시키는 비균일 메모리 액세스(numa) 데이터 프로세싱 시스템
TWI318737B (en) Method and apparatus for predicting early write-back of owned cache blocks, and multiprocessor computer system
US20150301949A1 (en) Using broadcast-based tlb sharing to reduce address-translation latency in a shared-memory system with optical interconnect
US9009446B2 (en) Using broadcast-based TLB sharing to reduce address-translation latency in a shared-memory system with electrical interconnect
US7568073B2 (en) Mechanisms and methods of cache coherence in network-based multiprocessor systems with ring-based snoop response collection
US20080126707A1 (en) Conflict detection and resolution in a multi core-cache domain for a chip multi-processor employing scalability agent architecture
US7818509B2 (en) Combined response cancellation for load command
JP6514329B2 (ja) メモリアクセス方法、スイッチ、およびマルチプロセッサシステム
EP3788493B1 (en) Data processing network with flow compaction for streaming data transfer
US11669453B2 (en) Data prefetching method and apparatus
TW201732610A (zh) 用於範圍保護的系統、方法及設備
CN107209725A (zh) 处理写请求的方法、处理器和计算机
EP4124963B1 (en) System, apparatus and methods for handling consistent memory transactions according to a cxl protocol
CN114546902A (zh) 基于多协议访问存储器的系统、设备和方法
US7337279B2 (en) Methods and apparatus for sending targeted probes
WO2018077123A1 (zh) 内存访问方法及多处理器系统
US7395380B2 (en) Selective snooping by snoop masters to locate updated data
US8898393B2 (en) Optimized ring protocols and techniques
US10489292B2 (en) Ownership tracking updates across multiple simultaneous operations
WO2019149031A1 (zh) 应用于节点系统的数据处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15762167

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15762167

Country of ref document: EP

Kind code of ref document: A1