WO2022241718A1 - 一种数据存取方法、互联系统及装置 - Google Patents

一种数据存取方法、互联系统及装置 Download PDF

Info

Publication number
WO2022241718A1
WO2022241718A1 PCT/CN2021/094911 CN2021094911W WO2022241718A1 WO 2022241718 A1 WO2022241718 A1 WO 2022241718A1 CN 2021094911 W CN2021094911 W CN 2021094911W WO 2022241718 A1 WO2022241718 A1 WO 2022241718A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
target address
exclusive processing
address
message
Prior art date
Application number
PCT/CN2021/094911
Other languages
English (en)
French (fr)
Inventor
夏晶
信恒超
袁思睿
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/094911 priority Critical patent/WO2022241718A1/zh
Priority to CN202180098267.7A priority patent/CN117355823A/zh
Publication of WO2022241718A1 publication Critical patent/WO2022241718A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems

Definitions

  • the present application relates to the technical field of data storage, and in particular to a data access method, interconnection system and device.
  • node In the face of gradually increasing system performance requirements, the performance goals can be achieved by expanding the system.
  • Common system expansion methods include vertical expansion (also known as upward expansion or Scale-up) and horizontal expansion (also known as outward expansion or Scale-out). Both methods are based on the node (node) as a unit for system expansion.
  • a node usually includes processing capabilities, storage capabilities, and communication capabilities, and a chip is a typical node.
  • the way of vertical expansion is mainly based on the existing foundation of the node itself.
  • the number of processors on the chip can be increased to improve the computing and processing capabilities.
  • it can be used in CPU2 and CPU3 are added in the node 0 that only includes CPU 0 and CPU 1 to increase the number of processors;
  • the capacity of the memory can also be increased to improve the storage capacity, as shown in the direction of the vertical expansion arrow in Figure 1,
  • the memory in node 0 can be expanded from 2MB to 4MB; input/output (I/O) interfaces can also be optimized to improve interaction bandwidth, etc.
  • the way of scale-out is to expand the system scale by increasing the number of nodes, as shown in the direction of the scale-out arrow in Figure 1, the number of nodes increases from two nodes (node 0 and node 1 in the figure) Increase to 4 nodes (nodes 0 to 3 in the figure).
  • the increase in the number of nodes means the improvement of the overall performance of the system.
  • a typical scale-out system such as high-performance computing (high performance computing, HPC) in multiple Chip interconnection architecture. Compared with vertical expansion (Scale-up), horizontal expansion (Scale-out) is more flexible in system composition, and nodes can be added or deleted at any time.
  • the interaction efficiency between nodes has a significant impact on system performance.
  • the nodes in the system adopt multi-path networking.
  • Multi-path networking can improve the efficiency of system interconnection, but how to ensure the sequence of data access request messages that require the visible order of execution results with a small overhead, and perform orderly processing of the visible order of execution results within the system is currently a problem. issues that need resolving.
  • the embodiment of the present application provides a data access method, interconnection system and device, which are used to control the visible order of execution results of message sequences with relatively small overhead for interconnection systems interconnected in a horizontal expansion manner.
  • a data access method is provided, which is applied to an interconnection system, and the interconnection system includes at least two nodes interconnected in a horizontal expansion manner, and the at least two nodes interconnected in a horizontal expansion manner include a first node and A second node, the method comprising:
  • the first node receives a message sequence from an external node, the message in the message sequence is used to request data access to a target address, the target address belongs to the storage space managed by the second node, the
  • the external node is a node outside the interconnection system, and the external node has stricter constraints on the visible order of execution results than the first node;
  • the first node acquires exclusive processing authority of the target address
  • the first node performs a data access operation on the target address based on the exclusive processing authority of the target address, wherein the visible order of execution results of the data access operation satisfies the order of execution results observed by the external node constraint.
  • the first node after the first node receives the message sequence from the external node (ie, the node outside the interconnection system), it obtains the corresponding target address (ie, the target address corresponding to each message in the message sequence) from the second node
  • the exclusive processing authority of the target address that is, the E state in the storage consistency
  • a node has the exclusive processing authority of the target address corresponding to the message sequence, namely It means that the node has the ability to process the visible order of the execution results of the message sequence, so the first node can control the visible order of the execution results of the message sequence. Because in the above implementation, there is no need to use serial communication, and no need to refer to the sequence number-based interaction process and the sequence number-based reorder (reorder) mechanism, so the visible order of the execution results can be controlled with less overhead.
  • obtaining the exclusive processing authority of the target address by the first node includes: the first node obtains the Exclusive processing rights for the destination address.
  • the first node after the first node receives the message sequence, it obtains the exclusive processing authority of the corresponding address according to the target address corresponding to the message sequence, and can obtain the exclusive processing authority of the corresponding address in a targeted manner, so as to reduce system overhead.
  • the first node acquires the exclusive processing authority of the target address from the second node, including: the first node receiving the After the message sequence, send a permission acquisition request to the second node, where the permission acquisition request carries a target address corresponding to at least one message in the message sequence; the first node receives the permission from the second node An acquisition response, where the authority acquisition response is used to indicate that the exclusive processing authority of the target address corresponding to the at least one message is migrated to the first node.
  • the exclusive processing permission with a granularity of "at least one target address corresponding to the message" is obtained, that is, the implementation method allows the permission acquisition request to carry the target address corresponding to the entire message sequence and allows the cache to exceed Cacheline size, that is, compared to the cache line size, the implementation can use a larger granularity to obtain exclusive processing rights of the target address. Since the permission acquisition request can carry the target address corresponding to more than one message, or even the target addresses corresponding to all messages in the entire message sequence, system overhead can be reduced and the efficiency of permission acquisition can be improved.
  • sending the permission acquisition request to the second node includes: after receiving the message sequence, the first node communicates with the The first path between the second nodes sends the permission acquisition request, wherein the first path is selected by the first node according to the congestion state of each path between the second node and the second node; the first The node receives the permission acquisition response from a second path with the second node, and the second path is the same as or different from the first path.
  • the first node since the first node can select an appropriate path to transmit the permission acquisition request based on the congestion control mechanism, communication efficiency can be improved and bandwidth occupation can be reduced.
  • the exclusive processing right of the target address is obtained by the first node from the second node before receiving the message sequence.
  • the first node can obtain the independent processing authority of the target address in advance, so that, after receiving the message sequence, it can directly perform data processing on the target address corresponding to the message sequence based on the independent processing authority of the target address obtained in advance. Access operations, which can reduce the delay of data access operations.
  • the method further includes: before receiving the message sequence, the first node acquires from the second node an exclusive processing right of a specified address range, where the specified address range includes the The destination address corresponding to the message in the above message sequence.
  • the first node does not perform a data access operation on the first address in the specified address range within a set period of time after obtaining the exclusive processing authority of the specified address range, release the The exclusive processing right of the first address, wherein the first address is any address within the specified address range.
  • the first node after the first node obtains the independent processing authority of the target address (such as the first address), if it does not perform data access operations on the target address within the set time, it indicates that the independent processing authority of the target address In this case, the first node releases the independent processing authority of the target address, so that other nodes can obtain the independent processing authority of the target address for data access processing.
  • the independent processing authority of the target address such as the first address
  • the obtaining the exclusive processing authority of the specified address range from the second node includes: according to the target address corresponding to the historical data access operation of the first node, belonging to the The address of the storage space managed by the second node determines the specified address range; the first node obtains the exclusive processing authority of the specified address range from the second node.
  • the above implementation methods can make the first The independent processing authority of the target address obtained by the node in advance has a high probability of being used in the subsequent message sequence processing process, so the independent processing authority of the target address can be obtained in advance in a targeted manner, thereby reducing system overhead.
  • the first node performs a data access operation on the target address based on the exclusive processing authority of the target address, including: the first node obtains from the second node the A cache address corresponding to the target address; the first node performs a data access operation on the cache address corresponding to the target address based on the exclusive processing authority of the target address.
  • obtaining the cache address corresponding to the target address from the second node by the first node includes:
  • the first node sends an address acquisition request through a first path with the second node, the address acquisition request carries a first target address, and the first target address includes at least one message in the message sequence corresponding to target address, wherein, the first path is selected by the first node according to the congestion state of each path between the second node and the second node;
  • the second path receives an address acquisition response, where the address acquisition response carries a first cache address corresponding to the first target address, and the second path is the same as or different from the first path.
  • the first node can select an appropriate path to transmit the address acquisition request based on the congestion control mechanism, communication efficiency can be improved and bandwidth occupation can be reduced.
  • the first node after the first node performs a data access operation on the target address based on the exclusive processing authority of the target address, the first node further includes: the first node releasing the Exclusive Processing Rights.
  • the first node after the first node performs data access to the target address corresponding to the message in the message sequence, it releases the independent processing authority of the corresponding target address for data access, so that other nodes can obtain the independent processing authority of the address so that Perform data access operations on this address.
  • the method further includes: after the first node obtains the exclusive processing authority of the target address, if no data access operation is performed on the target address for a set period of time , the exclusive processing right of the target address is released, so as to prevent the first node from occupying the exclusive processing right of the address for a long time.
  • the method further includes: before the first node performs data access to the first target address corresponding to the first message in the message sequence, if a request for obtaining the If there is an authorization acquisition request for the exclusive processing authority of the first target address, the first node releases the exclusive processing authority of the first target address so that other nodes can perform data access operations on the first target address.
  • the data access request of the other node to the first target address may have a higher priority than the corresponding message in the sequence of messages received by the first node.
  • the method further includes: the first node sends the execution result of the data access operation on the target address to the external node according to the constraint of the visible order of the execution result by the external node .
  • the method further includes: before the first node performs data access to the first target address corresponding to the first message in the message sequence, if a request for obtaining the first If the permission acquisition request of the exclusive processing right of the target address is requested, the exclusive processing right of the first target address is released.
  • the first node and the second node are respectively system-on-chip SoC chips, and the external nodes are input and output I/O devices.
  • an interconnection system in a second aspect, includes at least two nodes interconnected in a horizontal expansion manner, and the at least two nodes interconnected in a horizontal expansion manner include a first node and a second node, the first node A node for:
  • the message in the message sequence is used to request data access to a target address, the target address belongs to the storage space managed by the second node, and the external node is the A node external to the interconnection system, the external node has stricter constraints on the visible order of the execution results of the data access operation than the first node;
  • a data access operation is performed on the target address based on the exclusive processing right of the target address, wherein the visible order of execution results of the data access operation satisfies the constraint of the external node on the visible order of execution results.
  • the first node is specifically configured to: in response to an operation of receiving the message sequence, obtain an exclusive processing right of the target address from the second node.
  • the first node is specifically configured to send a permission acquisition request to the second node after receiving the message sequence, and the permission acquisition request carries at least one of the message sequence The target address corresponding to the message;
  • the second node is configured to send a permission acquisition response to the first node after receiving the permission acquisition request, and the permission acquisition response is used to indicate the target corresponding to the at least one message Exclusive processing rights for the address are migrated to the first node.
  • the first node is specifically configured to send the permission acquisition request through a first path with the second node after receiving the message sequence, wherein the first node A path is selected by the first node according to the congestion state of each path between the first node and the second node;
  • the second node is specifically configured to send the permission acquisition response through a second path between the first node and the first node, where the second path is the same as or different from the first path.
  • the exclusive processing right of the target address is obtained by the first node from the second node before receiving the message sequence.
  • the first node is further configured to: before receiving the message sequence, obtain from the second node an address of a specified address range in the storage space managed by the second node An operation authority for data access, wherein the first address is an address within the specified address range.
  • the data access to the first address is released operating authority.
  • the first node is specifically configured to: determine the specified an address range; obtaining exclusive processing authority for the specified address range from the second node.
  • the first node is specifically configured to: acquire a cache address corresponding to the target address from the second node; The cache address corresponding to the address performs data access operations.
  • the first node is further configured to: release the exclusive processing right of the target address after performing a data access operation on the target address based on the exclusive processing right of the target address .
  • the first node is further configured to: after obtaining the exclusive processing authority of the target address, if no data access operation is performed on the target address for a set period of time, Then the exclusive processing right of the target address is released.
  • the first node is further configured to: send to the external node the execution of the data access operation on the target address according to the constraint of the external node on the visible order of the execution results result.
  • the first node is further configured to: before performing data access to the first target address corresponding to the first message in the message sequence, if a request for obtaining the If the permission acquisition request of the exclusive processing right of the first target address is requested, the exclusive processing right of the first target address is released.
  • the second node is configured to: after the first node obtains the exclusive processing authority of the first target address, if the first node fails to return the If the exclusive processing right of a target address is obtained, the exclusive processing right of the first target address is regained, wherein the first target address is the target address corresponding to the first message in the message sequence.
  • the first node and the second node are respectively SoC chips, and the external nodes are I/O devices.
  • an SoC chip including: one or more processors, and one or more memories; wherein, one or more computer programs are stored in the one or more memories, and the one or more The computer program includes instructions which, when executed by the one or more processors, cause the SoC chip to perform the method as described in any one of the above first aspects.
  • a computer-readable storage medium includes a computer program, and when the computer program is run on a computing device, the computing device is made to execute the computer program described in any one of the above-mentioned first aspects. described method.
  • a computer program product is provided.
  • the computer program product When the computer program product is invoked by a computer, the computer executes the method described in any one of the above first aspects.
  • Figure 1 is a schematic diagram of the system expansion method
  • Fig. 2 is the schematic diagram of the multi-path networking of horizontal extension (Scale-out) system
  • Fig. 3 is a schematic diagram of adopting serial processing mode order preservation in the horizontal expansion (Scale-out) system
  • Fig. 4 is the schematic diagram that adopts sequence number to keep order in the horizontal extension (Scale-out) system
  • FIG. 5 is a schematic structural diagram of a scale-out system provided by an embodiment of the present application.
  • FIG. 6 is a block diagram of the data access process provided by the embodiment of the present application.
  • FIG. 7 is an interactive schematic diagram of the data access process in the embodiment of the present application.
  • FIG. 8a and FIG. 8b are respectively schematic diagrams of the first node obtaining the exclusive processing authority of the target address corresponding to each message based on multipath in the embodiment of the present application;
  • Fig. 9a, Fig. 9b and Fig. 9c are respectively schematic diagrams of the validity period timing processing of the exclusive processing authority of the first target address and the processing after timeout expires in the embodiment of the present application.
  • Data access operations also known as data read and write operations, including write operations and read operations.
  • write operation refers to storing data in the target address
  • read operation refers to reading data from the target address
  • the globally visible order of execution results of read and write operations is referred to as the visible order of execution results for short.
  • the processor is a fast-running device compared to the memory.
  • the processor reads and writes the memory, if it waits for the operation to complete before processing other tasks, it will cause the processor to block and reduce the working efficiency of the processor. Therefore, one cache (much faster than memory but smaller than memory) can be configured for each processor.
  • the processor writes data to the memory, the data can be written into the cache first and then other tasks can be processed, and the data is stored to the memory by a direct memory access (DMA) device; similarly, when the processor reads When storing data in the memory, the DMA device first stores the data from the memory to the cache, and then the processor reads the data from the cache.
  • DMA direct memory access
  • Cache-coherent devices comply with the MESI protocol, which stipulates four states of the cache line (cache line) (the smallest cache unit in the cache), including: exclusive (Exclusive, also known as E state), modified (Modified, Also known as M state), shared (Shared, also known as S state) and failure (Invalid, also known as I state).
  • states of the cache line including: exclusive (Exclusive, also known as E state), modified (Modified, Also known as M state), shared (Shared, also known as S state) and failure (Invalid, also known as I state).
  • the E state indicates that the cache line is valid, the data in the cache is consistent with the data in the memory, and the data only exists in this cache
  • the M state indicates that the cache line is valid, the data has been modified, the data in the cache is inconsistent with the data in the memory, and the data It only exists in this cache
  • the S state indicates that the cache line is valid, the data in the cache is consistent with the data in the memory, and the data exists in multiple caches
  • the I state indicates that the cache line is invalid.
  • the storage consistency model includes: sequential consistency (sequential consistency, SC) model, total storage order (total store order, TSO) model, relaxed model (relax model) , RM) and so on.
  • SC sequential consistency
  • TSO total storage order
  • TSO total store order
  • RM relaxed model
  • the SC model requires that the operation sequence of reading and writing shared memory on the hardware be strictly consistent with the operation sequence required by the software instructions;
  • TSO model introduces a cache mechanism on the basis of the SC model, and relaxes the requirements for write-read (write first, then read) operations.
  • the sequence constraints of the write-read operation can be completed before the write operation;
  • the RM model is the most relaxed, and does not impose sequence constraints on any read and write operations.
  • the nodes adopt a multi-path networking mode so that the system can support multi-path communication.
  • FIG. 2 shows a schematic diagram of nodes adopting multipath networking in a system adopting a scale-out mode.
  • the first node (Node0 among the figure) sends a message sequence to the second node (Node3 among the figure), and the message in the message sequence is used to request data storage at the target address in the memory storage space in the second node Pick.
  • the system can select a path to transmit the messages in the message sequence according to the congestion degree (that is, the degree of busyness) of each path.
  • the system selects a currently relatively idle path Node0-Node1-Node3 to transmit message 1, so that communication resources can be fully utilized to achieve congestion
  • the purpose of control is to improve interconnection efficiency.
  • the system that adopts the scale-out network supports the out-of-order propagation mode on the path, but has a certain relationship with the visible order of the execution results.
  • In-order (in-order) requirements of the external system such as I/O device docking, because the external system requires compliance with certain sequence constraints (such as the sequence constraints of the SC model or TSO model), so for messages from the external system Sequence, it is necessary to ensure that the visible order of the execution results of the corresponding data access operations in the Scale-out system complies with the corresponding sequence constraints to meet the expectations of the external system, and the Scale-out system can return the corresponding data access to the external system according to the sequence constraints
  • the execution result of the data access operation corresponding to the message that is, the data access response).
  • a current practice in the industry is to use serial processing on a single path, and the sending node sends messages in order. After sending a message , wait for the message to be processed by the receiving node before sending the next message.
  • FIG. 3 shows a schematic diagram of a scale-out system adopting a serial processing manner to ensure that the visible order of the execution results of the message sequence complies with the order constraint.
  • the message sequence (message 0, message 1, ..., message N) sent by the external I/O device
  • the message sequence of messages is sent to the second node (Node1 in the figure) for processing.
  • the first node (Node0) receives the message sequence sent by the I/O device.
  • the I/O device has strict requirements on the visible order of the execution results.
  • the first node (Node0) as the upstream master node (Master node) needs to send messages in order, and transmit them to the downstream slave node (Slave) through an out-of-order communication path Two nodes (Node1), and process messages on the second node (Node1) in the order in which they are received.
  • the second node (Node1) finishes processing the received message, it needs to complete a handshake (that is, reply a response message) with the first node (Node0) as the upstream node, and the first node (Node0) in After receiving the response message, the next message is sent to the second node (Node1) as the downstream node to ensure that the visible order of the execution results of the message sequence is consistent with the order of the messages in the message sequence, thereby satisfying the I/O A device's requirement for the visible order of execution results of data access operations.
  • the first node (Node0) receives a message sequence (message 0, message 1, ..., message N) from the I/O device, and first sends message 0 to the second node (Node1) To enable the second node to perform corresponding data access operations, after receiving the response 0 returned by the second node (Node1), send message 1 to the second node (Node1) to enable the second node to perform corresponding data access operations operation, and after receiving the response 1, send the next message to the second node, and so on, until the message N is sent to the second node (Node1) and the response N returned by the second node is received, so that the first node (Node0) can get globally visible data access operation execution results (response 0, response 1, ..., response N), the order of the execution results is consistent with the order of the messages in the message sequence, which is consistent with the I/O device pair Requirements for the visibility order of the execution results of data access operations.
  • a message sequence messages, message 1, ..., message N
  • sequence numbers In order to reduce the processing delay, another practice in the current industry is to use sequence numbers to ensure the execution order.
  • a corresponding set of sequence numbers is used to mark a message sequence. Messages carrying sequence numbers can be delivered out of order on each path, and the downstream node restores the corresponding sequence number according to the sequence number in the message. order.
  • FIG. 4 shows a schematic diagram of using sequence numbers in a scale-out system to ensure that the visible sequence of the execution results of the message sequence complies with the sequence constraint.
  • Messages carrying sequence numbers can travel out of order on the path.
  • the second node (Node1) After the second node (Node1) receives the message, it uses a mechanism similar to the reorder buffer (ROB) to restore the sequence according to the sequence number carried in the message.
  • the response message returned by the second node (Node1) to the first node (Node0) may also be transmitted out of sequence.
  • the first node (Node0) can determine the order of the response messages according to the recorded message sequence numbers to ensure the visible order of the execution results of the data access operations.
  • response 0 is the execution result of the data access operation corresponding to message 0
  • response 1 is the execution result of the data access operation corresponding to message 0
  • response N is the execution result of the data access operation corresponding to message N.
  • the first node can determine the response 0 to response N order, and then return the response to the I/O device in order.
  • each process between nodes (a process corresponds to a message sequence) needs to be configured with a corresponding sequence number (sequence number), high complexity, need to occupy more resources.
  • sequence number sequence number
  • each path since each path can be used for the communication of the process, each path needs to support the corresponding sequence number (sequence number), and the overhead varies with the number of paths. It is unacceptable for the system to grow rapidly.
  • the embodiments of the present application provide a data access method, interconnection system and device.
  • the embodiments of the present application can be applied to a system that adopts scale-out and multi-path networking, and realizes the problem of order preservation in a multi-path scenario with relatively small overhead.
  • the embodiment of the present application expands the cache coherence domain (Cache Coherence Domain, CC range) in the scale-out system, and includes the junction node between the scale-out system and the external system into the CC range of the downstream node, so that the execution result can be Visible order control operations are migrated from downstream nodes to junction nodes, that is, the visible order of execution results of data access operations is controlled on junction nodes.
  • the execution result can be Visible order control operations are migrated from downstream nodes to junction nodes, that is, the visible order of execution results of data access operations is controlled on junction nodes.
  • it can ensure that data access operations can be processed in parallel in the scale-out system , to ensure the bandwidth
  • it can satisfy the visible order of the execution results of the data access operation and meet the requirements of the external system Basically, save system overhead.
  • the junction node between the Scale-out system and the external system refers to the node in the Scale-out system that receives the message sequence sent by the external system (such as an external I/O device), for example, in Figure 2, Figure 3 or Figure 4
  • the first node of may be referred to as a junction node.
  • FIG. 5 it is a schematic diagram of an architecture of a scale-out system in an embodiment of the present application.
  • the system includes a first node (Node0 shown in the figure), a second node (Node3 shown in the figure), a third node (Node1 shown in the figure), a fourth node Node (Node2 as shown in the figure).
  • Each node is interconnected to form a multi-path network structure. Taking the path between the first node and the second node as an example, there are three paths between the first node and the second node: Node0-Node3, Node0-Node1-Node3, and Node0-Node2-Node3. It should be noted that the number of nodes included in the system in actual application may be more or less than the number of nodes shown in FIG. 5 , and the embodiment of the present application does not limit the number of nodes in the system.
  • a typical application of the above-mentioned system architecture is an HPC chip interconnection system.
  • the above-mentioned nodes may be System on Chip (SoC) chips.
  • SoC System on Chip
  • the nodes in the HPC chip interconnection system can exchange information with the nodes outside the system, wherein the nodes outside the system can be I/O devices.
  • I/O devices and the SoC chip are expanded through a high-speed serial computer
  • the I/O device can be a PCIE board
  • the I/O device and the SoC chip are connected by a network transmission protocol
  • the I/O device can be an Ethernet interface .
  • junction nodes or upstream nodes
  • junction nodes can receive messages from external systems sequence
  • the message in the message sequence is a data access request message, which is used to request data access to the address in the storage space managed by the target node (or downstream node) in the scale-out system operate.
  • the storage space managed by the target node refers to the storage space of the memory in the target node.
  • any node may become a junction node (or upstream node) or a target node (or downstream node).
  • the first node receives the message sequence sent by the I/O device outside the system
  • the first node is the junction node
  • the data access request message in the message sequence corresponds to
  • the target address belongs to the storage space managed by the second node
  • the second node is the target node corresponding to the message sequence.
  • the third node is called the target node corresponding to the message sequence.
  • the embodiment of the present application migrates the exclusive processing authority (also called E-state) of the target address from the downstream node to the junction node (upstream node) by expanding the cache consistency range (CC range), so that the The control operations that take the visible order of the execution results of the operations are migrated from the downstream nodes to the junction nodes.
  • the boundary node can undertake the control operation of the visible sequence of the execution results of the data access operation, This ensures that the visible order of the execution results meets the requirements of the external system.
  • the first node receives a message sequence sent by an I/O device outside the system, and the message sequence includes N data access request messages (Req0, Req1,... ,ReqN), N is an integer greater than or equal to 2.
  • the N data access request messages are used to request data access to addresses in the storage space managed by the second node (Node3) in the system, that is, the target addresses corresponding to the N data access request messages belong to The address space of the memory on the second node.
  • the first node (Node0) obtains the exclusive processing right of the corresponding address from the second node (Node3), so that the exclusive processing right of the corresponding address is migrated from the second node (Node3) to the first node (Node0), as shown in the figure, the original Only in the CC range of the second node (Node3) (as shown in a in Figure 5), after adopting the embodiment of the present application, the exclusive processing right of the address can be migrated (that is, E-state migration), and the first node (Node0 ) into the CC range (as shown in b in Figure 5).
  • the exclusive processing authority of the target address is migrated from the downstream node to the junction node, so that the junction node can perform data access operations based on storage consistency requirements, and can perform the visible order of the execution results of the data access operations. Control without additional sequential processing by downstream nodes, so that the multipath networking of the scale-out system can be realized with a small overhead.
  • nodes in the system can select a path based on the congestion mechanism, and transmit messages to other nodes through the selected path. Messages between nodes can be transmitted out of order through multiple paths to reduce system delay and improve bandwidth utilization.
  • a typical application scenario is the interaction between I/O devices (or I/O systems) and HPC systems.
  • the message sequence sent by the I/O device to the HPC system usually has requirements for the visible order of the execution results, and the HPC system is interconnected in a multi-path scale-out mode, and the messages are random on the path between nodes. In-order transmission, and can be multi-path transmission. Therefore, for the message sequence sent by the I/O device, after entering the scale-out system, it is necessary to ensure that the visible order of the execution results meets the requirements of the I/O device.
  • the following describes the data access process provided by the embodiment of the present application with reference to FIG. 6 and FIG. 7 .
  • the process can be applied to an interconnection system, and the interconnection system can include at least two nodes interconnected in a scale-out manner, and the at least two nodes interconnected in a scale-out manner include the first Both the node and the second node, for example, the first node and the second node are SoC chips.
  • the following process is described by taking the first node as a junction node and the second node as a target node as an example.
  • FIG. 6 is an overall block diagram of the data access process provided by the embodiment of the present application.
  • FIG. 7 is an interactive schematic diagram of the data access process in a specific application scenario in the embodiment of the present application.
  • the first node in the interconnected system receives a message sent by a node (such as an I/O device) outside the system.
  • a message sequence the message sequence includes a first message and a second message, the first message is a write request for requesting a write operation to the first target address, and the second message is a write request for requesting a write operation to the second target address
  • a write operation is performed, wherein the first target address and the second target address are addresses in a storage space managed by a second node in the interconnection system.
  • the visible order of the execution results of the message sequence needs to comply with the sequence constraints. For example, the visible order from front to back is: the data access execution result corresponding to the first message, and the data access execution result corresponding to the second message.
  • FIG. 7 only uses two write requests included in the message sequence as an example. For messages of other types of data access operations (such as read requests), or the message sequence contains more messages, it can be based on The principle implementation of the flow shown in Figure 7.
  • the data access process provided by the embodiment of the present application may include the following steps:
  • the first node receives a message sequence from an external node.
  • the external node is a node outside the interconnection system, such as an I/O device.
  • the external node obeys the execution result order constraint.
  • the external node has stricter constraints on the visible order of the execution result of the data access operation than the first node.
  • the external node Nodes comply with the execution result visibility order requirements required by the SC model or TSO model or other types of storage consistency models.
  • the message sequence includes at least two data access request messages, the data access request messages are used to request data access to the target address, for example, the message sequence may include a write request for writing data to the target address , which can also include a read request to read data from the target address.
  • the target address belongs to the storage space managed by the second node in the system, that is, the target address is an address in the storage space managed by the second node.
  • the target address is a physical address of a memory on the second node.
  • the message sequence may include a write request for requesting to write data to the memory of the second node, and/or a request for requesting to read data from the memory of the second node.
  • the access operation or read-write operation involved in the embodiment of the present application can support write-write (write first and then write), write-read (write first and then read), read-write (read first and then write), read-read ( Read first, then read) and other operations.
  • the message types of all the messages in the message sequence may be the same, for example, they are all write requests or all of them are read requests, or they may be different, for example, some messages are write requests, while others are read requests.
  • the target addresses corresponding to the messages in the message sequence may be the same or different, or the target addresses corresponding to some messages are the same, and the target addresses corresponding to other part messages are different.
  • a node outside the system sends a message sequence to the first node in the interconnection system, and the message sequence includes a first message (write request 1) and a second message (write request 2).
  • the first message (write request 1) carries the data to be written 1 and the first target address, and is used to request that the data to be written 1 be stored in the first target address;
  • the second message (write request 2) carries the data to be written 2
  • the second target address is used for requesting to store the to-be-written data 2 into the second target address. Both the first target address and the second target address belong to the storage space managed by the second node in the system.
  • the first node obtains the exclusive processing authority of the above-mentioned target address.
  • the exclusive processing authority of the target address can refer to the E state in the cache consistency, which means that the node has the operation authority of data access for this address, and the node that has the exclusive processing authority of the target address corresponding to the message sequence (such as in this process The first node of ), has the ability to process the visible order of execution results for this message sequence.
  • the message sequence includes the first message and the second message, and the target address corresponding to the first message is the first target address, and the target address corresponding to the second message is the second target address.
  • the CC range is extended from the second node to the first node, so that the first node participates in the management of cache consistency, and other nodes (such as the second node) cannot
  • the target address performs a data access operation requiring operation authority, and the control of the visible order of the execution results is transferred to the first node, that is, the first node controls the visible order of the execution results of the data access operation.
  • the first node may adopt one of the first acquisition method and the second acquisition method to acquire the exclusive processing authority of the target address.
  • the first acquisition method refers to the process of initiating the authorization acquisition process after receiving the message sequence, that is, the receiving operation of the message sequence can trigger the authorization acquisition process;
  • the second acquisition method is to obtain exclusive processing from other nodes (such as the second node) in advance permission.
  • the first acquisition manner and the second acquisition manner will be described in detail below respectively.
  • the first node After the first node receives the message sequence from the external node of the system, it determines that the target address corresponding to the message in the message sequence belongs to the storage space managed by the second node, and then from the second node The node acquires exclusive processing rights for that destination address.
  • the process for the first node to acquire the exclusive processing authority of the target address from the second node may include: after the first node receives a message sequence from a node outside the system, it sends a permission acquisition request to the second node, and the permission The acquisition request carries a target address, and the target address includes the target address corresponding to at least one message in the message sequence; the second node sends a permission acquisition response to the first node, and the permission acquisition response is used to indicate that the first node is allowed to obtain the corresponding target Exclusive processing rights for the address. Further, after receiving the permission acquisition response sent by the second node, the first node may send a confirmation message to the second node to inform the second node that the first node has obtained the exclusive processing permission of the corresponding target address.
  • the first node may acquire the authority in units of a cache line (cacheline). Taking the cache line size (linesize) as 64 bytes as an example, a permission acquisition request is used to obtain an exclusive processing permission for an address range with a capacity of 64 bytes.
  • the first node can also acquire permissions at a greater granularity, that is, a permission acquisition request can acquire an exclusive processing permission for an address range whose capacity is larger than the size of a cache line.
  • the exclusive processing authority of the address range of the capacity can be obtained in units of 4KB pages. In this way, the E-state acquisition of the target address corresponding to the entire message sequence can be realized through a permission acquisition request, thereby reducing the Complexity can also reduce system overhead.
  • the path through which the first node sends the permission acquisition request and the order in which it is sent there are no strict requirements on the path through which the first node sends the permission acquisition request and the order in which it is sent.
  • different permission acquisition requests may be sent from different paths or the same path, and may be sent out of order or in parallel.
  • the embodiment of the present application does not have strict requirements on the path through which the second node sends the permission acquisition response and the order of sending.
  • different permission acquisition responses can be sent from different paths or the same path, and can be sent out of order or in parallel.
  • the paths passed by the permission acquisition request and the corresponding permission acquisition response may be the same or different.
  • the first node may select an appropriate path based on the congestion control mechanism to send a permission acquisition request to the second node to reduce delay and improve communication efficiency; the second node may also use the congestion control mechanism to An appropriate path is selected to send a permission acquisition response to the first node.
  • the first node after receiving the message sequence, the first node sends a permission acquisition request through the first path between the first node and the second node, wherein the first path is based on the congestion status of each path between the first node and the second node
  • the first path may be the path with the lowest current busyness among all the paths between the first node and the second node.
  • the second node may send a corresponding authority acquisition response through a second path between the first node and the second path is selected by the second node according to the congestion status of each path between the first node and the second path may be The path that is currently least busy among all the paths between the second node and the first node.
  • the first path and the second path may be the same or different.
  • FIG. 8a and FIG. 8b show schematic diagrams in which the first node obtains the exclusive processing authority of the destination address corresponding to each message based on multipath.
  • the message sequence includes message 0, message 1, ..., message N, wherein the permission acquisition request (Req0: Get_E) corresponding to message 0 passes through the direct path between the first node and the second node (Node0 -Node3) transmission, the permission acquisition request (Req1:Get_E) corresponding to message 1 is transmitted through the indirect path (Node0-Node2-Node3) between the first node and the second node, and the permission acquisition request corresponding to message 2 (Req2:Get_E) Transmitted through an indirect path (Node0-Node1-Node3) between the first node and the second node.
  • the first node may select an appropriate path based on the congestion control mechanism to transmit the permission acquisition request corresponding to each message.
  • the authority acquisition response (Req0:Res) corresponding to the authority acquisition request (Req0:Get_E) is transmitted through the indirect path (Node3-Node1-Node0) between the second node and the first node, and the authority acquisition request (Req1
  • the permission acquisition response (Req1:Res) corresponding to :Get_E) is transmitted through the direct path (Node3-Node0) between the second node and the first node
  • the second node may select an appropriate path based on the congestion control mechanism to transmit the permission acquisition response corresponding to each message. Responses to acquiring permissions may also be sent in parallel.
  • the first node sends a first permission acquisition request (GET_E1) and a second permission acquisition request (GET_E2) to the second node, and the first permission acquisition request and the second permission acquisition request may be GET_E information.
  • the first permission acquisition request includes the first target address, which is used to request the exclusive processing permission of the first target address
  • the second permission acquisition request message includes the second target address, and is used to request the exclusive processing permission of the second target address .
  • the order in which the first node sends the first permission acquisition request and the second permission acquisition request to the second node is not limited, and the paths passed by the first permission acquisition request and the second permission acquisition request can be selected by the first node based on the congestion control mechanism path of.
  • the second node After the second node receives the first permission acquisition request, it sends a first permission acquisition response (RSP1) to the first node, and the second node sends a second permission acquisition response (RSP2) to the first node after receiving the second permission acquisition request.
  • the first authority acquisition response is a response message (RSP1) to the first authority acquisition request, and is used to indicate that the exclusive processing authority of the first target address is transferred to the first node;
  • the second authority acquisition response is a response message to the second authority acquisition request (RSP2), for instructing to transfer the exclusive processing right of the second target address to the first node.
  • the order in which the second node sends the first permission acquisition response and the second permission acquisition response to the first node is not limited, and the paths passed by the first permission acquisition response and the second permission acquisition response can be selected by the second node based on the congestion control mechanism path of.
  • the first node After the first node receives the first authority acquisition response, it sends the first confirmation message (ACK1) to the second node to inform the second node that the first node has obtained the exclusive processing authority of the first target address; the first node receives After receiving the second permission acquisition response, a second confirmation message (ACK2) is sent to the second node to inform the second node that the first node has obtained the exclusive processing permission of the second target address.
  • ACK1 the first confirmation message
  • ACK2 After receiving the second permission acquisition response, a second confirmation message (ACK2) is sent to the second node to inform the second node that the first node has obtained the exclusive processing permission of the second target address.
  • the order in which the first node sends the first confirmation message and the second confirmation message to the second node is not limited, and the paths passed by the first confirmation message and the second confirmation message may be the path selected by the first node based on the congestion control mechanism.
  • the first node may obtain the exclusive processing authority of the address in the storage space managed by other nodes (such as the second node) in advance.
  • the first node may obtain from the second node the exclusive processing authority of the address in the storage space managed by the second node according to a set period or a set time, or when it is idle.
  • the first node can access the addresses involved in the operation according to the historical data (That is, the target address corresponding to the historical data access operation), determine the address belonging to the storage space managed by the second node in the target address corresponding to the historical data access operation, and determine the specified address range accordingly, and obtain the specified address from the second node Exclusive processing rights for addresses within the address range.
  • the historical data That is, the target address corresponding to the historical data access operation
  • determine the address belonging to the storage space managed by the second node in the target address corresponding to the historical data access operation and determine the specified address range accordingly, and obtain the specified address from the second node Exclusive processing rights for addresses within the address range.
  • the specified address range matches the target address involved in the historical data access operation, for example, it may be the same as the target address range involved in the historical data access operation, or may include the target address involved in the historical data access operation, And further expand the address range appropriately on this basis.
  • the process for the first node to acquire the exclusive processing authority of the target address from the second node may include: the first node sends a permission acquisition request to the second node, and the permission acquisition request carries the target address address; the second node sends an authority acquisition response to the first node, where the authority acquisition response is used to instruct to transfer the exclusive processing authority of the target address to the first node.
  • the target address may be an address within the specified address range determined by the above method.
  • the path through which the first node sends the permission acquisition request and the order in which it is sent there are no strict requirements on the path through which the first node sends the permission acquisition request and the order in which it is sent.
  • different permission acquisition requests may be sent from different paths or the same path, and may be sent out of order or in parallel.
  • the embodiment of the present application does not have strict requirements on the path through which the second node sends the permission acquisition response and the order of sending.
  • different permission acquisition responses can be sent from different paths or the same path, and can be sent out of order or in parallel.
  • the paths passed by the permission acquisition request and the corresponding permission acquisition response may be the same or different.
  • the first node may acquire the authority in units of a cache line (cacheline). Taking the cache line size (linesize) as 64 bytes as an example, a permission acquisition request is used to obtain an exclusive processing permission for an address range with a capacity of 64 bytes.
  • the first node can also acquire permissions at a greater granularity, that is, a permission acquisition request can acquire exclusive processing permissions for an address range whose capacity is greater than the size of a cache line.
  • the exclusive processing authority of the address range of the capacity can be obtained in units of 4KB pages. In this way, the E-state acquisition of a larger address range can be realized through a permission acquisition request, which can reduce complexity and also System overhead can be reduced.
  • the exclusive processing authority of the migrated address has a validity period, and the validity period indicates that the address
  • the maximum length of time for which the exclusive processing authority of is migrated to other nodes (such as the first node above), and the validity period can be set in advance.
  • the upstream node (such as the first node in the embodiment of this application) may fail, for example, the first node has been hot-swapped and removed from the system, so that the first node will not be able to
  • the exclusive processing authority of the target address obtained from the second node is returned to the second node, causing a consistency error when the system subsequently performs data access operations on the target address.
  • the above-mentioned problem can be solved by setting the validity period of the authority. For the convenience of description, take the transfer of the exclusive processing authority of the first target address as an example.
  • the second node After the exclusive processing authority of the first target address is transferred from the second node to the first node, if the exclusive processing authority of the first target address within the set time If the processing right has not been returned to the second node, for example, if the message sent by the first node indicating the return of the exclusive processing right of the first target address is not received within the set time period, the second node will regain the first target address Exclusive processing rights for the destination address.
  • the set duration is the duration corresponding to the validity period.
  • the first node obtains the exclusive processing right of the first target address, if the set time If no data access operation is performed on the first target address, it indicates that the exclusive processing authority for data access to the first target address has timed out or expired, and the first node will release the exclusive processing authority for the first address, that is, the The exclusive processing right of the first target address is returned to the second node, for example, a message may be sent to the second node to indicate returning the exclusive processing right of the first target address.
  • the first node may start a first timer to time the validity period of the exclusive processing right of the first target address
  • the second node may start a second timer to time the validity period of the exclusive processing right of the first target address.
  • the first node sends a permission acquisition request to request the exclusive processing permission of the first target address, it can start a first timer for the exclusive processing permission of the first target address, or when receiving the permission acquisition request from the second node
  • a first timer is started for the exclusive processing permission of the first target address.
  • the second node may start a second timer for the exclusive processing right of the first target address when receiving the above-mentioned permission acquisition request sent by the first node, or when the second node sends a corresponding permission acquisition response to the first node , starting a second timer for the exclusive processing authority of the first target address.
  • the first node if the first node performs a data access operation on the first target address during the running period of the first timer, the first timer is released; at the second node, when the During the running period of the second timer, if an instruction to return the exclusive processing authority of the first target address is received, the second timer is released.
  • the duration of the first timer and the second timer are the same, and the duration is the duration value of the validity period.
  • the duration of the first timer and the second timer are different, wherein the duration of the first timer is shorter than the duration of the second timer.
  • the first node may also send a confirmation message to the second node to indicate that the exclusive processing permission of the first target address has been obtained .
  • the second node receives the confirmation message returned by the first node to confirm that the first node has obtained the exclusive processing right of the first target address, it can start the second timing for the exclusive processing right of the first target address In this way, the timing of the validity period of the exclusive processing right of the first target address to the first node and the second node will be relatively close, which can help the consistency of the exclusive processing right of the first target address on the first node and the second node sex.
  • the duration of the first timer and the duration of the second timer can be set to be substantially the same.
  • FIG. 9 a , FIG. 9 b and FIG. 9 c respectively show the validity period timing processing of the exclusive processing right of the first target address and the processing situation after the timeout expires.
  • the first node sends a first permission acquisition request (GET_E1) to the second node to request the exclusive processing permission of the first target address; after receiving the first permission acquisition request, the second node sends the first
  • the node sends a first authority acquisition response (RSP1) to transfer the exclusive processing authority of the first target address to the first node, and starts a second timer to time the time for which the exclusive processing authority of the first target address is migrated.
  • the duration of the second timer is Th; after the first node receives the first permission acquisition response, it starts the first timer to time the duration of the migration of the exclusive processing permission of the first target address, and the duration of the first timer is The duration is Tm, Tm ⁇ Th.
  • the first node is removed without returning the exclusive processing authority of the first target address, and cannot send the return of the first target address to the second node An indication of the exclusive processing right; on the second node, when the second timer expires, the second node regains the exclusive processing right of the first target address.
  • the first node sends a first permission acquisition request (GET_E1) to the second node to request the exclusive processing permission of the first target address; after receiving the first permission acquisition request, the second node sends the first
  • the node sends a first authority acquisition response (RSP1) to transfer the exclusive processing authority of the first target address to the first node, and starts a second timer to time the time for which the exclusive processing authority of the first target address is migrated.
  • the duration of the second timer is Th; after the first node receives the first permission acquisition response, it starts the first timer to time the duration of the migration of the exclusive processing permission of the first target address, and the duration of the first timer is The duration is Tm, Tm ⁇ Th.
  • the first timer expires, and the first node sends a return first target address to the second node when the first timer expires. Indicates the exclusive processing right of the address, and further releases the first timer; after receiving the instruction, the second node obtains the exclusive processing right of the first target address, and further releases the second timer.
  • the first node sends a first permission acquisition request (GET_E1) to the second node to request the exclusive processing permission of the first target address; after receiving the first permission acquisition request, the second node sends the first
  • the node sends a first authority acquisition response (RSP1) to transfer the exclusive processing authority of the first target address to the first node, and starts a second timer to time the time for which the exclusive processing authority of the first target address is migrated.
  • the duration of the second timer is Th; after the first node receives the first permission acquisition response, it starts the first timer to time the duration of the migration of the exclusive processing permission of the first target address, and the duration of the first timer is The duration is Tm, Tm ⁇ Th.
  • the first node performs a data access operation to the first target address, and sends an instruction to return the exclusive processing authority of the first target address to the second node; After receiving the instruction, the node regains the exclusive processing authority of the first target address.
  • the data access operation of the first node to the first target address may be successful or may fail, no matter whether the data access operation to the first target address is successful or not, the first node can send the return of the first target address to the second node indication of exclusive processing rights.
  • the first node performs a data access operation on the target address based on the exclusive processing authority of the target address.
  • the visible order of the execution results of the data access operation satisfies the above-mentioned constraint of the external nodes on the visible order of the execution results.
  • the first node after the first node obtains the first target address and the second target address, it can base the exclusive processing authority of the first target address on the first target address A data access operation is performed, and a data access operation is performed on the second target address based on the exclusive processing authority of the second target address.
  • the sequence of performing data access operations on the first target address and performing data access operations on the second target address only needs to meet storage consistency requirements. For example, if the first target address and the second target address are different addresses, the data caching operation can be performed on the first target address according to the first message, and then the data caching operation can be performed on the second target address according to the second message.
  • the data cache operation can be performed on the second target address according to the second message first, and then the data cache operation can be performed on the first target address according to the first message, and the data cache operation on the first target address and the second target address can also be performed in parallel. Perform data caching operations. For another example, if the first target address and the second target address are the same address, it is necessary to first perform a data caching operation on the target address according to the first message, and then perform a data caching operation on the target address according to the second message.
  • the visible order of the execution results of the data access operation needs to meet the visible order constraints of the execution results of the data storage operations observed by the above-mentioned external nodes. For example, if the visible order of the execution results required by the external node is: from front to back, the execution results corresponding to the first message and the execution results corresponding to the second message, then on the first node, the order of the execution results of the data access operation is in order It is: the execution result of the data access operation on the first target address, and the execution result of the data access operation on the second target address, and the sequence is globally visible.
  • the first node may obtain a cache address corresponding to the target address from the second node, and perform data access to the cache address.
  • the first node may send an address acquisition request to the second node, which carries the target address (such as the first target address and/or the second target address), and the second node sends the first address acquisition request to the first node after receiving the address acquisition request.
  • the node sends an address acquisition response, which carries the cache address corresponding to the target address.
  • the first node can send the address acquisition requests corresponding to the messages according to the visible order of the execution results required by the above-mentioned external nodes, and can record the sending order of the address acquisition requests as a global record of the execution results. Visible order. For example, according to the constraints of external nodes on the visible order of execution results, the visible order of execution results should be consistent with the order of the message sequence, so the first node sends corresponding messages to the second node in sequence according to the order of the message sequence. ask.
  • the first node may select an appropriate path based on the congestion control mechanism and send an address acquisition request to the second node to reduce delay and improve communication efficiency; the second node may also select an appropriate path based on the congestion control mechanism The mechanism selects an appropriate path and sends an address acquisition response to the first node.
  • the first node sends an address acquisition request through a first path with the second node, wherein the first path is selected by the first node according to the congestion status of each path between the first node and the second node, and the first path It may be the currently least busy path among all the paths between the first node and the second node.
  • the second node can send the corresponding address to get the response through the second path between the first node, the second path is selected by the second node according to the congestion status of each path between the first node and the second path can be The path that is currently least busy among all the paths between the second node and the first node.
  • the first path and the second path may be the same or different.
  • the first node after the first node obtains the exclusive processing authority of the first target address, it sends a first address acquisition request to the second node, which may carry the first target address and data access operation type (In this example, the data access operation type is write operation), which is used to request to obtain the cache address corresponding to the first target address; after the first node obtains the exclusive processing authority of the second target address, it can send Send a second address acquisition request, which may carry a second target address and a data access operation type (in this example, the data access operation type is a write operation), and is used to request to acquire a cache address corresponding to the second target address.
  • the first address acquisition request and the second address acquisition request may be writeback (WriteBack) messages.
  • the order in which the first node sends the first address acquisition request and the second address acquisition request to the second node is not limited, and the paths passed by the first address acquisition request and the second address acquisition request can be selected by the first node based on the congestion control mechanism path of.
  • the second node After the second node receives the first address acquisition request, it sends a first address acquisition response (RSP3) to the first node to indicate the second cache address corresponding to the first target address; the second node receives the second address acquisition request Afterwards, a second address acquisition response (RSP4) is sent to the first node, which is used to indicate the first cache address corresponding to the second target address.
  • RSP3 first address acquisition response
  • RSS4 second address acquisition response
  • the order in which the second node sends the first address acquisition response and the second address acquisition response to the first node is not limited, and the paths passed by the first address acquisition response and the second address acquisition response can be selected by the second node based on the congestion control mechanism path of.
  • the first node obtains the cache address corresponding to the first target address, and after writing the data to be written 1 into the first cache address, it can further send an instruction to return the exclusive processing right of the first target address to the second node, and the second node can according to This instruction regains the exclusive processing authority of the first target address; the first node obtains the cache address corresponding to the second target address, and after writing the data to be written 2 into the second cache address, it can further send the second target address to the second node An indication of the exclusive processing right of the address, according to which the second node can regain the exclusive processing right of the second target address.
  • the first node After the first node completes the data access operation, it can return the exclusive processing authority of the corresponding address to the second node, so that the second node or other nodes can perform data access operations on the address.
  • S604 The first node sends the execution result of the data access operation to the target address to the external node according to the restriction on the visible order of the execution result by the external node.
  • the visible order of the execution results required by it is: from front to back, the execution results corresponding to the first message, and the execution results corresponding to the second message
  • the first node sequentially sends a first response and a second response to the external node, wherein the first response is a response message to the first message, and the second response is a response message to the second message.
  • the first node before the first node performs data access to the target address corresponding to a message (such as the first message) in the message sequence, if an exclusive processing authority acquisition request (the request is used to request to acquire the exclusive processing authority of the first target address), release the exclusive processing authority of the first target address, so that other nodes can obtain the exclusive processing authority of the first target address.
  • the other nodes may be other nodes in the system where the first node is located except the first node, and the priority of the other nodes may be higher than that of the first node, or the data access request of the other nodes to the first target address may have a higher priority than the corresponding message in the sequence of messages received by the first node.
  • the visible sequence control operation of the execution result of the message sequence is completed at the upstream node, without interaction between the upstream node and the downstream node, and without using a sequence number-based reordering (reorder) mechanism to Ensure the controllability of the visible sequence of execution results, thereby reducing system overhead.
  • the embodiment of the present application also provides a SoC chip, which may include: one or more processors, and one or more memories; wherein, the one or more memories store one or more computer program, the one or more computer programs include instructions, and when the instructions are executed by the one or more processors, the SoC chip is made to execute the methods provided in the foregoing embodiments.
  • a SoC chip which may include: one or more processors, and one or more memories; wherein, the one or more memories store one or more computer program, the one or more computer programs include instructions, and when the instructions are executed by the one or more processors, the SoC chip is made to execute the methods provided in the foregoing embodiments.
  • an embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program runs on a computing device, the computing device executes the method provided in the foregoing embodiments .
  • an embodiment of the present application further provides a computer program product, which, when invoked by a computer, causes the computer to execute the method provided in the foregoing embodiments.
  • sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • a software program it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server, or data center Transmission to another website site, computer, server or data center via wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or may contain one or more data storage devices such as servers and data centers that can be integrated with the medium.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.
  • a magnetic medium such as a floppy disk, a hard disk, or a magnetic tape
  • an optical medium such as a DVD
  • a semiconductor medium such as a solid state disk (Solid State Disk, SSD)

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

一种数据存取方法、互联系统及装置,应用于存储技术领域。该方法中,互联系统中的第一节点接收来自外部节点的消息序列,获取该消息序列对应的目标地址的独占处理权限,并基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,其中,所述数据存取操作的执行结果的可见顺序满足所述外部节点对执行结果的可见顺序的约束。其中,该消息序列中的消息用于请求对目标地址进行数据存取,该目标地址归属于所述第二节点管理的存储空间,该外部节点对执行结果的可见顺序的约束比第一节点严格。采用该方法可针对采用横向扩展方式互联的互联系统,以较小的开销控制消息序列的执行结果的可见顺序。

Description

一种数据存取方法、互联系统及装置 技术领域
本申请涉及数据存储技术领域,尤其涉及一种数据存取方法、互联系统及装置。
背景技术
面对逐渐增长的系统性能需求,可通过对系统进行扩展来达到性能目标。常见的系统扩展方式有纵向扩展(也称向上扩展或Scale-up)和横向扩展(也称向外扩展或Scale-out)。两种方式均以节点(node)为单位进行系统扩展。一个节点通常包括处理能力、存储能力和通信能力等,一块芯片就是一个典型的节点。
纵向扩展(Scale-up)的方式主要基于节点自身的现有基础进行扩展,比如可以增加芯片上的处理器的数量以提高计算处理能力,如图1中的纵向扩展箭头方向所示,可在仅包含CPU 0和CPU 1的节点0中增加CPU2和CPU3,以增加处理器的数量;还可以增大存储器(memory)的容量来提高存储能力,如图1中的纵向扩展箭头方向所示,可将节点0中的存储器从2MB扩容到4MB;还可以优化输入输出(input/output,I/O)接口提升交互带宽等。横向扩展(Scale-out)的方式则是通过增加节点的数量来扩展系统规模,如图1中的横向扩展箭头方向所示,节点数量从两个节点(如图中的节点0和节点1)增加至4个节点(如图中的节点0~3)。在保证节点间通信效率的情况下,节点数量的增加就意味着系统整体性能的提升,典型的采用横向扩展(Scale-out)方式的系统如高性能计算(high performance computing,HPC)中的多芯片互联架构。横向扩展(Scale-out)相较于纵向扩展(Scale-up),系统组成方式更为灵活,随时可以增加或者删减节点。
在采用横向扩展(Scale-out)方式的系统中,节点之间的交互效率对于系统性能有显著影响。为了提高系统性能,系统中的节点采用多路径组网方式。多路径组网方式可以提高系统互联效率,但如何以较小的开销来保证对于执行结果可见顺序有要求的数据存取请求消息序列,在系统内部进行执行结果可见顺序的有序处理,是目前需要解决的问题。
发明内容
本申请实施例提供了一种数据存取方法、互联系统和装置,用以针对采用横向扩展方式互联的互联系统,以较小的开销控制消息序列的执行结果的可见顺序。
第一方面,提供一种数据存取方法,应用于互联系统,所述互联系统包括至少两个采用横向扩展方式互联的节点,所述至少两个采用横向扩展方式互联的节点包括第一节点和第二节点,所述方法包括:
所述第一节点接收来自于外部节点的消息序列,所述消息序列中的消息用于请求对目标地址进行数据存取,所述目标地址归属于所述第二节点管理的存储空间,所述外部节点为所述互联系统外部的节点,所述外部节点对执行结果的可见顺序的约束比第一节点严格;
所述第一节点获取所述目标地址的独占处理权限;
所述第一节点基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,其中,所述数据存取操作的执行结果的可见顺序满足所述外部节点遵守的执行结果顺序约束。
上述实现方式中,第一节点接收到来自于外部节点(即该互联系统外部的节点)的消息序列后,从第二节点获取相应目标地址(即该消息序列中的各消息对应的目标地址)的独占处理权限(即存储一致性中的E态),从而将目标地址的独占处理权限从第二节点迁移到第一节点,由于一个节点拥有消息序列所对应的目标地址的独占处理权限,即代表该节点针对该消息序列具有对执行结果的可见顺序的处理能力,因此第一节点能够控制消息序列的执行结果的可见顺序。由于上述实现方式中,无需采用串行通信方式,也无需引用基于序列号的交互过程以及基于序列号的重排序(reorder)机制,因此可以以较小的开销对执行结果的可见顺序进行控制。
在一种可能的实现方式中,所述第一节点获取所述目标地址的独占处理权限,包括:所述第一节点响应于接收所述消息序列的操作,从所述第二节点获取所述目标地址的独占处理权限。
上述实现方式中,当第一节点接收到消息序列后,根据该消息序列对应的目标地址获取对应地址的独占处理权限,可以有针对性的获取相应地址的独占处理权限,以减少系统开销。
在一种可能的实现方式中,所述第一节点响应于接收所述消息序列的操作,从所述第二节点获取所述目标地址的独占处理权限,包括:所述第一节点接收所述消息序列后,向所述第二节点发送权限获取请求,所述权限获取请求携带所述消息序列中的至少一个消息对应的目标地址;所述第一节点接收来自于所述第二节点的权限获取响应,所述权限获取响应用于指示将所述至少一个消息对应的目标地址的独占处理权限迁移到所述第一节点。
上述实现方式中,在权限获取过程,获取粒度为“至少一个消息对应的目标地址”的独占处理权限,即该实现方式允许该权限获取请求中携带整个消息序列所对应的目标地址且允许超过缓存线(cacheline)大小,即,相较于缓存线大小,该实现方式可采用更大粒度来获取目标地址的独占处理权限。由于权限获取请求中可以携带不止一个消息所对应的目标地址,甚至可以携带整个消息序列中的所有消息对应的目标地址,从而可以减少系统开销,提高权限的获取效率。
在一种可能的实现方式中,所述第一节点接收所述消息序列后,向所述第二节点发送权限获取请求,包括:所述第一节点接收所述消息序列后,通过与所述第二节点之间的第一路径发送所述权限获取请求,其中,所述第一路径是所述第一节点根据与所述第二节点之间各路径的拥塞状态选取的;所述第一节点从与所述第二节点之间的第二路径接收所述权限获取响应,所述第二路径和所述第一路径相同或不同。
上述实现方式中,一方面,由于第一节点可以基于拥塞控制机制选取合适的路径传输权限获取请求,因此可以提高通信效率,减少带宽占用。
在一种可能的实现方式中,所述目标地址的独占处理权限,是所述第一节点在接收所述消息序列之前从所述第二节点获取的。
上述实现方式中,第一节点可以提前获取目标地址的独立处理权限,这样,当接收到消息序列后,可以直接基于提前获取到的目标地址的独立处理权限对该消息序列对应的目标地址进行数据存取操作,从而可以减少数据存取操作的时延。
在一种可能的实现方式中,所述方法还包括:所述第一节点在接收所述消息序列之前,从所述第二节点获取指定地址范围的独占处理权限,所述指定地址范围包括所述消息序列中的消息对应的目标地址。
可选的,若所述第一节点在获取到所述指定地址范围的独占处理权限后,在设定时长内未对所述指定地址范围内的第一地址进行数据存取操作,则释放所述第一地址的独占处理权限,其中,所述第一地址为所述指定地址范围内的任一地址。
上述实现方式中,第一节点获取到目标地址(如第一地址)的独立处理权限后,若在设定时长内未对该目标地址进行数据存取操作,则表明该目标地址的独立处理权限失效,此种情况下,第一节点释放该目标地址的独立处理权限,以便其他节点能够获得该目标地址的独立处理权限以进行数据存取处理。
在一种可能的实现方式中,所述从所述第二节点获取指定地址范围的独占处理权限,包括:所述第一节点根据历史数据存取操作所对应的目标地址中,归属于所述第二节点管理的存储空间的地址,确定所述指定地址范围;所述第一节点从所述第二节点获取所述指定地址范围的独占处理权限。
上述实现方式中,基于统计,历史上对某些地址进行数据存取操作较为频繁的情况下,未来对该地址进行数据存取的概率也会较大,因此采用上述实现方式,可以使得第一节点提前获取到的目标地址的独立处理权限,有较大概率在后续的消息序列处理过程中被用到,因此可以有针对性地提前获取目标地址的独立处理权限,从而可以降低系统开销。
在一种可能的实现方式中,所述第一节点基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,包括:所述第一节点从所述第二节点获取与所述目标地址对应的缓存地址;所述第一节点基于所述目标地址的独占处理权限,对所述目标地址对应的缓存地址进行数据存取操作。
在一种可能的实现方式中,所述第一节点从所述第二节点获取与所述目标地址对应的缓存地址,包括:
所述第一节点通过与所述第二节点之间的第一路径发送地址获取请求,所述地址获取请求携带第一目标地址,所述第一目标地址包括所述消息序列中至少一个消息对应的目标地址,其中,所述第一路径是所述第一节点根据与所述第二节点之间各路径的拥塞状态选取的;所述第一节点从与所述第二节点之间的第二路径接收地址获取响应,所述地址获取响应携带与所述第一目标地址对应的第一缓存地址,所述第二路径和所述第一路径相同或不同。
上述实现方式中,由于第一节点可以基于拥塞控制机制选取合适的路径传输地址获取请求,因此可以提高通信效率,减少带宽占用。
在一种可能的实现方式中,所述第一节点基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作之后,还包括:所述第一节点释放所述目标地址的独占处理权限。
上述实现方式中,第一节点对消息序列中的消息对应的目标地址进行数据存取之后,释放相应目标地址进行数据存取的独立处理权限,从而可以使得其他节点获得该地址的独立处理权限以便对该地址进行数据存取操作。
在一种可能的实现方式中,所述方法还包括:所述第一节点获取到所述目标地址的独占处理权限后,若在设定长时间内未对所述目标地址进行数据存取操作,则释放所述目标地址的独占处理权限,以避免第一节点长时间占用地址的独占处理权限。
在一种可能的实现方式中,所述方法还包括:所述第一节点对所述消息序列中的第一消息对应的第一目标地址进行数据存取之前,若接收到用于请求获取所述第一目标地址的 独占处理权限的权限获取请求,则第一节点释放所述第一目标地址的独占处理权限,以便其他节点对该第一目标地址进行数据存取操作,该其他节点的优先级可能高于第一节点,或者该其他节点对第一目标地址的数据存取请求的优先级可能高于第一节点所接收到的消息序列中相应消息的优先级。
在一种可能的实现方式中,还包括:所述第一节点根据所述外部节点对执行结果的可见顺序的约束,向所述外部节点发送对所述目标地址进行数据存取操作的执行结果。
在一种可能的实现方式中,还包括:所述第一节点对所述消息序列中的第一消息对应的第一目标地址进行数据存取之前,若接收到用于请求获取所述第一目标地址的独占处理权限的权限获取请求,则释放所述第一目标地址的独占处理权限。
在一种可能的实现方式中,所述第一节点和所述第二节点分别为片上系统SoC芯片,所述外部节点为输入输出I/O设备。
第二方面,提供一种互联系统,所述互联系统包括至少两个采用横向扩展方式互联的节点,所述至少两个采用横向扩展方式互联的节点包括第一节点和第二节点,所述第一节点,用于:
接收来自于外部节点的消息序列,所述消息序列中的消息用于请求对目标地址进行数据存取,所述目标地址归属于所述第二节点管理的存储空间,所述外部节点为所述互联系统外部的节点,所述外部节点对数据存取操作执行结果的可见顺序的约束比第一节点严格;
获取所述目标地址的独占处理权限;
基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,其中,所述数据存取操作的执行结果的可见顺序满足所述外部节点对执行结果的可见顺序的约束。
在一种可能的实现方式中,所述第一节点,具体用于:响应于接收所述消息序列的操作,从所述第二节点获取所述目标地址的独占处理权限。
在一种可能的实现方式中,所述第一节点,具体用于接收所述消息序列后,向所述第二节点发送权限获取请求,所述权限获取请求携带所述消息序列中的至少一个消息对应的目标地址;所述第二节点,用于接收所述权限获取请求后,向所述第一节点发送权限获取响应,所述权限获取响应用于指示将所述至少一个消息对应的目标地址的独占处理权限迁移到所述第一节点。
在一种可能的实现方式中,所述第一节点,具体用于接收所述消息序列后,通过与所述第二节点之间的第一路径发送所述权限获取请求,其中,所述第一路径是所述第一节点根据与所述第二节点之间各路径的拥塞状态选取的;
所述第二节点,具体用于从与所述第一节点之间的第二路径发送所述权限获取响应,所述第二路径和所述第一路径相同或不同。
在一种可能的实现方式中,所述目标地址的独占处理权限,是所述第一节点在接收所述消息序列之前从所述第二节点获取的。
在一种可能的实现方式中,所述第一节点,还用于:在接收所述消息序列之前,从所述第二节点获取对所述第二节点管理的存储空间内指定地址范围的地址进行数据存取的操作权限,其中,所述第一地址为所述指定地址范围内的地址。
可选的,若在获取到对第一地址进行数据存取的操作权限后,在设定时长内未对所述第一地址进行数据存取,则释放对所述第一地址进行数据存取的操作权限。
在一种可能的实现方式中,所述第一节点,具体用于:根据历史数据存取操作所对应 的目标地址中,归属于所述第二节点管理的存储空间的地址,确定所述指定地址范围;从所述第二节点获取所述指定地址范围的独占处理权限。
在一种可能的实现方式中,所述第一节点,具体用于:从所述第二节点获取与所述目标地址对应的缓存地址;基于所述目标地址的独占处理权限,对所述目标地址对应的缓存地址进行数据存取操作。
在一种可能的实现方式中,所述第一节点,还用于:基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作之后,释放所述目标地址的独占处理权限。
在一种可能的实现方式中,所述第一节点,还用于:获取到所述目标地址的独占处理权限后,若在设定长时间内未对所述目标地址进行数据存取操作,则释放所述目标地址的独占处理权限。
在一种可能的实现方式中,所述第一节点还用于:根据所述外部节点对执行结果的可见顺序的约束,向所述外部节点发送对所述目标地址进行数据存取操作的执行结果。
在一种可能的实现方式中,所述第一节点,还用于:对所述消息序列中的第一消息对应的第一目标地址进行数据存取之前,若接收到用于请求获取所述第一目标地址的独占处理权限的权限获取请求,则释放所述第一目标地址的独占处理权限。
在一种可能的实现方式中,所述第二节点,用于:在第一节点获取到第一目标地址的独占处理权限后,若在设定时长内所述第一节点未归还所述第一目标地址的独占处理权限,则重新获得所述第一目标地址的独占处理权限,其中,所述第一目标地址为所述消息序列中的第一消息对应的目标地址。
在一种可能的实现方式中,所述第一节点和所述第二节点分别为SoC芯片,所述外部节点为I/O设备。
第三方面,提供一种SoC芯片,包括:一个或多个处理器,以及一个或多个存储器;其中,所述一个或多个存储器存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得所述SoC芯片执行如上述第一方面中任一项所述的方法。
第四方面,提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当计算机程序在计算设备上运行时,使得所述计算设备执行如上述第一方面中任一项所述的方法。
第五方面,提供一种计算机程序产品,所述计算机程序产品在被计算机调用时,使得所述计算机执行如上述第一方面中任一项所述的方法。
以上第二方面到第五方面的有益效果,请参见第一方面的有益效果,不重复赘述。
附图说明
图1为系统扩展方式的示意图;
图2为横向扩展(Scale-out)系统的多路径组网示意图;
图3为横向扩展(Scale-out)系统中采用串行处理方式保序的示意图;
图4为横向扩展(Scale-out)系统中采用序列号保序的示意图;
图5为本申请实施例提供的一种横向扩展(Scale-out)系统的架构示意图;
图6为本申请实施例提供的数据存取流程的框图;
图7为本申请实施例中的数据存取流程的交互示意图;
图8a和图8b分别为本申请实施例中第一节点基于多路径获取各消息对应的目标地址的独占处理权限的示意图;
图9a、图9b和图9c分别为本申请实施例中第一目标地址的独占处理权限的有效期计时处理以及超时失效后的处理情况示意图。
具体实施方式
首先对本申请涉及的一些概念进行描述:
数据存取操作:也称数据读写操作,包括写操作和读操作。其中,写操作是指将数据存储到目标地址中,读操作是指从目标地址中读取数据。
存储一致性:指硬件执行读写操作后,其他节点或外部系统对读写操作的执行结果(是否执行了读写操作)的全局可见顺序有一定要求,例如某一节点先后对两个地址分别进行一次读操作(即执行了两次读写操作),或者,某一节点先后对一个地址进行两次写操作,此种情况下,如果其他节点或外部系统不仅获知已经执行了两次读写操作,并且获知(即全局可见)这两次读写操作的执行结果的顺序符合软件预期,则表明满足了存储一致性的要求。
需要说明的是,为描述方便,下文中将读写操作的执行结果的全局可见的顺序,简称为执行结果的可见顺序。
缓存一致性:处理器相对于存储器是快速运行的设备,处理器对存储器进行读写操作时,如果等待操作完成再处理其他任务,将造成处理器阻塞,降低处理器的工作效率。因此,可以针对每个处理器配置一个缓存(缓存的速度远快于存储器但容量小于存储器)。当处理器向存储器中写数据时,可以将数据先写入缓存然后就可以处理其他任务,由直接存储器访问(direct memory access,DMA)器件来将数据存储至存储器;同理,当处理器读存储器中的数据时,由DMA器件先将数据从存储器存储至缓存,再由处理器从缓存中读取数据。当不同处理器通过各自的缓存对存储器中同一地址进行读写操作时,对读写操作的执行顺序有严格要求,即前一个读写操作完成前阻塞后一个读写操作,以防止同时进行读写操作而引起缓存中数据与存储器中数据不一致。
缓存一致性的设备遵守MESI协议,在MESI协议中规定了缓存线(cache line)(缓存中的最小缓存单位)的四种状态,包括:独占(Exclusive,也称E态)、修改(Modified,也称M态)、共享(Shared,也称S态)和失效(Invalid,也称I态)。其中,E态表示该缓存线有效,缓存中数据和存储器中数据一致,数据只存在于本缓存中;M态表示该缓存线有效,数据被修改了,缓存中数据和存储器中数据不一致,数据只存在于本缓存中;S态表示该缓存线有效,缓存中数据和存储器中数据一致,数据存在于多个缓存中;I态表示该缓存线无效。
存储一致性模型按照所要求的执行结果可见顺序的严格程度从强到弱包括:顺序一致性(sequential consistency,SC)模型、完全存储定序(total store order,TSO)模型、宽松模型(relax model,RM)等。SC模型要求硬件上读写共享内存的操作顺序与软件指令要求的操作顺序严格保持一致;TSO模型,在SC模型的基础上引入了缓存机制,放松了对于写-读(先写后读)操作的顺序约束,即写-读操作中的读操作可以先于写操作完成;RM模型最为宽松,不对任何读写操作进行顺序约束。
在采用横向扩展(Scale-out)方式的系统中,为了提高节点之间的交互效率,从而提 高系统性能,节点采用多路径组网方式,使得系统可以支持多路径通信。
示例性的,图2示出了一种在采用横向扩展(Scale-out)方式的系统中,节点采用多路径组网的示意图。第一节点(如图中的Node0)向第二节点(如图中的Node3)发送消息序列,该消息序列中的消息用于请求对第二节点中的存储器存储空间内的目标地址进行数据存取。由于采用多路径组网,系统可以根据各路径的拥塞程度(即繁忙程度)选择路径来对该消息序列中的消息进行传输,比如,在T0时刻选择当前最空闲的Node0-Node3的直达路径传输消息,而在T1时刻由于Node0-Node3的直达路径繁忙程度较高,处于被阻塞状态,因此系统选择一条当前较为空闲的路径Node0-Node1-Node3传输消息1,从而可以充分利用通信资源,达到拥塞控制的目的,提高互联效率。
采用横向扩展(Scale-out)方式组网的系统(以下可简称Scale-out系统),在路径上支持乱序(out-of-order)的传播方式,但是在与对执行结果可见顺序具有有序(in-order)要求的外部系统,比如I/O设备对接时,由于该外部系统要求遵守一定的顺序约束(比如要求遵守SC模型或TSO模型的顺序约束),因此对于来自外部系统的消息序列,需要保证对应的数据存取操作在Scale-out系统内部的执行结果的可见顺序遵守相应顺序约束以符合外部系统的预期,Scale-out系统可根据该顺序约束向外部系统返回相应数据存取消息所对应的数据存取操作执行结果(即数据存取响应)。
为了保证来自于外部系统的消息序列的执行结果的可见顺序遵守该外部系统的顺序约束,目前业界的一种做法是在单条路径上采用串行处理,发送节点按照顺序发送消息,发送一个消息后,等待该消息在接收节点处理完成后再发送下一个消息。
示例性的,图3示出了一种Scale-out系统中采用串行处理方式以保证消息序列的执行结果的可见顺序遵守顺序约束的示意图。如图所示,Scale-out系统中的第一节点(如图中的Node0)收到外部I/O设备发送的消息序列(消息0,消息1,……,消息N)后,将该消息序列的消息发送到第二节点(如图中的Node1)进行处理。其中,第一节点(Node0)收到I/O设备发送来的消息序列,该I/O设备对执行结果的可见顺序有严格要求,为保证该消息序列的执行结果的可见顺序满足该I/O设备的要求,第一节点(Node0)作为上游主节点(Master节点)需要按照顺序发送消息,通过支持乱序(out-of-order)的通信路径传输到作为下游辅节点(Slave)的第二节点(Node1),并在第二节点(Node1)上按照消息接收顺序进行处理。在上述过程中,第二节点(Node1)对接收到的消息处理完成后,需要与作为上游节点的第一节点(Node0)完成一次握手(即回复一个响应消息),第一节点(Node0)在接收到响应消息后,才向作为下游节点的第二节点(Node1)发送下一个消息,以保证该消息序列的执行结果的可见顺序与该消息序列中的消息顺序一致,从而满足该I/O设备对数据存取操作执行结果的可见顺序的要求。
比如如图3所示,第一节点(Node0)接收到来自于I/O设备的消息序列(消息0,消息1,……,消息N),首先将消息0发送给第二节点(Node1)以使第二节点进行相应的数据存取操作,在接收到第二节点(Node1)返回的响应0后,将消息1发送给第二节点(Node1)以使第二节点进行相应的数据存取操作,并在接收到响应1后将下一个消息发送给第二节点,以此类推,直到将消息N发送给第二节点(Node1)并接收第二节点返回的响应N,这样,第一节点(Node0)上可得到全局可见的数据存取操作执行结果(响应0,响应1,……,响应N),该执行结果的顺序与消息序列中的消息顺序一致,符合该I/O设备对数据存取操作执行结果的可见顺序的要求。
采用上述串行处理方式,由于需要等待一个消息处理完成后才能发送下一个消息,导致处理时延过长,带宽受到严重影响,互联效率较低。
为了降低处理时延,目前业界的另一种做法是采用序列号(sequence number)来保证执行顺序。该方法中,对于一个消息序列使用对应的一套序列号(sequence number)对消息进行标记,携带序列号的消息可以在各路径上乱序传递,在下游节点按照消息中的序列号还原对应的顺序。
示例性的,图4示出了一种Scale-out系统中采用序列号以保证消息序列的执行结果的可见顺序遵守顺序约束的示意图。如图所示,Scale-out系统中的第一节点(如图中的Node0)收到外部I/O设备发送的消息序列(消息0,消息1,……,消息N)后,由于该消息序列的执行结果的可见顺序需要遵守严格顺序约束,因此第一节点(Node0)作为上游节点在向作为下游节点的第二节点(Node1)发送消息前,按照顺序给该消息标记上对应的序列号(sequence number),图4中的序列号与消息序列中的消息序号一致,比如,消息0被标记的序列号记为0(即Seq Num=0),消息1被标记的序列号记为1(即Seq Num=1),以此类推,消息N被标记的序列号记为N(即Seq Num=N)。携带序列号的各消息可在路径上乱序传播。第二节点(Node1)收到消息后,采用类似于重新排序缓冲区(reorder buffer,ROB)的机制,根据消息中携带的序列号来还原顺序。第二节点(Node1)向第一节点(Node0)返回的响应消息也可以乱序传输。第一节点(Node0)可根据记录的消息序列号来确定响应消息的顺序,以保证数据存取操作执行结果的可见顺序,比如,响应0是消息0对应的数据存取操作执行结果,响应1是消息0对应的数据存取操作执行结果,响应N是消息N对应的数据存取操作执行结果,第一节点(Node0)可以根据消息0至消息N分别对应的序列号来确定响应0至响应N的顺序,进而按顺序向I/O设备返回响应。
采用上述添加序列号的方式,由于需要在消息中额外携带序列号,上下游节点需要提供对应的处理机制,且节点间的每个进程(一个进程对应一个消息序列)都需要对应配置一套序列号(sequence number),复杂度高,需占据较多资源。另外,很难扩展到多路径场景中使用,对于同一进程,由于每一条路径都可以用于该进程的通信,因此每一条路径上都需要支持对应的序列号(sequence number),开销随路径数量增长而急剧增长,对于系统而言是无法接受的。
为了解决上述问题,本申请实施例提供了数据存取方法、互联系统和装置。本申请实施例可应用于采用横向扩展(Scale-out)且多路径组网的系统,以较小的开销实现多路径场景下的保序问题。
本申请实施例对Scale-out系统中的缓存一致性范围(Cache Coherence Domain,CC范围)进行扩展,将Scale-out系统和外部系统的交界节点纳入下游节点的CC范围,从而可以将针对执行结果可见顺序的控制操作,从下游节点迁移到交界节点上,即在交界节点上对数据存取操作执行结果的可见顺序进行控制,一方面可以保证数据存取操作在Scale-out系统内能够并行处理,保证带宽,另一方面仅需要通过获取一致性中的地址操作权限来迁移顺序处理,与上述采用序列号的方案相比,可以在满足数据存取操作执行结果的可见顺序符合外部系统要求的基础上,节省系统开销。
其中,Scale-out系统和外部系统间的交界节点是指Scale-out系统中接收到外部系统(如外部I/O设备)发送的消息序列的节点,比如,图2、图3或图4中的第一节点可被称为交界节点。
下面结合附图对本申请实施例进行详细说明。
参见图5,为本申请实施例中的横向扩展(Scale-out)系统的架构示意图。
如图所示,该系统中包括第一节点(如图中所示的Node0)、第二节点(如图中所示的Node3)、第三节点(如图中所示的Node1)、第四节点(如图中所示的Node2)。各节点间互联,形成多路径组网结构。以第一节点和第二节点间的路径为例,第一节点和第二节点间包括三条路径:Node0-Node3、Node0-Node1-Node3、Node0-Node2-Node3。需要说明的是,实际应用中的系统所包含的节点数量可以比图5所示的节点数量更多或更少,本申请实施例对该系统中的节点数量不做限制。
上述系统架构的一种典型应用为HPC芯片互联系统,在HPC芯片互联系统中,上述节点可以是片上系统(System on Chip,SoC)芯片。HPC芯片互联系统中的节点可以与系统外部的节点进行信息交互,其中,所述系统外部的节点可以是I/O设备,示例性的,当I/O设备与SoC芯片通过高速串行计算机扩展总线标准(peripheral component interconnect express,PCIE)连接时,所述I/O设备可以是PCIE板卡,当I/O设备与SoC芯片通过网络传输协议连接时,该I/O设备可以是以太网接口。
为描述方便,下文描述中,将上述横向扩展(Scale-out)系统中,与外部系统进行信息交互的节点称为交界节点(或称为上游节点),交界节点可接收来自于外部系统的消息序列,该消息序列中的消息为数据存取请求消息,用于请求对该横向扩展(Scale-out)系统中的目标节点(或称下游节点)所管理的存储空间内的地址进行数据存取操作。可理解的,目标节点所管理的存储空间是指该目标节点中的存储器的存储空间。可理解的,上述横向扩展(Scale-out)系统中,任何节点都可能成为交界节点(或称上游节点)或目标节点(或称下游节点)。
比如,如图5所示,在第一节点接收到系统外部的I/O设备发送的消息序列的场景下,该第一节点即为交界节点,该消息序列中的数据存取请求消息对应的目标地址归属于第二节点管理的存储空间,则第二节点即为该消息序列对应的目标节点。如果第二节点接收到系统外部I/O设备发送的消息序列,则在该场景下,第二节点即为交界节点,该消息序列中的数据存取请求消息对应的目标地址归属于第三节点管理的存储空间,则第三节点称为该消息序列对应的目标节点。
基于上述系统架构,本申请实施例通过扩展缓存一致性范围(CC范围),将目标地址的独占处理权限(也称E态)从下游节点迁移到交界节点(上游节点),从而将针对数据存取操作执行结果的可见顺序的控制操作从下游节点迁移到交界节点上来。这样,对于交界节点接收到的来自于外部系统的消息序列,由于目标地址的独占处理权限已经被迁移到该交界节点,因此交界节点可以承担对数据存取操作执行结果的可见顺序的控制操作,从而保证执行结果的可见顺序满足外部系统的要求。
以图5为例,第一节点(Node0)接收到系统外部的I/O设备发送的消息序列,该消息序列中包括N个数据存取请求消息(如图中所示的Req0,Req1,…,ReqN),N为大于或等于2的整数。该N个数据存取请求消息用于请求对该系统中的第二节点(Node3)所管理的存储空间内的地址进行数据存取,即,该N个数据存取请求消息对应的目标地址归属于第二节点上的存储器的地址空间。第一节点(Node0)从第二节点(Node3)获取相应地址的独占处理权限,使得相应地址的独占处理权限从第二节点(Node3)迁移到第一节点(Node0),如图所示,原来仅在第二节点(Node3)的CC范围(如图5中的a所示),采 用本申请实施例后,可通过地址的独占处理权限迁移(即E态迁移),将第一节点(Node0)纳入CC范围(如图5中的b所示)。
由于本申请实施例中,将目标地址的独占处理权限从下游节点迁移到交界节点,使交界节点可以基于存储一致性要求执行数据存取操作,并可以对数据存取操作执行结果的可见顺序进行控制,无需下游节点做额外的顺序处理,从而可以以较小的开销实现Scale-out系统的多路径组网。
进一步的,在横向扩展(Scale-out)系统内部,节点间的信息交互仍可支持多路径通信。可选的,系统中的节点可以基于拥塞机制进行路径选择,通过选择的路径传输消息到其他节点,节点间的消息可通过多路径进行乱序传输,以降低系统时延,提高带宽利用率。
基于上述架构,比较典型的应用场景是I/O设备(或I/O系统)和HPC系统的交互。I/O设备发送给HPC系统的消息序列通常对执行结果的可见顺序是有要求的,而HPC系统采用多路径的横向扩展(Scale-out)方式互联,消息在各节点间的路径上是乱序传输的,且可以多路径传输。因此对于I/O设备发送的消息序列,在进入横向扩展(Scale-out)系统后,需要保证执行结果的可见顺序满足该I/O设备的要求。
基于以上系统架构,下面结合图6和图7,对本申请实施例提供的数据存取流程进行说明。该流程可应用于互联系统,该互联系统中可包括至少两个采用横向扩展方式(Scale-out)互联的节点,该至少两个采用横向扩展方式(Scale-out)互联的节点中包括第一节点和第二节点,比如第一节点和第二节点均为SoC芯片。以下流程以第一节点为交界节点,第二节点为目标节点为例描述。
其中,图6为本申请实施例提供的数据存取流程的总体框图。图7为本申请实施例中一个具体应用场景下的数据存取流程的交互示意图,在该应用场景中,互联系统中的第一节点接收到系统外部的节点(如I/O设备)发送的消息序列,该消息序列中包括第一消息和第二消息,第一消息为写请求,用于请求对第一目标地址进行写操作,第二消息为写请求,用于请求对第二目标地址进行写操作,其中第一目标地址和第二目标地址为互联系统内的第二节点管理的存储空间内的地址。该消息序列的执行结果的可见顺序需要遵守顺序约束,比如,该可见顺序从前到后为:第一消息对应的数据存取执行结果、第二消息对应的数据存取执行结果。
需要说明的是,图7仅以消息序列中包括两个写请求为例描述,对于其他数据存取操作类型的消息(比如读请求),或者消息序列中包含有更多消息的情况,可以基于图7所示流程的原理执行。
如图6所示,本申请实施例提供的数据存取流程可包括以下步骤:
S601:第一节点接收来自于外部节点的消息序列。
所述外部节点为互联系统外部的节点,比如I/O设备,该外部节点遵守执行结果顺序约束,该外部节点对数据存取操作执行结果的可见顺序的约束比第一节点严格,比如该外部节点遵守SC模型或TSO模型或其他类型的存储一致性模型所要求的执行结果可见顺序要求。
该消息序列中包括至少两个数据存取请求消息,所述数据存取请求消息用于请求对目标地址进行数据存取,比如该消息序列中可包括用于向目标地址写入数据的写请求,也可包括用于从目标地址读取数据的读请求。所述目标地址归属于该系统中的第二节点所管理的存储空间,也就是说,所述目标地址为第二节点管理的存储空间内的地址。比如,所述 目标地址为第二节点上的存储器的物理地址。可理解的,该消息序列中可包括用于请求向第二节点的存储器写入数据的写请求,和/或用于请求从第二节点的存储器读取数据的请求。
本申请实施例涉及的存取操作或读写操作,可以支持写-写(先写后写)、写-读(先写后读)、读-写(先读后写)、读-读(先读后读)等操作。消息序列中的所有消息的消息类型可以相同,比如均为写请求或均为读请求,也可以不同,比如部分消息为写请求,另外部分消息为读请求。消息序列中的各消息所对应的目标地址可以相同也可以不同,或者部分消息对应的目标地址相同,另外部分消息对应的目标地址不同。
示例性的,如图7所示,系统外部的节点(比如I/O设备)向互联系统中的第一节点发送消息序列,该消息序列中包括第一消息(写请求1)和第二消息(写请求2)。第一消息(写请求1)携带待写数据1和第一目标地址,用于请求将该待写数据1存入该第一目标地址;第二消息(写请求2)携带待写数据2第二目标地址,用于请求将该待写数据2存入该第二目标地址。该第一目标地址和第二目标地址均归属于系统内的第二节点管理的存储空间。
S602:第一节点获取上述目标地址的独占处理权限。
其中,目标地址的独占处理权限可以指缓存一致性中的E态,表示节点对于该地址拥有数据存取的操作权限,拥有消息序列所对应的目标地址的独占处理权限的节点(如本流程中的第一节点),针对该消息序列具有对执行结果的可见顺序的处理能力。以消息序列中包括第一消息和第二消息,且第一消息对应的目标地址为第一目标地址、第二消息对应的目标地址为第二目标地址为例,第一节点可以从第二节点获取第一目标地址的独占处理权限(即E态)和第二目标地址的独占处理权限(即E态)。
第一节点从第二节点获取目标地址的独占处理权限后,CC范围从第二节点扩展到第一节点,使得第一节点参与到缓存一致性的管理,其他节点(例如第二节点)不能对该目标地址进行需要操作权限的数据存取操作,对执行结果的可见顺序的控制被转移到第一节点,即,由第一节点对数据存取操作的执行结果的可见顺序进行控制。
本申请实施例中,第一节点可以采用第一获取方式和第二获取方式中的一种方式来获取目标地址的独占处理权限。其中,第一获取方式是指在接收到消息序列后发起获取权限的过程,即消息序列的接收操作可触发权限获取过程;第二获取方式为提前从其他节点(比如第二节点)获取独占处理权限。下面分别对第一获取方式和第二获取方式进行详细说明。
第一获取方式:
采用第一获取方式的情况下,第一节点接收到来自于系统外部节点的消息序列后,确定该消息序列中的消息所对应的目标地址归属于第二节点管理的存储空间,则从第二节点获取该目标地址的独占处理权限。
可选的,第一节点从第二节点获取目标地址的独占处理权限的过程可包括:第一节点接收到来自于系统外部的节点的消息序列后,向第二节点发送权限获取请求,该权限获取请求携带目标地址,该目标地址包括该消息序列中的至少一个消息所对应的目标地址;第二节点向第一节点发送权限获取响应,该权限获取响应用于指示允许第一节点获得相应目标地址的独占处理权限。进一步的,第一节点接收到第二节点发送的权限获取响应后,可向第二节点发送确认消息,以向第二节点告知第一节点已经获取到相应目标地址的独占处理权限。
上述权限获取过程中,第一节点可以以缓存线(cacheline)为单位进行权限的获取。 以缓存线大小(linesize)为64字节为例,一个权限获取请求用于针对容量为64字节大小的地址范围获取独占处理权限。在另外的实施例中,第一节点还可以以更大粒度进行权限的获取,也就是说,一个权限获取请求可以针对容量大于缓存线大小的地址范围获取独占处理权限。举例来说,可以以4KB大小的页为单位,获取该容量大小的地址范围的独占处理权限,这样,通过一个权限获取请求可以实现整个消息序列所对应的目标地址的E态获取,从而可以降低复杂度,也可以降低系统开销。
上述权限获取过程中,对第一节点发送权限获取请求所经过的路径以及发送的先后顺序没有严格要求。比如,不同的权限获取请求可以从不同路径或相同路径发送,可以乱序发送或者并行发送。本申请实施例对第二节点发送权限获取响应所经过的路径以及发送的先后顺序也没有严格要求。比如,不同的权限获取响应可以从不同路径或相同路径发送,可以乱序发送或者并行发送。进一步的,权限获取请求和对应的权限获取响应所经过的路径可能相同也可能不同。
可选的,在具体实施时,第一节点可基于拥塞控制机制,选择合适的路径向第二节点发送权限获取请求,以降低时延,提高通信效率;第二节点也可以基于拥塞控制机制,选择合适的路径向第一节点发送权限获取响应。举例来说,第一节点接收到消息序列后,通过与第二节点之间的第一路径发送权限获取请求,其中,第一路径是第一节点根据与第二节点之间各路径的拥塞状态选取的,第一路径可以是第一节点和第二节点之间的所有路径中当前繁忙程度最低的路径。第二节点可通过与第一节点之间的第二路径发送对应的权限获取响应,该第二路径是第二节点根据与第一节点之间各路径的拥塞状态选取的,第二路径可以是第二节点和第一节点之间的所有路径中当前繁忙程度最低的路径。其中,第一路径和第二路径可能相同,也可能不同。
示例性的,以图5所示的系统架构为例,图8a和图8b示出了第一节点基于多路径获取各消息对应的目标地址的独占处理权限的示意图。如图8a所示,消息序列中包括消息0、消息1,……,消息N,其中,消息0对应的权限获取请求(Req0:Get_E)通过第一节点和第二节点间的直通路径(Node0-Node3)传输,消息1对应的权限获取请求(Req1:Get_E)通过第一节点和第二节点间的间接路径(Node0-Node2-Node3)传输,消息2对应的权限获取请求(Req2:Get_E)通过第一节点和第二节点间的间接路径(Node0-Node1-Node3)传输。第一节点可基于拥塞控制机制选取合适的路径传输各消息对应的权限获取请求。消息0、消息1和消息2各自对应的权限获取请求也可以并行发送。
如图8b所示,权限获取请求(Req0:Get_E)对应的权限获取响应(Req0:Res)通过第二节点和第一节点间的间接路径(Node3-Node1-Node0)传输,权限获取请求(Req1:Get_E)对应的权限获取响应(Req1:Res)通过第二节点和第一节点间的直通路径(Node3-Node0)传输,权限获取请求(Req2:Get_E)对应的权限获取响应(Req2:Res)通过第二节点和第一节点间的间接路径(Node3-Node2-Node0)传输。第二节点可基于拥塞控制机制选取合适的路径传输各消息对应的权限获取响应。各权限获取响应也可以并行发送。
示例性的,如图7所示,第一节点向第二节点发送第一权限获取请求(GET_E1)和第二权限获取请求(GET_E2),第一权限获取请求和第二权限获取请求可以为GET_E消息。第一权限获取请求中包括第一目标地址,用于请求获取第一目标地址的独占处理权限,第二权限获取请求消息中包括第二目标地址,用于请求获取第二目标地址的独占处理权限。第一节点向第二节点发送第一权限获取请求和第二权限获取请求的顺序不做限制,第一权 限获取请求和第二权限获取请求所经过的路径可以是第一节点基于拥塞控制机制选取的路径。
第二节点接收到第一权限获取请求后向第一节点发送第一权限获取响应(RSP1),第二节点在接收到第二权限获取请求后向第一节点发送第二权限获取响应(RSP2)。第一权限获取响应是第一权限获取请求的响应消息(RSP1),用于指示将第一目标地址的独占处理权限转移至第一节点;第二权限获取响应是第二权限获取请求的响应消息(RSP2),用于指示将第二目标地址的独占处理权限转移至第一节点。第二节点向第一节点发送第一权限获取响应和第二权限获取响应的顺序不做限制,第一权限获取响应和第二权限获取响应所经过的路径可以是第二节点基于拥塞控制机制选取的路径。
第一节点接收到第一权限获取响应后,向第二节点发送第一确认消息(ACK1),以向第二节点告知第一节点已经获取到第一目标地址的独占处理权限;第一节点接收到第二权限获取响应后,向第二节点发送第二确认消息(ACK2),以向第二节点告知第一节点已经获取到第二目标地址的独占处理权限。第一节点向第二节点发送第一确认消息和第二确认消息的顺序不做限制,第一确认消息和第二确认消息所经过的路径可以是第一节点基于拥塞控制机制选取的路径。
第二获取方式:
采用第二获取方式的情况下,第一节点可以提前获取到其他节点(比如第二节点)所管理的存储空间内的地址的独占处理权限。比如,第一节点可按照设定周期或设定时间或者在空闲时,从第二节点获取第二节点所管理的存储空间内的地址的独占处理权限。
考虑到第二节点所管理的存储空间可能较大,为了减少系统开销以及对其他节点进行数据存储操作的影响,本申请实施例中,第一节点可根据历史数据存取操作所涉及的地址(即历史数据存取操作对应的目标地址),确定历史数据存取操作对应的目标地址中归属于第二节点管理的存储空间的地址,并据此确定指定地址范围,从第二节点获取该指定地址范围内的地址的独占处理权限。其中,该指定地址范围与历史数据存取操作所涉及的目标地址相匹配,比如可以与历史数据存取操作所涉及的目标地址范围相同,也可以包括历史数据存取操作所涉及的目标地址,并进一步在该基础上适当扩大地址范围。
可选的,采用第二获取方式的情况下,第一节点从第二节点获取目标地址的独占处理权限的过程可包括:第一节点向第二节点发送权限获取请求,该权限获取请求携带目标地址;第二节点向第一节点发送权限获取响应,该权限获取响应用于指示将该目标地址的独占处理权限转移至第一节点。其中,所述目标地址可以是采用上述方法确定出的指定地址范围内的地址。
上述权限获取过程中,对第一节点发送权限获取请求所经过的路径以及发送的先后顺序没有严格要求。比如,不同的权限获取请求可以从不同路径或相同路径发送,可以乱序发送或者并行发送。本申请实施例对第二节点发送权限获取响应所经过的路径以及发送的先后顺序也没有严格要求。比如,不同的权限获取响应可以从不同路径或相同路径发送,可以乱序发送或者并行发送。进一步的,权限获取请求和对应的权限获取响应所经过的路径可能相同也可能不同。
上述权限获取过程中,第一节点可以以缓存线(cacheline)为单位进行权限的获取。以缓存线大小(linesize)为64字节为例,一个权限获取请求用于针对容量为64字节大小的地址范围获取独占处理权限。在另外的实施例中,第一节点还可以以更大粒度进行权限 的获取,也就是说,一个权限获取请求可以针对容量大于缓存线大小的地址范围获取独占处理权限。举例来说,可以以4KB大小的页为单位,获取该容量大小的地址范围的独占处理权限,这样,通过一个权限获取请求可以实现较大地址范围的E态获取,从而可以降低复杂度,也可以降低系统开销。
可选的,本申请实施例中,在采用上述第一获取方式或第二获取方式或其他可能的获取方式以获取权限的情况下,被迁移的地址独占处理权限具有有效期,该有效期表示该地址的独占处理权限被迁移到其他节点(比如上述第一节点)上的最大时长,该有效期的大小可预先设置。
在实际应用中,上游节点(比如本申请实施例中的第一节点)可能出现失效的情况,比如第一节点发生了热插拔,被从系统中移除了,这样第一节点将无法将从第二节点获取到的目标地址的独占处理权限归还给第二节点,导致系统后续对该目标地址进行数据存取操作时发生一致性错误。本申请实施例通过设置权限的有效期,可以解决上述问题。为描述方便,以第一目标地址的独占处理权限转移为例,当第一目标地址的独占处理权限从第二节点转移到第一节点后,若在设定时长内该第一目标地址的独占处理权限未被归还给第二节点,比如在该设定时长内未接收到第一节点发送的用于指示归还该第一目标地址的独占处理权限的消息,则第二节点重新获得该第一目标地址的独占处理权限。其中,所述设定时长即为有效期对应的时长。
仍以第一目标地址的独占处理权限从第二节点转移到第一节点为例,在另一种情况下,当第一节点获取到第一目标地址的独占处理权限后,若在设定时长内未对该第一目标地址进行数据存取操作,则表明对第一目标地址进行数据存取的独占处理权限超时或失效,第一节点将释放该第一地址的独占处理权限,即,将该第一目标地址的独占处理权限归还给第二节点,比如可通过向第二节点发送消息以指示归还该第一目标地址的独占处理权限。
可选的,仍以第一目标地址的独占处理权限从第二节点转移到第一节点为例,第一节点可启动第一定时器来对第一目标地址的独占处理权限的有效期进行计时,第二节点可启动第二定时器来对第一目标地址的独占处理权限的有效期进行计时。第一节点可在发送权限获取请求以请求获取第一目标地址的独占处理权限时,针对该第一目标地址的独占处理权限启动第一定时器,或者在接收到第二节点针对该权限获取请求返回的权限获取响应时,针对该第一目标地址的独占处理权限启动第一定时器。第二节点可在接收到第一节点发送的上述权限获取请求时,针对该第一目标地址的独占处理权限启动第二定时器,或者在第二节点向第一节点发送对应的权限获取响应时,针对该第一目标地址的独占处理权限启动第二定时器。在第一节点处,当在第一定时器的运行期间内,若第一节点对第一目标地址进行了数据存取操作,则第一定时器被释放;在第二节点处,当在第二定时器的运行期间内,若接收到归还第一目标地址的独占处理权限的指示,则第二定时器被释放。
可选的,在一些实施例中,上述第一定时器和第二定时器的时长相同,该时长即为有效期的时长值。在另一些实施例中,上述第一定时器和第二定时器的时长不相同,其中,第一定时器的时长小于第二定时器的时长。考虑到权限获取响应在第二节点的发送时刻,与该权限获取响应在第一节点的接收时刻之间存在一定时延,将第一定时器的时长设置为小于第二定时器的时长,可以保证同一地址的独占处理权限在第一节点和第二节点上的一致性,即可以避免同一地址的独占处理权限在第一节点和第二节点上均有效的情况。
可选的,在另一些实施例中,第一节点在接收到第二节点发送的权限获取响应后,还 可以向第二节点发送确认消息,以表明已经获取到第一目标地址的独占处理权限。相应的,第二节点可在收到第一节点返回的该确认消息以确认第一节点已经获取到第一目标地址的独占处理权限时,针对该第一目标地址的独占处理权限启动第二定时器,这样,第一节点和第二节点对第一目标地址的独占处理权限的有效期计时会比较接近,可以有助于第一目标地址的独占处理权限在第一节点和第二节点上的一致性。此种情况下,可将第一定时器的时长和第二定时器的时长设置为基本相同。
示例性的,图9a、图9b和图9c分别示出了第一目标地址的独占处理权限的有效期计时处理以及超时失效后的处理情况。
如图9a所示,第一节点向第二节点发送第一权限获取请求(GET_E1)以请求获取第一目标地址的独占处理权限;第二节点接收到该第一权限获取请求后,向第一节点发送第一权限获取响应(RSP1)以将第一目标地址的独占处理权限转移给第一节点,并启动第二定时器对第一目标地址的独占处理权限的被迁移的时长进行计时,第二定时器的时长为Th;第一节点接收到该第一权限获取响应后,启动第一定时器以对第一目标地址的独占处理权限的被迁移的时长进行计时,该第一定时器的时长为Tm,Tm<Th。
在第一定时器和第二定时器运行期间内的Tf时刻,第一节点在尚未归还第一目标地址的独占处理权限的情况下被移除,无法向第二节点发送归还第一目标地址的独占处理权限的指示;在第二节点上,当第二定时器超时的时候,第二节点重新获得第一目标地址的独占处理权限。
如图9b所示,第一节点向第二节点发送第一权限获取请求(GET_E1)以请求获取第一目标地址的独占处理权限;第二节点接收到该第一权限获取请求后,向第一节点发送第一权限获取响应(RSP1)以将第一目标地址的独占处理权限转移给第一节点,并启动第二定时器对第一目标地址的独占处理权限的被迁移的时长进行计时,第二定时器的时长为Th;第一节点接收到该第一权限获取响应后,启动第一定时器以对第一目标地址的独占处理权限的被迁移的时长进行计时,该第一定时器的时长为Tm,Tm<Th。
在第一定时器运行期间内,第一节点未对第一目标地址执行数据存取操作,第一定时器超时,第一节点在第一定时器超时时,向第二节点发送归还第一目标地址的独占处理权限的指示,并进一步释放第一定时器;第二节点接收到该指示后,获得第一目标地址的独占处理权限,并进一步释放第二定时器。
如图9c所示,第一节点向第二节点发送第一权限获取请求(GET_E1)以请求获取第一目标地址的独占处理权限;第二节点接收到该第一权限获取请求后,向第一节点发送第一权限获取响应(RSP1)以将第一目标地址的独占处理权限转移给第一节点,并启动第二定时器对第一目标地址的独占处理权限的被迁移的时长进行计时,第二定时器的时长为Th;第一节点接收到该第一权限获取响应后,启动第一定时器以对第一目标地址的独占处理权限的被迁移的时长进行计时,该第一定时器的时长为Tm,Tm<Th。
在第一定时器和第二定时器运行期间内的Tf时刻,第一节点对第一目标地址进行数据存取操作,向第二节点发送归还第一目标地址的独占处理权限的指示;第二节点接收到该指示后,重新获得第一目标地址的独占处理权限。其中,第一节点对第一目标地址进行数据存取操作可能成功也可能失败,无论对第一目标地址的数据存取操作是否成功,第一节点都可以向第二节点发送归还第一目标地址的独占处理权限的指示。
S603:第一节点基于上述目标地址的独占处理权限,对该目标地址进行数据存取操作。 其中,所述数据存取操作的执行结果的可见顺序满足上述外部节点对执行结果的可见顺序的约束。
该步骤中,以消息序列中包括第一消息和第二消息为例,第一节点获得第一目标地址和第二目标地址后,可基于第一目标地址的独占处理权限,对第一目标地址进行数据存取操作,基于第二目标地址的独占处理权限,对第二目标地址进行数据存取操作。其中,对第一目标地址进行数据存取操作,与对第二目标地址进行数据存取操作的先后顺序,只要满足存储一致性要求即可。比如,如果第一目标地址和第二目标地址为不同的地址,则可先根据第一消息对第一目标地址进行数据缓存操作,再根据第二消息对第二目标地址进行数据缓存操作,也可先根据第二消息对第二目标地址进行数据缓存操作,再根据第一消息对第一目标地址进行数据缓存操作,还可并行的对第一目标地址进行数据缓存操作以及对第二目标地址进行数据缓存操作。再比如,如果第一目标地址和第二目标地址为相同的地址,则需要先根据第一消息对该目标地址进行数据缓存操作,再根据第二消息对该目标地址进行数据缓存操作。
该步骤中,第一节点上,数据存取操作的执行结果的可见顺序,需要满足上述外部节点遵守的数据存储操作执行结果的可见顺序约束。比如,如果外部节点要求的执行结果可见顺序为:从前到后依次为第一消息对应的执行结果、第二消息对应的执行结果,则在第一节点上,数据存取操作执行结果的顺序依次为:对第一目标地址进行数据存取操作的执行结果、对第二目标地址进行数据存取操作的执行结果,且该顺序全局可见。
该步骤中,可选的,第一节点可从第二节点获取与该目标地址对应的缓存地址,并对缓存地址进行数据存取。
举例来说,第一节点可向第二节点发送地址获取请求,其中携带该目标地址(比如第一目标地址和/或第二目标地址),第二节点接收到该地址获取请求后向第一节点发送地址获取响应,其中携带与该目标地址对应的缓存地址。
可选的,在一些实施例中,第一节点可根据上述外部节点要求的执行结果可见顺序来发送各消息对应的地址获取请求,并可记录地址获取请求的发送顺序,以作为执行结果的全局可见顺序。举例来说,根据外部节点对执行结果可见顺序的约束,执行结果的可见顺序应与消息序列的顺序一致,因此第一节点按照消息序列的顺序,依次向第二节点发送相应消息对应的地址获取请求。
可选的,在另一些实施例中,第一节点可基于拥塞控制机制,选择合适的路径向第二节点发送地址获取请求,以降低时延,提高通信效率;第二节点也可以基于拥塞控制机制,选择合适的路径向第一节点发送地址获取响应。举例来说,第一节点通过与第二节点之间的第一路径发送地址获取请求,其中,第一路径是第一节点根据与第二节点之间各路径的拥塞状态选取的,第一路径可以是第一节点和第二节点之间的所有路径中当前繁忙程度最低的路径。第二节点可通过与第一节点之间的第二路径发送对应的地址获取响应,该第二路径是第二节点根据与第一节点之间各路径的拥塞状态选取的,第二路径可以是第二节点和第一节点之间的所有路径中当前繁忙程度最低的路径。其中,第一路径和第二路径可能相同,也可能不同。
示例性的,如图7所示,第一节点在获取到第一目标地址的独占处理权限后,向第二节点发送第一地址获取请求,其中可携带第一目标地址和数据存取操作类型(本例子中该数据存取操作类型为写操作),用于请求获取与第一目标地址对应的缓存地址;第一节点 在获取到第二目标地址的独占处理权限后,可向第二节点发送第二地址获取请求,其中可携带第二目标地址和数据存取操作类型(本例子中该数据存取操作类型为写操作),用于请求获取与第二目标地址对应的缓存地址。第一地址获取请求和第二地址获取请求可以是回写(WriteBack)消息。第一节点向第二节点发送第一地址获取请求和第二地址获取请求的顺序不做限制,第一地址获取请求和第二地址获取请求所经过的路径可以是第一节点基于拥塞控制机制选取的路径。
第二节点接收到第一地址获取请求后,向第一节点发送第一地址获取响应(RSP3),用于指示第一目标地址对应的第二缓存地址;第二节点接收到第二地址获取请求后,向第一节点发送第二地址获取响应(RSP4),用于指示第二目标地址对应的第一缓存地址。第二节点向第一节点发送第一地址获取响应和第二地址获取响应的顺序不做限制,第一地址获取响应和第二地址获取响应所经过的路径可以是第二节点基于拥塞控制机制选取的路径。
第一节点获得第一目标地址对应的缓存地址,将待写数据1写入第一缓存地址后,可进一步向第二节点发送归还第一目标地址的独占处理权限的指示,第二节点可根据该指示重新获得第一目标地址的独占处理权限;第一节点获得第二目标地址对应的缓存地址,将待写数据2写入第二缓存地址后,可进一步向第二节点发送归还第二目标地址的独占处理权限的指示,第二节点可根据该指示重新获得第二目标地址的独占处理权限。通过该步骤,第一节点在完成数据存取操作后,可将对应地址的独占处理权限归还给第二节点,以便第二节点或其他节点对该地址进行数据存取操作。
可选的,在上述图6所示流程的S603之后,还可包括以下步骤:
S604:第一节点根据上述外部节点对执行结果的可见顺序的约束,向该外部节点发送对目标地址进行数据存取操作的执行结果。
示例性的,如图7所示,如果外部节点对执行结果具有严格顺序约束要求,其要求的执行结果可见顺序为:从前到后依次为第一消息对应的执行结果、第二消息对应的执行结果,则第一节点依次向该外部节点发送第一响应和第二响应,其中,第一响应是第一消息的响应消息,第二响应是第二消息的响应消息。
可选的,本申请的一些实施例中,在第一节点对消息序列中的一个消息(比如第一消息)对应的目标地址进行数据存取之前,若接收到针对该第一目标地址的独占处理权限获取请求(该请求用于请求获取第一目标地址的独占处理权限),则释放该第一目标地址的独占处理权限,以使得其他节点能够获取该第一目标地址的独占处理权限。其中,所述其他节点可以是第一节点所在系统内除第一节点以外的其他节点,该其他节点的优先级可能高于第一节点,或者该其他节点对第一目标地址的数据存取请求的优先级可能高于第一节点所接收到的消息序列中相应消息的优先级。
通过以上描述可以看出,通过将下游节点(第二节点)中消息序列中的消息(Req0-ReqN)对应的目标地址的独占处理权限(即E态)迁移到交界节点(第一节点),由于一个节点拥有消息序列所对应的目标地址的E态,即代表该节点针对该消息序列,具有对执行结果的可见顺序的处理能力,从而可以在上游节点完成执行结果的可见顺序的控制操作,从而保证消息序列的执行结果的可见顺序符合外部系统预期。本申请实施例中,在上游节点完成消息序列的执行结果的可见顺序控制操作,无需在上游节点和下游节点间进行交互,以及无需采用基于序列号(sequence number)的重排序(reorder)机制来保证执行结果的可 见顺序的可控性,从而可以减少系统开销。
此外,本申请实施例还提供了一种SoC芯片,该SoC芯片可包括:一个或多个处理器,以及一个或多个存储器;其中,所述一个或多个存储器存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得所述SoC芯片执行前述实施例提供的方法。
此外,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当计算机程序在计算设备上运行时,使得所述计算设备执行前述实施例提供的方法。
此外,本申请实施例还一种计算机程序产品,所述计算机程序产品在被计算机调用时,使得所述计算机执行前述实施例提供的方法。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(Digital Subscriber Line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可 以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (28)

  1. 一种数据存取方法,应用于互联系统,所述互联系统包括至少两个采用横向扩展方式互联的节点,所述至少两个采用横向扩展方式互联的节点包括第一节点和第二节点,其特征在于,所述方法包括:
    所述第一节点接收来自于外部节点的消息序列,所述消息序列中的消息用于请求对目标地址进行数据存取,所述目标地址归属于所述第二节点管理的存储空间,所述外部节点为所述互联系统外部的节点,所述外部节点对执行结果的可见顺序的约束比第一节点严格;
    所述第一节点获取所述目标地址的独占处理权限;
    所述第一节点基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,其中,所述数据存取操作的执行结果的可见顺序满足所述外部节点对执行结果的可见顺序的约束。
  2. 如权利要求1所述的方法,其特征在于,所述第一节点获取所述目标地址的独占处理权限,包括:
    所述第一节点响应于接收所述消息序列的操作,从所述第二节点获取所述目标地址的独占处理权限。
  3. 如权利要求2所述的方法,其特征在于,所述第一节点响应于接收所述消息序列的操作,从所述第二节点获取所述目标地址的独占处理权限,包括:
    所述第一节点接收所述消息序列后,向所述第二节点发送权限获取请求,所述权限获取请求携带所述消息序列中的至少一个消息对应的目标地址;
    所述第一节点接收来自于所述第二节点的权限获取响应,所述权限获取响应用于指示将所述至少一个消息对应的目标地址的独占处理权限迁移到所述第一节点。
  4. 如权利要求3所述的方法,其特征在于,所述第一节点接收所述消息序列后,向所述第二节点发送权限获取请求,包括:
    所述第一节点接收所述消息序列后,通过与所述第二节点之间的第一路径发送所述权限获取请求,其中,所述第一路径是所述第一节点根据与所述第二节点之间各路径的拥塞状态选取的;
    所述第一节点从与所述第二节点之间的第二路径接收所述权限获取响应,所述第二路径和所述第一路径相同或不同。
  5. 如权利要求1所述的方法,其特征在于,所述目标地址的独占处理权限,是所述第一节点在接收所述消息序列之前从所述第二节点获取的。
  6. 如权利要求5所述的方法,其特征在于,所述方法还包括:
    所述第一节点在接收所述消息序列之前,从所述第二节点获取指定地址范围的独占处理权限,所述指定地址范围包括所述消息序列中的消息对应的目标地址。
  7. 如权利要求6所述的方法,其特征在于,所述从所述第二节点获取指定地址范围的独占处理权限,包括:
    所述第一节点根据历史数据存取操作所对应的目标地址中,归属于所述第二节点管理的存储空间的地址,确定所述指定地址范围;
    所述第一节点从所述第二节点获取所述指定地址范围的独占处理权限。
  8. 如权利要求1-7任一项所述的方法,其特征在于,所述第一节点基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,包括:
    所述第一节点从所述第二节点获取与所述目标地址对应的缓存地址;
    所述第一节点基于所述目标地址的独占处理权限,对所述目标地址对应的缓存地址进行数据存取操作。
  9. 如权利要求1-8任一项所述的方法,其特征在于,所述第一节点基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作之后,还包括:
    所述第一节点释放所述目标地址的独占处理权限。
  10. 如权利要求1-9任一项所述的方法,其特征在于,还包括:
    所述第一节点获取到所述目标地址的独占处理权限后,若在设定长时间内未对所述目标地址进行数据存取操作,则释放所述目标地址的独占处理权限。
  11. 如权利要求1-10任一项所述的方法,其特征在于,还包括:
    所述第一节点根据所述外部节点对执行结果的可见顺序的约束,向所述外部节点发送对所述目标地址进行数据存取操作的执行结果。
  12. 如权利要求1-11任一项所述的方法,其特征在于,所述第一节点和所述第二节点分别为片上系统SoC芯片,所述外部节点为输入输出I/O设备。
  13. 一种互联系统,所述互联系统包括至少两个采用横向扩展方式互联的节点,所述至少两个采用横向扩展方式互联的节点包括第一节点和第二节点,其特征在于:
    所述第一节点,用于:
    接收来自于外部节点的消息序列,所述消息序列中的消息用于请求对目标地址进行数据存取,所述目标地址归属于所述第二节点管理的存储空间,所述外部节点为所述互联系统外部的节点,所述外部节点对数据存取操作执行结果的可见顺序的约束比第一节点严格;
    获取所述目标地址的独占处理权限;
    基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作,其中,所述数据存取操作的执行结果的可见顺序满足所述外部节点对执行结果的可见顺序的约束。
  14. 如权利要求13所述的互联系统,其特征在于,所述第一节点,具体用于:
    响应于接收所述消息序列的操作,从所述第二节点获取所述目标地址的独占处理权限。
  15. 如权利要求14所述的互联系统,其特征在于:
    所述第一节点,具体用于接收所述消息序列后,向所述第二节点发送权限获取请求,所述权限获取请求携带所述消息序列中的至少一个消息对应的目标地址;
    所述第二节点,用于接收所述权限获取请求后,向所述第一节点发送权限获取响应,所述权限获取响应用于指示将所述至少一个消息对应的目标地址的独占处理权限迁移到所述第一节点。
  16. 如权利要求15所述的互联系统,其特征在于:
    所述第一节点,具体用于接收所述消息序列后,通过与所述第二节点之间的第一路径发送所述权限获取请求,其中,所述第一路径是所述第一节点根据与所述第二节点之间各路径的拥塞状态选取的;
    所述第二节点,具体用于从与所述第一节点之间的第二路径发送所述权限获取响应,所述第二路径和所述第一路径相同或不同。
  17. 如权利要求13所述的互联系统,其特征在于,所述目标地址的独占处理权限,是所述第一节点在接收所述消息序列之前从所述第二节点获取的。
  18. 如权利要求17所述的互联系统,其特征在于,所述第一节点,还用于:
    在接收所述消息序列之前,从所述第二节点获取对所述第二节点管理的存储空间内指定地址范围的地址进行数据存取的操作权限,其中,所述第一地址为所述指定地址范围内的地址。
  19. 如权利要求18所述的互联系统,其特征在于,所述第一节点,具体用于:
    根据历史数据存取操作所对应的目标地址中,归属于所述第二节点管理的存储空间的地址,确定所述指定地址范围;
    从所述第二节点获取所述指定地址范围的独占处理权限。
  20. 如权利要求13-19任一项所述的互联系统,其特征在于,所述第一节点,具体用于:
    从所述第二节点获取与所述目标地址对应的缓存地址;
    基于所述目标地址的独占处理权限,对所述目标地址对应的缓存地址进行数据存取操作。
  21. 如权利要求13-20任一项所述的互联系统,其特征在于,所述第一节点,还用于:
    基于所述目标地址的独占处理权限,对所述目标地址进行数据存取操作之后,释放所述目标地址的独占处理权限。
  22. 如权利要求13-21任一项所述的互联系统,其特征在于,所述第一节点,还用于:
    获取到所述目标地址的独占处理权限后,若在设定长时间内未对所述目标地址进行数据存取操作,则释放所述目标地址的独占处理权限。
  23. 如权利要求13-22任一项所述的互联系统,其特征在于,所述第一节点还用于:
    根据所述外部节点对执行结果的可见顺序的约束,向所述外部节点发送对所述目标地址进行数据存取操作的执行结果。
  24. 如权利要求13-23任一项所述的互联系统,其特征在于,所述第二节点,用于:
    在第一节点获取到第一目标地址的独占处理权限后,若在设定时长内所述第一节点未归还所述第一目标地址的独占处理权限,则重新获得所述第一目标地址的独占处理权限,其中,所述第一目标地址为所述消息序列中的第一消息对应的目标地址。
  25. 如权利要求13-24任一项所述的互联系统,其特征在于,所述第一节点和所述第二节点分别为片上系统SoC芯片,所述外部节点为输入输出I/O设备。
  26. 一种片上系统SoC芯片,其特征在于,包括:一个或多个处理器,以及一个或多个存储器;其中,所述一个或多个存储器存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得所述SoC芯片执行如权利要求1-12中任一项所述的方法。
  27. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机程序,当计算机程序在计算设备上运行时,使得所述计算设备执行如权利要求1-12任一项所述的方法。
  28. 一种计算机程序产品,其特征在于,所述计算机程序产品在被计算机调用时,使得所述计算机执行如权利要求1-12任一项所述的方法。
PCT/CN2021/094911 2021-05-20 2021-05-20 一种数据存取方法、互联系统及装置 WO2022241718A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/094911 WO2022241718A1 (zh) 2021-05-20 2021-05-20 一种数据存取方法、互联系统及装置
CN202180098267.7A CN117355823A (zh) 2021-05-20 2021-05-20 一种数据存取方法、互联系统及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/094911 WO2022241718A1 (zh) 2021-05-20 2021-05-20 一种数据存取方法、互联系统及装置

Publications (1)

Publication Number Publication Date
WO2022241718A1 true WO2022241718A1 (zh) 2022-11-24

Family

ID=84140111

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/094911 WO2022241718A1 (zh) 2021-05-20 2021-05-20 一种数据存取方法、互联系统及装置

Country Status (2)

Country Link
CN (1) CN117355823A (zh)
WO (1) WO2022241718A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193192A1 (en) * 2008-01-29 2009-07-30 International Business Machines Corporation Method and Process for Expediting the Return of Line Exclusivity to a Given Processor Through Enhanced Inter-node Communications
CN104202391A (zh) * 2014-08-28 2014-12-10 浪潮(北京)电子信息产业有限公司 共享系统地址空间的非紧耦合系统间的rdma通信方法
CN105450555A (zh) * 2014-09-26 2016-03-30 杭州华为数字技术有限公司 一种片上网络系统,及片上网络通信链路的建立方法
CN106815318A (zh) * 2016-12-24 2017-06-09 上海七牛信息技术有限公司 一种时序数据库的集群化方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193192A1 (en) * 2008-01-29 2009-07-30 International Business Machines Corporation Method and Process for Expediting the Return of Line Exclusivity to a Given Processor Through Enhanced Inter-node Communications
CN104202391A (zh) * 2014-08-28 2014-12-10 浪潮(北京)电子信息产业有限公司 共享系统地址空间的非紧耦合系统间的rdma通信方法
CN105450555A (zh) * 2014-09-26 2016-03-30 杭州华为数字技术有限公司 一种片上网络系统,及片上网络通信链路的建立方法
CN106815318A (zh) * 2016-12-24 2017-06-09 上海七牛信息技术有限公司 一种时序数据库的集群化方法及系统

Also Published As

Publication number Publication date
CN117355823A (zh) 2024-01-05

Similar Documents

Publication Publication Date Title
EP3028162B1 (en) Direct access to persistent memory of shared storage
TWI318737B (en) Method and apparatus for predicting early write-back of owned cache blocks, and multiprocessor computer system
US7743191B1 (en) On-chip shared memory based device architecture
US20110004732A1 (en) DMA in Distributed Shared Memory System
TWI547870B (zh) 用於在多節點環境中對i/o 存取排序的方法和系統
US20150253997A1 (en) Method and Apparatus for Memory Allocation in a Multi-Node System
US9244877B2 (en) Link layer virtualization in SATA controller
TW201543218A (zh) 具有多節點連接的多核網路處理器互連之晶片元件與方法
US9372800B2 (en) Inter-chip interconnect protocol for a multi-chip system
CN114756388B (zh) 一种基于rdma的集群系统节点间按需共享内存的方法
US11741034B2 (en) Memory device including direct memory access engine, system including the memory device, and method of operating the memory device
CN110119304B (zh) 一种中断处理方法、装置及服务器
JP2004213435A (ja) 記憶装置システム
WO2012140670A2 (en) Multi-host sata controller
JP2021522608A (ja) ストリーミングデータ転送のためのフロー圧縮を用いたデータ処理ネットワーク
WO2022241718A1 (zh) 一种数据存取方法、互联系统及装置
CN114356839B (zh) 处理写操作的方法、设备、处理器及设备可读存储介质
KR20050080704A (ko) 프로세서간 데이터 전송 장치 및 방법
US10084724B2 (en) Technologies for transactional synchronization of distributed objects in a fabric architecture
US20230385220A1 (en) Multi-port memory link expander to share data among hosts
CN115495433A (zh) 一种分布式存储系统、数据迁移方法及存储装置
WO2022205130A1 (zh) 读写操作执行方法和SoC芯片
US20110191638A1 (en) Parallel computer system and method for controlling parallel computer system
US20240045588A1 (en) Hybrid memory system and accelerator including the same
US20230315636A1 (en) Multiprocessor system cache management with non-authority designation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21940173

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180098267.7

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21940173

Country of ref document: EP

Kind code of ref document: A1