WO2021097802A1 - 处理非缓存写数据请求的方法、缓存器和节点 - Google Patents

处理非缓存写数据请求的方法、缓存器和节点 Download PDF

Info

Publication number
WO2021097802A1
WO2021097802A1 PCT/CN2019/120252 CN2019120252W WO2021097802A1 WO 2021097802 A1 WO2021097802 A1 WO 2021097802A1 CN 2019120252 W CN2019120252 W CN 2019120252W WO 2021097802 A1 WO2021097802 A1 WO 2021097802A1
Authority
WO
WIPO (PCT)
Prior art keywords
buffer
data
node
address
write data
Prior art date
Application number
PCT/CN2019/120252
Other languages
English (en)
French (fr)
Inventor
吴文宇
夏晶
信恒超
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2019/120252 priority Critical patent/WO2021097802A1/zh
Priority to CN201980102202.8A priority patent/CN114731282B/zh
Priority to EP19953302.7A priority patent/EP4054140A4/en
Publication of WO2021097802A1 publication Critical patent/WO2021097802A1/zh
Priority to US17/749,612 priority patent/US11789866B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0817Cache consistency protocols using directory methods
    • G06F12/0824Distributed directories, e.g. linked lists of caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0888Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using selective caching, e.g. bypass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/084Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0868Data transfer between cache memory and other subsystems, e.g. storage devices or host systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1048Scalability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/15Use in a specific computing environment
    • G06F2212/154Networked environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/31Providing disk cache in a specific location of a storage system
    • G06F2212/311In host system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/50Control mechanisms for virtual memory, cache or TLB
    • G06F2212/507Control mechanisms for virtual memory, cache or TLB using speculative control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/62Details of cache specific to multiprocessor cache arrangements
    • G06F2212/621Coherency control relating to peripheral accessing, e.g. from DMA or I/O device

Definitions

  • This application relates to the field of buffers, and in particular to a method, buffers and nodes for processing non-cached write data requests.
  • each data write request requires interaction and handshake to complete.
  • Typical application scenarios include: remote direct memory access (RDMA) operations with the same address and low width with the same address.
  • RDMA remote direct memory access
  • a complete data write operation needs to include the following steps:
  • the CPU can send a non-cached write data request to the node through the cache, and the node can write the data corresponding to the non-cached write data request.
  • the data can also be sent to the next child node.
  • the node may be a bus node or controller.
  • the controller may be a storage controller, a universal serial bus (USB) controller, or the like. This node can form a storage system with child nodes.
  • the node After the node receives the first non-cached write data request, the node allocates a data buffer ID (DBID) to the CPU, and sends the DBID to the CPU through the buffer. After receiving the DBID, the CPU sends the first data that needs to be written to the node through the buffer. The node determines the data buffer corresponding to the DBID, and writes the first data into the data buffer. After completing the data writing, the node sends a completion (COMP) response to the CPU through the buffer.
  • DBID data buffer ID
  • the node determines the data buffer corresponding to the DBID, and writes the first data into the data buffer.
  • the node After completing the data writing, the node sends a completion (COMP) response to the CPU through the buffer.
  • COMP completion
  • the buffer needs to receive the DBID sent by the node before requesting the CPU according to the DBID for the first data corresponding to the first non-cached write data request. After receiving the first data, the buffer sends the first data to the node, so that the efficiency of writing data is low.
  • This application provides a method and a buffer for processing non-cached write data requests.
  • the buffer can obtain the first data from the processor without waiting until the first data buffer number sent by the node is received, and the first data buffer number is received from the slave node. After that, the first data can be sent to the node immediately, which can speed up the process of writing data.
  • the first aspect of the present application provides a method for processing a non-cached write data request.
  • the method includes: a cache receives a first non-cached write data request from a first processor and sends a first non-cached write request to a node.
  • Data request, the first non-cached write data request includes the first address, and the first non-cached write data request is used to apply to the node for the first data buffer number corresponding to the first address; if the buffer determines that the first data buffer is stored in the buffer An address, the buffer obtains the first data corresponding to the first non-cached write data request from the first processor; when the buffer receives the first data buffer number from the node, the buffer sends the first data to the node .
  • the buffer After the buffer receives the first non-cached write data request, if it is determined that the first address is stored locally, it can immediately obtain the first data from the first processor without waiting until the node allocated the first data is received. After a data buffer is numbered, the first data is requested from the first processor. After the buffer receives the first data buffer number, it can immediately send the first data to the node, which can improve the efficiency of data writing and speed up the data writing process.
  • the cache acquiring the first data corresponding to the first non-cached write data request from the first processor includes: The first processor sends the second data buffer number to instruct the first processor to send the first data to the buffer.
  • the second data buffer number is allocated by the buffer to the first processor according to the first non-cached write data request ;
  • the buffer receives the first data from the first processor. Since the first processor is not aware of which device assigned the data buffer number, the buffer can assign a second data buffer number to the first processor and instruct the first processor to send the first data In this way, the first processor can send the first data in advance, speeding up the data writing process.
  • the method further includes: the buffer determines that the first address stored in the buffer is marked as exclusive Mark; the buffer sends a first completion response to the first processor to instruct the first processor to release the buffer occupied by the first data.
  • the buffer may send a first completion response to the first processor while sending the second data buffer number to the first processor.
  • the first processor may immediately release the buffer area occupied by the first data after sending the first data. The pressure on the first processor can be relieved. If the first processor has a continuous data write request, it can immediately send the next non-cached data write request after receiving the first completion response. This can improve the efficiency of writing data.
  • the method before the buffer receives the first non-cached write data request from the first processor, the method It also includes: the buffer receives a second non-cached write data request from the second processor and sends a second non-cached write data request to the node, the second non-cached write data request includes a first address, and the first address is used by the node to determine the cache Whether the first address stored in the buffer is marked with an exclusive mark; the buffer receives the third data buffer number and a second completion response from the node, and the second completion response includes an exclusive mark; if the first address stored in the buffer is not marked with an exclusive mark Mark, the buffer will mark the first address with an exclusive mark. When the first address stored in the buffer is not marked with the exclusive mark, the buffer may request the exclusive mark from the node and mark the first address with the exclusive mark. After the buffer marks the first address with an exclusive mark, the buffer can directly send a completion response to the processor.
  • the method further includes: the buffer receives a delete instruction from the node; and the buffer deletes The exclusive mark of the first address, and send a delete confirmation instruction to the node.
  • the buffer may delete the exclusive mark of the first address and send a delete instruction to the node, and the node may delete the correspondence between the buffer and the first address according to the delete instruction.
  • the node can then record the correspondence between other buffers and the first address, which can increase the diversity of the scheme.
  • a second aspect of the present application provides a method for processing a non-cached write data request, the method includes: a node receives a first non-cached write data request from a first buffer, the first non-cached write data request includes a first address; the node According to the first non-cached write data request, the first processor is assigned the first data buffer number, and the first data buffer number is sent to the first buffer; when the node receives the first non-cached write data from the first buffer When the first data corresponding to the data request, the node writes the first data into the data buffer corresponding to the first data buffer number, and the first data is the first buffer. It is determined that the first address is stored in the first buffer. When, obtained from the first processor.
  • the first buffer After the first buffer receives the first non-cached write data request, if it is determined that the first address is stored locally, it may immediately obtain the first data from the first processor. There is no need to wait for the first data buffer number allocated by the node to request the first data from the first processor. When the buffer receives the first data buffer number, it can directly send the first data to the node. After receiving the first data, the node can write the first data into the data buffer corresponding to the first data buffer number according to the first data buffer number. It can improve the efficiency of data writing and speed up the process of writing data.
  • the method further includes: the node determines that the first The corresponding relationship between the serial number of the buffer and the first address, and the corresponding relationship indicates that the first address stored in the buffer is marked with an exclusive mark.
  • the first buffer can assign a second data buffer number to the first processor and instruct the first processor to send the first data, so that The first processor sends the first data in advance to speed up the data writing process.
  • the method before the node receives the first non-cached write data request from the first buffer, the method also Including: the node receives a second non-cached write data request from the first buffer, the second non-cached write data request includes a first address; the node stores the correspondence between the number of the first buffer and the first address; the node according to the second The non-buffered write data request allocates a third data buffer number and sends the third data buffer number and a second completion response to the first buffer.
  • the second completion response includes an exclusive flag.
  • the node may send the second completion response containing the exclusive mark to the first buffer, and the first buffer may mark the first address stored locally with the exclusive mark after receiving the second completion response.
  • the first buffer may send a first completion response to the first processor while sending the second data buffer number to the first processor.
  • the first processor may immediately release the buffer area occupied by the first data after sending the first data. The pressure on the first processor can be relieved. If the first processor has a continuous data write request, it can immediately send the next non-cached data write request after receiving the first completion response. This can improve the efficiency of writing data.
  • the method further includes: the node receives the third non-buffered from the second buffer Write data request.
  • the third non-cached write data request includes the first address, and the second buffer is different from the first buffer; if the node determines that the node stores the correspondence between the number of the first buffer and the first address, the node sends the A buffer sends a delete instruction, which is used to instruct the first buffer to delete the exclusive mark of the first address and send a delete confirmation instruction to the node; the node receives the delete confirmation instruction from the buffer; the node deletes the first buffer according to the delete confirmation instruction Correspondence between the serial number of the device and the first address.
  • the node may delete the correspondence between the number of the first buffer and the first address, and then record the number of the second buffer and the first address.
  • the correspondence of an address can increase the diversity of the scheme.
  • a third aspect of the present application provides a buffer, which includes: a management module, configured to receive a first non-cached write data request from a first processor and send a first non-cached write data request to a node, the first non-cached write data request
  • the cache write data request includes the first address, and the first non-cache write data request is used to apply to the node for the first data buffer number corresponding to the first address;
  • the data processing module is used when the tag storage module determines that the first address is stored When the first data buffer number is received from the first processor, the first data corresponding to the first non-cached write data request is obtained; the data processing module is also used to send to the node when the data buffer number receiving module receives the first data buffer number. The first data.
  • the management module After the management module receives the first non-cached data write request, if the tag storage module determines that the first address is stored locally, the data processing module can immediately obtain the first data from the first processor. After the data buffer number receiving module receives the first data buffer number, the data processing module can immediately send the first data to the node, which can improve the efficiency of data writing and speed up the data writing process.
  • the buffer further includes: a management module, which is further configured to send a second data buffer number to the first processor to indicate the first A processor sends the first data to the buffer; the data processing module is also used to receive the first data from the first processor.
  • the tag storage module is further used to determine that the first address is marked with an exclusive mark; the management module also It is used to send a first completion response to the first processor to instruct the first processor to release the buffer occupied by the first data.
  • the management module is further configured to receive a second non-cached write data request from the second processor And send a second non-cached write data request to the node, the second non-cached write data request contains the first address, the first address is used by the node to determine whether the first address stored in the buffer is marked with an exclusive mark; the data buffer number receiving module , Used to receive the third data buffer number from the node; the tag storage module, used to receive the second completion response from the node, the second completion response includes an exclusive tag; the tag storage module, also used when the first stored in the tag storage module When the address does not have an exclusive mark, the first address is marked with an exclusive mark.
  • the tag storage module is also used to receive a delete instruction from the node; tag storage The module is also used to delete the exclusive mark of the first address and send a delete confirmation instruction to the node.
  • a fourth aspect of the present application provides a node, the node includes: a request processing module, configured to receive a first non-cached write data request from a first buffer, the first non-cached write data request includes a first address; a data buffer module , For assigning the first data buffer number to the first processor according to the first non-buffered write data request received by the request processing module, and instructing the management module to send the first data buffer number to the first buffer; data buffering module , Is also used to write the first data into the data buffer corresponding to the first data buffer number when the data processing module receives the first data corresponding to the first non-buffered write data request, the first data is the first data When a buffer determines that the first address is stored in the first buffer, it is obtained from the first processor.
  • the first buffer After the first buffer receives the first non-cached write data request, if it is determined that the first address is stored locally, it may immediately obtain the first data from the first processor. There is no need to wait for the first data buffer number allocated by the node to request the first data from the first processor. When the buffer receives the first data buffer number, it can directly send the first data to the node. After receiving the first data, the node can write the first data into the data buffer corresponding to the first data buffer number according to the first data buffer number. It can improve the efficiency of data writing and speed up the process of writing data.
  • the management module is further configured to determine the correspondence between the number of the first buffer and the first address, and the correspondence indicates the buffer The first address stored by the device is marked with an exclusive mark.
  • the request processing module is further configured to receive second non-buffered write data from the first buffer Request, the second non-buffer write data request includes the first address; the management module is also used to store the correspondence between the number of the first buffer and the first address; the data buffer module is also used to process the data received according to the request.
  • the second non-cached write data request allocates a third data buffer number; the management module is further configured to send the third data buffer number and a second completion response to the first buffer, and the second completion response includes an exclusive mark.
  • the request processing module is further configured to receive the third non-information from the second buffer.
  • Cache write data request the third non-cache write data request includes the first address, and the second buffer is different from the first buffer;
  • the management module is also used when determining the correspondence between the number of the first buffer and the first address is stored When there is a relationship, send a delete instruction to the first buffer, and the delete instruction is used to instruct the first buffer to delete the exclusive mark of the first address, and send a delete confirmation instruction to the node;
  • the management module is also used to receive the delete confirmation instruction from the buffer ;
  • the management module is also used to delete the stored correspondence between the number of the first buffer and the first address according to the delete confirmation instruction.
  • This application provides a method for processing a non-cached write data request.
  • the method includes: a cache receives a first non-cached write data request from a first processor and sends a first non-cached write data request to a node.
  • the write data request includes the first address, and the first non-cached write data request is used to apply to the node for the first data buffer number corresponding to the first address; if the buffer determines that the first address is stored in the buffer, the buffer sends The first processor obtains the first data corresponding to the first non-cached write data request; when the buffer receives the first data buffer number from the node, the buffer sends the first data to the node.
  • the buffer After the buffer receives the first non-cached write data request, if it is determined that the first address is stored locally, it can immediately obtain the first data from the processor without waiting until the first data buffer number is received Then request the first data from the processor. After the buffer receives the first data buffer number, it can immediately send the first data to the node, which can improve the efficiency of data writing and speed up the data writing process.
  • FIG. 1 is a schematic diagram of an embodiment of an information processing system provided by this application.
  • FIG. 2 is a schematic diagram of an embodiment of a method for processing non-cached write data requests provided by this application;
  • FIG. 3 is a schematic diagram of an embodiment of a method for processing non-cached write data requests provided by this application
  • FIG. 4 is a schematic diagram of an embodiment of a method for processing non-cached write data requests provided by this application;
  • FIG. 5 is a schematic diagram of an embodiment of a method for processing non-cached write data requests provided by this application
  • FIG. 6 is a schematic diagram of an embodiment of a buffer provided by this application.
  • FIG. 7 is a schematic diagram of an embodiment of a node provided by this application.
  • each non-cached write data requires interaction and handshake to complete.
  • typical application scenarios include: high-bit width and the same address operation taking a work queue element (WQE) as an example, and low-bit width and the same address sequential operation taking a doorbell task (doorbell) as an example.
  • WQE work queue element
  • doorbell doorbell
  • Figure 1 is an information processing system provided by this application.
  • the system includes multiple buffers, such as buffer 200, buffer 210 in Figure 1...
  • Each buffer has a different number, and each buffer manages One or more processors, for example, the buffer 200 manages the processor 101, the processor 102, ... the processor 10N.
  • the N processors can be connected through the system bus. All N processors managed by the buffer 200 can interact with the buffer 200.
  • the processor when the processor managed by the buffer has a data write request, the processor sends a first non-cached write data request in step 201.
  • the buffer receives the first non-cached write data request, the buffer forwards the first non-cached write data request in step 202.
  • the node After the node receives the first non-cached write data request, it allocates the first data buffer number according to the first non-cached write data request.
  • the first data buffer number is sent.
  • the buffer forwards the first data buffer number in step 204.
  • the processor receives the first data buffer number
  • the processor sends the first data in step 205.
  • the buffer receives the first data
  • the buffer forwards the first data in step 206.
  • step 207 the node writes the first data to the node.
  • step 208 a first completion response is sent.
  • the buffer receives the first completion response, in step 209, the first completion response is forwarded.
  • the processor receives the first completion response, in step 210, the buffer occupied by the first data is released.
  • the buffer needs to receive the first data buffer number sent by the node before requesting the processor for the first data corresponding to the first non-cached write data request according to the first data buffer number. After receiving the first data, the buffer then sends the first data to the node, so that the efficiency of writing data is low.
  • the N processors managed by any buffer are located on the same chip as the buffer, and belong to the non-transmission sent by any one of the N processors managed by the same buffer.
  • the source address of the cache write data request is the same.
  • the buffer and the node may be on the same chip, or they may not be on the same chip.
  • PA protocol adapter
  • the buffer receives a non-cached write data request from any processor, the buffer needs to first send the non-cached write data request to the source PA located on the same chip as the buffer, and the source PA then sends the non-cached write data request to the source PA on the same chip as the buffer.
  • the cache write data request is forwarded to the destination PA located on the same chip as the node, and the destination PA then forwards the non-cache write data request to the node.
  • the non-cached write data request includes the attributes of the data to be written, the address of the destination to be written, and other basic information.
  • the non-cached write data request is necessary.
  • the embodiment of the present application provides a method for processing non-cached write data requests. Please refer to Figure 3. Specifically, the method for processing a data write request includes:
  • the first processor sends a first non-cached write data request to a cache.
  • the first processor When the first processor needs to write data, the first processor sends a first non-cached write data request to the cache.
  • the first processor is any one of the processors managed by the cache.
  • the buffer receives the first non-cached write data request from the first processor.
  • the first non-cached write data request includes a first address, and the first address points to a data buffer area of the node.
  • the data buffer area includes a plurality of data buffers, and each data buffer has a data buffer number.
  • the first non-cached write data request is used to apply to the node for a data buffer in the data buffer area.
  • the cache forwards the first non-cached write data request to the node.
  • the cache forwards the first non-cached write data request to the node.
  • the buffer determines that the first address that has been marked as exclusive is stored.
  • the buffer determines that the first address is stored locally, and the first address is marked as exclusive.
  • the MESI protocol defines modified, exclusive, share and invalid states. This solution uses the MESI protocol for reference. In this solution, the cache line in the cache has only two states, exclusive or invalid, for the first address in the node. When the state is exclusive, the data in the buffer is clean, that is, it is consistent with the data in the node.
  • the state of the buffer line in the buffer for the first address becomes the exclusive state.
  • the first address can only be monopolized by the cache line of one register at a time.
  • the buffer determines that the first address that has been marked as exclusive is stored
  • the buffer receives a non-cached write data request containing the first address from any processor
  • the buffer can directly send a request to the processor.
  • the processor sends the data buffer number and the completion response, so that the processor can release the buffer occupied by the data immediately after sending the data corresponding to the write data request.
  • the buffer sends the second data buffer number and the first completion response to the first processor.
  • the buffer determines that the first address is stored locally in step 303, the buffer sends a second data buffer number to the first processor, and the second data buffer number is the buffer to the first processor distributed.
  • the second data buffer number is used to instruct the first processor to send the first data corresponding to the first non-cached write data request to the buffer.
  • the buffer may send the second data buffer number to the first processor and at the same time send the first completion response to the first processor The first completion response is used to instruct the first processor to release the buffer occupied by the first data after sending the first data to the buffer.
  • the second data buffer number is allocated by the buffer, and the processor does not perceive which device allocates the data buffer number.
  • the first processor sends the first data to the buffer.
  • the first processor After the first processor receives the second data buffer number sent by the buffer in step 304, the first processor sends the first data to the buffer.
  • the first processor releases the buffer occupied by the first data.
  • the first buffer can immediately release the first data after sending the first data to the buffer in step 305. Buffer occupied by data.
  • the node determines that the corresponding relationship between the serial number of the buffer and the first address is stored.
  • the node determines the correspondence between the number of the buffer and the first address stored in its local directory (directory). When the node determines that the corresponding relationship between the buffer and the first address is stored in the directory, the node can determine that the first address stored in the buffer is marked with an exclusive mark.
  • step 307 is after step 302, but there is no chronological order between step 307 and step 303 to step 306.
  • the node sends the first data buffer number to the buffer.
  • the node After determining that the corresponding relationship between the number of the buffer and the first address is stored in the directory, the node allocates the first data buffer number to the first processor according to the first non-cached write data request sent by the buffer in step 302. And send the first data buffer number to the buffer.
  • the node While the node sends the first data buffer number to the buffer, the node may also send a completion response to the buffer, and the completion response may carry the exclusive mark.
  • the buffer sends the first data to the node.
  • the buffer When the buffer receives the first data buffer number from the node, the buffer sends the first data received from the first processor to the node.
  • the node writes the first data to the node according to the number of the first data buffer.
  • the node writes the first data to the node according to the number of the first data buffer. Specifically, the node determines the data buffer corresponding to the first data buffer number in the node according to the first data buffer number, and then writes the first data into the data buffer.
  • the node instructs the buffer to delete the exclusive mark of the first address.
  • the node may instruct the buffer to delete the exclusive mark of the first address.
  • FIG. 4 For details, refer to FIG. 4, FIG. 4, and the first buffer described in step 3111 to step 3115 as the buffer in FIG. 3 and step 301 to step 311.
  • the specific implementation of the node instructing the buffer to delete the target mark of the local record may be:
  • the node receives a third non-cached write data request from the second buffer, and the third non-cached write data request includes the first address.
  • the node receives a third non-cached write data request from a second buffer, which is different from the first buffer.
  • the number of the second buffer is different from the number of the first buffer.
  • the third non-cached write data request includes the first address.
  • the node sends a delete instruction to the first buffer.
  • the node When the node determines that the corresponding relationship between the first register and the first address is stored in the directory, the node sends a delete instruction to the first register.
  • the delete instruction is used to instruct the first register to delete the first address. Exclusive mark.
  • the first buffer locally deletes the exclusive mark of the first address.
  • the first register deletes the exclusive mark of the first address.
  • the first buffer sends a delete confirmation instruction to the node.
  • the first buffer sends a delete confirmation instruction to the node, where the delete confirmation instruction is used to instruct the node to delete the correspondence between the number of the first buffer stored in the node and the first address.
  • the node deletes the correspondence between the number of the first register and the first address in the directory.
  • the node After receiving the delete confirmation instruction, the node deletes the correspondence between the number of the first buffer and the first address in the local directory. Then, the corresponding relationship between the number of the second buffer and the first address is recorded.
  • the node may also send a data buffer number and a completion response to the second buffer, where the completion response is used to instruct the second buffer to mark the first address stored locally with an exclusive mark.
  • the embodiment of the present application provides a method for processing a non-cached write data request. After the cache receives the first non-cached write data request, if it is determined that the first address is stored locally, it can immediately obtain it from the processor. The first data does not need to wait for the first data buffer number to be received before requesting the first data from the processor. After the buffer receives the change in the number of the first data buffer, it can immediately send the first data to the node, which can improve the efficiency of data writing and speed up the process of writing data.
  • the buffer determines that the locally stored first address is marked with an exclusive mark
  • the buffer sends a first completion response to the first processor, and the first processor can release the first data immediately after sending the first data.
  • the buffer area occupied by data can relieve the pressure on the first processor. If the first processor has a continuous data write request, the next non-cached data write request can be issued immediately after sending the first data, which can speed up the data writing process.
  • the buffer needs to determine that the first address that has been marked as exclusive is stored locally. Before that, the buffer needs to obtain the exclusive mark from the node first, and mark the exclusive mark on the first address stored locally.
  • the method further includes:
  • the second processor sends a second non-cached write data request to the cache.
  • the second processor sends a second non-cached write data request to the cache, where the second non-cached write data request includes the first address.
  • the second processor and the foregoing first processor may be the same processor or different processors. If the first processor and the second processor are different, both the first processor and the second processor are processors managed by the cache.
  • the buffer receives the non-buffered write data request including the first address for the first time, the first address is stored locally.
  • the cache sends a second non-cached write data request to the node.
  • the buffer sends a second non-cached write data request to the node.
  • the buffer determines that the first address stored locally does not have an exclusive mark.
  • the buffer determines that the first address stored locally is not marked as exclusive.
  • the buffer sends the fourth data buffer number to the second processor.
  • the buffer sends a fourth data buffer number to the second processor, where the fourth data buffer number is allocated by the buffer to the second processor for the second non-cached write data request.
  • the buffer determines in step 403 that the first address stored locally is not marked with an exclusive mark, the buffer can only send the fourth data buffer number sent to the second processor, but not The second processor sends a completion response.
  • the second processor sends the second data to the buffer.
  • the second processor After the second processor receives the fourth data buffer number, the second processor sends the second data corresponding to the second non-cached write data request to the buffer.
  • the node stores the correspondence between the serial number of the buffer and the first address in the directory.
  • step 402 after the node receives the second non-cached write data request sent by the buffer, when the node determines that the corresponding relationship for the first node is not stored in the directory, the node stores the number of the buffer in the directory Correspondence with the first address.
  • the node after receiving the second non-cached write data request, when the node determines that the corresponding relationship between the numbers of other registers and the first address is stored in the directory, the node needs to delete the other registers first The corresponding relationship between the number of and the first address, and then the corresponding relationship between the number of the register and the first address is recorded. For details, please refer to step 3111 to step 3115 for understanding, and will not be repeated here.
  • Step 406 is after step 402, but step 405 and step 403 to step 405 have no time sequence relationship.
  • the node sends a third data buffer number and a second completion response to the buffer, where the second completion response includes an exclusive mark.
  • the node After the node stores the corresponding relationship between the number of the buffer and the first address in the directory, the node allocates a third data buffer number to the second processor according to the second non-cached write data request, and transfers the third data
  • the buffer number and the second completion response carrying the exclusive mark are sent to the buffer.
  • the exclusive mark is one character carried in the completion response.
  • the buffer sends the second data to the node.
  • the buffer After the buffer receives the third data buffer number and the second completion response, the buffer sends the second data to the node.
  • the buffer marks the first address stored locally with an exclusive mark.
  • the buffer is marked with an exclusive mark on the first address stored locally.
  • the exclusive mark is a character
  • the register adds the character to the end of the first address stored locally to identify the exclusive right of the register to the first address.
  • the buffer sends a third completion response to the second processor.
  • the buffer After the buffer is marked with an exclusive mark at the first address stored locally, the buffer sends a third completion response to the second processor, and the third completion response is used to instruct the second processor to release the buffer occupied by the second data Area.
  • the second processor releases the buffer area occupied by the second data.
  • the second processor may release the buffer area occupied by the second data.
  • the node writes the second data according to the second data buffer number.
  • Step 412 is after step 408, and there is no chronological sequence with step 409 to step 411.
  • the embodiment of the present application provides a method for processing non-cached write data requests.
  • the cache can apply to the node for the exclusive mark of the first address, and then mark the first address stored locally with the exclusive mark. Therefore, the buffer can directly send the data buffer number and completion response to the processor after receiving the non-cached write data request of the processor, and instruct the processor to send the data corresponding to the non-cached write data request and release the data occupied by the data. Buffer, which can relieve the pressure on the processor. When the processor needs to write data continuously, it can issue the next non-cached write data request immediately after sending the data, which can speed up the data writing process.
  • the embodiment of the present application provides a buffer. Please refer to FIG. 6.
  • the buffer includes a management module 501, a tag storage module 502, a data buffer number receiving module 503, and a data processing module 504.
  • the management module 501 is used to send a data buffer number and a completion response to the processor, is used to receive a non-cached write data request from the processor, and is also used to send a non-cached write data request to a node.
  • the management module 501 communicates with the mark storage module 502.
  • the mark storage module 502 is used to receive a completion response and a delete instruction from the node, and is also used to send a delete confirmation instruction to the node.
  • the data buffer number receiving module 503 is used to receive the data buffer number from the node.
  • the data processing module 504 communicates with the data buffer number receiving module 503, communicates with the management module 501, and communicates with the mark storage module 502.
  • the data processing module 504 is used to receive data from the processor and send data to the node.
  • each module in the buffer is as follows:
  • the management module 501 is configured to receive a first non-cached write data request from the first processor and send a first non-cached write data request to the node, the first non-cached write data request includes a first address, and the first non-cached write data request Used to apply to the node for the first data buffer number corresponding to the first address.
  • the management module 501 is also configured to send the second data buffer number to the first processor to instruct the first processor to send the first data to the buffer.
  • the management module 501 is further configured to send a first completion response to the first processor to instruct the first processor to release the buffer occupied by the first data.
  • the management module 501 is further configured to receive a second non-cached write data request from the second processor and send a second non-cached write data request to the node.
  • the second non-cached write data request includes a first address, and the first address is used for The node determines whether the first address stored in the buffer is marked with an exclusive mark.
  • the mark storage module 502 is used to determine that the first address is marked with an exclusive mark.
  • the mark storage module 502 is further configured to receive a second completion response from the node, and the second completion response includes an exclusive mark.
  • the mark storage module 502 is also configured to mark the first address with an exclusive mark when the first address stored by the mark storage module 502 is not marked with an exclusive mark.
  • the mark storage module 502 is also used to receive a deletion instruction from the node.
  • the mark storage module 502 is also used to delete the exclusive mark of the first address and send a delete confirmation instruction to the node.
  • the data buffer number receiving module 503 is configured to receive the third data buffer number from the node.
  • the data processing module 504 is configured to obtain the first data corresponding to the first non-cached write data request from the first processor when the tag storage module 502 determines that the first address is stored.
  • the data processing module 504 is further configured to send the first data to the node when the data buffer number receiving module 503 receives the first data buffer number.
  • the data processing module 504 is also configured to receive first data from the first processor.
  • the embodiment of the present application provides a node. Please refer to FIG. 7.
  • the node includes a request processing module 601, a management module 602, a data processing module 603, a data buffer module 604, and a request buffer module 605.
  • the request processing module 601 is connected to the data buffer module 604 and the request buffer module 605 respectively.
  • the request processing module 601 is configured to receive a non-buffered write data request from the buffer, and send the request to the data buffer module 604 and the request buffer module 605.
  • the data buffer module 604 is configured to allocate a data buffer number according to the non-cached write data request
  • the request buffer module 605 is configured to store the non-cached write data request.
  • the management module 602 communicates with the data buffer module 604. After the data buffer module 604 allocates a data buffer number, the management module 602 can send the data buffer number to the buffer.
  • the management module 602 is used to send the data buffer number, completion response, and delete instruction to the buffer.
  • the management module 602 is also used to receive a delete confirmation instruction from the buffer.
  • the management module 602 communicates with the request processing module 601 and the data processing module 603 respectively.
  • the data processing module 603 is used to receive data from the buffer and send the data to the data buffer module 604.
  • the data buffer module 604 stores data in the buffer.
  • the data buffer module 604 and the request buffer module 605 can also write the data to the lower-level node according to the stored data and the non-cached write data request.
  • each module in the node is as follows:
  • the request processing module 601 is configured to receive a first non-cached write data request from the first buffer, where the first non-cached write data request includes a first address.
  • the request processing module 601 is further configured to receive a second non-buffer write data request from the first buffer, and the second non-buffer write data request includes the first address.
  • the request processing module 601 is further configured to receive a third non-cached write data request from the second buffer.
  • the third non-cached write data request includes a first address, and the second buffer is different from the first buffer.
  • the management module 602 is configured to determine the correspondence between the serial number of the first buffer and the first address, and the correspondence indicates that the first address stored in the buffer is marked with an exclusive mark.
  • the management module 602 is also used to store the correspondence between the serial number of the first buffer and the first address.
  • the management module 602 is further configured to send a third data buffer number and a second completion response to the first buffer.
  • the second completion response includes an exclusive mark.
  • the management module 602 is further configured to send a delete instruction to the first register when it is determined that the corresponding relationship between the number of the first register and the first address is stored.
  • the delete instruction is used to instruct the first register to delete the first address. Exclusive mark, and send a delete confirmation instruction to the node.
  • the management module 602 is also used to receive a delete confirmation instruction from the buffer.
  • the management module 602 is further configured to delete the stored correspondence between the number of the first buffer and the first address according to the delete confirmation instruction.
  • the data processing module 603 is configured to receive the first data corresponding to the first non-cached write data request.
  • the data buffer module 604 is configured to allocate a first data buffer number to the first processor according to the first non-cached write data request received by the request processing module, and instruct the management module to send the first data buffer number to the first buffer .
  • the data buffer module 604 is further configured to write the first data into the data buffer corresponding to the first data buffer number when the data processing module receives the first data corresponding to the first non-buffered write data request.
  • a piece of data is obtained from the first processor when the first buffer determines that the first address is stored in the first buffer.
  • the data buffer module 604 is further configured to allocate a third data buffer number according to the second non-buffered write data request received by the request processing module.
  • the buffer can be a shared buffer, such as an L3 buffer; the node can be a storage controller, a bus controller, etc., since the buffer and the node are mainly implemented by hardware circuits, the buffer in the above embodiment
  • various modules included in the node are all circuit modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

一种处理非缓存写数据请求的方法、缓存器和节点。该方法包括:缓存器从第一处理器接收第一非缓存写数据请求并向节点发送第一非缓存写数据请求,第一非缓存写数据请求包含第一地址;若缓存器确定缓存器中存储有第一地址,则缓存器向第一处理器获取与第一非缓存写数据请求相对应的第一数据;当缓存器从节点接收到第一数据缓冲区编号时,缓存器向节点发送第一数据。该缓存器在接收到该第一非缓存写数据请求之后,若确定本地存储有第一地址,则可以向该处理器获取第一数据。当该缓存器接受到该第一数据缓冲区编号之后,可以向该节点发送该第一数据,可以提高数据写入的效率,加快写数据进程。

Description

处理非缓存写数据请求的方法、缓存器和节点 技术领域
本申请涉及缓存器领域,具体涉及一种处理非缓存写数据请求的方法、缓存器和节点。
背景技术
随着芯片组规模越来越大,芯片之间的通信成本也越来越高。在一个多核多片系统中,每次写数据请求都需要交互及握手才能完成。典型的应用场景包括:远程直接数据存取(remote direct memory access,RDMA)的高位宽同地址操作以及低位宽同地址操作。一次完整的写数据操作需要包含如下步骤:
当中央处理器(central processing unit,CPU)有写数据的需求时,该CPU可以通过缓存器向节点发送非缓存写数据请求,该节点可以写入与该非缓存写数据请求相对应的数据,也可以将该数据发往下一个子节点。该节点可以为一个总线节点或者控制器,示例性的,该控制器可以为存储控制器、通用串行总线(universal serial bus,USB)控制器等。该节点可以与子节点组成一个存储系统。当缓存器接收到CPU发送的第一非缓存写数据请求之后,该缓存器向节点转发该第一非缓存写数据请求。当节点接收到该第一非缓存写数据请求之后,该节点给CPU分配一个数据缓冲区编号(data buffer ID,DBID),并将该DBID通过缓存器发送给该CPU。该CPU接收到该DBID之后,通过缓存器向节点发送需要写入的第一数据。节点确定该DBID对应的数据缓冲区,并将该第一数据写入该数据缓冲区中。在完成数据写入之后,该节点通过缓存器向CPU发送完成(completion,COMP)响应。
该缓存器需要在接收到该节点发送的DBID之后,才能根据该DBID向该CPU请求与该第一非缓存写数据请求相对应的第一数据。在接收到该第一数据之后该缓存器再向节点发送该第一数据,这样写入数据的效率较低。
发明内容
本申请提供了一种处理非缓存写数据请求的方法及缓存器。在该处理非缓存写数据请求的方法中,该缓存器无需等到接收到该节点发送的第一数据缓冲区编号就可以向处理器获取第一数据,在从节点接收到第一数据缓冲区编号之后,可以立即向该节点发送该第一数据,这样可以加快写数据的进程。
有鉴于此,本申请第一方面提供了一种处理非缓存写数据请求的方法,该方法包括:缓存器从第一处理器接收第一非缓存写数据请求并向节点发送第一非缓存写数据请求,第一非缓存写数据请求包含第一地址,第一非缓存写数据请求用于向节点申请与第一地址对应的第一数据缓冲区编号;若缓存器确定缓存器中存储有第一地址,则缓存器向第一处理器获取与第一非缓存写数据请求相对应的第一数据;当缓存器从节点接收到第一数据缓冲区编号时,缓存器向节点发送第一数据。该缓存器在接收到该第一非缓存写数据请求之后, 若确定本地存储有第一地址,则可以立即向该第一处理器获取第一数据,而不需要等到接收到该节点分配的第一数据缓冲区编号之后再向该第一处理器请求第一数据。当该缓存器接收到该第一数据缓冲区编号之后,可以立即向该节点发送该第一数据,这样可以提高数据写入的效率,加快写数据进程。
可选的,结合第一方面,在第一方面的第一种可能的实现方式中,缓存器向第一处理器获取与第一非缓存写数据请求相对应的第一数据包括:缓存器向第一处理器发送第二数据缓冲区编号,以指示第一处理器向缓存器发送第一数据,第二数据缓冲区编号是缓存器根据第一非缓存写数据请求给第一处理器分配的;缓存器从第一处理器接收第一数据。由于该第一处理器对于数据缓冲区编号是哪个设备分配的并不感知,所以,该缓存器可以给该第一处理器分配第二数据缓冲区编号指示该第一处理器发送该第一数据,这样可以使得该第一处理器提前发送该第一数据,加快写数据进程。
可选的,结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,该方法还包括:缓存器确定缓存器中存储的第一地址被打上独占标记;缓存器向第一处理器发送第一完成响应,以指示第一处理器释放第一数据占用的缓冲区。当该缓存器确定该第一地址被打上独占标记时,可以在向该第一处理器发送第二数据缓冲区编号的同时向该第一处理器发送第一完成响应。该第一处理器在发送第一数据之后可以立即释放该第一数据占用的缓冲区。可以缓解该第一处理器的压力。若该第一处理器有连续写数据需求,在接收到该第一完成响应之后可以立即发送下一次的非缓存写数据请求。这样可以提高写数据效率。
可选的,结合第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,缓存器从第一处理器接收第一非缓存写数据请求之前,该方法还包括:缓存器从第二处理器接收第二非缓存写数据请求并向节点发送第二非缓存写数据请求,第二非缓存写数据请求包含第一地址,第一地址用于节点确定缓存器存储的第一地址是否被打上独占标记;缓存器从节点接收第三数据缓冲区编号和第二完成响应,第二完成响应包括独占标记;若缓存器中存储的第一地址未被打上独占标记,则缓存器将第一地址打上独占标记。当该缓存器存储的第一地址没有被打上独占标记时,该缓存器可以向节点请求该独占标记,并将该第一地址打上独占标记。该缓存器将该第一地址打上独占标记之后,该缓存器可以直接向处理器发送完成响应。
可选的,结合第一方面的第二种或第三种可能的实现方式,第一方面的第四种可能的实现方式中,该方法还包括:缓存器从节点接收删除指令;缓存器删除第一地址的独占标记,并向节点发送确认删除指令。当缓存器接收到删除指令之后,该缓存器可以删除该第一地址的独占标记,并向节点发送删除指令,该节点可以根据该删除指令删除该缓存器与该第一地址的对应关系。该节点就可以记录其他缓存器与该第一地址的对应关系,可以提高方案的多样性。
本申请第二方面提供了一种处理非缓存写数据请求的方法,该方法包括:节点从第一缓存器接收第一非缓存写数据请求,第一非缓存写数据请求包含第一地址;节点根据第一非缓存写数据请求给第一处理器分配第一数据缓冲区编号,并向第一缓存器发送第一数据 缓冲区编号;当节点从第一缓存器接收到与第一非缓存写数据请求相对应的第一数据时,节点将第一数据写入与第一数据缓冲区编号对应的数据缓冲区中,第一数据是第一缓存器确定第一缓存器中存储有第一地址时,从第一处理器获取的。该第一缓存器在接收到该第一非缓存写数据请求之后,若确定本地存储有第一地址,则可以立即向该第一处理器获取第一数据。而不需要等到接收到该节点分配的第一数据缓冲区编号之后再向该第一处理器请求第一数据。当该缓存器收到该第一数据缓冲区编号时,可以直接向该节点发送该第一数据。该节点收到该第一数据之后可以根据该第一数据缓冲区编号将该第一数据写入该第一数据缓冲区编号对应的数据缓冲区中。可以提高数据写入的效率,加快写数据进程。
可选的,结合第二方面,在第二方面的第一种可能的实现方式中,节点从第一缓存器接收第一非缓存写数据请求之后,该方法还包括:节点确定存储有第一缓存器的编号与第一地址的对应关系,该对应关系指示缓存器存储的第一地址被打上独占标记。当该第一缓存器存储的第一地址被打上独占标记时,该第一缓存器可以给第一处理器分配第二数据缓冲区编号指示该第一处理器发送该第一数据,这样可以使得该第一处理器提前发送该第一数据,加快写数据进程。
可选的,结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,节点从第一缓存器接收第一非缓存写数据请求之前,该方法还包括:节点从第一缓存器接收第二非缓存写数据请求,该第二非缓存写数据请求包含第一地址;节点存储第一缓存器的编号与第一地址的对应关系;节点根据第二非缓存写数据请求分配第三数据缓冲区编号并向第一缓存器发送第三数据缓冲区编号和第二完成响应,第二完成响应包括独占标记。该节点可以将包含有独占标记的第二完成响应发送给第一缓存器,该第一缓存器在接受到该第二完成响应之后可以将本地存储的第一地址打上独占标记。之后,该第一缓存器可以在向该第一处理器发送第二数据缓冲区编号的同时向该第一处理器发送第一完成响应。该第一处理器在发送第一数据之后可以立即释放该第一数据占用的缓冲区。可以缓解该第一处理器的压力。若该第一处理器有连续写数据需求,在接收到该第一完成响应之后可以立即发送下一次的非缓存写数据请求。这样可以提高写数据效率。
可选的,结合第二方面第一种或第二种可能的实现方式,在第二方面的第三种可能的实现方式中,该方法还包括:节点从第二缓存器接收第三非缓存写数据请求,第三非缓存写数据请求包括第一地址,第二缓存器不同于第一缓存器;若节点确定节点存储有第一缓存器的编号与第一地址的对应关系,节点向第一缓存器发送删除指令,删除指令用于指示第一缓存器删除第一地址的独占标记,并向节点发送确认删除指令;节点从缓存器接收确认删除指令;节点根据确认删除指令删除第一缓存器的编号与第一地址的对应关系。当节点接收到第二缓存器的第三非缓存写数据请求时,该节点可以先删除该第一缓存器的编号与第一地址的对应关系,再记录该第二缓存器的编号与该第一地址的对应关系,可以提高方案的多样性。
本申请第三方面提供了一种缓存器,该缓存器包括:管理模块,用于从第一处理器接收第一非缓存写数据请求并向节点发送第一非缓存写数据请求,第一非缓存写数据请求包含第一地址,第一非缓存写数据请求用于向节点申请与第一地址对应的第一数据缓冲区编 号;数据处理模块,用于当标记存储模块确定存储有第一地址时,从第一处理器获取与第一非缓存写数据请求相对应的第一数据;数据处理模块,还用于当数据缓冲区编号接收模块接收到第一数据缓冲区编号时,向节点发送第一数据。该管理模块在接收到该第一非缓存写数据请求之后,若标记存储模块确定本地存储有第一地址,则数据处理模块可以立即向该第一处理器获取第一数据。当数据缓冲区编号接收模块接收到该第一数据缓冲区编号之后,该数据处理模块可以立即向该节点发送该第一数据,这样可以提高数据写入的效率,加快写数据进程。
可选的,结合第三方面,在第三方面的第一种可能的实现方式中,缓存器还包括:管理模块,还用于向第一处理器发送第二数据缓冲区编号,以指示第一处理器向缓存器发送第一数据;数据处理模块,还用于从第一处理器接收第一数据。
可选的,结合第三方面第一种可能的实现方式,在第三方面的第二种可能的实现方式中,标记存储模块,还用于确定第一地址被打上独占标记;管理模块,还用于向第一处理器发送第一完成响应,以指示第一处理器释放第一数据占用的缓冲区。
可选的,结合第三方面的第二种可能的实现方式,在第三方面的第三种可能的实现方式中,管理模块,还用于从第二处理器接收第二非缓存写数据请求并向节点发送第二非缓存写数据请求,第二非缓存写数据请求包含第一地址,第一地址用于节点确定缓存器存储的第一地址是否被打上独占标记;数据缓冲区编号接收模块,用于从节点接收第三数据缓冲区编号;标记存储模块,用于从节点接收第二完成响应,第二完成响应包括独占标记;标记存储模块,还用于当标记存储模块存储的第一地址不具有独占标记时,将第一地址打上独占标记。
可选的,结合第三方面的第二种或第三种可能的实现方式,在第三方面的第四种可能的实现方式中,标记存储模块,还用于从节点接收删除指令;标记存储模块,还用于删除第一地址的独占标记,并向节点发送确认删除指令。
本申请第四方面提供了一种节点,该节点包括:请求处理模块,用于从第一缓存器接收第一非缓存写数据请求,第一非缓存写数据请求包含第一地址;数据缓冲模块,用于根据请求处理模块接收到的第一非缓存写数据请求给第一处理器分配第一数据缓冲区编号,并指示管理模块向第一缓存器发送第一数据缓冲区编号;数据缓冲模块,还用于当数据处理模块接收到与第一非缓存写数据请求相对应的第一数据时,将第一数据写入第一数据缓冲区编号对应的数据缓冲区中,第一数据是第一缓存器确定第一缓存器中存储有第一地址时,从第一处理器获取的。该第一缓存器在接收到该第一非缓存写数据请求之后,若确定本地存储有第一地址,则可以立即向该第一处理器获取第一数据。而不需要等到接收到该节点分配的第一数据缓冲区编号之后再向该第一处理器请求第一数据。当该缓存器收到该第一数据缓冲区编号时,可以直接向该节点发送该第一数据。该节点收到该第一数据之后可以根据该第一数据缓冲区编号将该第一数据写入该第一数据缓冲区编号对应的数据缓冲区中。可以提高数据写入的效率,加快写数据进程。
可选的,结合第四方面,在第四方面的第一种可能的实现方式中,管理模块,还用于确定存储有第一缓存器的编号与第一地址的对应关系,对应关系指示缓存器存储的第一地 址被打上独占标记。
可选的,结合第四方面的第一种可能的实现方式,在第四方面的第二种可能的实现方式中,请求处理模块,还用于从第一缓存器接收第二非缓存写数据请求,第二非缓存器写数据请求包含第一地址;管理模块,还用于存储第一缓存器的编号与第一地址的对应关系;数据缓冲模块,还用于根据请求处理模块接收到的第二非缓存写数据请求分配第三数据缓冲区编号;管理模块,还用于向第一缓存器发送第三数据缓冲区编号和第二完成响应,第二完成响应包括独占标记。
可选的,结合第四方面第一种或第二种可能的实现方式,在第四方面的第三种可能的实现方式中,请求处理模块,还用于从第二缓存器接收第三非缓存写数据请求,第三非缓存写数据请求包括第一地址,第二缓存器不同于第一缓存器;管理模块,还用于当确定存储有第一缓存器的编号与第一地址的对应关系时,向第一缓存器发送删除指令,删除指令用于指示第一缓存器删除第一地址的独占标记,并向节点发送确认删除指令;管理模块,还用于从缓存器接收确认删除指令;管理模块,还用于根据确认删除指令删除存储的第一缓存器的编号与第一地址的对应关系。
本申请提供了一种处理非缓存写数据请求的方法,该方法包括:缓存器从第一处理器接收第一非缓存写数据请求并向节点发送第一非缓存写数据请求,第一非缓存写数据请求包含第一地址,第一非缓存写数据请求用于向节点申请与第一地址对应的第一数据缓冲区编号;若缓存器确定缓存器中存储有第一地址,则缓存器向第一处理器获取与第一非缓存写数据请求相对应的第一数据;当缓存器从节点接收到第一数据缓冲区编号时,缓存器向节点发送第一数据。该缓存器在接收到该第一非缓存写数据请求之后,若确定本地存储有第一地址,则可以立即向该处理器获取第一数据,而不需要等到接收到该第一数据缓冲区编号之后再向该处理器请求第一数据。当该缓存器接受到该第一数据缓冲区编号之后,可以立即向该节点发送该第一数据,这样可以提高数据写入的效率,加快写数据进程。
附图说明
图1为本申请提供的一种信息处理系统一个实施例示意图;
图2为本申请提供的一种处理非缓存写数据请求的方法的一个实施例示意图;
图3为本申请提供的一种处理非缓存写数据请求的方法的一个实施例示意图;
图4为本申请提供的一种处理非缓存写数据请求的方法的一个实施例示意图;
图5为本申请提供的一种处理非缓存写数据请求的方法的一个实施例示意图;
图6为本申请提供的一种缓存器的一个实施例示意图;
图7为本申请提供的一种节点的一个实施例示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本 发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。
随着芯片组规模越来越大,芯片件通信的成本也越来越高。在一个多核多片系统中,每次非缓存写数据都需要交互及握手才能完成。在高性能处理器架构中,典型的应用场景包括:以工作队列元素(work queue element,WQE)为例的高位宽同地址操作,以门铃任务(doorbell)为例的低位宽同地址顺序操作。
图1为本申请提供的一种信息处理系统,该系统中包括多个缓存器,如图1中的缓存器200,缓存器210……每一个缓存器具有不同的编号,每一个缓存器管理一个或多个处理器,例如,该缓存器200管理处理器101、处理器102……处理器10N。该N个处理器可以通过系统总线相连。该缓存器200管理的N个处理器都可以与缓存器200交互。
参见图2,当缓存器管理的处理器有写数据需求时,该处理器在步骤201中,发送第一非缓存写数据请求。当该缓存器接收到该第一非缓存写数据请求时,该缓存器在步骤202中,转发该第一非缓存写数据请求。当该节点接收到给第一非缓存写数据请求之后,根据该第一非缓存写数据请求分配第一数据缓冲区编号。在步骤203中,发送该第一数据缓冲区编号。当缓存器接收到该第一数据缓冲区编号之后,该缓存器在步骤204中,转发该第一数据缓冲区编号。当处理器接收到该第一数据缓冲区编号时,该处理器在步骤205中,发送第一数据。当缓存器接收到该第一数据时,该缓存器在步骤206中,转发第一数据。节点在接收到该第一数据之后,在步骤207中,节点将该第一数据写入该节点。当该节点完成写入之后,在步骤208中,发送第一完成响应。缓存器接收到该第一完成响应之后,在步骤209中,转发第一完成响应。当处理器收到该第一完成响应时,在步骤210中,释放第一数据占用的缓冲区。
该缓存器需要在接收到该节点发送的第一数据缓冲区编号之后,才能根据该第一数据缓冲区编号向该处理器请求与该第一非缓存写数据请求相对应的第一数据。在接收到该第一数据之后,该缓存器再向节点发送该第一数据,这样写入数据的效率较低。
需要说明的是,在图1中,任意一个缓存器管理的N个处理器与该缓存器位于同一个芯片上,并且属于同一个缓存器管理的N个处理器中任意一个处理器发送的非缓存写数据请求的源地址是相同的。该缓存器与节点可能位于一个芯片上,也可能不在同一个芯片上。当该缓存器与节点不在一个芯片上时,该缓存器与节点之间还可以存在协议配置器(protocol adapter,PA)。当该缓存器接收到任意一个处理器的非缓存写数据请求时,该缓存器需要先将该非缓存写数据请求发送至与该缓存器位于同一芯片的源PA,该源PA 再将该非缓存写数据请求转发至与节点位于同一芯片的目的PA,该目的PA再将该非缓存写数据请求转发至节点。
需指出的是,非缓存写数据请求包括需要写入的数据的属性、需要写入的目的的地址以及其他基本信息。在节点将需要写入的数据写入过程中,该非缓存写数据请求是必要的。
本申请实施例提供了一种处理非缓存写数据请求的方法。请参见图3,具体地,处理写数据请求的方法包括:
301、第一处理器向缓存器发送第一非缓存写数据请求。
当第一处理器有写数据需求时,该第一处理器向缓存器发送第一非缓存写数据请求。该第一处理器为缓存器管理的处理器中的任意一个处理器。该缓存器从第一处理器接收该第一非缓存写数据请求。该第一非缓存写数据请求包含第一地址,该第一地址指向节点的一段数据缓冲区域,该数据缓冲区域包含多个数据缓冲区,每一个数据缓冲区有一个数据缓冲区编号。该第一非缓存写数据请求用于向节点申请该数据缓冲区域中的一个数据缓冲区。
302、缓存器向节点转发第一非缓存写数据请求。
缓存器向节点转发第一非缓存写数据请求。
303、缓存器确定存储有已被打上独占标记的第一地址。
该缓存器确定本地存储了第一地址,并且该第一地址被打上了独占(exclusive)标记。
需说明的是,MESI协议定义了修改(modified)、独占(exclusive)、分享(share)和无效(invalid)状态。本方案借鉴了该MESI协议,在本方案中,缓存器中的缓存线(cache line)对于节点中的第一地址只存在独占(exclusive)或无效(invalid)两种状态。当状态为独占时,缓存器中的数据是干净(clean)的,即与节点中的数据是一致的。
只有在缓存器收到节点的独占标记之后,该缓存器中的缓存线对于该第一地址的状态为独占状态。该第一地址在一个时刻只能被一个缓存器的缓存线所独占。当该缓存器确定存储有已被打上独占标记的第一地址时,该缓存器在从任意一个处理器接收到包含该第一地址的非缓存写数据请求时,该缓存器可以直接向该处理器发送数据缓冲区编号和完成响应,从而可以使得该处理器在发送与该写数据请求对应的数据之后立即释放该数据占用的缓冲区。
304、缓存器向第一处理器发送第二数据缓冲区编号和第一完成响应。
当步骤303中,该缓存器确定本地存储有第一地址时,该缓存器向第一处理器发送第二数据缓冲区编号,该第二数据缓冲区编号是该缓存器给该第一处理器分配的。该第二数据缓冲区编号用于指示该第一处理器向该缓存器发送与该第一非缓存写数据请求相对应的第一数据。
若该缓存器确定该缓存器本地存储的第一地址被打上独占标记时,该缓存器可以向该第一处理器发送第二数据缓冲区编号的同时向该第一处理器发送第一完成响应,该第一完 成响应用于指示该第一处理器在向缓存器发送第一数据之后释放该第一数据占用的缓冲区。
需要说明的是,该第二数据缓冲区编号是缓存器分配的,处理器对于该数据缓冲区编号是哪个设备分配的并不感知。
305、第一处理器向缓存器发送第一数据。
当第一处理器接收到在步骤304中缓存器发送的第二数据缓冲区编号之后,该第一处理器向缓存器发送第一数据。
306、第一处理器释放第一数据占用的缓冲区。
若该第一处理器在接收到该第二数据缓冲区编号的同时还接收到了第一完成响应,该第一缓存器在步骤305中向缓存器发送第一数据之后,可以立即释放该第一数据占用的缓冲区。
307、节点确定存储有缓存器的编号与第一地址的对应关系。
节点在自己本地的目录(directory)中确定存储有该缓存器的编号与第一地址的对应关系。当节点确定目录中存储有缓存器与第一地址的对应关系时,该节点可以确定该缓存器存储的第一地址被打上独占标记。
需说明的是,步骤307在步骤302之后,但是步骤307与步骤303至步骤306没有时间上的先后顺序。
308、节点向缓存器发送第一数据缓冲区编号。
当节点确定目录中存储有该缓存器的编号与第一地址的对应关系之后,根据步骤302中缓存器发送的第一非缓存写数据请求给该第一处理器分配第一数据缓冲区编号。并向该缓存器发送该第一数据缓冲区编号。
该节点向该缓存器发送第一数据缓冲区编号的同时,该节点也可以向该缓存器发送完成响应,该完成响应中可以携带该独占标记。
309、缓存器向节点发送第一数据。
当缓存器从该节点接收到第一数据缓冲区编号时,该缓存器向节点发送从该第一处理器接收到的第一数据。
310、节点根据第一数据缓冲区编号将第一数据写入该节点。
节点根据第一数据缓冲区编号将该第一数据写入该节点。具体的,该节点根据该第一数据缓冲区编号确定该第一数据缓冲区编号在该节点中对应的数据缓冲区,再将该第一数据写入该数据缓冲区中。
311、节点指示缓存器删除第一地址的独占标记。
可选的,该节点可以指示该缓存器删除该第一地址的独占标记。
具体的可以参见图4,图4以及步骤3111至步骤3115中所述的第一缓存器为图3以及步骤301至步骤311中的缓存器。该节点指示该缓存器删除本地记录的目标标记的具体实现可以为:
3111、节点从第二缓存器接收到第三非缓存写数据请求,第三非缓存写数据请求包括第一地址。
节点从第二缓存器接收到第三非缓存写数据请求,该第二缓存器不同于该第一缓存器。该第二缓存器的编号与第一缓存器的编号不相同。该第三非缓存写数据请求包括第一地址。
3112、节点向第一缓存器发送删除指令。
当节点确定目录中存储有第一缓存器与该第一地址的对应关系时,该节点向该第一缓存器发送删除指令,该删除指令用于指示该第一缓存器删除该第一地址的独占标记。
3113、第一缓存器在本地删除第一地址的独占标记。
第一缓存器删除该第一地址的独占标记。
3114、第一缓存器向节点发送确认删除指令。
该第一缓存器向节点发送确认删除指令,该确认删除指令用于指示该节点删除该节点中存储的第一缓存器的编号与第一地址的对应关系。
3115、节点在目录中删除第一缓存器的编号与第一地址的对应关系。
节点在收到该确认删除指令之后,在本地的目录中删除第一缓存器的编号与第一地址的对应关系。然后记录该第二缓存器的编号与该第一地址的对应关系。
可以理解的是,该节点还可以向该第二缓存器发送数据缓冲区编号和完成响应,该完成响应用于指示该第二缓存器将本地存储的第一地址打上独占标记。
本申请实施例提供了一种处理非缓存写数据请求的方法,该缓存器在接收到该第一非缓存写数据请求之后,若确定本地存储有第一地址,则可以立即向该处理器获取第一数据,而不需要等到接收到该第一数据缓冲区编号之后再向该处理器请求第一数据。当该缓存器接受到改第一数据缓冲区编号之后,可以立即向该节点发送该第一数据,这样可以提高数据写入的效率,加快写数据进程。
同时,当该缓存器确定本地存储的第一地址被打上独占标记时,该缓存器向第一处理器发送第一完成响应,该第一处理器可以将第一数据发出之后立即释放该第一数据占据的缓冲区,这样可以缓解该第一处理器的压力。若该第一处理器有连续写数据需求,在发出该第一数据之后可以马上发出下一次非缓存写数据请求,可以加快写数据进程。
在上述步骤303中,该缓存器需要确定本地存储有已被打上独占标记的第一地址。在此之前,该缓存器需要先从节点获取该独占标记,并在本地存储的第一地址上打上该独占标记。参见图5,在上述处理非缓存写数据请求的方法步骤301之前,该方法还包括:
401、第二处理器向缓存器发送第二非缓存写数据请求。
第二处理器向缓存器发送第二非缓存写数据请求,该第二非缓存写数据请求包含该第一地址。该第二处理器与上述第一处理器可以为相同的处理器,也可以为不同的处理器。若该第一处理器与第二处理器不相同,该第一处理器与第二处理器都为该缓存器管理的处理器。
需说明的是,该缓存器在首次接收到包含第一地址的非缓存写数据请求之后,将该第一地址存储在本地。
402、缓存器向节点发送第二非缓存写数据请求。
缓存器向该节点发送第二非缓存写数据请求。
403、缓存器确定本地存储的第一地址不具有独占标记。
缓存器确定本地存储的第一地址没有被打上独占标记。
404、缓存器向第二处理器发送第四数据缓冲区编号。
缓存器向第二处理器发送第四数据缓冲区编号,该第四数据缓冲区编号为该缓存器给该第二非缓存写数据请求给第二处理器分配的。
需说明的是,当该缓存器在步骤403中确定本地存储的第一地址没有被打上独占标记时,该缓存器只能向该第二处理器发送的第四数据缓冲区编号,而不能向该第二处理器发送完成响应。
405、第二处理器向缓存器发送第二数据。
第二处理器在接收到该第四数据缓冲区编号之后,该第二处理器向该缓存器发送与该第二非缓存写数据请求相对应的第二数据。
406、节点在目录中存储缓存器的编号与第一地址的对应关系。
在步骤402中,节点接收到该缓存器发送的第二非缓存写数据请求之后,该节点确定目录中没有存储关于该第一节点的对应关系时,该节点在目录中存储该缓存器的编号与第一地址的对应关系。
需指出的是,若该节点在接收到该第二非缓存写数据请求之后,该节点确定目录中存储有其他缓存器的编号与该第一地址的对应关系时,节点需要先删除其他缓存器的编号与该第一地址的对应关系,然后再记录该缓存器的编号与第一地址的对应关系,详情请参见步骤3111至步骤3115进行理解,此处不再赘述。
步骤406在步骤402之后,但是步骤405与步骤403至步骤405没有时间上的先后关系。
407、节点向缓存器发送第三数据缓冲区编号和第二完成响应,该第二完成响应包括独占标记。
节点在目录中存储该缓存器的编号与第一地址的对应关系之后,该节点根据该第二非缓存写数据请求给该第二处理器分配第三数据缓冲区编号,并将该第三数据缓冲区编号和携带有独占标记的第二完成响应发送给缓存器。示例性的,该独占标记为一个字符携带在该完成响应中。
408、缓存器向节点发送第二数据。
当缓存器接收到该第三数据缓冲区编号和第二完成响应之后,该缓存器向节点发送该第二数据。
409、缓存器将本地存储的第一地址打上独占标记。
缓存器在本地存储的第一地址打上独占标记。示例性的,若该独占标记为一个字符,则该缓存器在本地存储的第一地址末尾加上该字符,用以标识该缓存器对该第一地址的独占权限。
410、缓存器向第二处理器发送第三完成响应。
当该缓存器在本地存储的第一地址打上独占标记之后,缓存器向第二处理器发送第三完成响应,该第三完成响应用于指示该第二处理器释放该第二数据占用的缓冲区。
411、第二处理器释放第二数据占用的缓冲区。
第二处理器在接收到第三完成响应之后,该第二处理器可以释放该第二数据占用的缓冲区。
412、节点根据第二数据缓冲区编号将该第二数据写入。
节点在接收到该第二数据之后,将该第二数据写入该第二数据缓冲区对应的数据缓冲区中。步骤412在步骤408之后,与步骤409至步骤411没有时间上的先后顺序。
本申请实施例提供了一种处理非缓存写数据请求的方法,该方法中缓存器可以向节点申请第一地址的独占标记,然后将本地存储的第一地址打上独占标记。从而该缓存器在接收到处理器的非缓存写数据请求之后可以直接向处理器发送数据缓冲区编号和完成响应,指示处理器发送与该非缓存写数据请求对应的数据并释放该数据占用的缓冲区,这样可以缓解处理器的压力。当处理器有连续写数据需求时,可以在发出数据之后马上发出下一次非缓存写数据请求,可以加快写数据进程。
本申请实施例提供了一种缓存器,请参阅图6,该缓存器包括管理模块501、标记存储模块502、数据缓冲区编号接收模块503、数据处理模块504。
该管理模块501用于向处理器发送数据缓冲区编号和完成响应,用于从处理器接收非缓存写数据请求,还用于向节点发送非缓存写数据请求。该管理模块501与标记存储模块502互通。该标记存储模块502用于从节点接收完成响应和删除指令,还用于向节点发送确认删除指令。
该数据缓冲区编号接收模块503用于从节点接收数据缓冲区编号,数据处理模块504与数据缓冲区编号接收模块503互通,与管理模块501互通,与标记存储模块502互通。该数据处理模块504用于从处理器接收数据以及向节点发送数据。
具体的,该缓存器中各个模块的功能具体如下:
管理模块501,用于从第一处理器接收第一非缓存写数据请求并向节点发送第一非缓存写数据请求,第一非缓存写数据请求包含第一地址,第一非缓存写数据请求用于向节点申请与第一地址对应的第一数据缓冲区编号。
该管理模块501,还用于向第一处理器发送第二数据缓冲区编号,以指示第一处理器向缓存器发送第一数据。
该管理模块501,还用于向第一处理器发送第一完成响应,以指示第一处理器释放第一数据占用的缓冲区。
该管理模块501,还用于从第二处理器接收第二非缓存写数据请求并向节点发送第二非缓存写数据请求,第二非缓存写数据请求包含第一地址,第一地址用于节点确定缓存器存储的第一地址是否被打上独占标记。
标记存储模块502,用于确定第一地址被打上独占标记。
该标记存储模块502,还用于从节点接收第二完成响应,第二完成响应包括独占标记。
该标记存储模块502,还用于当该标记存储模块502存储的第一地址未被打上独占标记时,将第一地址打上独占标记。
该标记存储模块502,还用于从节点接收删除指令。
该标记存储模块502,还用于删除第一地址的独占标记,并向节点发送确认删除指令。
数据缓冲区编号接收模块503,用于从节点接收第三数据缓冲区编号。
数据处理模块504,用于当标记存储模块502确定存储有第一地址时,从第一处理器获取与第一非缓存写数据请求相对应的第一数据。
该数据处理模块504,还用于当数据缓冲区编号接收模块503接收到第一数据缓冲区编号时,向节点发送第一数据。
该数据处理模块504,还用于从第一处理器接收第一数据。
本申请实施例提供了一种节点,请参阅图7,该节点包括请求处理模块601、管理模块602、数据处理模块603、数据缓冲模块604、请求缓冲模块605。
该请求处理模块601与数据缓冲模块604、请求缓冲模块605分别相连。该请求处理模块601用于从缓存器接收非缓存写数据请求,并将请求发送至数据缓冲模块604和请求缓冲模块605。该数据缓冲模块604用于根据该非缓存写数据请求分配数据缓冲区编号,该请求缓冲模块605用于存储该非缓存写数据请求。该管理模块602与数据缓冲模块604互通,当数据缓冲模块604分配了数据缓冲区编号之后,该管理模块602可以向缓存器发送该数据缓冲区编号。该管理模块602用于向缓存器发送数据缓冲区编号、完成响应以及删除指令。该管理模块602还用于从缓存器接收确认删除指令。该管理模块602与请求处理模块601、数据处理模块603分别互通。该数据处理模块603用于从缓存器接收数据,并将数据发送至数据缓冲模块604。该数据缓冲模块604将数据存储在缓冲区中。该数据缓冲模块604与请求缓冲模块605还可以根据存储的数据和非缓存写数据请求将该数据写入下级节点。
具体的,该节点中各个模块的功能如下:
请求处理模块601,用于从第一缓存器接收第一非缓存写数据请求,第一非缓存写数据请求包含第一地址。
该请求处理模块601,还用于从第一缓存器接收第二非缓存写数据请求,第二非缓存器写数据请求包含第一地址。
该请求处理模块601,还用于从第二缓存器接收第三非缓存写数据请求,第三非缓存写数据请求包括第一地址,第二缓存器不同于第一缓存器。
管理模块602,用于确定存储有第一缓存器的编号与第一地址的对应关系,对应关系指示缓存器存储的第一地址被打上独占标记。
该管理模块602,还用于存储第一缓存器的编号与第一地址的对应关系。
该管理模块602,还用于向第一缓存器发送第三数据缓冲区编号和第二完成响应,第二完成响应包括独占标记。
该管理模块602,还用于当确定存储有第一缓存器的编号与第一地址的对应关系时,向第一缓存器发送删除指令,删除指令用于指示第一缓存器删除第一地址的独占标记,并向节点发送确认删除指令。
该管理模块602,还用于从缓存器接收确认删除指令。
该管理模块602,还用于根据确认删除指令删除存储的第一缓存器的编号与第一地址的对应关系。
数据处理模块603,用于接收与第一非缓存写数据请求相对应的第一数据。
数据缓冲模块604,用于根据请求处理模块接收到的第一非缓存写数据请求给第一处理器分配第一数据缓冲区编号,并指示管理模块向第一缓存器发送第一数据缓冲区编号。
数据缓冲模块604,还用于当数据处理模块接收到与第一非缓存写数据请求相对应的第一数据时,将第一数据写入第一数据缓冲区编号对应的数据缓冲区中,第一数据是第一缓存器确定第一缓存器中存储有第一地址时,从第一处理器获取的。
该数据缓冲模块604,还用于根据请求处理模块接收到的第二非缓存写数据请求分配第三数据缓冲区编号。
需要说明的是,在本申请中,缓存器可以是共享缓存,比如L3缓存;节点可以是存储控制器、总线控制器等,由于缓存器和节点主要由硬件电路实现,上述实施例中缓存器和节点所包含的各种模块(比如管理模块501、标记存储模块502、数据缓冲区编号接收模块503、数据处理模块504,请求处理模块601、管理模块602、数据处理模块603、数据缓冲模块604、请求缓冲模块605等)都是电路模块,对于本领域技术人员来说,在理解电路模块的功能之后设计对应的电路结构是容易的,因此,本申请对于这些电路模块的结构不做进一步说明。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (18)

  1. 一种处理非缓存写数据请求的方法,其特征在于,所述方法包括:
    缓存器从第一处理器接收第一非缓存写数据请求并向节点发送所述第一非缓存写数据请求,所述第一非缓存写数据请求包含第一地址,所述第一非缓存写数据请求用于向所述节点申请与所述第一地址对应的第一数据缓冲区编号;
    若所述缓存器确定所述缓存器中存储有所述第一地址,则所述缓存器从所述第一处理器获取与所述第一非缓存写数据请求相对应的第一数据;
    当所述缓存器从所述节点接收到所述第一数据缓冲区编号时,所述缓存器向所述节点发送所述第一数据。
  2. 根据权利要求1所述的处理非缓存写数据请求的方法,其特征在于,所述缓存器向所述第一处理器获取与所述第一非缓存写数据请求相对应的第一数据包括:
    所述缓存器向所述第一处理器发送第二数据缓冲区编号,以指示所述第一处理器向所述缓存器发送所述第一数据,所述第二数据缓冲区编号是所述缓存器根据所述第一非缓存写数据请求给所述第一处理器分配的;
    所述缓存器从所述第一处理器接收所述第一数据。
  3. 根据权利要求2所述的处理非缓存写数据请求的方法,其特征在于,所述方法还包括:
    所述缓存器确定所述缓存器中存储的所述第一地址被打上独占标记;
    所述缓存器向所述第一处理器发送第一完成响应,以指示所述第一处理器释放所述第一数据占用的缓冲区。
  4. 根据权利要求3所述的处理非缓存写数据请求的方法,其特征在于,所述缓存器从第一处理器接收第一非缓存写数据请求之前,所述方法还包括:
    所述缓存器从第二处理器接收第二非缓存写数据请求并向所述节点发送所述第二非缓存写数据请求,所述第二非缓存写数据请求包含所述第一地址,所述第一地址用于所述节点确定所述缓存器存储的所述第一地址是否被打上所述独占标记;
    所述缓存器从所述节点接收第三数据缓冲区编号和第二完成响应,所述第二完成响应包括所述独占标记;
    若所述缓存器中存储的所述第一地址未被打上所述独占标记,则所述缓存器将所述第一地址打上所述独占标记。
  5. 根据权利要求3或4所述的处理非缓存写数据请求的方法,所述方法还包括:
    所述缓存器从所述节点接收删除指令;
    所述缓存器删除所述第一地址的所述独占标记,并向所述节点发送确认删除指令。
  6. 一种处理非缓存写数据请求的方法,其特征在于,所述方法包括:
    节点从第一缓存器接收第一非缓存写数据请求,所述第一非缓存写数据请求包含第一地址;
    所述节点根据所述第一非缓存写数据请求给第一处理器分配第一数据缓冲区编号,并向所述第一缓存器发送所述第一数据缓冲区编号;
    当所述节点从所述第一缓存器接收到与所述第一非缓存写数据请求相对应的第一数据时,所述节点将所述第一数据写入与所述第一数据缓冲区编号对应的数据缓冲区中,所述第一数据是所述第一缓存器确定所述第一缓存器中存储有所述第一地址时,从所述第一处理器获取的。
  7. 根据权利要求6所述的处理非缓存写数据请求的方法,其特征在于,所述节点从第一缓存器接收第一非缓存写数据请求之后,所述方法还包括:
    所述节点确定存储有所述第一缓存器的编号与所述第一地址的对应关系,所述对应关系指示所述第一缓存器存储的所述第一地址被打上独占标记。
  8. 根据权利要求7所述的处理非缓存写数据请求的方法,其特征在于,所述节点从第一缓存器接收第一非缓存写数据请求之前,所述方法还包括:
    所述节点从所述第一缓存器接收第二非缓存写数据请求,所述第二非缓存写数据请求包含所述第一地址;
    所述节点存储所述第一缓存器的编号与所述第一地址的对应关系;
    所述节点根据所述第二非缓存写数据请求分配第三数据缓冲区编号并向所述第一缓存器发送所述第三数据缓冲区编号和第二完成响应,所述第二完成响应包括所述独占标记。
  9. 根据权利要求7或8所述的处理非缓存写数据请求的方法,其特征在于,所述方法还包括:
    所述节点从第二缓存器接收第三非缓存写数据请求,所述第三非缓存写数据请求包括所述第一地址,所述第二缓存器不同于所述第一缓存器;
    若所述节点确定所述节点存储有所述第一缓存器的编号与所述第一地址的对应关系,所述节点向所述第一缓存器发送删除指令,所述删除指令用于指示所述第一缓存器删除所述第一地址的所述独占标记,并向所述节点发送确认删除指令;
    所述节点从所述第一缓存器接收所述确认删除指令;
    所述节点根据所述确认删除指令删除所述第一缓存器的编号与所述第一地址的对应关系。
  10. 一种缓存器,其特征在于,所述缓存器包括:
    管理模块,用于从第一处理器接收第一非缓存写数据请求并向节点发送所述第一非缓存写数据请求,所述第一非缓存写数据请求包含第一地址,所述第一非缓存写数据请求用于向所述节点申请与所述第一地址对应的第一数据缓冲区编号;
    数据处理模块,用于当标记存储模块确定存储有所述第一地址时,从所述第一处理器获取与所述第一非缓存写数据请求相对应的第一数据;
    所述数据处理模块,还用于当数据缓冲区编号接收模块接收到所述第一数据缓冲区编号时,向所述节点发送所述第一数据。
  11. 根据权利要求10所述的缓存器,其特征在于,
    所述管理模块,还用于向所述第一处理器发送第二数据缓冲区编号,以指示所述第一处理器向所述缓存器发送所述第一数据;
    所述数据处理模块,还用于从所述第一处理器接收所述第一数据。
  12. 根据权利要求11所述的缓存器,其特征在于,
    所述标记存储模块,用于确定所述第一地址被打上独占标记;
    所述管理模块,还用于向所述第一处理器发送第一完成响应,以指示所述第一处理器释放所述第一数据占用的缓冲区。
  13. 根据权利要求12所述的缓存器,其特征在于,
    所述管理模块,还用于从第二处理器接收第二非缓存写数据请求并向所述节点发送所述第二非缓存写数据请求,所述第二非缓存写数据请求包含所述第一地址,所述第一地址用于所述节点确定所述缓存器存储的所述第一地址是否被打上所述独占标记;
    所述数据缓冲区编号接收模块,用于从所述节点接收第三数据缓冲区编号;
    所述标记存储模块,还用于从所述节点接收所述第二完成响应,所述第二完成响应包括所述独占标记;
    所述标记存储模块,还用于当所述标记存储模块存储的所述第一地址未被打上所述独占标记时,将所述第一地址打上所述独占标记。
  14. 根据权利要求12或13所述的缓存器,其特征在于,
    所述标记存储模块,还用于从所述节点接收删除指令;
    所述标记存储模块,还用于删除所述第一地址的所述独占标记,并向所述节点发送确认删除指令。
  15. 一种节点,其特征在于,所述节点包括:
    请求处理模块,用于从第一缓存器接收第一非缓存写数据请求,所述第一非缓存写数据请求包含第一地址;
    数据缓冲模块,用于根据所述请求处理模块接收到的所述第一非缓存写数据请求给第一处理器分配第一数据缓冲区编号,并指示管理模块向所述第一缓存器发送所述第一数据缓冲区编号;
    所述数据缓冲模块,还用于当数据处理模块接收到与所述第一非缓存写数据请求相对应的第一数据时,将所述第一数据写入所述第一数据缓冲区编号对应的数据缓冲区中,所述第一数据是所述第一缓存器确定所述第一缓存器中存储有所述第一地址时,从所述第一处理器获取的。
  16. 根据权利要求15所述的节点,其特征在于,
    所述管理模块,用于确定存储有所述第一缓存器的编号与所述第一地址的对应关系,所述对应关系指示所述第一缓存器存储的所述第一地址被打上独占标记。
  17. 根据权利要求16所述的节点,其特征在于,
    所述请求处理模块,还用于从所述第一缓存器接收第二非缓存写数据请求,所述第二非缓存器写数据请求包含所述第一地址;
    所述管理模块,还用于存储所述第一缓存器的编号与所述第一地址的对应关系;
    所述数据缓冲模块,还用于根据所述请求处理模块接收到的所述第二非缓存写数据请求分配第三数据缓冲区编号;
    所述管理模块,还用于向所述第一缓存器发送所述第三数据缓冲区编号和第二完成响 应,所述第二完成响应包括所述独占标记。
  18. 根据权利要求16或17所述的节点,其特征在于,
    所述请求处理模块,还用于从第二缓存器接收第三非缓存写数据请求,所述第三非缓存写数据请求包括所述第一地址,所述第二缓存器不同于所述第一缓存器;
    所述管理模块,还用于当确定存储有所述第一缓存器的编号与所述第一地址的对应关系时,向所述第一缓存器发送删除指令,所述删除指令用于指示所述第一缓存器删除所述第一地址的所述独占标记,并向所述节点发送确认删除指令;
    所述管理模块,还用于从所述第一缓存器接收所述确认删除指令;
    所述管理模块,还用于根据所述确认删除指令删除存储的所述第一缓存器的编号与所述第一地址的对应关系。
PCT/CN2019/120252 2019-11-22 2019-11-22 处理非缓存写数据请求的方法、缓存器和节点 WO2021097802A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/CN2019/120252 WO2021097802A1 (zh) 2019-11-22 2019-11-22 处理非缓存写数据请求的方法、缓存器和节点
CN201980102202.8A CN114731282B (zh) 2019-11-22 2019-11-22 处理非缓存写数据请求的方法、缓存器和节点
EP19953302.7A EP4054140A4 (en) 2019-11-22 2019-11-22 METHOD OF PROCESSING A NON-BUFFER DATA WRITE REQUEST, BUFFER AND NODE
US17/749,612 US11789866B2 (en) 2019-11-22 2022-05-20 Method for processing non-cache data write request, cache, and node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/120252 WO2021097802A1 (zh) 2019-11-22 2019-11-22 处理非缓存写数据请求的方法、缓存器和节点

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/749,612 Continuation US11789866B2 (en) 2019-11-22 2022-05-20 Method for processing non-cache data write request, cache, and node

Publications (1)

Publication Number Publication Date
WO2021097802A1 true WO2021097802A1 (zh) 2021-05-27

Family

ID=75979899

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120252 WO2021097802A1 (zh) 2019-11-22 2019-11-22 处理非缓存写数据请求的方法、缓存器和节点

Country Status (4)

Country Link
US (1) US11789866B2 (zh)
EP (1) EP4054140A4 (zh)
CN (1) CN114731282B (zh)
WO (1) WO2021097802A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101409715A (zh) * 2008-10-22 2009-04-15 中国科学院计算技术研究所 一种利用InfiniBand网络进行通信的方法及系统
US20140258438A1 (en) * 2013-03-10 2014-09-11 Mellanox Technologies Ltd. Network interface controller with compression capabilities
US20160062912A1 (en) * 2014-09-02 2016-03-03 Unisys Corporation Data input/output (i/o) handling for computer network communications links
CN105472023A (zh) * 2014-12-31 2016-04-06 华为技术有限公司 一种远程直接存储器存取的方法及装置
CN108702374A (zh) * 2015-09-02 2018-10-23 科内克斯实验室公司 用于以太网类型网络上的存储器和I/O的远程访问的NVM Express控制器
CN110109889A (zh) * 2019-05-09 2019-08-09 重庆大学 一种分布式内存文件管理系统

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7099913B1 (en) * 2000-08-31 2006-08-29 Hewlett-Packard Development Company, L.P. Speculative directory writes in a directory based cache coherent nonuniform memory access protocol
US6973536B1 (en) * 2001-08-31 2005-12-06 Oracle Corporation Self-adaptive hybrid cache
JP5049834B2 (ja) * 2008-03-26 2012-10-17 株式会社東芝 データ受信装置、データ受信方法およびデータ処理プログラム
CN102204218B (zh) * 2011-05-31 2015-01-21 华为技术有限公司 数据处理方法、缓存节点、协作控制器及系统
US9235519B2 (en) * 2012-07-30 2016-01-12 Futurewei Technologies, Inc. Method for peer to peer cache forwarding
CN102855194B (zh) * 2012-08-08 2015-05-13 北京君正集成电路股份有限公司 数据存储方法和存储器
CN103729304B (zh) * 2012-10-11 2017-03-15 腾讯科技(深圳)有限公司 数据处理方法及装置
CN103870204B (zh) * 2012-12-11 2018-01-09 华为技术有限公司 一种cache中数据写入和读取方法、cache控制器
US9170946B2 (en) * 2012-12-21 2015-10-27 Intel Corporation Directory cache supporting non-atomic input/output operations
WO2015010327A1 (zh) * 2013-07-26 2015-01-29 华为技术有限公司 数据发送方法、数据接收方法和存储设备
WO2016059715A1 (ja) * 2014-10-17 2016-04-21 株式会社日立製作所 計算機システム
GR20180100189A (el) * 2018-05-03 2020-01-22 Arm Limited Δικτυο επεξεργασιας δεδομενων με συμπυκνωση ροης για μεταφορα δεδομενων μεσω streaming

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101409715A (zh) * 2008-10-22 2009-04-15 中国科学院计算技术研究所 一种利用InfiniBand网络进行通信的方法及系统
US20140258438A1 (en) * 2013-03-10 2014-09-11 Mellanox Technologies Ltd. Network interface controller with compression capabilities
US20160062912A1 (en) * 2014-09-02 2016-03-03 Unisys Corporation Data input/output (i/o) handling for computer network communications links
CN105472023A (zh) * 2014-12-31 2016-04-06 华为技术有限公司 一种远程直接存储器存取的方法及装置
CN108702374A (zh) * 2015-09-02 2018-10-23 科内克斯实验室公司 用于以太网类型网络上的存储器和I/O的远程访问的NVM Express控制器
CN110109889A (zh) * 2019-05-09 2019-08-09 重庆大学 一种分布式内存文件管理系统

Also Published As

Publication number Publication date
CN114731282A (zh) 2022-07-08
US20220276960A1 (en) 2022-09-01
US11789866B2 (en) 2023-10-17
EP4054140A1 (en) 2022-09-07
EP4054140A4 (en) 2022-11-16
CN114731282B (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
US10169080B2 (en) Method for work scheduling in a multi-chip system
US20150253997A1 (en) Method and Apparatus for Memory Allocation in a Multi-Node System
JP7153441B2 (ja) データ処理
US10592459B2 (en) Method and system for ordering I/O access in a multi-node environment
US9372800B2 (en) Inter-chip interconnect protocol for a multi-chip system
JP6880402B2 (ja) メモリアクセス制御装置及びその制御方法
US10956347B2 (en) Data transfer device, arithmetic processing device, and data transfer method
JP2021515318A (ja) NVMeベースのデータ読み取り方法、装置及びシステム
WO2015134099A1 (en) Multi-core network processor interconnect with multi-node connection
KR102106261B1 (ko) 메모리 컨트롤러의 작동 방법과 이를 포함하는 장치들의 작동 방법들
JPWO2003075147A1 (ja) ストレージシステム及び同システムにおけるデータ転送方法
KR102428563B1 (ko) 수눕 작동을 관리하는 코히런트 인터커넥트와 이를 포함하는 데이터 처리 장치들
KR20190027312A (ko) 페이지 비트맵을 포함하는 효율적인 트랜잭션 테이블
CN113032293A (zh) 缓存管理器及控制部件
EP1703405A2 (en) Information processing device and data control method in information processing device
KR100630071B1 (ko) 다중 프로세서 환경에서의 dma를 이용한 고속 데이터전송 방법 및 그 장치
WO2023103704A1 (zh) 数据处理方法、存储介质和处理器
US9727521B2 (en) Efficient CPU mailbox read access to GPU memory
TW460787B (en) Apparatus and method for fabric ordering load/store to input/output device and direct memory access peer-to-peer transactions
US7409486B2 (en) Storage system, and storage control method
WO2021097802A1 (zh) 处理非缓存写数据请求的方法、缓存器和节点
WO2021081944A1 (zh) 处理非缓存写数据请求的方法、缓存器和节点
CN113031849A (zh) 直接内存存取单元及控制部件
KR20200143922A (ko) 메모리 카드 및 이를 이용한 데이터 처리 방법
JP6565729B2 (ja) 演算処理装置、制御装置、情報処理装置及び情報処理装置の制御方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19953302

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019953302

Country of ref document: EP

Effective date: 20220530

NENP Non-entry into the national phase

Ref country code: DE