WO2023133830A1 - Shared storage system and apparatus, and method for invalidating cache data - Google Patents

Shared storage system and apparatus, and method for invalidating cache data Download PDF

Info

Publication number
WO2023133830A1
WO2023133830A1 PCT/CN2022/072102 CN2022072102W WO2023133830A1 WO 2023133830 A1 WO2023133830 A1 WO 2023133830A1 CN 2022072102 W CN2022072102 W CN 2022072102W WO 2023133830 A1 WO2023133830 A1 WO 2023133830A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
signal
manager
address
processor
Prior art date
Application number
PCT/CN2022/072102
Other languages
French (fr)
Chinese (zh)
Inventor
何涛
于东浩
李瑛�
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202280041800.0A priority Critical patent/CN117529899A/en
Priority to PCT/CN2022/072102 priority patent/WO2023133830A1/en
Publication of WO2023133830A1 publication Critical patent/WO2023133830A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 

Definitions

  • the embodiments of the present application relate to the field of computer security, and in particular to a shared storage system, device and method for invalidating cached data.
  • memory space is shared between multiple processors. That is to say, any processor connected to the shared memory system can issue a request to access any address. Since the multiple processors share the memory space, multiple processors that issue a request to access the same address will get a copy of the same data. However, after one of the processors rewrites the copy, the copy stored by other processors in the address is different from the rewritten copy, that is, there is a cache coherence problem.
  • the memory manager responsible for maintaining the shared address space is usually based on The snoopy protocol sends information to other processors that have saved the data in the address to notify other processors that the data in the address is invalid.
  • the snoopy protocol sends information to other processors that have saved the data in the address to notify other processors that the data in the address is invalid.
  • the processor processes more and more data in a clock cycle, and copies of the same large amount of data (that is, data in multiple addresses) are cached in multiple processors.
  • the memory manager needs to send multiple addresses corresponding to the large amount of data to other processors respectively.
  • the memory manager needs to send the ten addresses to other processors one by one, seriously occupying the bandwidth of the communication network in the shared memory system.
  • multiple addresses are sent to other processors one by one, it may take multiple clock cycles for other processors to obtain information indicating that the stored copy is invalid, that is, there is a time delay problem.
  • a shared storage system when multiple processors store copies of the same large amount of data, when one of the processors rewrites the copy, how to effectively invalidate the copies saved by other processors to improve The performance of the shared storage system becomes a problem that needs to be solved.
  • the shared storage system, device and method for invalidating cached data provided by the present application can efficiently invalidate copies cached by other processors when a certain processor rewrites data in a certain address.
  • the present application adopts the following technical solutions.
  • the embodiment of the present application provides a shared storage system
  • the shared storage system includes: a memory manager, a plurality of processors, and a shared memory;
  • the first processor in the plurality of processors is configured to provide
  • the memory manager sends a request, and the request indicates to rewrite the data in the first address, and the first address is an address in the shared memory;
  • the memory manager based on the request, sends to the plurality of processes
  • the second processor in the memory device sends a first signal, the first signal indicates that the first data stored by the second processor is invalid, and the first data is the data in the first address space in the shared memory a copy, the first address is located in the first address space;
  • the second processor based on the first signal, invalidates the first data.
  • the memory manager can be a dedicated processor, or it can be integrated with the core in the central processing unit.
  • the memory manager may be, for example, the memory manager 21 , the memory manager 22 , the memory manager 23 or the memory manager 24 shown in FIG. 1 .
  • the multiple processors are, for example, processor 01 , processor 02 , processor 03 , or processor 04 shown in FIG. 1 .
  • the shared memory is, for example, the memory 11 , the memory 12 , the memory 13 or the memory 14 shown in FIG. 1 .
  • the memory manager is used to manage the memory of its said coupling. For example, in FIG.
  • the memory manager 21 manages the memory 11
  • the memory manager 22 manages the memory 12
  • the memory manager 23 manages the memory 13
  • the memory manager 24 manages the memory 14 .
  • the management here can refer to which processors save the data in each address space, and before one of the processors rewrites the data in a certain address space, invalidate the data in the address space saved in other processors .
  • the first address space may be pre-divided by the memory manager, and may also be called a declared space area.
  • the size of the first address space may be dynamically adjusted. For example, the memory manager may first recover all the address spaces allocated to each processor, readjust the size of the address space of the shared memory, and broadcast the adjusted size of the address space and the identifier corresponding to the address space to each processor.
  • the first signal may include various implementations.
  • the first signal may include the address range of the first address space; in the second possible implementation manner, the first address space may map a preset identifier, and the first signal may include The default identification can be included.
  • the first address space includes a plurality of addresses.
  • the memory manager used to manage the shared memory needs to invalidate the data copy in the first address space stored in the first processor, it needs to send multiple signals to the first processor, and one signal indicates invalid data at one address; this severely consumes the bandwidth of the interconnection network.
  • the memory manager provided by the embodiment of the present application can invalidate the copy of data in the first address space stored in the first processor by sending the first signal to the first processor. It increases the network bandwidth and improves bandwidth utilization, thereby improving the performance of the shared storage system.
  • the second processor is specifically configured to: decompose the first address space into multiple addresses based on the first signal and the size of a preset data segment range, to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment; and invalidate the multiple data segments.
  • the second processor may invalidate multiple data segments at the same time, or invalidate the multiple data segments in a time-sharing manner.
  • data is stored in the cache in the form of a data segment, such as a cache line.
  • the processor performs data operations in units of data segments, such as modifying the state of data, storing data, or reading data.
  • a data segment usually corresponds to an address range.
  • the first address space includes multiple address ranges, that is, multiple data segments are stored.
  • the second processor After the second processor receives the information indicating to invalidate the copy of data in the first address space, by decomposing the first address space into a plurality of address ranges to obtain a plurality of data segments, and then invalidating the plurality of data segments, such that , a traditional storage protocol (such as the snoopy protocol) can be used to invalidate multiple data segments, and then invalidate the first address space, without spending additional overhead on the second processor used to perform invalidation of the data in the first address space Hardware improvement is beneficial to save design cost.
  • a traditional storage protocol such as the snoopy protocol
  • the second processor is specifically configured to: when the state of some data segments among the plurality of data segments is one of a shared state and an exclusive state, Modifying the state of the at least some of the data segments to an invalid state.
  • the second processor includes a cache manager and an invalidation manager; the invalidation manager generates multiple invalidation signals based on the multiple address ranges, one The address range corresponds to one invalidation signal; and transmitting the plurality of invalidation signals to the buffer manager in time division; the buffer manager invalidates the plurality of data segments based on the plurality of invalidation signals.
  • Both the cache manager and the invalidation manager can be implemented by hardware circuits. Operations on the cached data in the second processor (such as obtaining data from the memory for caching, writing data into the memory, and modifying the state of the cached data) are all implemented by the cache manager.
  • the cache manager operates on data in units of data segments. In the traditional technology, no invalidation manager is set, and the memory manager provides multiple invalidation signals corresponding to multiple address ranges to the cache manager in time-sharing, which seriously occupies bandwidth; in the embodiment of the present application, by setting the invalidation manager, The memory manager is replaced by the invalidation manager, and multiple invalidation signals are generated and provided to the cache manager.
  • the invalidation manager and the cache manager are set in the same processor, compared with the prior art, it is not necessary to occupy the off-chip processor The bandwidth of the interconnection network, and has an efficient transmission rate.
  • the cache manager there is no need to make any changes to the structure of the cache manager, and it can still receive signals according to the traditional storage protocol and operate on data in units of data segments, saving design costs.
  • the invalidation manager is further configured to decompose the first address space into multiple address ranges based on the first signal and a size of a preset data segment.
  • the cache manager is further configured to send multiple responses to the multiple signals to the invalidation manager, where one response corresponds to one data segment,
  • the first response in the plurality of responses indicates that the first data segment in the plurality of data segments has not been rewritten;
  • the invalidation manager is further configured to transmit to the memory manager based on the plurality of responses a second signal indicating completion of invalidation of the first data.
  • no invalidation manager is provided, and the cache manager needs to transmit multiple response signals to the memory manager, occupying the bandwidth of the communication network.
  • the invalidation manager when the invalidation manager receives multiple responses, and the multiple responses indicate that the corresponding data segment has not been rewritten, it transmits a signal to the memory manager indicating that the invalidation of the first data is completed, that is, the invalidation management Only one signal needs to be transmitted from the controller to the memory manager.
  • the bandwidth of the communication network can be reduced and the utilization rate of the bandwidth can be improved.
  • the second response in the multiple responses includes the second data segment in the multiple data segments; the invalidation manager is further configured to convert the The second data segment is written back to the corresponding address range in the shared memory.
  • the data in the second data segment is rewritten, and the cache manager needs to write the rewritten data back to the memory.
  • the cache manager transmits the second response carrying the second data segment to the invalidation manager, so that the invalidation manager writes the second data segment back to the memory.
  • the first invalid signal among the plurality of invalid signals indicates that the first data segment is invalid; and the buffer manager is specifically configured to: upon receiving the Before the first invalid signal, write the first data segment back to the shared memory; in response to the third signal sent by the memory manager, the third signal indicates that the storage of the first data segment is complete, and write the first data segment to the shared memory.
  • the invalidation manager sends the first response.
  • the memory manager is further configured to: send a fourth signal to the first processor, where the fourth signal indicates that the first processor is allowed to rewrite the the data in the first address.
  • the third processor among the multiple processors is configured to send a request to the memory manager, where the request indicates to rewrite the data in the second address,
  • the second address is an address in the shared memory;
  • the memory manager based on the request, sends a fifth signal to the cache manager, and the fifth signal indicates invalidation of the memory stored by the cache manager the second data in the second address, the second data is a copy of the data in the second address;
  • the cache controller based on the fifth signal, sends a second response to the memory manager, the second The response indicates that the second data is modified or not modified.
  • the memory manager can directly transmit a signal indicating that the data copy in the second address is invalid To the cache controller, there is no need to decompose the address range through the invalidation manager at this time, which can improve the signal transmission speed and the working efficiency of the shared memory system.
  • an embodiment of the present application provides a device, the device includes a memory manager, and the memory manager is configured to: receive a first request from a first processor, and the first request indicates to rewrite the data in the first address , the first address is an address in a shared memory, the shared memory is managed by the memory manager; based on the first request, a first signal is sent to the second processor, the first signal indicates invalid
  • the first data saved by the second processor the first data is a copy of the data in the first address space in the shared memory, the first address is located in the first address space, wherein the A shared memory is managed by the memory manager, and an address space in the shared memory is allowed to be accessed by the first processor and the second processor.
  • the device provided in the embodiment of the present application may only be provided with a memory manager dedicated to managing shared memory.
  • the device provided in this embodiment of the present application may be a processor, and the processor may include, for example, a processor core in addition to a memory manager.
  • the first address space includes a plurality of addresses.
  • the memory manager used to manage the shared memory needs to invalidate the data copy in the first address space stored in the first processor, it needs to send multiple signals to the first processor, and one signal indicates invalid data at one address; this severely consumes the bandwidth of the interconnection network.
  • the memory manager provided by the embodiment of the present application can invalidate the copy of data in the first address space stored in the first processor by sending the first signal to the first processor. It increases the network bandwidth and improves bandwidth utilization, thereby improving the performance of the shared storage system.
  • the memory manager is further configured to: receive a second signal from the second processor, the second signal indicating completion of invalidation of the first data ; based on the second signal, sending a third signal to the first processor, the third signal indicating that the first processor is allowed to rewrite the data in the first address.
  • the memory manager is further configured to: monitor that the second processor stores a first data segment in the shared memory, and the first data segment is the A piece of data in the first data; in response to the completion of storage of the first data section, sending a fourth signal to the second processor, where the fourth signal indicates that the storage of the first data section is complete.
  • the memory manager is further configured to: receive a second request from a third processor, where the second request indicates to rewrite the data in the second address, and the first The second address is an address in the shared memory; based on the second request, a fifth signal is sent to the second processor, the fifth signal indicates that the second data is invalid, and the second data is the second data. A copy of the data at the second address mentioned above.
  • an embodiment of the present application provides a device, the device includes a processor, the processor is configured to: receive a first signal from a memory manager, the first signal is used to indicate that the saved first data is invalid, The first data is a copy of data in a first address space in a shared memory, the first address is located in the first address space, and the shared memory is managed by the memory manager; based on the first signal , invalidate the first data.
  • the processor is specifically configured to: decompose the first address space into multiple address ranges based on the first signal and the size of a preset data segment, Obtaining multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment; invalidating the multiple data segments.
  • the processor is specifically configured to: when the state of at least some of the data segments in the multiple data segments is one of the shared state or the exclusive state, set The state of the at least some data segments is modified to an invalid state.
  • the processor includes a cache manager and an invalidation manager; the invalidation manager generates multiple invalidation signals based on the multiple address ranges, one address range corresponding to one invalidation signal; and transmitting the plurality of invalidation signals to the buffer manager in time division; the buffer manager invalidates the plurality of data segments based on the plurality of invalidation signals.
  • the cache manager is further configured to send multiple responses to the multiple invalidation signals to the invalidation manager, where one response corresponds to one data segment , the first response in the plurality of responses indicates that the first data segment in the plurality of data segments has not been rewritten; the invalid manager is further configured to send the memory manager to the memory manager based on the plurality of responses Transmitting a second signal indicating completion of invalidation of the first data.
  • the second response among the multiple responses carries the second data segment among the multiple data segments, and the status of the second data segment is Rewrite state; the invalidation manager is further configured to write the second data segment back into the corresponding address range in the shared memory when the state of the second data segment is in the rewritten state.
  • the first invalid signal among the plurality of invalid signals indicates that the first data segment is invalid; and the cache manager is specifically configured to: upon receiving the Before the first invalidation signal, write the first data segment back to the shared memory; in response to the fourth signal sent by the memory manager, send the first response to the invalidation manager, and the fourth A signal indicates that the storage of the first data segment is complete.
  • the cache controller is further configured to: receive a fifth signal from the memory manager, the fifth signal indicating invalidation of the first Two data, the second data is a copy of the data in the second address; in response to the fifth signal, sending a third response to the memory manager, the third response indicates that the second data is modified or unmodified.
  • an embodiment of the present application provides a method for invalidating cached data, the method for invalidating cached data includes: receiving a first request from a first processor, the first request indicating rewriting the first address in data, the first address is an address in the shared memory; based on the first request, a first signal is sent to the second processor, and the first signal indicates invalidation of the first stored by the second processor data, the first data is a copy of data in a first address space in the shared memory, and the first address is located in the first address space.
  • the method further includes: receiving a second signal from the second processor, the second signal indicating completion of invalidation of the first data; the second signal, and send a third signal to the first processor, where the third signal indicates that the first processor is allowed to rewrite the data in the first address.
  • the embodiment of the present application provides a method for invalidating cached data
  • the method for invalidating cached data includes: receiving a first signal from the memory manager, the first signal is used to indicate invalidation of the stored first data, where the first data is a copy of data in a first address space in the shared memory, and the first address is located in the first address space; based on the first signal, the first data is invalidated.
  • the invalidating the first data based on the first signal includes: based on the first signal and the size of a preset data segment, converting the first data to Decomposing an address space into multiple address ranges to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment; invalidating the multiple data segments.
  • the invalidating the plurality of data segments includes: when the state of at least some of the data segments in the plurality of data segments is one of a shared state or an exclusive state , modify the state of the at least some data segments to an invalid state.
  • the method further includes: when the state of each data segment in the plurality of data segments is an invalid state, transmitting a second signal to the memory manager , the second signal indicates completion of invalidation of the first data.
  • the method before transmitting the second signal to the memory manager, the method further includes: when the state of the first data segment among the plurality of data segments is When the state is rewritten, write the second data segment back into the corresponding address range in the shared memory.
  • FIG. 1 is a schematic diagram of a hardware structure of a shared storage system provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of the mapping relationship between the memory and the global address space provided by the embodiment of the present application;
  • FIG. 3 is a schematic diagram of multiple address ranges divided by the address space of the memory 11 provided by the embodiment of the present application;
  • FIG. 4 is a flow chart of the interaction between components in the shared storage system provided by the embodiment of the present application.
  • Fig. 5 is another schematic diagram of the hardware structure of the shared storage system provided by the embodiment of the present application.
  • FIG. 6 is a flowchart of the interaction between components in the shared storage system shown in FIG. 5 provided by the embodiment of the present application;
  • Fig. 7 is another flow chart of interaction between components in the shared storage system shown in Fig. 5 provided by the embodiment of the present application;
  • Fig. 8 is another flow chart of interaction between components in the shared storage system shown in Fig. 5 provided by the embodiment of the present application;
  • FIG. 9 is a flowchart of a method for invalidating cached data provided by an embodiment of the present application.
  • FIG. 10 is another flow chart of the method for invalidating cached data provided by the embodiment of the present application.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design scheme described as “exemplary” or “for example” in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as “exemplary” or “such as” is intended to present related concepts in a concrete manner.
  • “plurality” means two or more. For example, multiple processors refers to two or more processors.
  • the shared storage system 100 described in the embodiment of the present application may be a symmetric multiprocessor (symmetrical mulit-processing, SMP) system.
  • the shared storage system 100 includes multiple processors, for example, 2, 3, etc., which is not specifically limited in this embodiment of the present application.
  • the plurality of processors may be a central processing unit (CPU, central processing unit) or a dedicated processor (such as an image signal processor, an artificial intelligence processor or a neural network processor, etc.), and the embodiment of the present application does not change the form of the processor. Be specific.
  • Components (such as processors and memory) connected to the shared storage system 100 may communicate through an interconnection network, and the interconnection network may include, but is not limited to, one of the following: a single bus, multiple buses, or a crossbar.
  • the shared storage system 100 also includes a memory, such as but not limited to SDRAM, DDR, and the like.
  • the memory can be independently set and managed by a dedicated memory manager, or it can be coupled to one of the processors and managed by the processor, or it can be divided into multiple parts and distributed and coupled to multiple processors, each part is controlled by it. Coupled processor management.
  • the memory forms a globally unified address space, which is shared by all processors in the shared storage system 100 , and each processor can initiate a request to access any address space.
  • the access to any address space mentioned here refers to reading instructions or data from the accessed address space, or writing data to the accessed address space. Processors that issue requests to read the same address space will get the same data.
  • the above-mentioned memory management refers to recording which processors store data in each address space, and before one of the processors rewrites data in a certain address space, invalidates the data in the address space stored in other processors.
  • the shared storage system 100 includes four processors as an example, and each processor is mounted with a memory case as an example. With reference to FIG. 1 , the shared storage system 100 provided by the embodiment of the present application is described in detail.
  • FIG. 1 is a schematic diagram of a hardware structure of a shared storage system 100 provided in an embodiment of the present application.
  • a shared storage system 100 includes four processors, namely a processor 01 , a processor 02 , a processor 03 and a processor 04 .
  • the four processors are connected through an interconnection network.
  • the processor 01 to the processor 04 may be respectively integrated into one or more chips.
  • processor 01 to processor 04 may be integrated into different chips; for another example, processor 01 to processor 04 may also be integrated into one chip.
  • the processor 01 to the processor 04 are respectively independent processors and are respectively integrated into different chips for description, but this is not used to limit the solution.
  • Each processor has a memory attached to it. That is to say, the processor 01 is coupled to the memory 11 , the processor 02 is coupled to the memory 12 , the processor 03 is coupled to the memory 13 , and the processor 04 is coupled to the memory 14 .
  • Each memory can also optionally be integrated into the same chip as the processor coupled to it.
  • the memory 11 - the memory 14 form the global address space of the shared memory system 100, which is shared by the processors 01 - 04. Memory 11 to memory 14 are respectively mapped to a certain block in the global address space. As shown in FIG. 2 , FIG. 2 schematically shows a mapping relationship between each memory and the global address space.
  • the storage space of memory 11 is mapped to 0x0000 ⁇ 0x4000 in the global address space
  • the storage space of memory 12 is mapped to 0x4001 ⁇ 0x8000 in the global address space
  • the storage space of memory 13 is mapped to the global address space 0x8000-0x1201 in the memory
  • the storage space of memory 14 is mapped to 0x1200-0x1600 in the global address space.
  • Each processor is provided with one or more processor cores (cores), and the processor cores are used to obtain instructions and data from the global address space to complete various tasks that need to be processed. As shown schematically in Figure 1, one core is set in processor 01, one core c2 is set in processor 02, one core c3 is set in processor 03, and one core c4 is set in processor 04 a kernel.
  • Each processor is also integrated with a memory manager, which can also be called a local agent (home agent), wherein the processor 01 is integrated with a memory manager 21, and the processor 02 is integrated with a memory manager 22, A memory manager 23 is integrated in the processor 03 , and a memory manager 24 is integrated in the processor 04 .
  • Each memory manager can be implemented by a hardware circuit. Each memory manager is used to manage the memory to which it is coupled. Taking the memory manager 21 as an example, the memory manager 21 is used to record which processors the data in the address space 0x0000 ⁇ 0x4000 shown in Figure 2 are saved by, and which processors in the processor 02 ⁇ processor 04 rewrite a certain Before the data in an address space, invalidate the data in the address space held by other processors.
  • processor 02 and processor 03 both store data at address 0x3800.
  • the processor 02 needs to rewrite the data in the address 0x3800.
  • the processor 01 transmits to the processor 03 a signal for indicating invalid data in the address 0x3800. Therefore, cache consistency in the shared storage system 100 can be guaranteed.
  • each memory manager can communicate with other processors (or components set in the processor, such as processor core, data invalidation manager and cache manager described below) through an interconnection network Direct communication, directly communicates with components in the internal processor (such as processor core, cache manager and data invalidation manager, etc.) through the internal bus.
  • each memory manager may further divide the space area it manages into multiple address spaces in advance, and the address space may be divided into page (page) granularity, for example.
  • At least some of the address spaces in the plurality of address spaces are declarative address spaces. That is to say, when the processor needs to read data in a certain address (or multiple addresses) in the declared address space, the memory manager provides all the data in the address space corresponding to the declared address space to the processing.
  • a processor needs to rewrite the data in a certain address (or multiple addresses) in the declared address space, if other processors have a copy of the data in the declared address space, other processors need to copy the data in the declared address space. All the saved data in the declaration address space are invalid.
  • the address space 0x0000 ⁇ 0x4000 mapped by the memory 11 shown in FIG. 2 can be further divided into four address spaces.
  • the four address spaces may be, for example, address spaces 0x0000-0x1000, address spaces 0x1001-0x2000, address spaces 0x2001-0x3000, and address spaces 0x3001-0x4000.
  • the address space 0x0000-0x1000, the address space 0x1001-0x2000 and the address space 0x2001-0x3000 are declared address spaces, respectively recorded as address space p1, address space p2, and address space p3.
  • the memory manager 21 will All data in address space p1 are provided to processor 02 and processor 03 respectively. Based on the four address spaces divided by the memory 11 shown in Figure 3, taking the processor 02 further sending a request to the memory manager 21 to rewrite the data in addresses 0x0000-0x0800 as an example, combined with the interaction process shown in Figure 4, the memory The manner in which the manager 21 invalidates the copy stored in the processor 03 will be described.
  • Step 401 the processor 02 sends a request q1 to the memory manager 21 .
  • the request q1 is used to request to rewrite the data in addresses 0x0000 ⁇ 0x0800.
  • Step 402 the memory manager 21 sends a signal s1 to the processor 03 based on the request q1. Specifically, after receiving the request q1, the memory manager 21 first finds out that addresses 0x0000-0x0800 are located in the address space p1. Then, the memory manager 21 can query and find out that the processor 03 has saved a copy of the data in the address space p1. Finally, the memory manager 21 sends a signal s1 to the processor 03 indicating that the processor 03 invalidates the copy of the data in the address space p1. In an optional implementation manner, the memory manager 21 may directly send the address range 0x0000-0x1000 corresponding to the address space to the processor 03 .
  • each declared address space can be mapped to a preset identifier
  • each processor can store the mapping relationship between the declared address space and the preset identifier, such as address space p1, address
  • the preset identifiers corresponding to the space p2 and the address space p3 are page1, page2 and page3 respectively; the memory manager 21 may send the preset identifier page1 corresponding to the address space p1 to the processor 03.
  • the processor 03 invalidates the saved copy of the data in the address space p1 based on the signal s1.
  • the copy of the data in the address space p1 saved by the processor 03 may include multiple data segments, and one data segment may be, for example, a cache line (cache line).
  • Each data segment is provided with status information indicating the status of the data segment, such status information includes but not limited to the following items: modified (modity), unique (exclusive), shared (shared) and invalid (Invalid).
  • the copy of the data in the address space p1 that is invalidated as mentioned above may mean that when the status information of a certain data segment is one of modified, unique and shared, the processor 03 modifies the status information to be invalid .
  • the processors communicate with each other based on the snoopy protocol.
  • the processor usually operates data at the granularity of data segments (eg, reads data, writes data, or modifies identification bits corresponding to data). That is to say, assume that the processor 03 stores data in addresses 0x3000-0x0800, and the granularity of the data that can be manipulated each time is 0x0100.
  • the memory manager 21 decomposes the address 0x0000 ⁇ 0x0800 with 0x0100 as the granularity to obtain eight address ranges based on the snoopy protocol. . Then, the memory manager 21 transmits to the processor 03 a signal indicating that data in an address range is invalid each time, that is, the memory manager 21 needs to transmit signals to the processor 03 eight times, seriously occupying the bandwidth of the interconnection network. In addition, if the eight address ranges are sent to the processor 03 one by one, the processor 03 may obtain a signal indicating that the data in a certain address range is invalid after several clock cycles, and there is a time delay problem.
  • the memory manager used to manage the range of the declaration space can set the The copy signal is provided to other processors that store the copy. Compared with the memory manager that needs to send multiple signals, the embodiment of the present application can send a signal once to invalidate a large amount of data, releasing the bandwidth of the interconnection network. Helps improve bandwidth utilization.
  • each processor in the shared memory system 100 shown in FIG. 5 further includes a cache manager.
  • a cache manager can also be called a cache agent.
  • a cache and a cache controller are provided in the cache manager.
  • the cache can be based on the control of the cache controller, from the global address space in the shared storage system 100, obtain instructions and data required for the operation of the processor core for storage; and write data to any address space in the global address space, the The data is dirty (dirty) data that is pre-stored in the cache and generated by the operation of the processor. That is, if a processor has a copy that needs to be invalidated, that copy is kept in the cache manager.
  • Fig. 1 schematically shows that a cache manager 31 is set in the processor 01, a cache manager 32 is set in the processor 02, a cache manager 33 is set in the processor 03, and a cache management is set in the processor 04 device 34.
  • Each cache manager can be implemented by a hardware circuit.
  • the cache manager operates on data at the granularity of data segments based on the snoopy protocol.
  • the cache management manager performs an invalidation operation on multiple data segments included in the copy to be invalidated at a data segment granularity.
  • the invalid operation mentioned here refers to modifying the status of the data segment to invalid when the status of the data segment is one of modified, unique and shared.
  • each processor may also be provided with an invalidation manager, and the invalidation manager may also be called a flush engine (flush engine).
  • An invalidation manager 41 is set in the processor 01
  • an invalidation manager 42 is set in the processor 02
  • an invalidation manager 43 is set in the processor 03
  • an invalidation manager 44 is set in the processor 04 .
  • Each invalidation manager may be a hardware circuit.
  • the invalidation manager is used to obtain a signal from the memory manager, which indicates invalidation of a copy of data in a declared address space; based on the signal, the invalidation manager will The address space of a statement to be invalidated is decomposed into multiple address ranges, and one address range corresponds to one data segment; the invalidation manager provides the multiple address ranges to the cache manager respectively.
  • the invalidation manager 43 receives a signal from the memory manager 21 indicating that the data in the address space p1 is invalidated; assuming that the granularity for storing data in the cache manager is 0x0100, the invalidation manager 43 based on the received signal, decompose the addresses in the address space p1 into 10 addresses, namely address 0x0000-0x0100, address 0x0101-0x0200, ... address 0x0901-0x1000, and then provide the ten addresses to the cache manager 33 in time-sharing. So that the cache manager 33 invalidates the data in the ten addresses.
  • the memory manager used to manage the address space can provide a signal indicating invalidation of the data in the address space to the invalidation management
  • the invalidation manager further decomposes the address region corresponding to the address space, and provides multiple decomposed address ranges to the on-chip cache manager. Therefore, compared with the traditional technology where the memory manager needs to send multiple signals to the cache manager, the embodiment of the present application can send a signal once to invalidate a large amount of data, which releases the bandwidth of the interconnection network and is conducive to improving the bandwidth. utilization rate.
  • the buffer manager can reduce the signal transmission delay compared with transmitting multiple signals.
  • the shared storage system 100 provided by the embodiment of the present application can improve the bandwidth utilization rate of the interconnection network and reduce signal delay, thereby improving the performance of the shared storage system 100 .
  • the cache manager of the processor when a data copy stored in a certain processor is invalidated, if the data is dirty (dirty) data, the cache manager of the processor needs to write the dirty data back into the memory. Based on this, in an optional implementation manner of the embodiment of the present application, when the data to be invalidated is dirty data, the cache manager may also be used to transmit the dirty data to the invalidation manager in the chip. An invalid manager writes dirty data back into memory. It should be noted that after the memory manager initiates invalidation of the data copy of a certain address space stored in a processor and before the processor returns to the memory manager information indicating that the invalidation is completed, the memory manager invalidates the data copy of the address space. Locked, that is, no other processor can access the address space.
  • a signal indicating invalidation of the declared air data copy is transmitted to the cache manager.
  • the memory manager needs to store invalid data outside the declared address space (for example, the memory manager 21 needs to invalidate the data in addresses 0x3100-0x3200 shown in FIG. 3 ), or when the number of data to be invalidated is the number stipulated in the snoopy protocol (such as the data of a data segment), the memory manager can directly transmit a signal indicating invalid data in a certain address range to the cache manager.
  • the cache manager detects that dirty data is stored in the address range based on the signal, it can directly write the dirty data back into the corresponding address range in the memory.
  • FIG. 6 is an interaction process 600 between components in the shared storage system 100 shown in FIG. 5.
  • the interaction process includes the following steps:
  • Step 601 the cache manager 32 sends a request q2 to the memory manager 21 , the request q2 instructs to rewrite the data in the address range A.
  • step 602 the memory manager 21 sends a signal s2 to the invalidation manager 43 based on the request q2. Specifically, after receiving the request q2, the memory manager 21 first finds out that the address range R is located in the address space p1. Then, the memory manager 21 can query and find out that the cache manager 32 stores a copy of the data in the address space p2. Finally, the memory manager 21 sends a signal s2 to the invalidation manager 43 indicating a copy of the data in the invalidation address space p1.
  • the invalidation manager 43 In step 603, the invalidation manager 43 generates an invalidation signal i1, an invalidation signal i2 and an invalidation signal i3 based on the signal s2, the preset size of the data segment and the address space p1.
  • the invalidation manager 43 provides the invalidation signal i1, the invalidation signal i2 and the invalidation signal i3 to the cache manager in time division. Specifically, the invalidation manager 43 divides the address space p1 into three address ranges of address range a1, address range a2, and address range a3 based on the size of the preset data segment, wherein the amount of data in each address range is one data part.
  • the data segment is the cache manager's granularity for data operations.
  • Invalid signal il indicates a copy of data in invalid address range al
  • invalid signal i2 indicates a copy of data in invalid address range a2
  • invalid signal i3 indicates a copy of data in invalid address range a3.
  • FIG. 6 shows that the invalidation manager 43 transmits three signals to the cache manager. It can be understood that this embodiment of the present application is limited to this. In a specific scenario, it can be based on the size of the address space and The size of the data segment to adjust the number of divided address ranges and the number of generated signals.
  • Step 605 the cache manager 33 detects that the data segment in the address range a1 has not been rewritten based on the signal i1 (that is, the state information of the data segment is one of unique, shared and invalid), and transmits a response to the invalid manager 43 r1, the response r1 indicates that the data segment in the address range a1 has not been rewritten; the cache manager 33 detects that the data segment in the address range a2 has not been rewritten based on the signal i2, and transmits a response r2 to the invalidation manager 43, and the response r2 indicates the address The data segment in the range a2 has not been rewritten; the cache manager 33 detects that the data segment in the address range a3 has not been rewritten based on the signal i3, and transmits a response r3 to the invalidation manager 43, and the response r3 indicates the data segment in the address range a3 Not overwritten.
  • the signal i1 that is, the state information of the data segment is one of unique,
  • the cache manager 33 needs to modify the status information "unique” or “shared” to the status before transmitting the response The message is "invalid".
  • step 604 and step 605 are not used to limit the sequence of time.
  • the invalidation manager 43 sends an invalidation signal i1 to the cache manager 33; in the second clock cycle, the invalidation manager 43 sends an invalidation signal i2 to the cache manager 33, and the cache manager 33 based on the invalidation Signal i1 generates a response r1 and transmits it to the invalidation manager 43; in the third clock cycle, the invalidation manager 43 sends an invalidation signal i3 to the cache manager 33, and the cache manager 33 generates a response r2 based on the invalidation signal i2 and transmits it to the invalidation manager 43 : In the fourth clock cycle, the cache manager 33 generates a response r3 based on the invalidation signal i3 and transmits it to the invalidation manager 43 .
  • step 606 the invalidation manager 43 transmits a signal s3 to the memory manager 21 based on the response r1, the response r2 and the response r3, and the signal s3 indicates that the invalidation of the data copy in the address space p1 is completed.
  • the invalidation manager 43 transmits a signal s3 to the memory manager 21 when all responses are received and each response indicates that the data segment in the address range has not been rewritten.
  • Step 607 the memory manager 21 sends a signal s4 to the cache manager 32, the signal s4 is used to indicate that the cache manager 32 is allowed to rewrite the data in the address range A.
  • step 605 of the interaction process 600 shown in FIG. 6 the cache manager 33 detects that the status information of the data in the address range a1 - address range a3 is one of unique, shared and invalid.
  • the state information of the data segments in at least part of the address ranges in the address range a1-address range a3 is modified (modity), that is, in the address range a1-address range an Dirty data exists in at least part of the address range of .
  • modity modity
  • Dirty data exists in at least part of the address range of .
  • steps 701 to 704 are the same as steps 601 to 604 shown in FIG. 6 .
  • steps 601 to 604 shown in FIG. 6 For details, refer to related descriptions of steps 601 to 604 in FIG. 6 , and details are not repeated here.
  • Step 705 the cache manager 33 detects that the data segment in the address range a1 has not been rewritten based on the signal i1, and transmits a response r4 to the invalidation manager 43, and the response r4 indicates that the data segment in the address range a1 has not been rewritten; the cache manager 33 detects that the data segment in the address range a2 has not been rewritten based on the signal i2, and transmits a response r2 to the invalidation manager 43, and the response r2 indicates that the data segment in the address range a2 has not been rewritten; the cache manager 33 detects that the data segment in the address range a2 is not rewritten based on the signal i3 If the state information of the data segment in the address range a3 is modified, a response signal r3 is transmitted to the invalidation manager 43, and the response signal r3 includes the data segment D1 in the address range an.
  • Step 706 the invalidation manager 43 writes the data segment D1 back into the address range an in the memory 11 .
  • Step 707 the memory manager 21 detects that the data segment D1 has been stored in the memory 11, and sends a signal S5 to the invalidation manager 43, and the signal S5 indicates that the storage of the data segment D1 is completed.
  • Steps 708 to 709 are the same as steps 606 to 607 shown in FIG. 6 .
  • steps 606 to 607 shown in FIG. 6 For details, refer to related descriptions of steps 606 to 607 in FIG. 6 , and details are not repeated here.
  • FIG. 8 is another interaction process 800 of the shared storage system 100 provided by the embodiment of the present application.
  • the interaction process 800 includes the following steps:
  • Step 801 the cache manager 33 writes the data segment D2 in the address range a1 to the memory 11 .
  • Steps 802 to 805 are the same as steps 601 to 604 shown in FIG. 6 .
  • steps 601 to 604 shown in FIG. 6 .
  • steps 601 to 604 in FIG. 6 refer to related descriptions of steps 601 to 604 in FIG. 6 , and details are not repeated here.
  • Step 806 the cache manager 33 detects that the data segment in the address range a2 has not been rewritten based on the signal i2, and transmits a response r2 to the invalidation manager 43, and the response r2 indicates that the data segment in the address range a2 has not been rewritten; the cache manager 33 detects that the data segment in the address range a3 has not been rewritten based on the signal i3, and transmits a response r3 to the invalidation manager 43, which indicates that the data segment in the address range a3 has not been rewritten.
  • Step 807 the memory manager 21 transmits a signal S6 to the cache manager 33, and the signal S6 indicates that the storage of the data segment D2 is completed.
  • Step 808 the cache manager 33 transmits a response r1 to the invalidation manager 43 based on the signal i1 and the signal S6 , and the response r1 indicates that the data segment in the address range a1 has not been rewritten.
  • Steps 809 to 810 are the same as steps 606 to 607 shown in FIG. 6 .
  • steps 606 to 607 shown in FIG. 6 For details, refer to related descriptions of steps 606 to 607 in FIG. 6 , and details are not repeated here.
  • the embodiment of the present application also provides a method for invalidating cached data, and the method for invalidating cached data can be applied to any memory manager as shown in FIG. 1 .
  • FIG. 9 shows a process 900 of the method for invalidating cached data provided by the embodiment of the present application.
  • the method comprises the following steps: step 901, receiving a first request from a first processor, the first request indicating rewriting data in a first address, the first address being an address in a shared memory; step 902, based on The first request sends a first signal to the second processor, the first signal indicates invalidation of the first data stored by the second processor, and the first data is a first address in the shared memory A copy of the data in the space in which the first address is located.
  • the memory manager executing the process 900 may be set in any processor shown in FIG. 1 , and the memory manager executing the process 900 is denoted as a third processor.
  • the above-mentioned first processor, the second processor and the third processor are all different processors.
  • the first processor is processor 01
  • the second processor is processor 02
  • the third processor is processor 03
  • the third processor 03 is provided with a memory manager 33
  • the memory management The device 33 is used to execute the process 900 shown in FIG. 9 .
  • the method further includes: receiving a second signal from the second processor, the second signal indicating completion of invalidation of the first data; based on the second signal, sending a third signal to the first processor, the third signal indicating that the first processor is allowed to rewrite the data in the first address.
  • an embodiment of the present application also provides a method for invalidating cached data, and the method for invalidating cached data can be applied to any processor shown in FIG. 1 .
  • FIG. 10 shows a process 1000 of the method for invalidating cached data provided by the embodiment of the present application. Including the following steps: Step 1001, receiving a first signal from the memory manager, the first signal is used to indicate the invalidation of the saved first data, the first data is the data of the first address space in the shared memory , the first address is located in the first address space; Step 1002 invalidates the first data based on the first signal.
  • the invalidating the first data based on the first signal includes: decomposing the first address space into A plurality of address ranges, obtaining a plurality of data segments corresponding to the first data, wherein one address range corresponds to a data segment; invalidating the plurality of data segments.
  • the invalidating the plurality of data segments includes: when the state of at least some of the data segments in the plurality of data segments is one of the shared state or the exclusive state, setting the The state of at least some of the data segments is modified to an invalid state.
  • the method further includes: when the state of each data segment in the plurality of data segments is an invalid state, transmitting a second signal to the memory manager, the second Signaling completion of invalidation of the first data.
  • the method before the transmitting the second signal to the memory manager, the method further includes: when the state of the first data segment among the plurality of data segments is a rewritten state, Writing the second data segment back into the corresponding address range in the shared memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

Embodiments of the present application provide a shared storage system and a method for invalidating cache data. The shared storage system comprises a memory manager, a plurality of processors, and a shared memory; a first processor in the plurality of processors is used for sending a request to the memory manager, the request indicating rewriting data in a first address, and the first address being an address in the shared memory; the memory manager sends a first signal to a second processor in the plurality of processors on the basis of the request, the first signal indicating invalidating first data stored in the second processor, the first data being a copy of data in a first address space in the shared memory, and the first address being located in the first address space; the second processor invalidates the first data on the basis of the first signal. According to the shared storage system and the method for invalidating cache data provided by the embodiments of the present application, the performance of the shared storage system can be improved.

Description

共享存储系统、装置和用于无效缓存数据的方法Shared storage system, device and method for invalidating cached data 技术领域technical field
本申请实施例涉及计算机安全领域,尤其涉及一种共享存储系统、装置和用于无效缓存数据的方法。The embodiments of the present application relate to the field of computer security, and in particular to a shared storage system, device and method for invalidating cached data.
背景技术Background technique
共享存储技术中,多个处理器之间共享内存空间。也即是说,接入共享存储系统中的任意一个处理器,均可以发出访问任意地址的请求。由于该多个处理器共享内存空间,发出访问同一地址请求的多个处理器会得到同一份数据的副本。然而,其中一个处理器对该副本改写后,导致该地址中,其他处理器所保存的副本与被改写的副本不同,也即存在缓存一致性问题。In shared memory technology, memory space is shared between multiple processors. That is to say, any processor connected to the shared memory system can issue a request to access any address. Since the multiple processors share the memory space, multiple processors that issue a request to access the same address will get a copy of the same data. However, after one of the processors rewrites the copy, the copy stored by other processors in the address is different from the rewritten copy, that is, there is a cache coherence problem.
为保证同一地址中,多个副本以及原始数据之间的缓存一致性,传统技术中,当其中一个处理器对某一地址进行了写操作后,负责维护共享地址空间的内存管理器,通常基于snoopy协议,向其他保存了该地址中的数据的处理器发送信息,以通知其他处理器该地址中的数据失效。然而,随着处理器速度的提升,处理器在时钟周期内处理的数据越来越多,相同的大量数据(也即多个地址中的数据)的副本被缓存于多个处理器中。在snoopy协议中,当其中一个处理器对该大量数据改写时,内存管理器需要将该大量数据对应的多个地址分别发送至其他处理器。假设该大量数据位于十个地址中,内存管理器需要将该十个地址一一发送至其他处理器,严重占用共享存储系统中通信网络的带宽。此外,将多个地址一一发送至其他处理器,其他处理器可能在多个时钟周期后才获得指示所保存的副本无效的信息,也即存在时延问题。综上,共享存储系统中,针对多个处理器中均保存同一大量数据的副本的情况,当其中一个处理器对该副本改写时,如何高效的使其他处理器所保存的副本无效,以提高共享存储系统的性能,成为需要解决的问题。In order to ensure the cache consistency between multiple copies and original data in the same address, in traditional technology, when one of the processors writes to a certain address, the memory manager responsible for maintaining the shared address space is usually based on The snoopy protocol sends information to other processors that have saved the data in the address to notify other processors that the data in the address is invalid. However, as the processor speed increases, the processor processes more and more data in a clock cycle, and copies of the same large amount of data (that is, data in multiple addresses) are cached in multiple processors. In the snoopy protocol, when one of the processors rewrites the large amount of data, the memory manager needs to send multiple addresses corresponding to the large amount of data to other processors respectively. Assuming that the large amount of data is located in ten addresses, the memory manager needs to send the ten addresses to other processors one by one, seriously occupying the bandwidth of the communication network in the shared memory system. In addition, if multiple addresses are sent to other processors one by one, it may take multiple clock cycles for other processors to obtain information indicating that the stored copy is invalid, that is, there is a time delay problem. To sum up, in a shared storage system, when multiple processors store copies of the same large amount of data, when one of the processors rewrites the copy, how to effectively invalidate the copies saved by other processors to improve The performance of the shared storage system becomes a problem that needs to be solved.
发明内容Contents of the invention
本申请提供的共享存储系统、装置和用于无效缓存数据的方法,可以在某一处理器对某一地址中的数据改写时,高效的使其他处理器所缓存的副本无效。为达到上述目的,本申请采用如下技术方案。The shared storage system, device and method for invalidating cached data provided by the present application can efficiently invalidate copies cached by other processors when a certain processor rewrites data in a certain address. In order to achieve the above purpose, the present application adopts the following technical solutions.
第一方面,本申请实施例提供一种共享存储系统,该共享存储系统包括:内存管理器、多个处理器以及共享内存;所述多个处理器中的第一处理器,用于向所述内存管理器发送请求,所述请求指示改写第一地址中的数据,所述第一地址为所述共享内存中的地址;所述内存管理器,基于所述请求,向所述多个处理器中的第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中;所述第二处理器,基于所述第一信号,无效所述第一数据。In the first aspect, the embodiment of the present application provides a shared storage system, the shared storage system includes: a memory manager, a plurality of processors, and a shared memory; the first processor in the plurality of processors is configured to provide The memory manager sends a request, and the request indicates to rewrite the data in the first address, and the first address is an address in the shared memory; the memory manager, based on the request, sends to the plurality of processes The second processor in the memory device sends a first signal, the first signal indicates that the first data stored by the second processor is invalid, and the first data is the data in the first address space in the shared memory a copy, the first address is located in the first address space; the second processor, based on the first signal, invalidates the first data.
内存管理器可以为一个专用的处理器,也可以与内核一起集成于中央处理器中。内存管理器例如可以为图1中所示的内存管理器21、内存管理器22、内存管理器23或者内存管理器24。多个处理器例如为图1所示的处理器01、处理器02处理器03或者处理器04。共享内存例如为图1所示的内存11、内存12、内存13或者内存14。内存管理器用于管理其所述耦合的内存。例如图1中,内存管理器21管理内存11,内存管理器22管理内存12,内存管理器23管理内存13,内存管理器24管理内存14。这里的管理,可以是指记录各地址空间中的数据被哪些处理器保存,以及在其中一个处理器改写某一地址空间中的数据之前,无效其他处理器中所保存的该地址空间中的数据。The memory manager can be a dedicated processor, or it can be integrated with the core in the central processing unit. The memory manager may be, for example, the memory manager 21 , the memory manager 22 , the memory manager 23 or the memory manager 24 shown in FIG. 1 . The multiple processors are, for example, processor 01 , processor 02 , processor 03 , or processor 04 shown in FIG. 1 . The shared memory is, for example, the memory 11 , the memory 12 , the memory 13 or the memory 14 shown in FIG. 1 . The memory manager is used to manage the memory of its said coupling. For example, in FIG. 1 , the memory manager 21 manages the memory 11 , the memory manager 22 manages the memory 12 , the memory manager 23 manages the memory 13 , and the memory manager 24 manages the memory 14 . The management here can refer to which processors save the data in each address space, and before one of the processors rewrites the data in a certain address space, invalidate the data in the address space saved in other processors .
第一地址可以包括多个,该多个第一地址均位于第一地址空间中。第一地址空间可以是内存管理器预先划分出的,也可以称为申明空间区域。一种可选的实现方式中,第一地址空间的大小可以被动态调整。例如,内存管理器可以首先将分配给各处理器的地址空间全部收回,对共享内存的地址空间大小重新调整,将调整后的地址空间的大小以及地址空间对应的标识广播给各处理器。There may be multiple first addresses, and the multiple first addresses are all located in the first address space. The first address space may be pre-divided by the memory manager, and may also be called a declared space area. In an optional implementation manner, the size of the first address space may be dynamically adjusted. For example, the memory manager may first recover all the address spaces allocated to each processor, readjust the size of the address space of the shared memory, and broadcast the adjusted size of the address space and the identifier corresponding to the address space to each processor.
第一信号可以包括多种实现方式。在第一种可能的实现方式中,第一信号可以中可以包括第一地址空间的地址范围;在第二种可能的实现方式中,第一地址空间可以映射一个预设标识,第一信号中可以包括该预设标识。The first signal may include various implementations. In the first possible implementation manner, the first signal may include the address range of the first address space; in the second possible implementation manner, the first address space may map a preset identifier, and the first signal may include The default identification can be included.
第一地址空间包括多个地址。传统技术中,用于管理共享内存的内存管理器,如果需要无效第一处理器中所保存的第一地址空间中的数据副本时,需要向第一处理器发送多次信号,一次信号指示无效一个地址中的数据;这就严重占用了互连网络的带宽。本申请实施例提供的内存管理器,通过向第一处理器发送第一信号,即可无效第一处理器中所保存的第一地址空间中的数据副本,与传统技术相比,可以释放互连网络带宽,提高带宽利用率,从而可以提高共享存储系统的性能。The first address space includes a plurality of addresses. In the traditional technology, if the memory manager used to manage the shared memory needs to invalidate the data copy in the first address space stored in the first processor, it needs to send multiple signals to the first processor, and one signal indicates invalid data at one address; this severely consumes the bandwidth of the interconnection network. The memory manager provided by the embodiment of the present application can invalidate the copy of data in the first address space stored in the first processor by sending the first signal to the first processor. It increases the network bandwidth and improves bandwidth utilization, thereby improving the performance of the shared storage system.
基于第一方面,在一种可能的实现方式中,所述第二处理器具体用于:基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;无效所述多个数据段。Based on the first aspect, in a possible implementation manner, the second processor is specifically configured to: decompose the first address space into multiple addresses based on the first signal and the size of a preset data segment range, to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment; and invalidate the multiple data segments.
可选的,第二处理器可以同时对多个数据段进行无效,也可以分时对该多个数据段进行无效。Optionally, the second processor may invalidate multiple data segments at the same time, or invalidate the multiple data segments in a time-sharing manner.
传统存储协议(例如snoopy协议)中,数据是以数据段的形式在缓存中存储,该数据段例如为缓存行。处理器以数据段为单位进行数据的操作,例如修改数据的状态、存储数据或者读取数据。一个数据段通常对应一个地址范围。第一地址空间包括多个地址范围,也即存储有多个数据段。第二处理器接收到指示无效第一地址空间中的数据副本的信息后,通过将第一地址空间分解成多个地址范围以得到多个数据段,然后对多个数据段无效,这样一来,可以采用传统的存储协议(例如snoopy协议)对多个数据段无效,进而对第一地址空间无效,不需要花费额外开销对第二处理器中用于执行无效第一地址空间中的数据的硬件改进,有利于节约设计成本。In a traditional storage protocol (such as the snoopy protocol), data is stored in the cache in the form of a data segment, such as a cache line. The processor performs data operations in units of data segments, such as modifying the state of data, storing data, or reading data. A data segment usually corresponds to an address range. The first address space includes multiple address ranges, that is, multiple data segments are stored. After the second processor receives the information indicating to invalidate the copy of data in the first address space, by decomposing the first address space into a plurality of address ranges to obtain a plurality of data segments, and then invalidating the plurality of data segments, such that , a traditional storage protocol (such as the snoopy protocol) can be used to invalidate multiple data segments, and then invalidate the first address space, without spending additional overhead on the second processor used to perform invalidation of the data in the first address space Hardware improvement is beneficial to save design cost.
基于第一方面,在一种可能的实现方式中,所述第二处理器具体用于:当所述多个数据段中的部分数据段的状态为共享状态和独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。Based on the first aspect, in a possible implementation manner, the second processor is specifically configured to: when the state of some data segments among the plurality of data segments is one of a shared state and an exclusive state, Modifying the state of the at least some of the data segments to an invalid state.
基于第一方面,在一种可能的实现方式中,所述第二处理器包括缓存管理器和无效管 理器;所述无效管理器,基于所述多个地址范围,生成多个无效信号,一个地址范围对应一个无效信号;以及分时将所述多个无效信号传输至所述缓存管理器;所述缓存管理器,基于所述多个无效信号,无效所述多个数据段。Based on the first aspect, in a possible implementation manner, the second processor includes a cache manager and an invalidation manager; the invalidation manager generates multiple invalidation signals based on the multiple address ranges, one The address range corresponds to one invalidation signal; and transmitting the plurality of invalidation signals to the buffer manager in time division; the buffer manager invalidates the plurality of data segments based on the plurality of invalidation signals.
缓存管理器和无效管理器均可以由硬件电路实现。第二处理器中对缓存数据的操作(例如从内存中获得数据以缓存、向内存中写入数据、修改所缓存的数据的状态),均由缓存管理器实现。缓存管理器以数据段为单位,对数据进行操作。传统技术中,未设置无效管理器,内存管理器将对应于多个地址范围的多个无效信号,分时提供至缓存管理器,严重占用带宽;本申请实施例中,通过设置无效管理器,由无效管理器代替内存管理器,生成多个无效信号提供至缓存管理器,由于无效管理器和缓存管理器设置于同一处理器中,与现有技术相比,可以不需要占用处理器片外互连网络的带宽,且具有高效的传输速率。此外,对于缓存管理器来说,缓存管理器的结构可以不需要做任何改变,依然可以按照传统存储协议接收信号,以数据段为单位对数据进行操作,节约设计成本。Both the cache manager and the invalidation manager can be implemented by hardware circuits. Operations on the cached data in the second processor (such as obtaining data from the memory for caching, writing data into the memory, and modifying the state of the cached data) are all implemented by the cache manager. The cache manager operates on data in units of data segments. In the traditional technology, no invalidation manager is set, and the memory manager provides multiple invalidation signals corresponding to multiple address ranges to the cache manager in time-sharing, which seriously occupies bandwidth; in the embodiment of the present application, by setting the invalidation manager, The memory manager is replaced by the invalidation manager, and multiple invalidation signals are generated and provided to the cache manager. Since the invalidation manager and the cache manager are set in the same processor, compared with the prior art, it is not necessary to occupy the off-chip processor The bandwidth of the interconnection network, and has an efficient transmission rate. In addition, for the cache manager, there is no need to make any changes to the structure of the cache manager, and it can still receive signals according to the traditional storage protocol and operate on data in units of data segments, saving design costs.
基于第一方面,在一种可能的实现方式中,所述无效管理器还用于基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围。Based on the first aspect, in a possible implementation manner, the invalidation manager is further configured to decompose the first address space into multiple address ranges based on the first signal and a size of a preset data segment.
基于第一方面,在一种可能的实现方式中,所述缓存管理器,还用于向所述无效管理器发送针对于所述多个信号的多个响应,其中一个响应对应一个数据段,所述多个响应中的第一响应指示所述多个数据段中的第一数据段未被改写;所述无效管理器,还用于基于所述多个响应,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。Based on the first aspect, in a possible implementation manner, the cache manager is further configured to send multiple responses to the multiple signals to the invalidation manager, where one response corresponds to one data segment, The first response in the plurality of responses indicates that the first data segment in the plurality of data segments has not been rewritten; the invalidation manager is further configured to transmit to the memory manager based on the plurality of responses a second signal indicating completion of invalidation of the first data.
传统技术中,未设置无效管理器,缓存管理器需要将多个响应信号传输至内存管理器,占据通信网络的带宽。本申请实施例中,无效管理器在接收到多个响应、并且多个响应均指示相应的数据段未被改写时,向内存管理器传输指示完成对第一数据无效的信号,也即无效管理器向内存管理器仅传输一个信号即可,与现有技术中传输多个响应信号相比,可以降低通信网络的带宽,提高带宽利用率。In the traditional technology, no invalidation manager is provided, and the cache manager needs to transmit multiple response signals to the memory manager, occupying the bandwidth of the communication network. In the embodiment of the present application, when the invalidation manager receives multiple responses, and the multiple responses indicate that the corresponding data segment has not been rewritten, it transmits a signal to the memory manager indicating that the invalidation of the first data is completed, that is, the invalidation management Only one signal needs to be transmitted from the controller to the memory manager. Compared with the transmission of multiple response signals in the prior art, the bandwidth of the communication network can be reduced and the utilization rate of the bandwidth can be improved.
基于第一方面,在一种可能的实现方式中,所述多个响应中的第二响应,包括所述多个数据段中的第二数据段;所述无效管理器,还用于将所述第二数据段写回所述共享内存中相应地址范围内。Based on the first aspect, in a possible implementation manner, the second response in the multiple responses includes the second data segment in the multiple data segments; the invalidation manager is further configured to convert the The second data segment is written back to the corresponding address range in the shared memory.
本申请实施例中,第二数据段中的数据被改写,缓存管理器需要将被改写的数据写回内存。缓存管理器将携带有第二数据段的第二响应传输至无效管理器,从而无效管理器将第二数据段写回内存。In the embodiment of the present application, the data in the second data segment is rewritten, and the cache manager needs to write the rewritten data back to the memory. The cache manager transmits the second response carrying the second data segment to the invalidation manager, so that the invalidation manager writes the second data segment back to the memory.
基于第一方面,在一种可能的实现方式中,所述多个无效信号中的第一无效信号指示所述第一数据段无效;以及所述缓存管理器具体用于:在接收到所述第一无效信号之前,将所述第一数据段写回所述共享内存;响应于所述内存管理器发送的第三信号,所述第三信号指示所述第一数据段存储完毕,向所述无效管理器发送所述第一响应。Based on the first aspect, in a possible implementation manner, the first invalid signal among the plurality of invalid signals indicates that the first data segment is invalid; and the buffer manager is specifically configured to: upon receiving the Before the first invalid signal, write the first data segment back to the shared memory; in response to the third signal sent by the memory manager, the third signal indicates that the storage of the first data segment is complete, and write the first data segment to the shared memory. The invalidation manager sends the first response.
基于第一方面,在一种可能的实现方式中,所述内存管理器还用于:向所述第一处理器发送第四信号,所述第四信号指示允许所述第一处理器改写所述第一地址中的数据。Based on the first aspect, in a possible implementation manner, the memory manager is further configured to: send a fourth signal to the first processor, where the fourth signal indicates that the first processor is allowed to rewrite the the data in the first address.
基于第一方面,在一种可能的实现方式中,所述多个处理器中的第三处理器,用于向所述内存管理器发送请求,所述请求指示改写第二地址中的数据,所述第二地址为所述共享内存中的地址;所述内存管理器,基于所述请求,向所述缓存管理器发送第五信号,所 述第五信号指示无效所述缓存管理器所保存的第二数据,所述第二数据为所述第二地址中的数据的副本;所述缓存控制器,基于所述第五信号,向所述内存管理器发送第二响应,所述第二响应指示所述第二数据修改或者未修改。Based on the first aspect, in a possible implementation manner, the third processor among the multiple processors is configured to send a request to the memory manager, where the request indicates to rewrite the data in the second address, The second address is an address in the shared memory; the memory manager, based on the request, sends a fifth signal to the cache manager, and the fifth signal indicates invalidation of the memory stored by the cache manager the second data in the second address, the second data is a copy of the data in the second address; the cache controller, based on the fifth signal, sends a second response to the memory manager, the second The response indicates that the second data is modified or not modified.
该实现方式中,如果第二地址位于第一地址范围之外、或者第二地址对应的数据量恰好为一个数据段时,内存管理器可以直接将指示无效第二地址中的数据副本的信号传输至缓存控制器,此时不需要通过无效管理器对地址范围分解,可以提高信号传输速度以及共享存储系统的工作效率。In this implementation, if the second address is outside the first address range, or the amount of data corresponding to the second address is exactly one data segment, the memory manager can directly transmit a signal indicating that the data copy in the second address is invalid To the cache controller, there is no need to decompose the address range through the invalidation manager at this time, which can improve the signal transmission speed and the working efficiency of the shared memory system.
第二方面,本申请实施例提供一种装置,该装置包括内存管理器,所述内存管理器用于:从第一处理器接收第一请求,所述第一请求指示改写第一地址中的数据,所述第一地址为共享内存中的地址,所述共享内存由所述内存管理器管理;基于所述第一请求,向第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中,其中,所述共享内存被所述内存管理器管理、且所述共享内存中的地址空间允许所述第一处理器和所述第二处理器访问。In a second aspect, an embodiment of the present application provides a device, the device includes a memory manager, and the memory manager is configured to: receive a first request from a first processor, and the first request indicates to rewrite the data in the first address , the first address is an address in a shared memory, the shared memory is managed by the memory manager; based on the first request, a first signal is sent to the second processor, the first signal indicates invalid The first data saved by the second processor, the first data is a copy of the data in the first address space in the shared memory, the first address is located in the first address space, wherein the A shared memory is managed by the memory manager, and an address space in the shared memory is allowed to be accessed by the first processor and the second processor.
本申请实施例提供的装置,可以仅设置有内存管理器,专用于管理共享内存。可选的,本申请实施例提供的装置,可以为一个处理器,该处理器中除了包括内存管理器之外,例如还可以包括处理器内核等。The device provided in the embodiment of the present application may only be provided with a memory manager dedicated to managing shared memory. Optionally, the device provided in this embodiment of the present application may be a processor, and the processor may include, for example, a processor core in addition to a memory manager.
第一地址空间包括多个地址。传统技术中,用于管理共享内存的内存管理器,如果需要无效第一处理器中所保存的第一地址空间中的数据副本时,需要向第一处理器发送多次信号,一次信号指示无效一个地址中的数据;这就严重占用了互连网络的带宽。本申请实施例提供的内存管理器,通过向第一处理器发送第一信号,即可无效第一处理器中所保存的第一地址空间中的数据副本,与传统技术相比,可以释放互连网络带宽,提高带宽利用率,从而可以提高共享存储系统的性能。The first address space includes a plurality of addresses. In the traditional technology, if the memory manager used to manage the shared memory needs to invalidate the data copy in the first address space stored in the first processor, it needs to send multiple signals to the first processor, and one signal indicates invalid data at one address; this severely consumes the bandwidth of the interconnection network. The memory manager provided by the embodiment of the present application can invalidate the copy of data in the first address space stored in the first processor by sending the first signal to the first processor. It increases the network bandwidth and improves bandwidth utilization, thereby improving the performance of the shared storage system.
基于第二方面,在一种可能的实现方式中,所述内存管理器还用于:从所述第二处理器接收第二信号,所述第二信号指示完成对所述第一数据的无效;基于所述第二信号,向所述第一处理器发送第三信号,所述第三信号指示允许所述第一处理器改写所述第一地址中的数据。Based on the second aspect, in a possible implementation manner, the memory manager is further configured to: receive a second signal from the second processor, the second signal indicating completion of invalidation of the first data ; based on the second signal, sending a third signal to the first processor, the third signal indicating that the first processor is allowed to rewrite the data in the first address.
基于第二方面,在一种可能的实现方式中,所述内存管理器还用于:监测所述第二处理器向所述共享内存中存储第一数据段,所述第一数据段为所述第一数据中的一段数据;响应于所述第一数据段存储完毕,向所述第二处理器发送第四信号,所述第四信号指示所述第一数据段存储完毕。Based on the second aspect, in a possible implementation manner, the memory manager is further configured to: monitor that the second processor stores a first data segment in the shared memory, and the first data segment is the A piece of data in the first data; in response to the completion of storage of the first data section, sending a fourth signal to the second processor, where the fourth signal indicates that the storage of the first data section is complete.
基于第二方面,在一种可能的实现方式中,所述内存管理器还用于:从第三处理器接收第二请求,所述第二请求指示改写第二地址中的数据,所述第二地址为所述共享内存中的地址;基于所述第二请求,向所述第二处理器发送第五信号,所述第五信号指示无效所述第二数据,所述第二数据为所述第二地址中的数据的副本。Based on the second aspect, in a possible implementation manner, the memory manager is further configured to: receive a second request from a third processor, where the second request indicates to rewrite the data in the second address, and the first The second address is an address in the shared memory; based on the second request, a fifth signal is sent to the second processor, the fifth signal indicates that the second data is invalid, and the second data is the second data. A copy of the data at the second address mentioned above.
第三方面,本申请实施例提供一种装置,该装置包括处理器,所述处理器用于:从内存管理器接收第一信号,所述第一信号用于指示无效所保存的第一数据,所述第一数据为共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中,所述共享内存被所述内存管理器管理;基于所述第一信号,无效所述第一数据。In a third aspect, an embodiment of the present application provides a device, the device includes a processor, the processor is configured to: receive a first signal from a memory manager, the first signal is used to indicate that the saved first data is invalid, The first data is a copy of data in a first address space in a shared memory, the first address is located in the first address space, and the shared memory is managed by the memory manager; based on the first signal , invalidate the first data.
基于第三方面,在一种可能的实现方式中,所述处理器具体用于:基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;无效所述多个数据段。Based on the third aspect, in a possible implementation manner, the processor is specifically configured to: decompose the first address space into multiple address ranges based on the first signal and the size of a preset data segment, Obtaining multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment; invalidating the multiple data segments.
基于第三方面,在一种可能的实现方式中,所述处理器具体用于:当所述多个数据段中的至少部分数据段的状态为共享状态或独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。Based on the third aspect, in a possible implementation manner, the processor is specifically configured to: when the state of at least some of the data segments in the multiple data segments is one of the shared state or the exclusive state, set The state of the at least some data segments is modified to an invalid state.
基于第三方面,在一种可能的实现方式中,所述处理器包括缓存管理器和无效管理器;所述无效管理器,基于所述多个地址范围,生成多个无效信号,一个地址范围对应一个无效信号;以及分时将所述多个无效信号传输至所述缓存管理器;所述缓存管理器,基于所述多个无效信号,无效所述多个数据段。Based on the third aspect, in a possible implementation manner, the processor includes a cache manager and an invalidation manager; the invalidation manager generates multiple invalidation signals based on the multiple address ranges, one address range corresponding to one invalidation signal; and transmitting the plurality of invalidation signals to the buffer manager in time division; the buffer manager invalidates the plurality of data segments based on the plurality of invalidation signals.
基于第三方面,在一种可能的实现方式中,所述缓存管理器,还用于向所述无效管理器发送针对于所述多个无效信号的多个响应,其中一个响应对应一个数据段,所述多个响应中的第一响应指示所述多个数据段中的第一数据段未被改写;所述无效管理器,还用于基于所述多个响应,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。Based on the third aspect, in a possible implementation manner, the cache manager is further configured to send multiple responses to the multiple invalidation signals to the invalidation manager, where one response corresponds to one data segment , the first response in the plurality of responses indicates that the first data segment in the plurality of data segments has not been rewritten; the invalid manager is further configured to send the memory manager to the memory manager based on the plurality of responses Transmitting a second signal indicating completion of invalidation of the first data.
基于第三方面,在一种可能的实现方式中,所述多个响应中的第二响应,携带有所述多个数据段中的第二数据段,所述第二数据段的状态为被改写状态;所述无效管理器,还用于当所述第二数据段的状态为被改写状态时,将所述第二数据段写回所述共享内存中相应地址范围内。Based on the third aspect, in a possible implementation manner, the second response among the multiple responses carries the second data segment among the multiple data segments, and the status of the second data segment is Rewrite state; the invalidation manager is further configured to write the second data segment back into the corresponding address range in the shared memory when the state of the second data segment is in the rewritten state.
基于第三方面,在一种可能的实现方式中,所述多个无效信号中的第一无效信号指示所述第一数据段无效;以及所述缓存管理器具体用于:在接收到所述第一无效信号之前,将所述第一数据段写回所述共享内存;响应于所述内存管理器发送的第四信号,向所述无效管理器发送所述第一响应,所述第四信号指示所述第一数据段存储完毕。Based on the third aspect, in a possible implementation manner, the first invalid signal among the plurality of invalid signals indicates that the first data segment is invalid; and the cache manager is specifically configured to: upon receiving the Before the first invalidation signal, write the first data segment back to the shared memory; in response to the fourth signal sent by the memory manager, send the first response to the invalidation manager, and the fourth A signal indicates that the storage of the first data segment is complete.
基于第三方面,在一种可能的实现方式中,所述缓存控制器还用于:从所述内存管理器接收第五信号,所述第五信号指示无效所述缓存管理器所保存的第二数据,所述第二数据为第二地址中的数据的副本;响应于所述第五信号,向所述内存管理器发送第三响应,所述第三响应指示所述第二数据修改或者未修改。Based on the third aspect, in a possible implementation manner, the cache controller is further configured to: receive a fifth signal from the memory manager, the fifth signal indicating invalidation of the first Two data, the second data is a copy of the data in the second address; in response to the fifth signal, sending a third response to the memory manager, the third response indicates that the second data is modified or unmodified.
第四方面,本申请实施例提供一种用于无效缓存数据的方法,该用于无效缓存数据的方法包括:从第一处理器接收第一请求,所述第一请求指示改写第一地址中的数据,所述第一地址为共享内存中的地址;基于所述第一请求,向第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中。In a fourth aspect, an embodiment of the present application provides a method for invalidating cached data, the method for invalidating cached data includes: receiving a first request from a first processor, the first request indicating rewriting the first address in data, the first address is an address in the shared memory; based on the first request, a first signal is sent to the second processor, and the first signal indicates invalidation of the first stored by the second processor data, the first data is a copy of data in a first address space in the shared memory, and the first address is located in the first address space.
基于第四方面,在一种可能的实现方式中,所述方法还包括:从所述第二处理器接收第二信号,所述第二信号指示完成对所述第一数据的无效;基于所述第二信号,向所述第一处理器发送第三信号,所述第三信号指示允许所述第一处理器改写所述第一地址中的数据。Based on the fourth aspect, in a possible implementation manner, the method further includes: receiving a second signal from the second processor, the second signal indicating completion of invalidation of the first data; the second signal, and send a third signal to the first processor, where the third signal indicates that the first processor is allowed to rewrite the data in the first address.
第五方面,本申请实施例提供一种用于无效缓存数据的方法,该用于无效缓存数据的方法包括:从内存管理器接收第一信号,所述第一信号用于指示无效所保存的第一数据,所述第一数据为共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址 空间中;基于所述第一信号,无效所述第一数据。In the fifth aspect, the embodiment of the present application provides a method for invalidating cached data, the method for invalidating cached data includes: receiving a first signal from the memory manager, the first signal is used to indicate invalidation of the stored first data, where the first data is a copy of data in a first address space in the shared memory, and the first address is located in the first address space; based on the first signal, the first data is invalidated.
基于第五方面,在一种可能的实现方式中,所述基于所述第一信号,无效所述第一数据,包括:基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;无效所述多个数据段。Based on the fifth aspect, in a possible implementation manner, the invalidating the first data based on the first signal includes: based on the first signal and the size of a preset data segment, converting the first data to Decomposing an address space into multiple address ranges to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment; invalidating the multiple data segments.
基于第五方面,在一种可能的实现方式中,所述无效所述多个数据段包括:当所述多个数据段中的至少部分数据段的状态为共享状态或独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。Based on the fifth aspect, in a possible implementation manner, the invalidating the plurality of data segments includes: when the state of at least some of the data segments in the plurality of data segments is one of a shared state or an exclusive state , modify the state of the at least some data segments to an invalid state.
基于第五方面,在一种可能的实现方式中,所述方法还包括:当所述多个数据段中的每一个数据段的状态为无效状态时,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。Based on the fifth aspect, in a possible implementation manner, the method further includes: when the state of each data segment in the plurality of data segments is an invalid state, transmitting a second signal to the memory manager , the second signal indicates completion of invalidation of the first data.
基于第五方面,在一种可能的实现方式中,所述向所述内存管理器传输第二信号之前,所述方法还包括:当所述多个数据段中的第一数据段的状态为被改写状态时,将所述第二数据段写回所述共享内存中相应地址范围内。Based on the fifth aspect, in a possible implementation manner, before transmitting the second signal to the memory manager, the method further includes: when the state of the first data segment among the plurality of data segments is When the state is rewritten, write the second data segment back into the corresponding address range in the shared memory.
应当理解的是,本申请的第二至第五方面与本申请的第一方面的技术方案一致,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。It should be understood that the technical solutions of the second to fifth aspects of the present application are consistent with the technical solutions of the first aspect of the present application, and the beneficial effects obtained by the various aspects and the corresponding feasible implementation modes are similar, so details are not repeated here.
附图说明Description of drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments of the present application. Obviously, the accompanying drawings in the following description are only some embodiments of the present application , for those skilled in the art, other drawings can also be obtained according to these drawings without paying creative labor.
图1是本申请实施例提供的共享存储系统的一个硬件结构示意图;FIG. 1 is a schematic diagram of a hardware structure of a shared storage system provided by an embodiment of the present application;
图2是本申请实施例提供的内存与全局地址空间映射关系的一个示意图;FIG. 2 is a schematic diagram of the mapping relationship between the memory and the global address space provided by the embodiment of the present application;
图3是本申请实施例提供的内存11的地址空间所划分的多个地址范围的示意图;FIG. 3 is a schematic diagram of multiple address ranges divided by the address space of the memory 11 provided by the embodiment of the present application;
图4是本申请实施例提供的共享存储系统中各部件之间交互的一个流程图;FIG. 4 is a flow chart of the interaction between components in the shared storage system provided by the embodiment of the present application;
图5是本申请实施例提供的共享存储系统的又一个硬件结构示意图;Fig. 5 is another schematic diagram of the hardware structure of the shared storage system provided by the embodiment of the present application;
图6是本申请实施例提供的如图5所示的共享存储系统中各部件之间交互的一个流程图;FIG. 6 is a flowchart of the interaction between components in the shared storage system shown in FIG. 5 provided by the embodiment of the present application;
图7是本申请实施例提供的如图5所示的共享存储系统中各部件之间交互的又一个流程图;Fig. 7 is another flow chart of interaction between components in the shared storage system shown in Fig. 5 provided by the embodiment of the present application;
图8是本申请实施例提供的如图5所示的共享存储系统中各部件之间交互的又一个流程图;Fig. 8 is another flow chart of interaction between components in the shared storage system shown in Fig. 5 provided by the embodiment of the present application;
图9是本申请实施例提供的用于无效缓存数据的方法的一个流程图;FIG. 9 is a flowchart of a method for invalidating cached data provided by an embodiment of the present application;
图10是本申请实施例提供的用于无效缓存数据的方法的又一个流程图。FIG. 10 is another flow chart of the method for invalidating cached data provided by the embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请 中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.
本文所提及的"第一"、"第二"以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。同样,"一个"或者"一"等类似词语也不表示数量限制,而是表示存在至少一个。"连接"等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的,等同于广义上的联通。"First", "second" and similar words mentioned herein do not indicate any order, quantity or importance, but are only used to distinguish different components. Likewise, words like "a" or "one" do not denote a limitation in number, but indicate that there is at least one. "Connection" and similar words are not limited to physical or mechanical connection, but may include electrical connection, no matter it is direct or indirect, which is equivalent to communication in a broad sense.
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。在本申请实施例的描述中,除非另有说明,“多个”的含义是指两个或两个以上。例如,多个处理器是指两个或两个以上的处理器。In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations or illustrations. Any embodiment or design scheme described as "exemplary" or "for example" in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete manner. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more. For example, multiple processors refers to two or more processors.
本申请实施例所述的共享存储系统100,可以是一个对称多处理器(symmetrical mulit-processing,SMP)系统。共享存储系统100包括多个处理器,例如2个、3个等,本申请实施例对此不做具体限定。该多个处理器可以为中央处理器(CPU,central processing unit)或者专用处理器(例如图像信号处理器、人工智能处理器或者神经网络处理器等),本申请实施例对处理器的形式不做具体限定。接入共享存储系统100的各部件(例如处理器和内存)之间可以通过互连网络通信,该互连网络例如可以包括但不限于以下之一:单总线、多总线或者交叉开关。另外,共享存储系统100还包括内存,该内存例如包括但不限于SDRAM,DDR等。该内存可以独立设置由专用的内存管理器管理、也可以耦合于其中一个处理器上由该处理器管理,也可以分成多个部分分布式的耦合于多个处理器上,每一部分由其所耦合的处理器管理。该内存形成一个全局统一的地址空间,共享存储系统100中的所有处理器共享该地址空间,各处理器均可以发起访问任意地址空间的请求。这里所说的访问任意地址空间,是指从所访问的地址空间中读取指令或者数据,或者向所访问的地址空间中写入数据。发出读取同一地址空间请求的处理器,会得到同一份数据。上述管理内存,是指记录各地址空间中的数据被哪些处理器保存,以及在其中一个处理器改写某一地址空间中的数据之前,无效其他处理器中所保存的该地址空间中的数据。本申请实施例以共享存储系统100包括四个处理器为例,并且每一个处理器均挂载有内存案为例,结合图1,对本申请实施例提供的共享存储系统100进行详细描述。The shared storage system 100 described in the embodiment of the present application may be a symmetric multiprocessor (symmetrical mulit-processing, SMP) system. The shared storage system 100 includes multiple processors, for example, 2, 3, etc., which is not specifically limited in this embodiment of the present application. The plurality of processors may be a central processing unit (CPU, central processing unit) or a dedicated processor (such as an image signal processor, an artificial intelligence processor or a neural network processor, etc.), and the embodiment of the present application does not change the form of the processor. Be specific. Components (such as processors and memory) connected to the shared storage system 100 may communicate through an interconnection network, and the interconnection network may include, but is not limited to, one of the following: a single bus, multiple buses, or a crossbar. In addition, the shared storage system 100 also includes a memory, such as but not limited to SDRAM, DDR, and the like. The memory can be independently set and managed by a dedicated memory manager, or it can be coupled to one of the processors and managed by the processor, or it can be divided into multiple parts and distributed and coupled to multiple processors, each part is controlled by it. Coupled processor management. The memory forms a globally unified address space, which is shared by all processors in the shared storage system 100 , and each processor can initiate a request to access any address space. The access to any address space mentioned here refers to reading instructions or data from the accessed address space, or writing data to the accessed address space. Processors that issue requests to read the same address space will get the same data. The above-mentioned memory management refers to recording which processors store data in each address space, and before one of the processors rewrites data in a certain address space, invalidates the data in the address space stored in other processors. In this embodiment of the present application, the shared storage system 100 includes four processors as an example, and each processor is mounted with a memory case as an example. With reference to FIG. 1 , the shared storage system 100 provided by the embodiment of the present application is described in detail.
请参考图1,图1为本申请实施例提供的共享存储系统100的一个硬件结构示意图。在图1中,共享存储系统100包括处理器01、处理器02、处理器03和处理器04该四个处理器。该四个处理器之间通过互连网络连接。需要说明的是,处理器01~处理器04可以分别集成于一个或多个芯片中。例如,处理器01~处理器04可以分别集成于不同的芯片中;再例如,处理器01~处理器04也可以集成于一个芯片中。本申请实施例中以处理器01~处理器04分别为独立的处理器且分别集成于不同的芯片中为例进行描述,但不用于对方案的限定。Please refer to FIG. 1 , which is a schematic diagram of a hardware structure of a shared storage system 100 provided in an embodiment of the present application. In FIG. 1 , a shared storage system 100 includes four processors, namely a processor 01 , a processor 02 , a processor 03 and a processor 04 . The four processors are connected through an interconnection network. It should be noted that the processor 01 to the processor 04 may be respectively integrated into one or more chips. For example, processor 01 to processor 04 may be integrated into different chips; for another example, processor 01 to processor 04 may also be integrated into one chip. In the embodiment of the present application, the processor 01 to the processor 04 are respectively independent processors and are respectively integrated into different chips for description, but this is not used to limit the solution.
每一个处理器均挂载有一个内存。也即是说,处理器01与内存11耦合,处理器02与内存12耦合,处理器03与内存13耦合,处理器04与内存14耦合。各内存也可以选择性的与其所耦合的处理器集成于同一芯片中。内存11~内存14组成共享存储系统100的全局地址空间,供处理器01~处理器04共享。内存11~内存14分别被映射到全局地址 空间中的某块。如图2所示,图2示意性的示出了各内存与全局地址空间的映射关系图。在图2中,内存11的存储空间被映射至全局地址空间中的0x0000~0x4000,内存12的存储空间被映射至全局地址空间中的0x4001~0x8000,内存13的存储空间被映射至全局地址空间中的0x8000~0x1201,内存14的存储空间被映射至全局地址空间中的0x1200~0x1600。Each processor has a memory attached to it. That is to say, the processor 01 is coupled to the memory 11 , the processor 02 is coupled to the memory 12 , the processor 03 is coupled to the memory 13 , and the processor 04 is coupled to the memory 14 . Each memory can also optionally be integrated into the same chip as the processor coupled to it. The memory 11 - the memory 14 form the global address space of the shared memory system 100, which is shared by the processors 01 - 04. Memory 11 to memory 14 are respectively mapped to a certain block in the global address space. As shown in FIG. 2 , FIG. 2 schematically shows a mapping relationship between each memory and the global address space. In Figure 2, the storage space of memory 11 is mapped to 0x0000~0x4000 in the global address space, the storage space of memory 12 is mapped to 0x4001~0x8000 in the global address space, and the storage space of memory 13 is mapped to the global address space 0x8000-0x1201 in the memory, the storage space of memory 14 is mapped to 0x1200-0x1600 in the global address space.
每一个处理器中均设置有一个或多个处理器内核(core),处理器内核用于从全局地址空间中获得指令和数据,以完成各种需要处理的任务。如图1中示意性的示出了处理器01中设置有出一个内核、处理器02中设置有c2该一个内核、处理器03中设置有c3该一个内核、处理器04中设置有c4该一个内核。Each processor is provided with one or more processor cores (cores), and the processor cores are used to obtain instructions and data from the global address space to complete various tasks that need to be processed. As shown schematically in Figure 1, one core is set in processor 01, one core c2 is set in processor 02, one core c3 is set in processor 03, and one core c4 is set in processor 04 a kernel.
每一个处理器中还集成有内存管理器,该内存管理器也可以称为本地代理(home agent),其中处理器01中集成有内存管理器21、处理器02中集成有内存管理器22、处理器03中集成有内存管理器23、处理器04中集成有内存管理器24。各内存管理器均可以由硬件电路实现。各内存管理器用于管理其所耦合的内存。以内存管理器21为例,内存管理器21用于记录如图2所示的地址空间0x0000~0x4000中的数据被哪些处理器保存,以及处理器02~处理器04中的一个处理器改写某一地址空间中的数据之前,无效其他处理器所保存的该地址空间中的数据。假设处理器02和处理器03均保存有地址0x3800中的数据。处理器02需要对地址0x3800中的数据改写,在处理器02对地址0x3800中的数据改写之前,处理器01向处理器03传输用于指示无效地址0x3800中的数据的信号。从而可以保障共享存储系统100中的缓存一致性。需要说明的是,各内存管理器均可以通过互连网络与其他各处理器(也可以为处理器中设置的部件,例如处理器内核、下文中所述的数据无效管理器以及缓存管理器)直接通信,通过内部总线与内部处理器中的部件(例如处理器内核、缓存管理器和数据无效管理器等)直接通信。Each processor is also integrated with a memory manager, which can also be called a local agent (home agent), wherein the processor 01 is integrated with a memory manager 21, and the processor 02 is integrated with a memory manager 22, A memory manager 23 is integrated in the processor 03 , and a memory manager 24 is integrated in the processor 04 . Each memory manager can be implemented by a hardware circuit. Each memory manager is used to manage the memory to which it is coupled. Taking the memory manager 21 as an example, the memory manager 21 is used to record which processors the data in the address space 0x0000~0x4000 shown in Figure 2 are saved by, and which processors in the processor 02~processor 04 rewrite a certain Before the data in an address space, invalidate the data in the address space held by other processors. Assume that processor 02 and processor 03 both store data at address 0x3800. The processor 02 needs to rewrite the data in the address 0x3800. Before the processor 02 rewrites the data in the address 0x3800, the processor 01 transmits to the processor 03 a signal for indicating invalid data in the address 0x3800. Therefore, cache consistency in the shared storage system 100 can be guaranteed. It should be noted that each memory manager can communicate with other processors (or components set in the processor, such as processor core, data invalidation manager and cache manager described below) through an interconnection network Direct communication, directly communicates with components in the internal processor (such as processor core, cache manager and data invalidation manager, etc.) through the internal bus.
进一步的,本申请实施例中,各内存管理器可以预先将其所管理的空间区域进一步划分成多个地址空间,该地址空间例如可以页(page)为粒度进行划分。该多个地址空间中的至少部分地址空间为申明地址空间。也即是说,当处理器需要读取申明地址空间中的某一地址(或多个地址)中的数据时,内存管理器将该申明地址空间对应的地址空间中的全部数据均提供给处理器;另外,当某一处理器需要改写申明地址空间中的某一地址(或多个地址)中的数据时,若其他处理器保存有该申明地址空间的数据的副本,其他处理器需要将所保存的该申明地址空间中全部的数据均无效。下面以图2所示的内存11所映射的地址空间为例,结合图3,对地址空间所划分的多个地址空间进行描述。如图2所示的内存11所映射的地址空间0x0000~0x4000可以进一步划分成四个地址空间。该四个地址空间例如可以为地址空间0x0000~0x1000、地址空间0x1001~0x2000、地址空间0x2001~0x3000以及地址空间0x3001~0x4000。其中地址空间0x0000~0x1000、地址空间0x1001~0x2000以及地址空间0x2001~0x3000为申明地址空间,分别记为地址空间p1、地址空间p2、和地址空间p3。假设处理器02向内存管理器21发出读取地址0x0000~0x0800中的数据的信号,处理器03向内存管理器21发出读取地址0x3000~0x0900中的数据的信号,则内存管理器21则将地址空间p1中的全部数据分别提供给处理器02和处理器03。基于图3所示的内存11所划分的四个地址空间,以处理器02进一步向内存管理器21发出改写地址0x0000~0x0800中的数据的请求为例,结合图4所示的交互流程,内存管理器 21无效处理器03中所保存的副本的方式进行描述。Further, in the embodiment of the present application, each memory manager may further divide the space area it manages into multiple address spaces in advance, and the address space may be divided into page (page) granularity, for example. At least some of the address spaces in the plurality of address spaces are declarative address spaces. That is to say, when the processor needs to read data in a certain address (or multiple addresses) in the declared address space, the memory manager provides all the data in the address space corresponding to the declared address space to the processing In addition, when a processor needs to rewrite the data in a certain address (or multiple addresses) in the declared address space, if other processors have a copy of the data in the declared address space, other processors need to copy the data in the declared address space. All the saved data in the declaration address space are invalid. Taking the address space mapped by the memory 11 shown in FIG. 2 as an example, the multiple address spaces divided by the address space will be described in conjunction with FIG. 3 . The address space 0x0000˜0x4000 mapped by the memory 11 shown in FIG. 2 can be further divided into four address spaces. The four address spaces may be, for example, address spaces 0x0000-0x1000, address spaces 0x1001-0x2000, address spaces 0x2001-0x3000, and address spaces 0x3001-0x4000. Among them, the address space 0x0000-0x1000, the address space 0x1001-0x2000 and the address space 0x2001-0x3000 are declared address spaces, respectively recorded as address space p1, address space p2, and address space p3. Assuming that the processor 02 sends a signal to the memory manager 21 to read the data in addresses 0x0000-0x0800, and the processor 03 sends a signal to the memory manager 21 to read the data in the address 0x3000-0x0900, then the memory manager 21 will All data in address space p1 are provided to processor 02 and processor 03 respectively. Based on the four address spaces divided by the memory 11 shown in Figure 3, taking the processor 02 further sending a request to the memory manager 21 to rewrite the data in addresses 0x0000-0x0800 as an example, combined with the interaction process shown in Figure 4, the memory The manner in which the manager 21 invalidates the copy stored in the processor 03 will be described.
步骤401,处理器02向内存管理器21发送请求q1。该请求q1用于请求改写地址0x0000~0x0800中的数据。Step 401 , the processor 02 sends a request q1 to the memory manager 21 . The request q1 is used to request to rewrite the data in addresses 0x0000˜0x0800.
步骤402,内存管理器21,基于请求q1,向处理器03发送信号s1。具体的,内存管理器21接收到请求q1后,首先查询出地址0x0000~0x0800位于地址空间p1中。然后,内存管理器21又可以查询出处理器03保存有地址空间p1中的数据的副本。最后,内存管理器21向处理器03发送信号s1,信号s1指示处理器03无效地址空间p1中的数据的副本。在一种可选的实现方式中,内存管理器21可以将地址空间对应的地址范围0x0000~0x1000直接发送给处理器03。在另外一种可能的实现方式中,每一个申明地址空间均可以映射一个预设标识,各处理器中均可以保存有申明地址空间与预设标识之间的映射关系,例如地址空间p1、地址空间p2、和地址空间p3对应的预设标识分别为page1、page2和page3;内存管理器21可以将地址空间p1所对应的预设标识page1发送给处理器03。Step 402, the memory manager 21 sends a signal s1 to the processor 03 based on the request q1. Specifically, after receiving the request q1, the memory manager 21 first finds out that addresses 0x0000-0x0800 are located in the address space p1. Then, the memory manager 21 can query and find out that the processor 03 has saved a copy of the data in the address space p1. Finally, the memory manager 21 sends a signal s1 to the processor 03 indicating that the processor 03 invalidates the copy of the data in the address space p1. In an optional implementation manner, the memory manager 21 may directly send the address range 0x0000-0x1000 corresponding to the address space to the processor 03 . In another possible implementation, each declared address space can be mapped to a preset identifier, and each processor can store the mapping relationship between the declared address space and the preset identifier, such as address space p1, address The preset identifiers corresponding to the space p2 and the address space p3 are page1, page2 and page3 respectively; the memory manager 21 may send the preset identifier page1 corresponding to the address space p1 to the processor 03.
步骤403,处理器03基于信号s1,无效所保存的地址空间p1中的数据的副本。通常,处理器03所保存的地址空间p1中的数据的副本,可以包括多个数据段,一个数据段例如可以为一个缓存行(cache line)。每一数据段均设置有指示该数据段的状态的状态信息,该状态信息例如包括但不限于以下一项:被修改(modity)、唯一(exclusive)、共享(shared)和无效(Invalid)。上述所说的无效所保存的地址空间p1中的数据的副本,可以是指当某一数据段的状态信息为被修改、唯一和共享中的一项时,处理器03将状态信息修改为无效。In step 403, the processor 03 invalidates the saved copy of the data in the address space p1 based on the signal s1. Generally, the copy of the data in the address space p1 saved by the processor 03 may include multiple data segments, and one data segment may be, for example, a cache line (cache line). Each data segment is provided with status information indicating the status of the data segment, such status information includes but not limited to the following items: modified (modity), unique (exclusive), shared (shared) and invalid (Invalid). The copy of the data in the address space p1 that is invalidated as mentioned above may mean that when the status information of a certain data segment is one of modified, unique and shared, the processor 03 modifies the status information to be invalid .
基于图1所示的共享存储系统100的结构,传统的共享存储系统100中,各处理器之间基于snoopy协议通信。在snoopy协议中,处理器通常以数据段为粒度操作数据(例如读取数据、写入数据或者修改数据相对应的标识位)。也即是说,假设处理器03保存有地址0x3000~0x0800中的数据、且每次所能操作的数据的粒度为0x0100。当处理器02对地址0x0000~0x0800中的数据进行写操作时,内存管理器21基于snoopy协议,按照处理器中操作数据的粒度,将地址0x0000~0x0800以0x0100为粒度分解,得到八个地址范围。然后,内存管理器21每次向处理器03传输指示无效一个地址范围中的数据的信号,也即内存管理器21需要向处理器03传输八次信号,严重占用了互连网络的带宽。此外,将八个地址范围一一发送至处理器03,处理器03可能在多个时钟周期后才获得指示某一地址范围内的数据无效的信号,存在时延问题。本申请实施例中,通过预先设置申明地址空间,当需要对申明地址空间中的大量数据无效时,用于管理该申明空间范围的内存管理器,可以将指示无效该申明地址空间中的数据的副本的信号,提供给其他保存有该副本的处理器,与内存管理器需要发送多次信号相比,本申请实施例可以发送一次信号即可对大量数据无效,释放了互连网络的带宽,有利于提高带宽利用率。Based on the structure of the shared storage system 100 shown in FIG. 1 , in the traditional shared storage system 100 , the processors communicate with each other based on the snoopy protocol. In the snoopy protocol, the processor usually operates data at the granularity of data segments (eg, reads data, writes data, or modifies identification bits corresponding to data). That is to say, assume that the processor 03 stores data in addresses 0x3000-0x0800, and the granularity of the data that can be manipulated each time is 0x0100. When the processor 02 writes the data in the address 0x0000~0x0800, the memory manager 21 decomposes the address 0x0000~0x0800 with 0x0100 as the granularity to obtain eight address ranges based on the snoopy protocol. . Then, the memory manager 21 transmits to the processor 03 a signal indicating that data in an address range is invalid each time, that is, the memory manager 21 needs to transmit signals to the processor 03 eight times, seriously occupying the bandwidth of the interconnection network. In addition, if the eight address ranges are sent to the processor 03 one by one, the processor 03 may obtain a signal indicating that the data in a certain address range is invalid after several clock cycles, and there is a time delay problem. In the embodiment of the present application, by setting the declaration address space in advance, when a large amount of data in the declaration address space needs to be invalidated, the memory manager used to manage the range of the declaration space can set the The copy signal is provided to other processors that store the copy. Compared with the memory manager that needs to send multiple signals, the embodiment of the present application can send a signal once to invalidate a large amount of data, releasing the bandwidth of the interconnection network. Helps improve bandwidth utilization.
基于图1所示的共享存储系统100的结构、图4所示的交互流程400,请继续参考图5,图5是本申请实施例提供的共享存储系统100更为详细的一个结构示意图。在图1所示的共享存储系统100基础上,图5所示的共享存储系统100中的每一个处理器,进一步包括缓存管理器。缓存管理器也可以称为缓存代理(cache agent)。缓存管理器中设置有缓存和缓存控制器。缓存可以基于缓存控制器的控制,从共享存储系统100中的全局地址 空间,获得处理器内核运行所需要的指令和数据以进行保存;以及向全局地址空间中的任意地址空间写入数据,该数据为预先保存在缓存中的、处理器运行所产生的脏(dirty)数据。也即是说,如果某一处理器存在需要无效的副本,该副本保存在缓存管理器中。图1中示意性的示出了处理器01中设置有缓存管理器31、处理器02中设置有缓存管理器32、处理器03中设置有缓存管理器33、处理器04中设置有缓存管理器34。各缓存管理器均可以由硬件电路实现。通常,缓存管理器基于snoopy协议,以数据段为粒度对数据进行操作。一种可选的实现方式中,缓存管管理器以数据段为粒度,对所要无效的副本所包括的多个数据段进行无效操作。这里所说的无效操作,是指数据段的状态为被修改、唯一和共享中的一项时,将该数据段的状态修改为无效。Based on the structure of the shared storage system 100 shown in FIG. 1 and the interaction process 400 shown in FIG. 4 , please continue to refer to FIG. 5 , which is a more detailed structural diagram of the shared storage system 100 provided by the embodiment of the present application. On the basis of the shared memory system 100 shown in FIG. 1 , each processor in the shared memory system 100 shown in FIG. 5 further includes a cache manager. A cache manager can also be called a cache agent. A cache and a cache controller are provided in the cache manager. The cache can be based on the control of the cache controller, from the global address space in the shared storage system 100, obtain instructions and data required for the operation of the processor core for storage; and write data to any address space in the global address space, the The data is dirty (dirty) data that is pre-stored in the cache and generated by the operation of the processor. That is, if a processor has a copy that needs to be invalidated, that copy is kept in the cache manager. Fig. 1 schematically shows that a cache manager 31 is set in the processor 01, a cache manager 32 is set in the processor 02, a cache manager 33 is set in the processor 03, and a cache management is set in the processor 04 device 34. Each cache manager can be implemented by a hardware circuit. Usually, the cache manager operates on data at the granularity of data segments based on the snoopy protocol. In an optional implementation manner, the cache management manager performs an invalidation operation on multiple data segments included in the copy to be invalidated at a data segment granularity. The invalid operation mentioned here refers to modifying the status of the data segment to invalid when the status of the data segment is one of modified, unique and shared.
如图5所示,每一个处理器中还可以设置有无效管理器,该无效管理器也可以称为刷新引擎(flush egine)。处理器01中设置有无效管理器41、处理器02中设置有无效管理器42、处理器03中设置有无效管理器43、处理器04中设置有无效管理器44。各无效管理器可以为硬件电路。无效管理器用于从内存管理器中获得信号,该信号指示无效某一个申明地址空间中的数据的副本;无效管理器基于该信号,按照缓存管理器中处理数据的粒度(例如数据段),将所要无效的某申明地址空间分解成多个地址范围,一个地址范围对应一个数据段;无效管理器将该多个地址范围分别提供给缓存管理器。例如,假设无效管理器43从内存管理器21接收到信号,该信号指示无效地址空间p1中的数据;假设缓存管理器中用于存储数据的粒度为0x0100,无效管理器43基于所接收到的信号,将地址空间p1中的地址分解成10个,分别为地址0x0000~0x0100、地址0x0101~0x0200、…地址0x0901~0x1000,然后将该十个地址分时提供至缓存管理器33。以使得缓存管理器33无效该十个地址中的数据。As shown in FIG. 5 , each processor may also be provided with an invalidation manager, and the invalidation manager may also be called a flush engine (flush engine). An invalidation manager 41 is set in the processor 01 , an invalidation manager 42 is set in the processor 02 , an invalidation manager 43 is set in the processor 03 , and an invalidation manager 44 is set in the processor 04 . Each invalidation manager may be a hardware circuit. The invalidation manager is used to obtain a signal from the memory manager, which indicates invalidation of a copy of data in a declared address space; based on the signal, the invalidation manager will The address space of a statement to be invalidated is decomposed into multiple address ranges, and one address range corresponds to one data segment; the invalidation manager provides the multiple address ranges to the cache manager respectively. For example, assume that the invalidation manager 43 receives a signal from the memory manager 21 indicating that the data in the address space p1 is invalidated; assuming that the granularity for storing data in the cache manager is 0x0100, the invalidation manager 43 based on the received signal, decompose the addresses in the address space p1 into 10 addresses, namely address 0x0000-0x0100, address 0x0101-0x0200, ... address 0x0901-0x1000, and then provide the ten addresses to the cache manager 33 in time-sharing. So that the cache manager 33 invalidates the data in the ten addresses.
本申请实施例通过设置无效管理器,当对某地址空间内的大量数据无效时,用于管理该地址空间的内存管理器,可以将指示无效该地址空间中的数据的信号,提供给无效管理器,由无效管理器进一步对该地址空间对应的地址区域分解,将分解后的多个地址范围提供至片内的缓存管理器。由此,与传统技术中、内存管理器需要向缓存管理器发送多次信号相比,本申请实施例可以发送一次信号即可对大量数据无效,释放了互连网络的带宽,有利于提高带宽利用率。此外,由于无效管理器与缓存管理器位于同一个处理器中,通过无效管理器对地址范围分解、且通过无效管理器将多个地址范围提供给缓存管理器,与内存管理器需要向片外的缓存管理器传输多个信号相比,可以降低信号传输时延。综上,本申请实施例提供的共享存储系统100,可以提高互连网络的带宽利用率、降低信号时延,从而提高共享存储系统100的性能。In the embodiment of the present application, by setting an invalidation manager, when a large amount of data in a certain address space is invalidated, the memory manager used to manage the address space can provide a signal indicating invalidation of the data in the address space to the invalidation management The invalidation manager further decomposes the address region corresponding to the address space, and provides multiple decomposed address ranges to the on-chip cache manager. Therefore, compared with the traditional technology where the memory manager needs to send multiple signals to the cache manager, the embodiment of the present application can send a signal once to invalidate a large amount of data, which releases the bandwidth of the interconnection network and is conducive to improving the bandwidth. utilization rate. In addition, since the invalidation manager and the cache manager are located in the same processor, the address range is decomposed by the invalidation manager, and multiple address ranges are provided to the cache manager through the invalidation manager, and the memory manager needs to provide off-chip The buffer manager can reduce the signal transmission delay compared with transmitting multiple signals. To sum up, the shared storage system 100 provided by the embodiment of the present application can improve the bandwidth utilization rate of the interconnection network and reduce signal delay, thereby improving the performance of the shared storage system 100 .
本申请实施例中,当某一处理器中所保存的数据副本被无效时,如果该数据为脏(dirty)数据,则该处理器的缓存管理器需要将该脏数据写回到内存中。基于此,本申请实施例一种可选的实现方式中,当被无效的数据为脏数据时,缓存管理器还可以用于将脏数据传输至片内的无效管理器。无效管理器将脏数据写回内存中。需要说明的是,内存管理器发起无效某一处理器中所保存的某一地址空间的数据副本之后、并且该处理器向内存管理器返回指示完成无效的信息之前,内存管理器将该地址空间锁定,也即其他任意处理器均无法访问该地址空间。In the embodiment of the present application, when a data copy stored in a certain processor is invalidated, if the data is dirty (dirty) data, the cache manager of the processor needs to write the dirty data back into the memory. Based on this, in an optional implementation manner of the embodiment of the present application, when the data to be invalidated is dirty data, the cache manager may also be used to transmit the dirty data to the invalidation manager in the chip. An invalid manager writes dirty data back into memory. It should be noted that after the memory manager initiates invalidation of the data copy of a certain address space stored in a processor and before the processor returns to the memory manager information indicating that the invalidation is completed, the memory manager invalidates the data copy of the address space. Locked, that is, no other processor can access the address space.
图5所述的共享存储系统100中,当内存管理器需要无效缓存管理器中所保存的某一 申明地址空间的大量数据时,向该缓存管理器传输指示无效该申明空中数据副本的信号。本申请实施例一种可选的实现方式中,如果内存管理器需要无效的数据存储于申明地址空间之外时(例如内存管理器21需要无效图3中所示的地址0x3100~0x3200中的数据)、或者所要无效的数据的数目是snoopy协议中规定的数目时(例如一个数据段的数据),内存管理器可以直接向缓存管理器传输指示无效某一地址范围内的数据的信号。另外,如果缓存管理器器基于该信号,检测出该地址范围内存储有脏数据,可以直接将该脏数据写回内存中相应地址范围内。In the shared storage system 100 described in FIG. 5 , when the memory manager needs to invalidate a large amount of data in a declared address space stored in the cache manager, a signal indicating invalidation of the declared air data copy is transmitted to the cache manager. In an optional implementation of the embodiment of the present application, if the memory manager needs to store invalid data outside the declared address space (for example, the memory manager 21 needs to invalidate the data in addresses 0x3100-0x3200 shown in FIG. 3 ), or when the number of data to be invalidated is the number stipulated in the snoopy protocol (such as the data of a data segment), the memory manager can directly transmit a signal indicating invalid data in a certain address range to the cache manager. In addition, if the cache manager detects that dirty data is stored in the address range based on the signal, it can directly write the dirty data back into the corresponding address range in the memory.
基于图5所示的共享存储系统100、图2所示的内存与全局地址空间的映射关系图以及图3所示的内存11的地址空间,下面以缓存管理器32和缓存管理器33均缓存有内存11中的地址空间p1、且缓存管理器32请求改写地址空间p1中的数据为例,对图5所示的共享存储系统100中各部件之间的交互流程进行描述。请参考图6,图6为如图5所示的共享存储系统100中各部件之间的交互流程600,该交互流程包括如下步骤:Based on the shared storage system 100 shown in FIG. 5 , the mapping relationship between the memory and the global address space shown in FIG. 2 , and the address space of the memory 11 shown in FIG. 3 , both the cache manager 32 and the cache manager 33 cache Taking the address space p1 in the memory 11 and the cache manager 32 requesting to rewrite the data in the address space p1 as an example, the interaction process between the components in the shared storage system 100 shown in FIG. 5 will be described. Please refer to FIG. 6. FIG. 6 is an interaction process 600 between components in the shared storage system 100 shown in FIG. 5. The interaction process includes the following steps:
步骤601,缓存管理器32向内存管理器21发送请求q2,该请求q2指示改写地址范围A中的数据。Step 601 , the cache manager 32 sends a request q2 to the memory manager 21 , the request q2 instructs to rewrite the data in the address range A.
步骤602,内存管理器21基于请求q2,向无效管理器43发送信号s2。具体的,内存管理器21接收到请求q2后,首先查询出地址范围R位于地址空间p1中。然后,内存管理器21又可以查询出缓存管理器32保存有地址空间p2中的数据的副本。最后,内存管理器21向无效管理器43发送信号s2,信号s2指示无效地址空间p1中的数据的副本。In step 602, the memory manager 21 sends a signal s2 to the invalidation manager 43 based on the request q2. Specifically, after receiving the request q2, the memory manager 21 first finds out that the address range R is located in the address space p1. Then, the memory manager 21 can query and find out that the cache manager 32 stores a copy of the data in the address space p2. Finally, the memory manager 21 sends a signal s2 to the invalidation manager 43 indicating a copy of the data in the invalidation address space p1.
步骤603,无效管理器43基于信号s2、预先设置的数据段的大小以及地址空间p1,生成无效信号i1、无效信号i2和无效信号i3。步骤604,无效管理器43将无效信号i1、无效信号i2和无效信号i3分时提供至缓存管理器。具体的,无效管理器43基于预设的数据段的大小,将地址空间p1分成地址范围a1、地址范围a2和地址范围a3该三个地址范围,其中每一个地址范围中的数据量为一个数据段。该数据段为缓存管理器对数据操作的粒度。无效信号i1指示无效地址范围a1中的数据的副本,无效信号i2指示无效地址范围a2中的数据的副本,无效信号i3指示无效地址范围a3中的数据的副本。需要说明的是,图6中示出了无效管理器43将三个信号传输至缓存管理器,可以理解的是,本申请实施例并用仅限于此,具体场景中,可以基于地址空间的大小以及数据段的大小,来调整所划分的地址范围的数目以及所生成的信号的数目。In step 603, the invalidation manager 43 generates an invalidation signal i1, an invalidation signal i2 and an invalidation signal i3 based on the signal s2, the preset size of the data segment and the address space p1. In step 604, the invalidation manager 43 provides the invalidation signal i1, the invalidation signal i2 and the invalidation signal i3 to the cache manager in time division. Specifically, the invalidation manager 43 divides the address space p1 into three address ranges of address range a1, address range a2, and address range a3 based on the size of the preset data segment, wherein the amount of data in each address range is one data part. The data segment is the cache manager's granularity for data operations. Invalid signal il indicates a copy of data in invalid address range al, invalid signal i2 indicates a copy of data in invalid address range a2, and invalid signal i3 indicates a copy of data in invalid address range a3. It should be noted that FIG. 6 shows that the invalidation manager 43 transmits three signals to the cache manager. It can be understood that this embodiment of the present application is limited to this. In a specific scenario, it can be based on the size of the address space and The size of the data segment to adjust the number of divided address ranges and the number of generated signals.
步骤605,缓存管理器33基于信号i1,检测出地址范围a1中的数据段未被改写(也即数据段的状态信息为唯一、共享和无效中的一项),向无效管理器43传输响应r1,响应r1指示地址范围a1中的数据段未被改写;缓存管理器33基于信号i2,检测出地址范围a2中的数据段未被改写,向无效管理器43传输响应r2,响应r2指示地址范围a2中的数据段未被改写;缓存管理器33基于信号i3,检测出地址范围a3中的数据段未被改写,向无效管理器43传输响应r3,响应r3指示地址范围a3中的数据段未被改写。需要说明的是,如果地址范围a1~地址范围a3中的数据段的状态信息为一或者共享时,缓存管理器33在传输响应之前,还需要将状态信息“唯一”或者“共享”修改为状态信息“无效”。 Step 605, the cache manager 33 detects that the data segment in the address range a1 has not been rewritten based on the signal i1 (that is, the state information of the data segment is one of unique, shared and invalid), and transmits a response to the invalid manager 43 r1, the response r1 indicates that the data segment in the address range a1 has not been rewritten; the cache manager 33 detects that the data segment in the address range a2 has not been rewritten based on the signal i2, and transmits a response r2 to the invalidation manager 43, and the response r2 indicates the address The data segment in the range a2 has not been rewritten; the cache manager 33 detects that the data segment in the address range a3 has not been rewritten based on the signal i3, and transmits a response r3 to the invalidation manager 43, and the response r3 indicates the data segment in the address range a3 Not overwritten. It should be noted that, if the status information of the data segment in the address range a1-address range a3 is one or shared, the cache manager 33 needs to modify the status information "unique" or "shared" to the status before transmitting the response The message is "invalid".
需要说明的是,上述步骤604和步骤605不用于对时间先后顺序的限定。举例来说,在第一时钟周期,无效管理器43向缓存管理器33发送无效信号i1;在第二时钟周期,无效管理器43向缓存管理器33发送无效信号i2,缓存管理器33基于无效信号i1,生成响 应r1传输给无效管理器43;在第三时钟周期,无效管理器43向缓存管理器33发送无效信号i3,缓存管理器33基于无效信号i2,生成响应r2传输给无效管理器43;在第四时钟周期,缓存管理器33基于无效信号i3,生成响应r3传输给无效管理器43。It should be noted that the above step 604 and step 605 are not used to limit the sequence of time. For example, in the first clock cycle, the invalidation manager 43 sends an invalidation signal i1 to the cache manager 33; in the second clock cycle, the invalidation manager 43 sends an invalidation signal i2 to the cache manager 33, and the cache manager 33 based on the invalidation Signal i1 generates a response r1 and transmits it to the invalidation manager 43; in the third clock cycle, the invalidation manager 43 sends an invalidation signal i3 to the cache manager 33, and the cache manager 33 generates a response r2 based on the invalidation signal i2 and transmits it to the invalidation manager 43 : In the fourth clock cycle, the cache manager 33 generates a response r3 based on the invalidation signal i3 and transmits it to the invalidation manager 43 .
步骤606,无效管理器43基于响应r1、响应r2和响应r3,向内存管理器21传输信号s3,信号s3指示完成对地址空间p1中的数据副本的无效。该步骤中,无效管理器43将所有的响应接收完毕、且各响应均指示地址范围中的数据段未被改写时,向内存管理器21传输信号s3。In step 606, the invalidation manager 43 transmits a signal s3 to the memory manager 21 based on the response r1, the response r2 and the response r3, and the signal s3 indicates that the invalidation of the data copy in the address space p1 is completed. In this step, the invalidation manager 43 transmits a signal s3 to the memory manager 21 when all responses are received and each response indicates that the data segment in the address range has not been rewritten.
步骤607,内存管理器21向缓存管理器32发送信号s4,信号s4用于指示允许缓存管理器32改写地址范围A中的数据。Step 607, the memory manager 21 sends a signal s4 to the cache manager 32, the signal s4 is used to indicate that the cache manager 32 is allowed to rewrite the data in the address range A.
图6所示的交互流程600的步骤605中,缓存管理器33检测出地址范围a1~地址范围a3中的数据的状态信息为唯一、共享和无效中的一项。在本申请实施例一种可能的实现方式中,地址范围a1~地址范围a3中的至少部分地址范围中的数据段的状态信息为被修改(modity),也即地址范围a1~地址范围an中的至少部分地址范围中存在脏数据。当存在脏数据时,还需要将脏数据写回内存11中。下面通过图7所示交互流程700进行详细描述。In step 605 of the interaction process 600 shown in FIG. 6 , the cache manager 33 detects that the status information of the data in the address range a1 - address range a3 is one of unique, shared and invalid. In a possible implementation of the embodiment of the present application, the state information of the data segments in at least part of the address ranges in the address range a1-address range a3 is modified (modity), that is, in the address range a1-address range an Dirty data exists in at least part of the address range of . When there is dirty data, it is also necessary to write the dirty data back into the memory 11 . The following describes in detail through the interaction process 700 shown in FIG. 7 .
在图7中,步骤701~步骤704与图6所示的步骤601~步骤604相同,具体参考图6中步骤601~步骤604的相关描述,不再赘述。In FIG. 7 , steps 701 to 704 are the same as steps 601 to 604 shown in FIG. 6 . For details, refer to related descriptions of steps 601 to 604 in FIG. 6 , and details are not repeated here.
步骤705,缓存管理器33基于信号i1,检测出地址范围a1中的数据段未被改写,向无效管理器43传输响应r4,响应r4指示地址范围a1中的数据段未被改写;缓存管理器33基于信号i2,检测出地址范围a2中的数据段未被改写,向无效管理器43传输响应r2,响应r2指示地址范围a2中的数据段未被改写;缓存管理器33基于信号i3,检测出地址范围a3中的数据段的状态信息为被修改,向无效管理器43传输响应信号r3,响应信号r3中包括地址范围an中的数据段D1。 Step 705, the cache manager 33 detects that the data segment in the address range a1 has not been rewritten based on the signal i1, and transmits a response r4 to the invalidation manager 43, and the response r4 indicates that the data segment in the address range a1 has not been rewritten; the cache manager 33 detects that the data segment in the address range a2 has not been rewritten based on the signal i2, and transmits a response r2 to the invalidation manager 43, and the response r2 indicates that the data segment in the address range a2 has not been rewritten; the cache manager 33 detects that the data segment in the address range a2 is not rewritten based on the signal i3 If the state information of the data segment in the address range a3 is modified, a response signal r3 is transmitted to the invalidation manager 43, and the response signal r3 includes the data segment D1 in the address range an.
步骤706,无效管理器43将数据段D1写回内存11中的地址范围an中。Step 706 , the invalidation manager 43 writes the data segment D1 back into the address range an in the memory 11 .
步骤707,内存管理器21检测到数据段D1在内存11中存储完毕,向无效管理器43发送信号S5,信号S5指示完成数据段D1的存储。Step 707, the memory manager 21 detects that the data segment D1 has been stored in the memory 11, and sends a signal S5 to the invalidation manager 43, and the signal S5 indicates that the storage of the data segment D1 is completed.
步骤708~步骤709与图6所示的步骤606~步骤607相同,具体参考图6中步骤606~步骤607的相关描述,不再赘述。Steps 708 to 709 are the same as steps 606 to 607 shown in FIG. 6 . For details, refer to related descriptions of steps 606 to 607 in FIG. 6 , and details are not repeated here.
基于图6和图7所示的交互流程,在一种场景中,在图6所示的步骤601(或者图7所示的步骤701)之前,缓存管理器33将地址范围a1中的数据段传输至内存11,而在步骤605中缓存管理器33接收到无效信号i1时,地址范围a1中的数据仍未传输完毕。基于该种场景,请参考图8,图8是本申请实施例提供的共享存储系统100的又一个交互流程800。该交互流程800包括如下步骤:Based on the interaction process shown in FIG. 6 and FIG. 7, in one scenario, before step 601 shown in FIG. 6 (or step 701 shown in FIG. 7), cache manager 33 will is transmitted to the memory 11, and when the cache manager 33 receives the invalid signal i1 in step 605, the data in the address range a1 has not been transmitted yet. Based on this scenario, please refer to FIG. 8 , which is another interaction process 800 of the shared storage system 100 provided by the embodiment of the present application. The interaction process 800 includes the following steps:
步骤801,缓存管理器33向内存11写入地址范围a1中的数据段D2。Step 801 , the cache manager 33 writes the data segment D2 in the address range a1 to the memory 11 .
步骤802~步骤805与图6所示的步骤601~步骤604相同,具体参考图6中步骤601~步骤604的相关描述,不再赘述。Steps 802 to 805 are the same as steps 601 to 604 shown in FIG. 6 . For details, refer to related descriptions of steps 601 to 604 in FIG. 6 , and details are not repeated here.
步骤806,缓存管理器33基于信号i2,检测出地址范围a2中的数据段未被改写,向无效管理器43传输响应r2,响应r2指示地址范围a2中的数据段未被改写;缓存管理器33基于信号i3,检测出地址范围a3中的数据段未被改写,向无效管理器43传输响 应r3,响应r3指示地址范围a3中的数据段未被改写。 Step 806, the cache manager 33 detects that the data segment in the address range a2 has not been rewritten based on the signal i2, and transmits a response r2 to the invalidation manager 43, and the response r2 indicates that the data segment in the address range a2 has not been rewritten; the cache manager 33 detects that the data segment in the address range a3 has not been rewritten based on the signal i3, and transmits a response r3 to the invalidation manager 43, which indicates that the data segment in the address range a3 has not been rewritten.
步骤807,内存管理器21向缓存管理器33传输信号S6,信号S6指示完成数据段D2的存储。Step 807, the memory manager 21 transmits a signal S6 to the cache manager 33, and the signal S6 indicates that the storage of the data segment D2 is completed.
步骤808,缓存管理器33基于信号i1以及信号S6,向无效管理器43传输响应r1,响应r1指示地址范围a1中的数据段未被改写。Step 808 , the cache manager 33 transmits a response r1 to the invalidation manager 43 based on the signal i1 and the signal S6 , and the response r1 indicates that the data segment in the address range a1 has not been rewritten.
步骤809~步骤810与图6所示的步骤606~步骤607相同,具体参考图6中步骤606~步骤607的相关描述,不再赘述。Steps 809 to 810 are the same as steps 606 to 607 shown in FIG. 6 . For details, refer to related descriptions of steps 606 to 607 in FIG. 6 , and details are not repeated here.
基于同一发明构思,本申请实施例还提供了一种用于无效缓存数据的方法,该用于无效缓存数据的方法可以应用于如图1所示的任意一个内存管理器中。请继续参看图9,其示出了本申请实施例提供的用于无效缓存数据的方法的一个流程900。包括如下所述的步骤:步骤901,从第一处理器接收第一请求,所述第一请求指示改写第一地址中的数据,所述第一地址为共享内存中的地址;步骤902,基于所述第一请求,向第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中。Based on the same inventive concept, the embodiment of the present application also provides a method for invalidating cached data, and the method for invalidating cached data can be applied to any memory manager as shown in FIG. 1 . Please continue to refer to FIG. 9 , which shows a process 900 of the method for invalidating cached data provided by the embodiment of the present application. The method comprises the following steps: step 901, receiving a first request from a first processor, the first request indicating rewriting data in a first address, the first address being an address in a shared memory; step 902, based on The first request sends a first signal to the second processor, the first signal indicates invalidation of the first data stored by the second processor, and the first data is a first address in the shared memory A copy of the data in the space in which the first address is located.
需要说明的是,执行流程900的内存管理器可以设置于图1所示的任意处理器中,该执行流程900的内存管理器记为第三处理器。上述第一处理器与第二处理器以及第三处理器均为不同的处理器。例如,在图1中,第一处理器为处理器01、第二处理器为处理器02、第三处理器为处理器03,其中第三处理器03中设置有内存管理器33,内存管理器33用于执行图9所示的流程900。It should be noted that, the memory manager executing the process 900 may be set in any processor shown in FIG. 1 , and the memory manager executing the process 900 is denoted as a third processor. The above-mentioned first processor, the second processor and the third processor are all different processors. For example, in FIG. 1, the first processor is processor 01, the second processor is processor 02, and the third processor is processor 03, wherein the third processor 03 is provided with a memory manager 33, and the memory management The device 33 is used to execute the process 900 shown in FIG. 9 .
在一种可能的实现方式中,所述方法还包括:从所述第二处理器接收第二信号,所述第二信号指示完成对所述第一数据的无效;基于所述第二信号,向所述第一处理器发送第三信号,所述第三信号指示允许所述第一处理器改写所述第一地址中的数据。In a possible implementation manner, the method further includes: receiving a second signal from the second processor, the second signal indicating completion of invalidation of the first data; based on the second signal, sending a third signal to the first processor, the third signal indicating that the first processor is allowed to rewrite the data in the first address.
基于同一发明构思,本申请实施例还提供了一种用于无效缓存数据的方法,该用于无效缓存数据的方法可以应用于如图1所示的任意一个处理器中。请继续参看图10,其示出了本申请实施例提供的用于无效缓存数据的方法的一个流程1000。包括如下所述的步骤:步骤1001,从内存管理器接收第一信号,所述第一信号用于指示无效所保存的第一数据,所述第一数据为共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中;步骤1002基于所述第一信号,无效所述第一数据。Based on the same inventive concept, an embodiment of the present application also provides a method for invalidating cached data, and the method for invalidating cached data can be applied to any processor shown in FIG. 1 . Please continue to refer to FIG. 10 , which shows a process 1000 of the method for invalidating cached data provided by the embodiment of the present application. Including the following steps: Step 1001, receiving a first signal from the memory manager, the first signal is used to indicate the invalidation of the saved first data, the first data is the data of the first address space in the shared memory , the first address is located in the first address space; Step 1002 invalidates the first data based on the first signal.
在一种可能的实现方式中,所述基于所述第一信号,无效所述第一数据,包括:基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;无效所述多个数据段。In a possible implementation manner, the invalidating the first data based on the first signal includes: decomposing the first address space into A plurality of address ranges, obtaining a plurality of data segments corresponding to the first data, wherein one address range corresponds to a data segment; invalidating the plurality of data segments.
在一种可能的实现方式中,所述无效所述多个数据段包括:当所述多个数据段中的至少部分数据段的状态为共享状态或独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。In a possible implementation manner, the invalidating the plurality of data segments includes: when the state of at least some of the data segments in the plurality of data segments is one of the shared state or the exclusive state, setting the The state of at least some of the data segments is modified to an invalid state.
在一种可能的实现方式中,所述方法还包括:当所述多个数据段中的每一个数据段的状态为无效状态时,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。In a possible implementation manner, the method further includes: when the state of each data segment in the plurality of data segments is an invalid state, transmitting a second signal to the memory manager, the second Signaling completion of invalidation of the first data.
在一种可能的实现方式中,所述向所述内存管理器传输第二信号之前,所述方法还包 括:当所述多个数据段中的第一数据段的状态为被改写状态时,将所述第二数据段写回所述共享内存中相应地址范围内。In a possible implementation manner, before the transmitting the second signal to the memory manager, the method further includes: when the state of the first data segment among the plurality of data segments is a rewritten state, Writing the second data segment back into the corresponding address range in the shared memory.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, rather than limiting them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.

Claims (26)

  1. 一种共享存储系统,其特征在于,包括内存管理器、多个处理器以及共享内存;A shared storage system, characterized in that it includes a memory manager, a plurality of processors and shared memory;
    所述多个处理器中的第一处理器,用于向所述内存管理器发送第一请求,所述第一请求指示改写第一地址中的数据,所述第一地址为所述共享内存中的地址;The first processor among the plurality of processors is configured to send a first request to the memory manager, the first request indicates to rewrite data in a first address, and the first address is the shared memory address in
    所述内存管理器,基于所述第一请求,向所述多个处理器中的第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中;The memory manager, based on the first request, sends a first signal to a second processor among the plurality of processors, the first signal indicating invalidation of the first data stored by the second processor , the first data is a copy of data in a first address space in the shared memory, and the first address is located in the first address space;
    所述第二处理器,基于所述第一信号,无效所述第一数据。The second processor invalidates the first data based on the first signal.
  2. 根据权利要求1所述的共享存储系统,其特征在于,所述第二处理器具体用于:The shared storage system according to claim 1, wherein the second processor is specifically configured to:
    基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;Based on the first signal and the size of the preset data segment, decomposing the first address space into multiple address ranges to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment ;
    无效所述多个数据段。The plurality of data segments are invalidated.
  3. 根据权利要求2所述的共享存储系统,其特征在于,所述第二处理器具体用于:The shared storage system according to claim 2, wherein the second processor is specifically configured to:
    当所述多个数据段中的至少部分数据段的状态为共享状态或独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。When the state of at least some of the data segments in the plurality of data segments is one of the shared state or the exclusive state, modify the state of the at least some of the data segments to an invalid state.
  4. 根据权利要求2或3所述的共享存储系统,其特征在于,所述第二处理器包括缓存管理器和无效管理器;The shared storage system according to claim 2 or 3, wherein the second processor includes a cache manager and an invalidation manager;
    所述无效管理器,基于所述多个地址范围,生成多个无效信号,一个地址范围对应一个无效信号;以及分时将所述多个无效信号传输至所述缓存管理器;The invalidation manager generates multiple invalidation signals based on the multiple address ranges, one address range corresponds to one invalidation signal; and transmits the multiple invalidation signals to the cache manager in time-sharing;
    所述缓存管理器,基于所述多个无效信号,无效所述多个数据段。The cache manager invalidates the plurality of data segments based on the plurality of invalidation signals.
  5. 根据权利要求4所述的共享存储系统,其特征在于,The shared storage system according to claim 4, wherein:
    所述缓存管理器,还用于向所述无效管理器发送针对于所述多个无效信号的多个响应,其中一个响应对应一个数据段,所述多个响应中的第一响应指示所述多个数据段中的第一数据段未被改写;The cache manager is further configured to send multiple responses to the invalidation manager for the multiple invalidation signals, wherein one response corresponds to one data segment, and the first response in the multiple responses indicates the A first data segment of the plurality of data segments has not been rewritten;
    所述无效管理器,还用于基于所述多个响应,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。The invalidation manager is further configured to transmit a second signal to the memory manager based on the plurality of responses, the second signal indicating completion of the invalidation of the first data.
  6. 根据权利要求5所述的共享存储系统,其特征在于,所述多个响应中的第二响应,携带有所述多个数据段中的第二数据段,所述第二数据段的状态为被改写状态;The shared storage system according to claim 5, wherein the second response in the plurality of responses carries a second data segment in the plurality of data segments, and the state of the second data segment is overwritten state;
    所述无效管理器,还用于将所述第二数据段写回所述共享内存中相应地址范围内。The invalidation manager is further configured to write the second data segment back into the corresponding address range in the shared memory.
  7. 根据权利要求5所述的共享存储系统,其特征在于,所述多个无效信号中的第一无效信号指示所述第一数据段无效;以及所述缓存管理器具体用于:The shared storage system according to claim 5, wherein the first invalidation signal in the plurality of invalidation signals indicates that the first data segment is invalid; and the cache manager is specifically used for:
    在接收到所述第一无效信号之前,将所述第一数据段写回所述共享内存;Before receiving the first invalidation signal, writing the first data segment back to the shared memory;
    响应于所述内存管理器发送的第三信号,向所述无效管理器发送所述第一响应,所述第三信号指示所述第一数据段存储完毕。The first response is sent to the invalidation manager in response to a third signal sent by the memory manager, the third signal indicating that the storage of the first data segment is complete.
  8. 根据权利要求5所述的共享存储系统,其特征在于,所述内存管理器还用于:The shared storage system according to claim 5, wherein the memory manager is also used for:
    基于所述第二信号,向所述第一处理器发送第四信号,所述第四信号指示允许所述第一处理器改写所述第一地址中的数据。A fourth signal is sent to the first processor based on the second signal, the fourth signal indicating that the first processor is allowed to rewrite data in the first address.
  9. 根据权利要求4-8任一项所述的共享存储系统,其特征在于,The shared storage system according to any one of claims 4-8, wherein,
    所述多个处理器中的第三处理器,用于向所述内存管理器发送第二请求,所述第二请求指示改写第二地址中的数据,所述第二地址为所述共享内存中的地址;A third processor among the plurality of processors is configured to send a second request to the memory manager, the second request indicates to rewrite data in a second address, and the second address is the shared memory address in
    所述内存管理器,基于所述第二请求,向所述缓存管理器发送第五信号,所述第五信号指示无效所述缓存管理器所保存的第二数据,所述第二数据为所述第二地址中的数据的副本;The memory manager, based on the second request, sends a fifth signal to the cache manager, the fifth signal indicating invalidation of the second data stored by the cache manager, the second data being the a copy of the data in the second address;
    所述缓存控制器,响应于所述第五信号,向所述内存管理器发送第三响应,所述第三响应指示所述第二数据修改或者未修改。The cache controller, in response to the fifth signal, sends a third response to the memory manager, the third response indicating that the second data is modified or not modified.
  10. 一种装置,其特征在于,包括内存管理器,所述内存管理器用于:A device, characterized in that it includes a memory manager, the memory manager is used for:
    从第一处理器接收第一请求,所述第一请求指示改写第一地址中的数据,所述第一地址为共享内存中的地址;receiving a first request from a first processor, the first request indicating rewriting data in a first address, where the first address is an address in a shared memory;
    基于所述第一请求,向第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中;Based on the first request, send a first signal to the second processor, the first signal indicates invalidation of the first data stored by the second processor, the first data is the first data in the shared memory a copy of the data of the address space in which the first address is located;
    其中,所述共享内存被所述内存管理器管理、且所述共享内存中的地址空间允许所述第一处理器和所述第二处理器访问。Wherein, the shared memory is managed by the memory manager, and the address space in the shared memory is allowed to be accessed by the first processor and the second processor.
  11. 根据权利要求10所述的装置,其特征在于,所述内存管理器还用于:The device according to claim 10, wherein the memory manager is also used for:
    从所述第二处理器接收第二信号,所述第二信号指示完成对所述第一数据的无效;receiving a second signal from the second processor indicating completion of the invalidation of the first data;
    基于所述第二信号,向所述第一处理器发送第三信号,所述第三信号指示允许所述第一处理器改写所述第一地址中的数据。Based on the second signal, a third signal is sent to the first processor, the third signal indicating that the first processor is allowed to rewrite data in the first address.
  12. 根据权利要求10或11所述的装置,其特征在于,所述内存管理器还用于:The device according to claim 10 or 11, wherein the memory manager is also used for:
    监测所述第二处理器向所述共享内存中存储第一数据段,所述第一数据段为所述第一数据中的一段数据;monitoring that the second processor stores a first data segment in the shared memory, where the first data segment is a segment of data in the first data;
    响应于所述第一数据段存储完毕,向所述第二处理器发送第四信号,所述第四信号指示所述第一数据段存储完毕。In response to the completion of storage of the first data segment, a fourth signal is sent to the second processor, where the fourth signal indicates that the storage of the first data segment is complete.
  13. 一种装置,其特征在于,包括处理器,所述处理器用于:An apparatus, characterized by comprising a processor configured to:
    从内存管理器接收第一信号,所述第一信号用于指示无效所保存的第一数据,所述第一数据为共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中,所述共享内存被所述内存管理器管理;Receive a first signal from the memory manager, the first signal is used to indicate that the saved first data is invalid, the first data is a copy of data in a first address space in the shared memory, and the first address is located at the In the first address space, the shared memory is managed by the memory manager;
    基于所述第一信号,无效所述第一数据。Based on the first signal, the first data is invalidated.
  14. 根据权利要求13所述的装置,其特征在于,所述处理器具体用于:The device according to claim 13, wherein the processor is specifically configured to:
    基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;Based on the first signal and the size of the preset data segment, decomposing the first address space into multiple address ranges to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment ;
    无效所述多个数据段。The plurality of data segments are invalidated.
  15. 根据权利要求14所述的装置,其特征在于,所述处理器具体用于:The device according to claim 14, wherein the processor is specifically configured to:
    当所述多个数据段中的至少部分数据段的状态为共享状态或独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。When the state of at least some of the data segments in the plurality of data segments is one of the shared state or the exclusive state, modify the state of the at least some of the data segments to an invalid state.
  16. 根据权利要求14或15所述的装置,其特征在于,所述处理器包括缓存管理器和无效管理器;The apparatus according to claim 14 or 15, wherein the processor comprises a cache manager and an invalidation manager;
    所述无效管理器,基于所述多个地址范围,生成多个无效信号,一个地址范围对应一 个无效信号;以及分时将所述多个无效信号传输至所述缓存管理器;The invalidation manager generates a plurality of invalidation signals based on the plurality of address ranges, one address range corresponds to one invalidation signal; and transmits the plurality of invalidation signals to the cache manager in time-sharing;
    所述缓存管理器,基于所述多个无效信号,无效所述多个数据段。The cache manager invalidates the plurality of data segments based on the plurality of invalidation signals.
  17. 根据权利要求16所述的装置,其特征在于,The device according to claim 16, characterized in that,
    所述缓存管理器,还用于向所述无效管理器发送针对于所述多个无效信号的多个响应,其中一个响应对应一个数据段,所述多个响应中的第一响应指示所述多个数据段中的第一数据段未被改写;The cache manager is further configured to send multiple responses to the invalidation manager for the multiple invalidation signals, wherein one response corresponds to one data segment, and the first response in the multiple responses indicates the A first data segment of the plurality of data segments has not been rewritten;
    所述无效管理器,还用于基于所述多个响应,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。The invalidation manager is further configured to transmit a second signal to the memory manager based on the plurality of responses, the second signal indicating completion of the invalidation of the first data.
  18. 根据权利要求17所述的装置,其特征在于,所述多个响应中的第二响应,携带有所述多个数据段中的第二数据段,所述第二数据段的状态为被改写状态;The device according to claim 17, wherein the second response in the plurality of responses carries the second data segment in the plurality of data segments, and the status of the second data segment is rewritten state;
    所述无效管理器,还用于当所述第二数据段的状态为被改写状态时,将所述第二数据段写回所述共享内存中相应地址范围内。The invalidation manager is further configured to write the second data segment back into the corresponding address range in the shared memory when the state of the second data segment is in a rewritten state.
  19. 根据权利要求17或18所述的装置,其特征在于,所述多个无效信号中的第一无效信号指示所述第一数据段无效;以及所述缓存管理器具体用于:The device according to claim 17 or 18, wherein a first invalidation signal among the plurality of invalidation signals indicates that the first data segment is invalid; and the cache manager is specifically used for:
    在接收到所述第一无效信号之前,将所述第一数据段写回所述共享内存;Before receiving the first invalidation signal, writing the first data segment back to the shared memory;
    响应于所述内存管理器发送的第四信号,向所述无效管理器发送所述第一响应,所述第四信号指示所述第一数据段存储完毕。The first response is sent to the invalidation manager in response to a fourth signal sent by the memory manager, the fourth signal indicating that the storage of the first data segment is complete.
  20. 一种用于无效缓存数据的方法,其特征在于,包括:A method for invalidating cached data, comprising:
    从第一处理器接收第一请求,所述第一请求指示改写第一地址中的数据,所述第一地址为共享内存中的地址;receiving a first request from a first processor, the first request indicating rewriting data in a first address, where the first address is an address in a shared memory;
    基于所述第一请求,向第二处理器发送第一信号,所述第一信号指示无效所述第二处理器所保存的第一数据,所述第一数据为所述共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中。Based on the first request, send a first signal to the second processor, the first signal indicates invalidation of the first data stored by the second processor, the first data is the first data in the shared memory A copy of the data of the address space in which the first address is located.
  21. 根据权利要求20所述的方法,其特征在于,所述方法还包括:The method according to claim 20, further comprising:
    从所述第二处理器接收第二信号,所述第二信号指示完成对所述第一数据的无效;receiving a second signal from the second processor indicating completion of the invalidation of the first data;
    基于所述第二信号,向所述第一处理器发送第三信号,所述第三信号指示允许所述第一处理器改写所述第一地址中的数据。Based on the second signal, a third signal is sent to the first processor, the third signal indicating that the first processor is allowed to rewrite data in the first address.
  22. 一种用于无效缓存数据的方法,其特征在于,包括:A method for invalidating cached data, comprising:
    从内存管理器接收第一信号,所述第一信号用于指示无效所保存的第一数据,所述第一数据为共享内存中第一地址空间的数据的副本,所述第一地址位于所述第一地址空间中;Receive a first signal from the memory manager, the first signal is used to indicate that the saved first data is invalid, the first data is a copy of data in a first address space in the shared memory, and the first address is located at the in the first address space;
    基于所述第一信号,无效所述第一数据。Based on the first signal, the first data is invalidated.
  23. 根据权利要求22所述的方法,其特征在于,所述基于所述第一信号,无效所述第一数据,包括:The method according to claim 22, wherein said invalidating said first data based on said first signal comprises:
    基于所述第一信号以及预设数据段的大小,将所述第一地址空间分解成多个地址范围,得到对应于所述第一数据的多个数据段,其中一个地址范围对应一个数据段;Based on the first signal and the size of the preset data segment, decomposing the first address space into multiple address ranges to obtain multiple data segments corresponding to the first data, wherein one address range corresponds to one data segment ;
    无效所述多个数据段。The plurality of data segments are invalidated.
  24. 根据权利要求22所述的方法,其特征在于,所述无效所述多个数据段,包括:The method according to claim 22, wherein said invalidating said plurality of data segments comprises:
    当所述多个数据段中的至少部分数据段的状态为共享状态或独占状态中的一项时,将所述至少部分数据段的状态修改为无效状态。When the state of at least some of the data segments in the plurality of data segments is one of the shared state or the exclusive state, modify the state of the at least some of the data segments to an invalid state.
  25. 根据权利要求23或24所述的方法,其特征在于,所述方法还包括:The method according to claim 23 or 24, wherein the method further comprises:
    当所述多个数据段中的每一个数据段的状态为无效状态时,向所述内存管理器传输第二信号,所述第二信号指示完成对所述第一数据的无效。When the state of each data segment in the plurality of data segments is an invalid state, a second signal is transmitted to the memory manager, the second signal indicating completion of invalidation of the first data.
  26. 根据权利要求25所述的方法,其特征在于,所述向所述内存管理器传输第二信号之前,所述方法还包括:The method according to claim 25, wherein before transmitting the second signal to the memory manager, the method further comprises:
    当所述多个数据段中的第一数据段的状态为被改写状态时,将所述第一数据段写回所述共享内存中相应地址范围内。When the state of the first data segment among the plurality of data segments is the rewritten state, write the first data segment back into the corresponding address range in the shared memory.
PCT/CN2022/072102 2022-01-14 2022-01-14 Shared storage system and apparatus, and method for invalidating cache data WO2023133830A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202280041800.0A CN117529899A (en) 2022-01-14 2022-01-14 Shared memory system, apparatus and method for invalidating cached data
PCT/CN2022/072102 WO2023133830A1 (en) 2022-01-14 2022-01-14 Shared storage system and apparatus, and method for invalidating cache data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/072102 WO2023133830A1 (en) 2022-01-14 2022-01-14 Shared storage system and apparatus, and method for invalidating cache data

Publications (1)

Publication Number Publication Date
WO2023133830A1 true WO2023133830A1 (en) 2023-07-20

Family

ID=87279874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/072102 WO2023133830A1 (en) 2022-01-14 2022-01-14 Shared storage system and apparatus, and method for invalidating cache data

Country Status (2)

Country Link
CN (1) CN117529899A (en)
WO (1) WO2023133830A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030110360A1 (en) * 2001-12-10 2003-06-12 Mitsubishi Denki Kabushiki Kaisha Cache device controlling a state of a corresponding cache memory according to a predetermined protocol
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory
CN105659216A (en) * 2014-09-29 2016-06-08 华为技术有限公司 Cache directory processing method and directory controller of multi-core processor system
US20170185519A1 (en) * 2015-12-28 2017-06-29 Freescale Semiconductor, Inc. Computing system with a cache invalidation unit, a cache invalidation unit and a method of operating a cache invalidation unit in a computing system
US20180165196A1 (en) * 2016-12-12 2018-06-14 Intel Corporation Instruction and logic for flushing memory ranges in a distributed shared memory system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030110360A1 (en) * 2001-12-10 2003-06-12 Mitsubishi Denki Kabushiki Kaisha Cache device controlling a state of a corresponding cache memory according to a predetermined protocol
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory
CN105659216A (en) * 2014-09-29 2016-06-08 华为技术有限公司 Cache directory processing method and directory controller of multi-core processor system
US20170185519A1 (en) * 2015-12-28 2017-06-29 Freescale Semiconductor, Inc. Computing system with a cache invalidation unit, a cache invalidation unit and a method of operating a cache invalidation unit in a computing system
US20180165196A1 (en) * 2016-12-12 2018-06-14 Intel Corporation Instruction and logic for flushing memory ranges in a distributed shared memory system

Also Published As

Publication number Publication date
CN117529899A (en) 2024-02-06

Similar Documents

Publication Publication Date Title
US9792210B2 (en) Region probe filter for distributed memory system
US10310979B2 (en) Snoop filter for cache coherency in a data processing system
JP5431525B2 (en) A low-cost cache coherency system for accelerators
US6115804A (en) Non-uniform memory access (NUMA) data processing system that permits multiple caches to concurrently hold data in a recent state from which data can be sourced by shared intervention
US6631448B2 (en) Cache coherence unit for interconnecting multiprocessor nodes having pipelined snoopy protocol
US6108764A (en) Non-uniform memory access (NUMA) data processing system with multiple caches concurrently holding data in a recent state from which data can be sourced by shared intervention
US7669018B2 (en) Method and apparatus for filtering memory write snoop activity in a distributed shared memory computer
KR100465583B1 (en) Non-uniform memory access(numa) data processing system that speculatively forwards a read request to a remote processing node and communication method in the system
US7096323B1 (en) Computer system with processor cache that stores remote cache presence information
JP5221565B2 (en) Snoop filtering using snoop request cache
US6868485B1 (en) Computer system with integrated directory and processor cache
GB2439650A (en) Snoop filter that maintains data coherency information for caches in a multi-processor system by storing the exclusive ownership state of the data
JPH10187645A (en) Multiprocess system constituted for storage in many subnodes of process node in coherence state
US20090006668A1 (en) Performing direct data transactions with a cache memory
US20020013886A1 (en) Multiprocessor system
US20140229678A1 (en) Method and apparatus for accelerated shared data migration
WO2023133830A1 (en) Shared storage system and apparatus, and method for invalidating cache data
US6826654B2 (en) Cache invalidation bus for a highly scalable shared cache memory hierarchy
JP2000227877A (en) Asynchronous input/output cache reducing waiting time
KR20000049990A (en) Apparatus and method for processing node in multiprocessor computer system
JPH04280351A (en) Cache memory for multi-processor system
JPH04340636A (en) Local cache consistency keeping device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22919481

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202280041800.0

Country of ref document: CN