WO2022216597A1 - Agent d'e/s - Google Patents

Agent d'e/s Download PDF

Info

Publication number
WO2022216597A1
WO2022216597A1 PCT/US2022/023296 US2022023296W WO2022216597A1 WO 2022216597 A1 WO2022216597 A1 WO 2022216597A1 US 2022023296 W US2022023296 W US 2022023296W WO 2022216597 A1 WO2022216597 A1 WO 2022216597A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache line
data
agent
read
ownership
Prior art date
Application number
PCT/US2022/023296
Other languages
English (en)
Inventor
Gaurav Garg
Sagi Lahav
Lital LEVY-RUBIN
III Gerard WILLIAMS
Samer Nassar
Per H. Hammarlund
Harshavardhan Kaushikkar
Srinivasa Rangan Sridharan
Jeff Gonion
James Vash
Original Assignee
Apple Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/648,071 external-priority patent/US11550716B2/en
Application filed by Apple Inc. filed Critical Apple Inc.
Priority to CN202280025986.0A priority Critical patent/CN117099088A/zh
Priority to KR1020237034002A priority patent/KR20230151031A/ko
Priority to DE112022001978.6T priority patent/DE112022001978T5/de
Publication of WO2022216597A1 publication Critical patent/WO2022216597A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0831Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/62Details of cache specific to multiprocessor cache arrangements
    • G06F2212/621Coherency control relating to peripheral accessing, e.g. from DMA or I/O device

Definitions

  • This disclosure relates generally to an integrated circuit and, more specifically, to cache coherency in relation to peripheral components.
  • Modem computer systems often include various hardware components that are coupled to memory devices (e.g., random access memory) of those systems.
  • the components typically retrieve data from those memory devices, manipulate the data, and then store that data back at one of those memory devices.
  • multiple components e.g., cores of a processor
  • a first processor core accesses a block of data that it temporarily stores locally. While the data is being held by the first processor core, a second processor core may attempt to access the block of data from the same data source so that it can be used by the second processor core. If data coherency is not maintained for that data, then issues can arise in which it becomes incoherent or is incorrectly processed.
  • data that is accessed by peripheral devices and processor cores, or other components that expect coherent access to memory requires data coherency to be maintained.
  • Various embodiments relating to an I/O agent circuit that is configured to implement coherency mechanisms for processing transactions associated with peripheral components (or, simply “peripherals”) are disclosed.
  • a system on a chip SOC
  • memory that stores data
  • memory controllers that manage access to that memory
  • peripherals that operate on data of that memory (e.g., read and write data).
  • An I/O agent circuit is disclosed that is configured to bridge the peripherals to a coherent fabric that is coupled to the set of memory controllers, including implementing coherency mechanisms for processing transactions associated with those peripherals.
  • the I/O agent circuit may receive, from a peripheral, requests to perform a set of read transactions that are directed to one or more cache lines of the SOC — the set is non-null and thus includes at least one read and/or write transaction.
  • the I/O agent circuit may issue, to a memory controller circuit that manages access to one of those cache lines, a request for exclusive read ownership of that cache line such that the data of the cache line is not cached outside of the memory and the I/O agent circuit in a valid state.
  • the I/O agent circuit may receive the data of the cache line and perform at least one of the read transactions against the cache line.
  • the I/O agent circuit may also receive requests to perform write transactions and thus request exclusive write ownership of the appropriate cache lines.
  • the I/O agent circuit might lose exclusive ownership of a cache line before the I/O agent circuit has performed the corresponding transaction(s). If there exists a threshold number of remaining unprocessed transactions directed to the lost cache line, then the I/O agent circuit may reacquire exclusive ownership of the cache line.
  • Fig. l is a block diagram illustrating example elements of a system on a chip, according to some embodiments.
  • FIG. 2 is a block diagram illustrating example elements of interactions between an I/O agent and a memory controller, according to some embodiments.
  • FIG. 3 A is a block diagram illustrating example elements of an I/O agent configured to process write transactions, according to some embodiments.
  • Fig. 3B is a block diagram illustrating example elements of an I/O agent configured to process read transactions, according to some embodiments.
  • Fig. 4 is a flow diagram illustrating an example of processing read transaction requests from a peripheral component, according to some embodiments.
  • Fig. 5 is a flow diagram illustrating example method relating to the processing of read transaction requests by an I/O agent, according to some embodiments.
  • Fig. 6 is a block diagram illustrating an example process of fabricating at least a portion of an SOC, according to some embodiments.
  • Fig. 7 is a block diagram illustrating an example SOC that is usable in various types of systems, according to some embodiments.
  • a computer system implements a data/cache coherency protocol in which a coherent view of data is ensured within the computer system. Consequently, changes to shared data are propagated throughout the computer system normally in a timely manner in order to ensure the coherent view.
  • a computer system may implement a memory consistency model defines what can be expected by multiple software/hardware entities in terms of memory behavior to enable shared-memory communication — e.g., strong-ordering or relaxed-ordering.
  • a computer system also typically includes or interfaces with peripherals, such as input/output (I/O) devices. These peripherals, however, are not configured to understand or make efficient use of the relaxed-memory consistency model that is implemented by the computer system.
  • peripherals often use specific order rules for their transactions (which are discussed further below) that are stricter than the consistency model.
  • Many peripherals also do not have caches —that is, they are not cacheable devices. As a result, it can take reasonably longer for peripherals to receive completion acknowledgements for their transactions as they are not completed in a local cache.
  • This disclosure addresses, among other things, these technical problems relating to peripherals not being able to make proper use of the relaxed- memory consistency model and not having caches.
  • a system on a chip includes memory, memory controllers, and an I/O agent coupled to peripherals.
  • the I/O agent is configured to receive read and write transaction requests from the peripherals that target specified memory addresses whose data may be stored in cache lines of the SOC.
  • a cache line can also be referred to as a cache block.
  • the specific ordering rules of the peripherals impose that the read/write transactions be completed serially (e.g., not out of order relative to the order in which they are received).
  • the I/O agent is configured to complete a read/write transaction before initiating the next occurring read/write transaction according to their execution order. But in order to perform those transactions in a more performant way, in various embodiments, the I/O agent is configured to obtain exclusive ownership of the cache lines being targeted such that the data of those cache lines is not cached in a valid state in other caching agents (e.g., a processor core) of the SOC.
  • the I/O agent may preemptively obtain exclusive ownership of cache line(s) targeted by the second transaction.
  • the I/O agent receives data for those cache lines and stores the data within a local cache of the I/O agent.
  • the I/O agent may thereafter complete the second transaction in its local cache without having to send out a request for the data of those cache lines and wait for the data to be returned.
  • the I/O agent may obtain exclusive read ownership or exclusive write ownership depending on the type of the associated transaction.
  • the I/O agent might lose exclusive ownership of a cache line before the I/O agent has performed the corresponding transaction.
  • I/O agent may receive a snoop that causes the I/O agent to relinquish exclusive ownership of the cache line, including invalidating the data stored at the I/O agent for the cache line.
  • a “snoop” or “snoop request,” as used herein, refers to a message that is transmitted to a component to request a state change for a cache line (e.g., to invalidate data of the cache line stored within a cache of the component) and, if that component has an exclusive copy of the cache line or is otherwise responsible for the cache line, the message may also request that the cache line be provided by the component.
  • the I/O agent may reacquire exclusive ownership of the cache line. For example, if there are three unprocessed write transactions that target the cache line, then the I/O agent may reacquire exclusive ownership of that cache line. This can prevent the unreasonably slow serialization of the remaining transactions that target a particular cache line. Larger or smaller numbers of unprocessed transactions may be used as the threshold in various embodiments.
  • the high number of clock cycles for each transaction may be avoided.
  • the I/O agent when the I/O agent is processing a set of transactions, the I/O agent can preemptively begin caching the data before the first transaction is complete.
  • the data for a second transaction may be cached and available when the first transaction is completed such that the I/O agent is then able to complete the second transaction shortly thereafter.
  • a portion of the transactions may not each take, e.g., over 500 clock cycles to be completed.
  • SOC 100 includes a caching agent 110, memory controllers 120 A and 120B coupled to memory 130A and 130B, respectively, and an input/output (I/O) cluster 140.
  • I/O cluster 140 includes an I/O agent 142 and a peripheral 144.
  • SOC 100 is implemented differently than shown.
  • SOC 100 may include a display controller, a power management circuit, etc. and memory 130A and 130B may be included on SOC 100.
  • I/O cluster 140 may have multiple peripherals 144, one or more of which may be external to SOC 100. Accordingly, it is noted that the number of components of SOC 100 (and also the number of subcomponents) may vary between embodiments. There may be more or fewer of each component/subcomponent than the number shown in Fig. 1.
  • a caching agent 110 is any circuity that includes a cache for caching memory data or that may otherwise take control of cache lines and potentially update the data of those cache lines locally.
  • Caching agents 110 may participate in a cache coherency protocol to ensure that updates to data made by one caching agent 110 are visible to the other caching agents 110 that subsequently read that data, and that updates made in a particular order by two or more caching agents 110 (as determined at an ordering point within SOC 100, such as memory controllers 120A-B) are observed in that order by caching agents 110.
  • Caching agents 110 can include, for example, processing units (e.g., CPUs, GPUs, etc.), fixed function circuitry, and fixed function circuitry having processor assist via an embedded processor (or processors).
  • I/O agent 142 can be considered a type of caching agent 110. But I/O agent 142 is different from other caching agents 110 for at least the reason that I/O agent 142 serves as a cache-capable entity configured to cache data for other, separate entities (e.g., peripherals, such as a display, a USB-connected device, etc.) that do not have their own caches. Additionally, the I/O agent 142 may cache a relatively small number of cache lines temporarily to improve peripheral memory access latency, but may proactively retire cache lines once transactions are complete.
  • I/O agent 142 may cache a relatively small number of cache lines temporarily to improve peripheral memory access latency, but may proactively retire cache lines once transactions are complete.
  • caching agent 110 is a processing unit having a processor 112 that may serve as the CPU of SOC 100.
  • Processor 112 includes any circuitry and/or microcode configured to execute instructions defined in an instruction set architecture implemented by that processor 112.
  • Processor 112 may encompass one or more processor cores that are implemented on an integrated circuit with other components of SOC 100. Those individual processor cores of processor 112 may share a common last level cache (e.g., an L2 cache) while including their own respective caches (e.g., an L0 cache and/or an LI cache) for storing data and program instructions.
  • Processor 112 may execute the main control software of the system, such as an operating system.
  • Caching agent 110 may further include hardware that is configured to interface caching agent 110 to the other components of SOC 100 (e.g. an interface to interconnect 105).
  • Cache 114 in various embodiments, is a storage array that includes entries configured to store data or program instructions.
  • cache 114 may be a data cache or an instruction cache, or a shared instruction/data cache.
  • Cache 114 may be an associative storage array (e.g., fully associative or set-associative, such as a 4-way set associative cache) or a direct-mapped storage array, and may have any storage capacity.
  • cache lines (or alternatively, “cache blocks”) are the unit of allocation and deallocation within cache 114 and may be of any desired size (e.g. 32 bytes, 64 bytes, 128 bytes, etc.).
  • caching agent 110 information may be pulled from the other components of the system into cache 114 and used by processor cores of processor 112. For example, as a processor core proceeds through an execution path, the processor core may cause program instructions to be fetched from memory 130A-B into cache 114 and then the processor core may fetch them from cache 114 and execute them. Also during the operation of caching agent 110, information can be written from cache 114 to memory (e.g., memory 130A-B) through memory controllers 120A- B.
  • memory e.g., memory 130A-B
  • a memory controller 120 includes circuitry that is configured to receive, from the other components of SOC 100, memory requests (e.g., load/store requests, instruction fetch requests, etc.) to perform memory operations, such as accessing data from memory 130.
  • Memory controllers 120 may be configured to access any type of memory 130.
  • Memory 130 may be implemented using various, different physical memory media, such as hard disk storage, floppy disk storage, removable disk storage, flash memory, random access memory (RAM— SRAM, EDO RAM, SDRAM, DDR SDRAM, RAMBUS RAM, etc ), read only memory (PROM, EEPROM, etc.), etc.
  • Memory available to SOC 100 is not limited to primary storage such as memory 130.
  • SOC 100 may further include other forms of storage such as cache memory (e.g., LI cache, L2 cache, etc.) in caching agent 110.
  • memory controllers 120 include queues for storing and ordering memory operations that are to be presented to memory 130.
  • Memory controllers 120 may also include data buffers to store write data awaiting to be written to memory 130 and read data that is awaiting to be returned to the source of a memory operation, such as caching agent 110.
  • memory controllers 120 may include various components for maintaining cache coherency within SOC 100, including components that track the location of data of cache lines within SOC 100.
  • requests for cache line data are routed through memory controllers 120, which may access the data from other caching agents 110 and/or memory 130A-B.
  • memory controllers 120 may cause snoop requests to be issued to caching agents 110 and I/O agents 142 that store the data within their local cache.
  • memory controllers 120 can cause those caching agents 110 and I/O agents 142 to invalidate and/or evict the data from their caches to ensure coherency within the system.
  • memory controllers 120 process exclusive cache line ownership requests in which memory controllers 120 grant a component exclusive ownership of a cache line while using snoop request to ensure that the data is not cached in other caching agents 110 and I/O agents 142.
  • I/O cluster 140 includes one or more peripheral devices 144 (or simply, peripherals 144) that may provide additional hardware functionality and I/O agent 142.
  • Peripherals 144 may include, for example, video peripherals (e.g., GPUs, blenders, video encoder/decoders, scalers, display controllers, etc.) and audio peripherals (e.g., microphones, speakers, interfaces to microphones and speakers, digital signal processors, audio processors, mixers, etc.).
  • video peripherals e.g., GPUs, blenders, video encoder/decoders, scalers, display controllers, etc.
  • audio peripherals e.g., microphones, speakers, interfaces to microphones and speakers, digital signal processors, audio processors, mixers, etc.
  • Peripherals 144 may include interface controllers for various interfaces external to SOC 100 (e.g., Universal Serial Bus (USB), peripheral component interconnect (PCI) and PCI Express (PCIe), serial and parallel ports, etc.) The interconnection to external components is illustrated by the dashed arrow in Fig. 1 that extends external to SOC 100. Peripherals 144 may also include networking peripherals such as media access controllers (MACs). While not shown, in various embodiments, SOC 100 includes multiple I/O clusters 140 having respective sets of peripherals 144.
  • USB Universal Serial Bus
  • PCI peripheral component interconnect
  • PCIe PCI Express
  • serial and parallel ports etc.
  • the interconnection to external components is illustrated by the dashed arrow in Fig. 1 that extends external to SOC 100.
  • Peripherals 144 may also include networking peripherals such as media access controllers (MACs). While not shown, in various embodiments, SOC 100 includes multiple I/O clusters 140 having respective sets of peripherals 144.
  • SOC 100 might include a first I/O cluster 140 having external display peripherals 144, a second I/O cluster 140 having USB peripherals 144, and a third I/O cluster 140 having video encoder peripherals 144.
  • Each of those EO clusters 140 may include its own I/O agent 142.
  • I/O agent 142 in various embodiments, includes circuitry that is configured to bridge its peripherals 144 to interconnect 105 and to implement coherency mechanisms for processing transactions associated with those peripherals 144. As discussed in more detail with respect to Fig. 2, I/O agent 142 may receive transaction requests from peripheral 144 to read and/or write data to cache lines associated with memory 130A-B. In response to those requests, in various embodiments, I/O agent 142 communicates with memory controllers 120 to obtain exclusive ownership over the targeted cache lines. Accordingly, memory controllers 120 may grant exclusive ownership to I/O agent 142, which may involve providing I/O agent 142 with cache line data and sending snoop requests to other caching agents 110 and I/O agents 142.
  • I/O agent 142 may start completing transactions that target the cache line. In response to completing a transaction, I/O agent 142 may send an acknowledgement to the requesting peripheral 144 that the transaction has been completed. In some embodiments, I/O agent 142 does not obtain exclusive ownership for relaxed ordered requests, which do not have to be completed in a specified order.
  • Interconnect 105 in various embodiments, is any communication-based interconnect and/or protocol for communicating among components of SOC 100.
  • interconnect 105 may enable processor 112 within caching agent 110 to interact with peripheral 144 within I/O cluster 140.
  • interconnect 105 is bus-based, including shared bus configurations, cross bar configurations, and hierarchical buses with bridges.
  • Interconnect 105 may be packet-based, and may be hierarchical with bridges, cross bar, point-to-point, or other interconnects.
  • FIG. 2 a block diagram of example elements of interactions involving a caching agent 110, a memory controller 120, an I/O agent 142, and peripherals 144 is shown.
  • memory controller 120 includes a coherency controller 210 and directory 220.
  • the illustrated embodiment may be implemented differently than shown. For example, there may be multiple caching agents 110, multiple memory controllers 120, and/or multiple I/O agents 142.
  • coherency controller 210 is configured to implement the memory controller portion of the cache coherency protocol.
  • the cache coherency protocol may specify messages, or commands, that may be transmitted between caching agents 110, I/O agents 142, and memory controllers 120 (or coherency controllers 210) in order to complete coherent transactions.
  • Those messages may include transaction requests 205, snoops 225, and snoop responses 227 (or alternatively, “completions”).
  • a transaction request 205 in various embodiments, is a message that initiates a transaction, and specifies the requested cache line/block (e.g.
  • a transaction request 205 may be a write transaction in which the requestor seeks to write data to a cache line or a read transaction in which the requestor seeks to read the data of a cache line.
  • a transaction request 205 may specify a non-relaxed ordered dynamic random-access memory (DRAM) request.
  • Coherency controller 210 in some embodiments, is also configured to issue memory requests 222 to memory 130 to access data from memory 130 on behalf of components of SOC 100 and to receive memory responses 224 that may include requested data.
  • I/O agent 142 receives transaction requests 205 from peripherals 144.
  • I/O agent 142 might receive a series of write transaction requests 205, a series of read transaction requests 205, or combination of read and write transaction requests 205 from a given peripheral 144.
  • I/O agent 142 may receive four read transaction requests 205 from peripheral 144 A and three write transaction requests 205 from peripheral 144B.
  • transaction requests 205 received from a peripheral 144 have to be completed in a certain order (e.g., completed in the order in which they are received from a peripheral 144).
  • I/O agent 142 performs work on later requests 205 by preemptively obtaining exclusive ownership of the targeted cache lines. Accordingly, I/O agent 142 may issue exclusive ownership requests 215 to memory controllers 120 (particularly, coherency controllers 210). In some instances, a set of transaction requests 205 may target cache lines managed by different memory controllers 120 and as such, I/O agent 142 may issue exclusive ownership requests 215 to the appropriate memory controllers 120 based on those transaction requests 205. For a read transaction request 205, I/O agent 142 may obtain exclusive read ownership; for a write transaction request 205, I/O agent 142 may obtain exclusive write ownership.
  • Coherency controller 210 is circuitry configured to receive requests (e.g., exclusive ownership requests 215) from interconnect 105 (e.g. via one or more queues included in memory controller 120) that are targeted at cache lines mapped to memory 130 to which memory controller 120 is coupled. Coherency controller 210 may process those requests and generate responses (e.g., exclusive ownership response 217) having the data of the requested cache lines while also maintaining cache coherency in SOC 100. To maintain cache coherency, coherency controller 210 may use directory 220.
  • Directory 220 in various embodiments, is a storage array having a set of entries, each of which may track the coherency state of a respective cache line within the system.
  • an entry also tracks the location of the data of a cache line.
  • an entry of directory 220 may indicate that a particular cache line’s data is cached in cache 114 of caching agent 110 in a valid state.
  • a cache line may be shared between multiple cache-capable entities (e.g., caching agent 110) for read purposes and thus shared ownership can be provided.)
  • coherency controller 210 may ensure that the cache line is not stored outside of memory 130 and memory controller 120 in a valid state.
  • coherency controller 210 determines which components (e.g., caching agents 110, I/O agents 142, etc.) are to receive snoops 225 and the type of snoop 225 (e.g. invalidate, change to owned, etc.). For example, memory controller 120 may determine that caching agent 110 stores the data of a cache line requested by I/O agent 142 and thus may issue a snoop 225 to caching agent 110 as shown in Fig. 2. In some embodiments, coherency controller 210 does not target specific components, but instead, broadcasts snoops 225 that are observed by many of the components of SOC 100.
  • components e.g., caching agents 110, I/O agents 142, etc.
  • type of snoop 225 e.g. invalidate, change to owned, etc.
  • coherency controller 210 does not target specific components, but instead, broadcasts snoops 225 that are observed by many of the components of SOC 100.
  • snoop forward and snoop back are supported: snoop forward and snoop back.
  • the snoop forward messages may be used to cause a component (e.g., cache agent 110) to forward the data of a cache line to the requesting component, whereas the snoop back messages may be used to cause the component to return the data of the cache line to memory controller 120.
  • Supporting snoop forward and snoop back flows may allow for both three-hop (snoop forward) and four-hop (snoop back) behaviors. For example, snoop forward may be used to minimize the number of messages when a cache line is provided to a component, since the component may store the cache line and potentially use the data therein.
  • caching agent 110 receives a snoop 225 from memory controller 120, processes that snoop 225 to update the cache line state (e.g., invalidate the cache line), and provides back a copy of the data of the cache line (if specified by the snoop 225) to the initial ownership requestor or memory controller 120.
  • a snoop response 227 (or a “completion”), in various embodiments, is message that indicates that the state change has been made and provides the copy of the cache line data, if applicable.
  • the data is provided to the requesting component in three hops over the interconnect 105: request from the requesting component to the memory controller 120, the snoop from the memory controller 120 to the caching, and the snoop response by the caching component to the requesting component.
  • the snoop back mechanism four hops may occur: request and snoop, as in the three-hop protocol, snoop response by the caching component to the memory controller 120, and data from the memory controller 120 to the requesting component.
  • coherency controller 210 may update directory 220 when a snoop 225 is generated and transmitted instead of when a snoop response 227 is received.
  • coherency controller 210 grants exclusive read (or write) ownership to the ownership requestor (e.g., I/O agent 142) via an exclusive ownership response 217.
  • the exclusive ownership response 217 may include the data of the requested cache line.
  • coherency controller 210 updates directory 220 to indicate that the cache line has been granted to the ownership requestor.
  • I/O agent 142 may receive a series of read transaction requests 205 from peripheral 144 A. For a given one of those requests, I/O agent 142 may send an exclusive read ownership request 215 to memory controller 120 for data associated with a specific cache line (or if the cache line is managed by another memory controller 120, then the exclusive read ownership request 215 is sent to that other memory controller 120).
  • Coherency controller 210 may determine, based on an entry of directory 220, that cache agent 110 currently stores data associated with the specific cache line in a valid state.
  • coherency controller 210 sends a snoop 225 to caching agent 110 that causes caching agent 110 to relinquish ownership of that cache line and send back a snoop response 227, which may include the cache line data.
  • coherency controller 210 may generate and then send an exclusive ownership response 217 to I/O agent 142, providing I/O agent 142 with the cache line data and exclusive ownership of the cache line.
  • I/O agent 142 waits until the corresponding transaction can be completed (according to the ordering rules) —that is, waits until the corresponding transaction becomes the most senior transaction and there is ordering dependency resolution for the transaction.
  • I/O agents 142 may receive transaction requests 205 from a peripheral 144 to perform write transactions A-D. I/O agent 142 may obtain exclusive ownership of the cache line associated with transaction C; however, transactions A and B may not have been completed. Consequently, I/O agent 142 waits until transactions A and B have been completed before writing the relevant data for the cache line associated with transaction C. After completing a given transaction, in various embodiments, I/O agent 142 provides a transaction response 207 to the transaction requestor (e.g., peripheral 144 A) indicating that the requested transaction has been performed. In various cases, I/O agent 142 may obtain exclusive read ownership of a cache line, perform a set of read transactions on the cache line, and thereafter release exclusive read ownership of the cache line without having performed a write to the cache line while the exclusive read ownership was held.
  • the transaction requestor e.g., peripheral 144 A
  • I/O agent 142 might receive multiple transaction requests 205 (within a reasonably short period of time) that target the same cache line and, as a result, I/O agent 142 may perform bulk read and writes.
  • two write transaction requests 205 received from peripheral 144A might target the lower and upper portions of a cache line, respectively.
  • I/O agent 142 may acquire exclusive write ownership of the cache line and retain the data associated with the cache line until at least both of the write transactions have been completed.
  • I/O agent 142 may forward executive ownership between transactions that target the same cache line. That is, I/O agent 142 does not have to send an ownership request 215 for each individual transaction request 205.
  • I/O agent 142 may forward executive ownership from a read transaction to a write transaction (or vice versa), but in other cases, I/O agent 142 forwards executive ownership only between the same type of transactions (e.g., from a read transaction to another read transaction).
  • I/O agent 142 may issue an exclusive write ownership request 215 that requests exclusive ownership of a cache line without receiving data when it is performing a full cache write and the cache line is not in a modified state.
  • I/O agent 142 might lose exclusive ownership of a cache line before I/O agent 142 has performed the relevant transactions against the cache line.
  • I/O agent 142 may receive a snoop 225 from memory controller 120 as a result of another I/O agent 142 seeking to obtain exclusive ownership of the cache line. After relinquishing exclusive ownership of a cache line, in various embodiments, I/O agent 142 determines whether to reacquire ownership of the lost cache line.
  • I/O agent 142 If the lost cache line is associated with one pending transaction, then I/O agent 142, in many cases, does not reacquire exclusive ownership of the cache line; however, in some cases, if the pending transaction is behind a set number of transactions (and thus is not about to become the senior transaction), then I/O agent 142 may issue an exclusive ownership request 215 for the cache line. But if there is a threshold number of pending transactions (e.g., two pending transactions) directed to the cache line, then I/O agent 142 reacquires exclusive ownership of the cache line, in various embodiments.
  • a threshold number of pending transactions e.g., two pending transactions
  • I/O agent 142 includes an I/O agent controller 310 and coherency caches 320.
  • coherency caches 320 include a fetched data cache 322, a merged data cache 324, and a new data cache 326.
  • I/O agent 142 is implemented differently than shown. As an example, I/O agent 142 may not include separate caches for data pulled from memory and data that is to be written as a part of a write transaction.
  • I/O agent controller 310 in various embodiments, is circuitry configured to receive and process transactions associated with peripherals 144 that are coupled to I/O agent 142.
  • I/O agent controller 310 receives a write transaction request 205 from a peripheral 144.
  • the write transaction request 205 specifies a destination memory address and may include the data to be written or a reference to the location of that data.
  • I/O agent 142 uses caches 320.
  • Coherency caches 320 are storage arrays that include entries configured to store data or program instructions.
  • coherency caches 320 may be associative storage arrays (e.g., fully associative or set-associative, such as a 4-way associative cache) or direct- mapped storage arrays, and may have any storage capacity and/or any cache line size (e.g. 32 bytes, 64 bytes, etc.).
  • Fetched data cache 322 in various embodiments, is used to store data that is obtained in response to issuing an exclusive ownership request 215.
  • I/O agent 142 may then issue an exclusive write ownership request 215 to the particular memory controller 120 that manages the data stored at the destination/targeted memory address.
  • I/O agent controller 310 stores that data separate from the data included in the write transaction request 205 in order to allow for snooping of the fetched data prior to ordering resolution. Accordingly, as shown, I/O agent 142 may receive a snoop 225 that causes I/O agent 142 to provide a snoop response 227, releasing the data received from the particular memory controller 120.
  • New data cache 326 in various embodiments, is used to store the data that is included in a write transaction request 205 until ordering dependency is resolved.
  • I/O agent 142 may merge the relevant data from fetched data cache 322 with the corresponding write data from new data cache 326.
  • Merged data cache 324 in various embodiments, is used to store the merged data.
  • a write transaction may target a portion, but not all of a cache line. Accordingly, the merged data may include a portion that has been changed by the write transaction and a portion that has not been changed.
  • I/O agent 142 may receive a set of write transaction requests 205 that together target multiple or all portions of a cache line. As such, processing the set of write transactions, most of cache line (or the entire cache line) may be changed. As an example, I/O agent 142 may process four write transaction requests 205 that each target a different 32- bit portion of the same 128-bit cache line, thus the entire line content is replaced with the new data. In some cases, a write transaction request 205 is a full cacheline write and thus the data accessed from fetched data cache 322 for the write transaction is entirely replaced by that one write transaction request 205.
  • I/O agent 142 releases exclusive write ownership of the cache line and may then evict the data from coherency caches 320.
  • I/O agent 142 includes I/O agent controller 310 and fetched data cache 322. In some embodiments, I/O agent 142 is implemented differently than shown.
  • I/O agent 142 Since I/O agent 142 does not write data for read transactions, in various embodiments, I/O agent 142 does not use merged data cache 324 and new data cache 326 for processing read transactions — as such, they are not shown in the illustrated embodiment. Consequently, after receiving a read transaction request 205, I/O agent 142 may issues an exclusive read ownership request 215 to the appropriate memory controller 120 and receive back an exclusive ownership response 217 that includes the data of the targeted cache line. Once I/O agent 142 has received the relevant data and once the read transaction has become the senior pending transaction, I/O agent 142 may complete the read transaction.
  • I/O agent 142 releases exclusive read ownership of the cache line and may then evict the data from fetched data cache 322.
  • I/O agent 142 receives, from peripheral 144, a read transaction request 205 A followed by a read transaction request 205B.
  • I/O agent 142 issues, for transaction request 205A, an exclusive read ownership request 215A to memory controller 120B and, for transaction request 205B, I/O agent 142 issues an exclusive read ownership request 215B to memory controller 120 A.
  • read transaction requests 205A-B may target cache lines managed by the same memory controller 120 and thus I/O agent 142 may communicate with only that memory controller 120 to fulfill read transaction requests 205 A-B.
  • a directory miss occurs at memory controller 120A for the targeted cache line of transaction request 205B, indicating that the data of the targeted cache line is not stored in a valid state outside of memory 130.
  • Memory controller 120A returns an exclusive read ownership response 217B to I/O agent 142 that grants exclusive read ownership of the cache line and may further include the data associated with that cache line.
  • a directory hit occurs at memory controller 120B for the targeted cache line of transaction request 205 A.
  • Memory controller 120B may determine, based on its directory 220, that the illustrated caching agent 110 caches the data of the targeted cache line.
  • memory controller 120B issues a snoop 225 to that caching agent 110 and receives a snoop response 227, which may include data associated with the targeted cache line.
  • Memory controller 120B returns an exclusive read ownership response 217A to I/O agent 142 that grants exclusive read ownership of the targeted cache line and may further include the data associated with that cache line.
  • I/O agent 142 receives exclusive read ownership response 217B before receiving exclusive read ownership response 217A.
  • the transactional order rules of peripheral 144 impose that transaction requests 205 A-B must be completed in a certain order (e.g., the order in which they were received).
  • I/O agent 142 holds speculative read exclusive ownership but does not complete the corresponding read transaction request 205B.
  • I/O agent 142 may then complete transaction request 205A and issue a complete request 205A to peripheral 144.
  • I/O agent 142 may complete transaction request 205B and also issue a complete request 205B to peripheral 144. Because I/O agent 142 preemptively obtained exclusive read ownership of the cache line associated with read transaction request 205B, I/O agent 142 does not have to send out a request for that cache line after completing read transaction request 205 A (assuming that I/O agent 142 has not lost ownership of the cache line). Instead, I/O agent 142 may complete read transaction request 205B relatively soon after completing read transaction request 205A and thus not incur most or all of the delay (e.g., 500 clock cycles) associated with fetching that cache line into I/O agent 142’ s coherency caches 320.
  • the delay e.g., 500 clock cycles
  • Method 500 is one embodiment of a method performed by an I/O agent circuit (e.g., an I/O agent 142) in order to process a set of transaction requests (e.g., transaction requests 205) received from a peripheral component (e.g., a peripheral 144).
  • I/O agent circuit e.g., an I/O agent 142
  • a set of transaction requests e.g., transaction requests 205
  • peripheral component e.g., a peripheral 144
  • method 500 includes more or less steps than shown — e.g., the I/O agent circuit may evict data from its cache (e.g., a coherency cache 330) after processing the set of transaction requests.
  • cache e.g., a coherency cache 330
  • Method 500 begins in step 510 with the I/O agent circuit receiving a set of transaction requests from the peripheral component to perform a set of read transactions (which includes at least one read transaction) that are directed to one or more of the plurality of cache lines.
  • the I/O agent receives requests to perform write transactions or a mixture of read and write transactions.
  • the I/O agent may receive those transaction requests from multiple peripheral components.
  • the I/O agent circuit issues, to a first memory controller circuit (e.g., a memory controller 120) that is configured to manage access to a first one of the plurality of cache lines, a request (e.g., an exclusive ownership request 215) for exclusive read ownership of the first cache line such that data of the first cache line is not cached outside of the memory and the I/O agent circuit in a valid state.
  • the request for exclusive read ownership of the first cache line may cause a snoop request (e.g., a snoop 225) to be sent to another I/O agent circuit (or a caching agent 110) to release exclusive read ownership of the first cache line.
  • the request for exclusive read ownership of the first cache line may be issued only in response to the I/O agent making a determination that the set of requests includes at least one write transaction that is directed to the first cache line.
  • the I/O agent circuit receives exclusive read ownership of the first cache line, including receiving the data of the first cache line.
  • the I/O agent circuit may receive a snoop request directed to the first cache line and may then release exclusive read ownership of the first cache line before completing performance of the set of read transactions, including invalidating the data stored at the I/O agent circuit for the first cache line.
  • the I/O agent circuit may thereafter make a determination that at least a threshold number of remaining unprocessed read transactions of the set of read transactions are directed to the first cache line and in response to the determination, send a request to the first memory controller circuit to re establish exclusive read ownership of the first cache line.
  • the I/O agent circuit may process the remaining read transactions without re-establishing exclusive read ownership of the first cache line.
  • the I/O agent circuit performs the set of read transactions with respect to the data.
  • the I/O agent circuit may release exclusive read ownership of the first cache line without having performed a write to the first cache line while the exclusive read ownership was held.
  • the I/O agent circuit may make a determination that at least two of the set of read transactions target at least two different portions of the first cache line. In response to the determination, the I/O agent circuit may process multiple of the read transactions before releasing exclusive read ownership of the first cache line.
  • the I/O agent circuit may receive, from another peripheral component, a set of requests to perform a set of write transactions that are directed to one or more of the plurality of cache lines.
  • the I/O agent circuit may issue, to a second memory controller circuit that is configured to manage access to a second one of the plurality of cache lines, a request for exclusive write ownership of the second cache line such that: data of the second cache line is not cached outside of the memory and the I/O agent circuit in a valid state; and the data for the second cache line is provided to the I/O agent circuit only if the data is in a modified state.
  • the I/O agent circuit may receive the data of the second cache line and perform the set of write transactions with respect to the data of the second cache line.
  • one of the set of write transactions may involve writing data to a first portion of the second cache line.
  • the I/O agent circuit may merge the data of the second cache line with data of the write transaction such that the first portion (e.g., lower 64 bits) is updated, but a second portion (e.g., upper 64 bits) of the second cache line is unchanged.
  • the I/O agent circuit may release exclusive write ownership of the second cache line in response to writing to all portions of the second cache line.
  • FIG. 6 a block diagram illustrating an example process of fabricating an integrated circuit 630 that can include at least a portion of SOC 100 is shown.
  • the illustrated embodiment includes a non-transitory computer-readable medium 610 (which includes design information 615), a semiconductor fabrication system 620, and a resulting fabricated integrated circuit 630.
  • integrated circuit 630 includes at least a caching agent 110, a memory controller 120, a memory 130, and an I/O cluster 140 — in some cases, that memory 130 and one or more peripherals of that I/O cluster 140 may be separate from integrated circuit 630.
  • Integrated circuit 630 may further additionally or alternatively includes other circuits such as a wireless network circuit.
  • semiconductor fabrication system 620 is configured to process design information 615 to fabricate integrated circuit 630.
  • Non-transitory computer-readable medium 610 may include any of various appropriate types of memory devices or storage devices.
  • non-transitory computer-readable medium 610 may include at least one of an installation medium (e.g., a CD-ROM, floppy disks, or tape device), a computer system memory or random access memory (e.g., DRAM, DDR RAM, SRAM, EDO RAM, Rambus RAM, etc.), a non-volatile memory such as a Flash, magnetic media (e.g., a hard drive, or optical storage), registers, or other types of non-transitory memory.
  • Non-transitory computer-readable medium 610 may include two or more memory mediums, which may reside in different locations (e.g., in different computer systems that are connected over a network).
  • Design information 615 may be specified using any of various appropriate computer languages, including hardware description languages such as, without limitation: VHDL, Verilog, SystemC, SystemVerilog, RHDL, M, MyHDL, etc. Design information 615 may be usable by semiconductor fabrication system 620 to fabricate at least a portion of integrated circuit 630. The format of design information 615 may be recognized by at least one semiconductor fabrication system 620. In some embodiments, design information 615 may also include one or more cell libraries, which specify the synthesis and/or layout of integrated circuit 630. In some embodiments, the design information is specified in whole or in part in the form of a netlist that specifies cell library elements and their connectivity.
  • Design information 615 taken alone, may or may not include sufficient information for fabrication of a corresponding integrated circuit (e.g., integrated circuit 630).
  • design information 615 may specify circuit elements to be fabricated but not their physical layout. In this case, design information 615 may be combined with layout information to fabricate the specified integrated circuit.
  • Semiconductor fabrication system 620 may include any of various appropriate elements configured to fabricate integrated circuits. This may include, for example, elements for depositing semiconductor materials (e.g., on a wafer, which may include masking), removing materials, altering the shape of deposited materials, modifying materials (e.g., by doping materials or modifying dielectric constants using ultraviolet processing), etc. Semiconductor fabrication system 620 may also be configured to perform various testing of fabricated circuits for correct operation.
  • integrated circuit 630 is configured to operate according to a circuit design specified by design information 615, which may include performing any of the functionality described herein.
  • integrated circuit 630 may include any of various elements described with reference to Figs. 1-5.
  • integrated circuit 630 may be configured to perform various functions described herein in conjunction with other components. The functionality described herein may be performed by multiple connected integrated circuits.
  • design information that specifies a design of a circuit configured to .. does not imply that the circuit in question must be fabricated in order for the element to be met. Rather, this phrase indicates that the design information describes a circuit that, upon being fabricated, will be configured to perform the indicated actions or will include the specified components.
  • Design information 615 may be generated using one or more computer systems and stored in non-transitory computer-readable medium 610. The method may conclude when design information 615 is sent to semiconductor fabrication system 620 or prior to design information 615 being sent to semiconductor fabrication system 620. Accordingly, in some embodiments, the method may not include actions performed by semiconductor fabrication system 620.
  • Design information 615 may be sent to semiconductor fabrication system 620 in a variety of ways. For example, design information 615 may be transmitted (e.g., via a transmission medium such as the Internet) from non-transitory computer-readable medium 610 to semiconductor fabrication system 620 (e.g., directly or indirectly).
  • non- transitory computer-readable medium 610 may be sent to semiconductor fabrication system 620.
  • semiconductor fabrication system 620 may fabricate integrated circuit 630 as discussed above.
  • FIG. 7 a block diagram of one embodiment of a system 700 is shown that may incorporate and/or otherwise utilize the methods and mechanisms described herein.
  • the system 700 includes at least one instance of a system on chip (SOC) 100 that is coupled to external memory 130, peripherals 144, and a power supply 705.
  • Power supply 705 is also provided which supplies the supply voltages to SOC 100 as well as one or more supply voltages to the memory 130 and/or the peripherals 144.
  • SOC system on chip
  • power supply 705 represents a battery (e.g., a rechargeable battery in a smart phone, laptop or tablet computer, or other device). In some embodiments, more than one instance of SOC 100 is included (and more than one external memory 130 is included as well).
  • system 700 is shown to have application in a wide range of areas.
  • system 700 may be utilized as part of the chips, circuitry, components, etc., of a desktop computer 710, laptop computer 720, tablet computer 730, cellular or mobile phone 740, or television 750 (or set-top box coupled to a television).
  • a smartwatch and health monitoring device 760 are illustrated.
  • smartwatch may include a variety of general-purpose computing related functions.
  • smartwatch may provide access to email, cellphone service, a user calendar, and so on.
  • a health monitoring device may be a dedicated medical device or otherwise include dedicated health related functionality.
  • a health monitoring device may monitor a user’ s vital signs, track proximity of a user to other users for the purpose of epidemiological social distancing, contact tracing, provide communication to an emergency service in the event of a health crisis, and so on.
  • the above-mentioned smartwatch may or may not include some or any health monitoring related functions.
  • Other wearable devices are contemplated as well, such as devices worn around the neck, devices that are implantable in the human body, glasses designed to provide an augmented and/or virtual reality experience, and so on.
  • System 700 may further be used as part of a cloud-based service(s) 770.
  • the previously mentioned devices, and/or other devices may access computing resources in the cloud (e.g., remotely located hardware and/or software resources).
  • system 700 may be utilized in one or more devices of a home 780 other than those previously mentioned.
  • appliances within home 780 may monitor and detect conditions that warrant attention.
  • various devices within home 780 e.g., a refrigerator, a cooling system, etc.
  • a thermostat may monitor the temperature in home 780 and may automate adjustments to a heating/cooling system based on a history of responses to various conditions by the homeowner.
  • system 700 may be used in the control and/or entertainment systems of aircraft, trains, buses, cars for hire, private automobiles, waterborne vessels from private boats to cruise liners, scooters (for rent or owned), and so on.
  • system 700 may be used to provide automated guidance (e.g., self-driving vehicles), general systems control, and otherwise.
  • This disclosure may discuss potential advantages that may arise from the disclosed embodiments. Not all implementations of these embodiments will necessarily manifest any or all of the potential advantages. Whether an advantage is realized for a particular implementation depends on many factors, some of which are outside the scope of this disclosure. In fact, there are a number of reasons why an implementation that falls within the scope of the claims might not exhibit some or all of any disclosed advantages. For example, a particular implementation might include other circuitry outside the scope of the disclosure that, in conjunction with one of the disclosed embodiments, negates or diminishes one or more the disclosed advantages. Furthermore, suboptimal design execution of a particular implementation (e.g., implementation techniques or tools) could also negate or diminish disclosed advantages.
  • embodiments are non-limiting. That is, the disclosed embodiments are not intended to limit the scope of claims that are drafted based on this disclosure, even where only a single example is described with respect to a particular feature.
  • the disclosed embodiments are intended to be illustrative rather than restrictive, absent any statements in the disclosure to the contrary. The application is thus intended to permit claims covering disclosed embodiments, as well as such alternatives, modifications, and equivalents that would be apparent to a person skilled in the art having the benefit of this disclosure.
  • references to a singular form of an item i.e., a noun or noun phrase preceded by “a,” “an,” or “the” are, unless context clearly dictates otherwise, intended to mean “one or more.” Reference to “an item” in a claim thus does not, without accompanying context, preclude additional instances of the item.
  • a “plurality” of items refers to a set of two or more of the items.
  • a recitation of “w, x, y, or z, or any combination thereof’ or “at least one of ... w, x, y, and z” is intended to cover all possibilities involving a single element up to the total number of elements in the set.
  • these phrasings cover any single element of the set (e.g., w but not x, y, or z), any two elements (e.g., w and x, but not y or z), any three elements (e.g., w, x, and y, but not z), and all four elements.
  • w, x, y, and z thus refers to at least one element of the set [w, x, y, z], thereby covering all possible combinations in this list of elements. This phrase is not to be interpreted to require that there is at least one instance of w, at least one instance of x, at least one instance of y, and at least one instance of z.
  • labels may precede nouns or noun phrases in this disclosure.
  • different labels used for a feature e.g., “first circuit,” “second circuit,” “particular circuit,” “given circuit,” etc.
  • labels “first,” “second,” and “third” when applied to a feature do not imply any type of ordering (e.g., spatial, temporal, logical, etc.), unless stated otherwise.
  • a determination may be solely based on specified factors or based on the specified factors as well as other, unspecified factors.
  • an entity described or recited as being “configured to” perform some task refers to something physical, such as a device, circuit, a system having a processor unit and a memory storing program instructions executable to implement the task, etc. This phrase is not used herein to refer to something intangible.
  • circuits may be described in this disclosure. These circuits or “circuitry” constitute hardware that includes various types of circuit elements, such as combinatorial logic, clocked storage devices (e.g., flip-flops, registers, latches, etc.), finite state machines, memory (e.g., random-access memory, embedded dynamic random-access memory), programmable logic arrays, and so on. Circuitry may be custom designed, or taken from standard libraries. In various implementations, circuitry can, as appropriate, include digital components, analog components, or a combination of both.
  • combinatorial logic e.g., clocked storage devices (e.g., flip-flops, registers, latches, etc.), finite state machines, memory (e.g., random-access memory, embedded dynamic random-access memory), programmable logic arrays, and so on. Circuitry may be custom designed, or taken from standard libraries. In various implementations, circuitry can, as appropriate, include digital components, analog components, or a combination of both.
  • circuits may be commonly referred to as “units” (e.g., a decode unit, an arithmetic logic unit (ALU), functional unit, memory management unit (MMU), etc.). Such units also refer to circuits or circuitry.
  • units e.g., a decode unit, an arithmetic logic unit (ALU), functional unit, memory management unit (MMU), etc.
  • ALU arithmetic logic unit
  • MMU memory management unit
  • circuits/units/components and other elements illustrated in the drawings and described herein thus include hardware elements such as those described in the preceding paragraph.
  • the internal arrangement of hardware elements within a particular circuit may be specified by describing the function of that circuit.
  • a particular “decode unit” may be described as performing the function of “processing an opcode of an instruction and routing that instruction to one or more of a plurality of functional units,” which means that the decode unit is “configured to” perform this function.
  • This specification of function is sufficient, to those skilled in the computer arts, to connote a set of possible structures for the circuit.
  • circuits, units, and other elements defined by the functions or operations that they are configured to implement The arrangement and such circuits/units/components with respect to each other and the manner in which they interact form a microarchitectural definition of the hardware that is ultimately manufactured in an integrated circuit or programmed into an FPGA to form a physical implementation of the microarchitectural definition.
  • the microarchitectural definition is recognized by those of skill in the art as structure from which many physical implementations may be derived, all of which fall into the broader structure described by the microarchitectural definition.
  • HDL hardware description language
  • Such an HDL description may take the form of behavioral code (which is typically not synthesizable), register transfer language (RTL) code (which, in contrast to behavioral code, is typically synthesizable), or structural code (e.g., a netlist specifying logic gates and their connectivity).
  • the HDL description may subsequently be synthesized against a library of cells designed for a given integrated circuit fabrication technology, and may be modified for timing, power, and other reasons to result in a final design database that is transmitted to a foundry to generate masks and ultimately produce the integrated circuit.
  • Some hardware circuits or portions thereof may also be custom-designed in a schematic editor and captured into the integrated circuit design along with synthesized circuitry.
  • the integrated circuits may include transistors and other circuit elements (e.g.
  • the HDL design may be synthesized to a programmable logic array such as a field programmable gate array (FPGA) and may be implemented in the FPGA.
  • FPGA field programmable gate array

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

L'invention concerne des techniques se rapportant à un circuit d'agent d'E/S d'un système informatique. Le circuit d'agent d'E/S peut recevoir, en provenance d'un composant périphérique, un ensemble de demandes de transactions pour effectuer un ensemble de transactions de lecture qui sont dirigées vers une ou plusieurs lignes parmi une pluralité de lignes de cache. Le circuit d'agent d'E/S peut délivrer, à un premier circuit de contrôleur de mémoire configuré pour gérer l'accès à une première ligne de la pluralité de lignes de cache, une demande de propriété exclusive en lecture de la première ligne de cache de telle façon que des données de la première ligne de cache ne soient pas mises en cache en dehors de la mémoire et du circuit d'agent d'E/S dans un état valide. Le circuit d'agent d'E/S peut recevoir la propriété exclusive en lecture de la première ligne de cache, ce qui comprend la réception des données de la première ligne de cache. Le circuit d'agent d'E/S peut ensuite effectuer l'ensemble de transactions de lecture par rapport aux données.
PCT/US2022/023296 2021-04-05 2022-04-04 Agent d'e/s WO2022216597A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202280025986.0A CN117099088A (zh) 2021-04-05 2022-04-04 I/o代理
KR1020237034002A KR20230151031A (ko) 2021-04-05 2022-04-04 I/o 에이전트
DE112022001978.6T DE112022001978T5 (de) 2021-04-05 2022-04-04 E/a-agent

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163170868P 2021-04-05 2021-04-05
US63/170,868 2021-04-05
US17/648,071 2022-01-14
US17/648,071 US11550716B2 (en) 2021-04-05 2022-01-14 I/O agent

Publications (1)

Publication Number Publication Date
WO2022216597A1 true WO2022216597A1 (fr) 2022-10-13

Family

ID=81386952

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/023296 WO2022216597A1 (fr) 2021-04-05 2022-04-04 Agent d'e/s

Country Status (1)

Country Link
WO (1) WO2022216597A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030177316A1 (en) * 2001-07-27 2003-09-18 Broadcom Corporation Read exclusive for fast, simple invalidate
JP2011076159A (ja) * 2009-09-29 2011-04-14 Nec Computertechno Ltd キャッシュメモリ制御システム及びキャッシュメモリの制御方法
US20140181394A1 (en) * 2012-12-21 2014-06-26 Herbert H. Hum Directory cache supporting non-atomic input/output operations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030177316A1 (en) * 2001-07-27 2003-09-18 Broadcom Corporation Read exclusive for fast, simple invalidate
JP2011076159A (ja) * 2009-09-29 2011-04-14 Nec Computertechno Ltd キャッシュメモリ制御システム及びキャッシュメモリの制御方法
US20140181394A1 (en) * 2012-12-21 2014-06-26 Herbert H. Hum Directory cache supporting non-atomic input/output operations

Similar Documents

Publication Publication Date Title
US11868258B2 (en) Scalable cache coherency protocol
US20230350828A1 (en) Multiple Independent On-chip Interconnect
US20230333851A1 (en) DSB Operation with Excluded Region
US11989131B2 (en) Storage array invalidation maintenance
US11550716B2 (en) I/O agent
WO2022216597A1 (fr) Agent d'e/s
US11822480B2 (en) Criticality-informed caching policies
US11941428B2 (en) Ensuring transactional ordering in I/O agent
US20230063676A1 (en) Counters For Ensuring Transactional Ordering in I/O Agent
US11755489B2 (en) Configurable interface circuit
US11960400B2 (en) Managing multiple cache memory circuit operations
US11880308B2 (en) Prediction confirmation for cache subsystem
US11886340B1 (en) Real-time processing in computer systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22719133

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202280025986.0

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 20237034002

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020237034002

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 112022001978

Country of ref document: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22719133

Country of ref document: EP

Kind code of ref document: A1