TW202319902A - Software indirection level for address translation sharing - Google Patents

Software indirection level for address translation sharing Download PDF

Info

Publication number
TW202319902A
TW202319902A TW111138972A TW111138972A TW202319902A TW 202319902 A TW202319902 A TW 202319902A TW 111138972 A TW111138972 A TW 111138972A TW 111138972 A TW111138972 A TW 111138972A TW 202319902 A TW202319902 A TW 202319902A
Authority
TW
Taiwan
Prior art keywords
address translation
array
context identifier
level
context
Prior art date
Application number
TW111138972A
Other languages
Chinese (zh)
Inventor
佩里恩 佩雷斯
舒文都 史可翰 穆克吉
克斯特 阿薩諾維奇
Original Assignee
美商賽發馥股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 美商賽發馥股份有限公司 filed Critical 美商賽發馥股份有限公司
Publication of TW202319902A publication Critical patent/TW202319902A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1036Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] for multiple virtual address spaces, e.g. segmentation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Systems and methods are disclosed for debug event tracing. For example, an integrated circuit (e.g., a processor) for executing instructions includes an address translation buffer, wherein an entry of the address translation buffer includes a tag including a field storing a context identifier; and a context identifier look-up circuitry that stores context identifiers in an array that is indexed by a hardware identifier, wherein the context identifier look-up circuitry is configured to: receive an input address translation request including a first hardware identifier value; generate an output address translation request, including a first context identifier value that is stored in an entry of the array indexed by the first hardware identifier value; and apply the output address translation request to the address translation buffer.

Description

位址轉換共享軟體間接級別address translation shareware indirection level

相關申請案之相互引用Cross-references to related applications

此申請案主張2021年10月17日提交申請之美國臨時專利申請案第63/256,628號之優先權及權益,其整體藉由引用被併入本文。此申請案主張2021年10月17日提交申請之美國臨時專利申請案第63/256,630號之優先權及權益,其整體藉由引用被併入本文。此申請案主張2021年12月17日提交申請之美國臨時專利申請案第63/291,314號之優先權及權益,其整體藉由引用被併入本文。This application claims priority and benefit to U.S. Provisional Patent Application No. 63/256,628, filed October 17, 2021, which is hereby incorporated by reference in its entirety. This application claims priority and benefit to US Provisional Patent Application No. 63/256,630, filed October 17, 2021, which is hereby incorporated by reference in its entirety. This application claims priority and benefit to US Provisional Patent Application No. 63/291,314, filed December 17, 2021, which is hereby incorporated by reference in its entirety.

此揭露內容關於用於位址轉換共享的軟體間接級別。This disclosure is about the level of software indirection used for address translation sharing.

輸入/輸出記憶體管理單元(IOMMU)是將直接記憶體可存取(DMA-capable)的I/O匯流排連接到主記憶體的記憶體管理單元(MMU)。如將 CPU可見的虛擬位址轉換為實體位址的傳統MMU一樣,IOMMU將裝置可見的虛擬位址(也稱為裝置位址或I/O位址)映射到實體位址。一些單元也提供記憶體保護以防止故障或惡意裝置。An input/output memory management unit (IOMMU) is a memory management unit (MMU) that connects a DMA-capable I/O bus to main memory. Like traditional MMUs that translate CPU-visible virtual addresses to physical addresses, the IOMMU maps device-visible virtual addresses (also known as device addresses or I/O addresses) to physical addresses. Some units also provide memory protection to prevent malfunction or malicious installation.

概述overview

本文描述可用於實施用於位址轉換共享的軟體間接級別的系統及方法。位址轉換引擎可以整合在輸入/輸出(I/O)裝置及系統互連之間。例如,IO裝置可以包括用於圖形的圖形處理單元(GPU)、儲存控制器、網路介面控制器(NIC)、或IO加速器(例如加密加速器或數位訊號處理器(DSP)),其可能具有對系統的直接記憶體存取(DMA)介面。位址轉換引擎的角色可以包括將裝置虛擬位址轉換為用於裝置DMA請求的實體位址,以及針對這樣的請求執行記憶體保護。Systems and methods are described herein that may be used to implement a level of software indirection for address translation sharing. Address translation engines may be integrated between input/output (I/O) devices and system interconnects. For example, an IO device may include a graphics processing unit (GPU) for graphics, a storage controller, a network interface controller (NIC), or an IO accelerator (such as a cryptographic accelerator or a digital signal processor (DSP)), which may have Direct memory access (DMA) interface to the system. The role of the address translation engine may include translating device virtual addresses to physical addresses for device DMA requests, and performing memory protection for such requests.

為了執行位址轉換,位址轉換引擎可以區分各種IO裝置以選擇適當的轉換規則。為了促進這一點,入站請求可以包括硬體識別符(例如硬體上下文識別符)以及請求(例如也包括位址、讀/寫、屬性的請求)以識別轉換項目。硬體識別符對於裝置或裝置內的程序應該是唯一的。每個裝置可以基於事務向位址轉換引擎提供數個硬體識別符,但一個硬體識別符應只與一個裝置相關聯。In order to perform address translation, the address translation engine can distinguish various IO devices to select an appropriate translation rule. To facilitate this, an inbound request may include a hardware identifier (eg, a hardware context identifier) and a request (eg, a request that also includes address, read/write, attributes) to identify the conversion item. A hardware identifier should be unique to a device or a program within a device. Each device can provide several HIDs to the address translation engine on a transaction basis, but a HID should only be associated with one device.

位址轉換引擎包括將軟體程序與硬體識別符相關聯的機制。這是藉由使用由硬體識別符索引的陣列將上下文識別符(例如SoftContextID)連結至硬體識別符來達成。上下文識別符可以是軟體程序的表示。在一些實施中,陣列以單階表儲存於包括在位址轉換引擎中的資料儲存(例如暫存器檔案或靜態隨機存取記憶體(SRAM))中。在一些實施中,陣列以具有一些項目的多階表儲存於系統記憶體中。對將硬體識別符映射到上下文識別符的陣列的寫入存取可以限定於具有特定特權等級(例如機器特權等級或管理員特權等級)的主機及/或作為管理程序(hypervisor)的主機。藉由共享上下文識別符,多個裝置可以共享位址轉換上下文。The address translation engine includes a mechanism for associating software programs with hardware identifiers. This is accomplished by linking context identifiers (eg, SoftContextID) to hardware identifiers using an array indexed by hardware identifiers. A context identifier may be a representation of a software program. In some implementations, the array is stored as a single-level table in data storage (eg, a register file or static random access memory (SRAM)) included in the address translation engine. In some implementations, the array is stored in system memory as a multilevel table with entries. Write access to the array mapping hardware identifiers to context identifiers may be restricted to hosts with a particular privilege level (eg, machine privilege level or administrator privilege level) and/or hosts acting as hypervisors. By sharing a context identifier, multiple devices can share an address translation context.

一些實施可以提供優於用於位址轉換的傳統系統的優勢,例如,減少有效支援IO裝置的收集所需的位址轉換緩衝器(例如位址轉換快取或轉換後備緩衝器)的大小、支援共享記憶體系統的大量IO裝置、及/或在一些情況下增加位址轉換引擎的速度/性能。Some implementations may provide advantages over conventional systems for address translation, such as reducing the size of an address translation buffer (such as an address translation cache or translation lookaside buffer) needed to efficiently support collection of IO devices, Support for a large number of IO devices for shared memory systems, and/or increase the speed/performance of the address translation engine in some cases.

本文中使用的用語“電路”指電子元件(例如電晶體、電阻器、電容器、及/或電感器)的配置,此配置被結構化以實施一個或更多功能。例如,電路可以包含一個或更多經互連而形成邏輯閘的電晶體,該些邏輯閘共同地實施邏輯功能。 細節 As used herein, the term "circuit" refers to an arrangement of electronic components (eg, transistors, resistors, capacitors, and/or inductors) structured to perform one or more functions. For example, a circuit may comprise one or more transistors interconnected to form logic gates that collectively perform a logic function. detail

圖1是用於與輸入/輸出裝置140共享記憶體的系統100的範例的方塊圖,包括用於將來自各種裝置的虛擬位址轉換為實體位址的位址轉換引擎150。系統100包括處理器核心複合體110、記憶體互連120以及記憶體子系統130。系統100也包括使用位址轉換引擎150存取記憶體子系統130的一個或更多輸入/輸出裝置140,位址轉換引擎150是輸入/輸出橋接器152的一部分。1 is a block diagram of an example of a system 100 for sharing memory with an I/O device 140, including an address translation engine 150 for translating virtual addresses from various devices to physical addresses. System 100 includes processor core complex 110 , memory interconnect 120 and memory subsystem 130 . System 100 also includes one or more I/O devices 140 that access memory subsystem 130 using address translation engine 150 , which is part of I/O bridge 152 .

位址轉換引擎150集成於IO裝置140及系統互連120之間。例如,輸入/輸出裝置140可以包括可能具有到系統100的記憶體的DMA介面的IO加速器(例如加密加速器或DSP)、用於圖形的GPU、儲存控制器、及/或NIC。位址轉換引擎150的目的是既將裝置虛擬位址轉換為裝置DMA請求的實體位址,又為這樣的請求執行記憶體保護。The address translation engine 150 is integrated between the IO device 140 and the system interconnection 120 . For example, input/output devices 140 may include an IO accelerator (eg, a cryptographic accelerator or DSP), possibly with a DMA interface to system 100's memory, a GPU for graphics, a storage controller, and/or a NIC. The purpose of the address translation engine 150 is to both translate device virtual addresses to physical addresses for device DMA requests, and to perform memory protection for such requests.

例如,位址轉換引擎150可以用於促進來自輸入/輸出裝置140的DMA訊務。位址轉換引擎150可集成在IO橋接器152或外殼中以符合單晶片系統(SOC)的要求。這個IO橋接器152的角色可能因SOC不同而不同,但它可能必須處理請求重新排序、錯誤處理及/或特定屬性管理。For example, address translation engine 150 may be used to facilitate DMA traffic from I/O device 140 . The address translation engine 150 can be integrated in the IO bridge 152 or the housing to meet the requirements of a system on chip (SOC). The role of this IO bridge 152 may vary from SOC to SOC, but it may have to handle request reordering, error handling, and/or specific attribute management.

在範例中,對記憶體子系統130的裝置DMA請求(可以稱為入站事務)裝置可以由位址轉換引擎150處理。在一些實施中,從處理器核心複合體110到輸入/輸出裝置140的出站事務不由位址轉換引擎150管理,因為事務的位址已經是實體的(例如由核心的記憶體管理單元(MMU)轉換)。為了位址轉換引擎150執行位址轉換,它可以區分各種IO裝置140以選擇適當的轉換規則。然後,入站請求必須具有隨著請求(位址、讀/寫、屬性)的硬體識別符(例如硬體上下文識別符)以識別轉換項目。硬體識別符對裝置(或裝置中的程序)可以是唯一的。在一些實施中,每個IO裝置可以基於事務向位址轉換引擎150提供數個硬體識別符,但一個硬體識別符應只與一個裝置相關聯。In an example, device DMA requests (which may be referred to as inbound transactions) to memory subsystem 130 may be handled by address translation engine 150 . In some implementations, outbound transactions from the processor core complex 110 to the input/output devices 140 are not managed by the address translation engine 150 because the address of the transaction is already physical (e.g., by the core's memory management unit (MMU) ) to convert). In order for the address translation engine 150 to perform address translation, it can differentiate various IO devices 140 to select an appropriate translation rule. Incoming requests must then have a hardware identifier (eg hardware context identifier) along with the request (address, read/write, attributes) to identify the translation item. A hardware identifier may be unique to a device (or a program in a device). In some implementations, each IO device may provide several HIDs to the address translation engine 150 on a transaction basis, but one HID should only be associated with one device.

位址轉換引擎150可以包括將軟體程序與硬體識別符相關聯的機制。這可以藉由使用上下文識別符表將軟體上下文識別符連結至硬體識別符來達成。軟體上下文識別符可以是軟體程序的表示(例如使用位址空間識別符(ASID))。單一位址轉換引擎可以用於執行來自一個裝置或數個裝置的位址轉換。SOC中也可以有數個位址轉換引擎實例,每個實例都為一個IO裝置集合轉換位址。例如,位址轉換引擎150可以是圖3的位址轉換引擎300。Address translation engine 150 may include a mechanism for associating software programs with hardware identifiers. This can be achieved by linking software context identifiers to hardware identifiers using a context identifier table. A software context identifier may be a representation of a software program (eg, using an address space identifier (ASID)). A single address translation engine can be used to perform address translation from one device or several devices. There may also be several address conversion engine instances in the SOC, and each instance converts addresses for a set of IO devices. For example, the address translation engine 150 may be the address translation engine 300 of FIG. 3 .

圖2是用於與藉由匯流排連接至積體電路210的裝置共享記憶體的系統200的範例的方塊圖,積體電路210包括位址轉換引擎230,位址轉換引擎230包括使裝置間能夠共享位址轉換上下文的上下文識別符查找電路240。在此範例中,系統200是經由PCIe匯流排連接元件的快捷週邊元件互連介面(PCIe)系統。系統200包括用於執行指令的積體電路210、系統記憶體218及終端裝置(222、224及226),其中一些經由開關228連接至積體電路210。積體電路210包括一個或更多處理器核心212、系統架構214、用於與系統記憶體218介接的記憶體控制器216、用於經由PCIe匯流排與裝置介接的PCIe控制器220、以及用於將由終端裝置(222、224及226)使用的虛擬位址轉換成系統記憶體218的實體位址的位址轉換引擎230(例如圖3的位址轉換引擎300)。位址轉換引擎230包括位址轉換緩衝器232(例如轉換後備緩衝器(TLB))及上下文識別符查找電路240,上下文識別符查找電路240用於映射與終端裝置(222、224或226)相關聯的硬體識別符至可以用於促進裝置之間位址轉換上下文的共享的上下文識別符。例如,位址轉換緩衝器232可以是圖3的位址轉換緩衝器300。例如,系統200可以用於實施圖6的程序600及/或圖7的程序700。FIG. 2 is a block diagram of an example of a system 200 for sharing memory with a device connected by a bus to an integrated circuit 210 including an address translation engine 230 including an inter-device A context identifier lookup circuit 240 capable of sharing an address translation context. In this example, system 200 is a Peripheral Component Interconnect Express (PCIe) system that connects components via a PCIe bus. System 200 includes an integrated circuit 210 for executing instructions, a system memory 218 , and end devices ( 222 , 224 , and 226 ), some of which are connected to integrated circuit 210 via switches 228 . Integrated circuit 210 includes one or more processor cores 212, system architecture 214, memory controller 216 for interfacing with system memory 218, PCIe controller 220 for interfacing with devices via a PCIe bus, and an address translation engine 230 (such as the address translation engine 300 of FIG. 3 ) for translating virtual addresses used by the terminal devices ( 222 , 224 , and 226 ) into physical addresses of the system memory 218 . The address translation engine 230 includes an address translation buffer 232 (such as a translation lookaside buffer (TLB)) and a context identifier lookup circuit 240 for mapping Linked hardware identifiers to context identifiers that can be used to facilitate sharing of address translation contexts between devices. For example, address translation buffer 232 may be address translation buffer 300 of FIG. 3 . For example, system 200 may be used to implement procedure 600 of FIG. 6 and/or procedure 700 of FIG. 7 .

積體電路210包括位址轉換緩衝器232。位址轉換緩衝器232的項目包括標籤,標籤包括儲存上下文識別符的欄位。例如,位址轉換緩衝器232可以是轉換後備緩衝器。在一些實施中,位址轉換緩衝器232是二階位址轉換快取的一部份。位址轉換緩衝器232的項目可以用上下文識別符(例如軟體上下文識別符)及虛擬位址來標記。完整的上下文識別符可以用於標記VS-階段或巢套轉換,且上下文識別符的上部部分可以用於標記G-階段轉換。位址轉換緩衝器232中可以有數個項目具有相同虛擬位址但不同上下文識別符,或甚至具有相同虛擬位址及上下文識別符但不同特權等級。在一些實施中,位址轉換緩衝器232未以硬體識別符標記,因為多個IO裝置可以使用相同的轉換規則且被分配給相同的軟體上下文。然而,軟體上下文可以由特權等級及上下文識別符唯一地識別。The integrated circuit 210 includes an address translation buffer 232 . The entries of the address translation buffer 232 include tags including fields for storing context identifiers. For example, address translation buffer 232 may be a translation lookaside buffer. In some implementations, the address translation buffer 232 is part of a second-level address translation cache. Entries in address translation buffer 232 may be tagged with a context identifier (eg, a software context identifier) and a virtual address. The full context identifier can be used to mark VS-phase or nest transitions, and the upper part of the context identifier can be used to mark G-phase transitions. There may be several entries in address translation buffer 232 with the same virtual address but different context identifiers, or even the same virtual address and context identifier but different privilege levels. In some implementations, the address translation buffer 232 is not tagged with a hardware identifier because multiple IO devices can use the same translation rules and be assigned to the same software context. However, a software context can be uniquely identified by a privilege level and a context identifier.

積體電路210包括將上下文識別符儲存於陣列(例如上下文識別符項目陣列410)中的上下文識別符查找電路240,陣列由硬體識別符索引。上下文識別符查找電路240可被配置以接收包括第一硬體識別符值的輸入位址轉換請求;產生輸出位址轉換請求,其包括儲存於由第一硬體識別符值索引的陣列的項目中的第一上下文識別符值;以及將輸出位址轉換請求應用於位址轉換緩衝器232。位址轉換緩衝器232可以回復實體位址以回應輸出位址轉換請求,可以傳送輸出位址轉換請求以回應輸入位址轉換請求。在一些實施中,輸入位址轉換請求是直接記憶體存取請求的一部份。在一些實施中,輸入位址轉換請求從外部裝置經由PCIe匯流排被接收。例如,輸入位址轉換請求可以從終端裝置(222、224或226)經由PCIe控制器220被接收。例如,輸入位址轉換請求可以藉由上下文識別符查找電路240從處理器核心經由匯流排被接收,且第一硬體識別符值可以與處理器核心相關聯。Integrated circuit 210 includes context identifier lookup circuitry 240 that stores context identifiers in an array, such as context identifier entry array 410 , indexed by the hardware identifier. The context identifier lookup circuit 240 may be configured to receive an input address translation request comprising a first hardware identifier value; generate an output address translation request comprising an item stored in an array indexed by the first hardware identifier value and applying the output address translation request to the address translation buffer 232 . The address translation buffer 232 can return the physical address in response to the outgoing address translation request, and can transmit the outgoing address translation request in response to the incoming address translation request. In some implementations, the incoming address translation request is part of a DMA request. In some implementations, an incoming address translation request is received from an external device via a PCIe bus. For example, an incoming address translation request may be received from an end device ( 222 , 224 or 226 ) via PCIe controller 220 . For example, an incoming address translation request may be received from a processor core via the bus by context identifier lookup circuit 240, and a first hardware identifier value may be associated with the processor core.

在一些實施中,陣列被實施為單階表,且陣列的所有項目儲存於上下文識別符查找電路240的資料儲存中。例如,上下文識別符查找電路240可以具有2^n個項目,其中n是硬體識別符以位元為單位的寬度。在範例中,上下文識別符查找電路240的陣列可以具有256個本地儲存的項目。在一些實施中,陣列被實施為多階表,且陣列的至少一些項目儲存於由上下文識別符查找電路240經由匯流排存取的記憶體中。例如,上下文識別符查找電路240可以包括儲存具有項目的多階表的第一階的資料儲存,項目包括指向多階表的下一階的指標。例如,上下文識別符查找電路240可以將多階表的一些階儲存於系統記憶體218中。例如,上下文識別符查找電路240的陣列可以被實施為圖5的多階表500。In some implementations, the array is implemented as a single-level table, and all entries of the array are stored in the data storage of the context identifier lookup circuit 240 . For example, the context identifier lookup circuit 240 may have 2^n entries, where n is the width of the hardware identifier in bits. In an example, the array of context identifier lookup circuits 240 may have 256 locally stored entries. In some implementations, the array is implemented as a multi-level table, and at least some entries of the array are stored in memory accessed by the context identifier lookup circuit 240 via the bus. For example, the context identifier lookup circuit 240 may include data storage storing a first level of a multi-level table with entries including pointers to the next level of the multi-level table. For example, the context identifier lookup circuit 240 may store some levels of the multi-level table in the system memory 218 . For example, the array of context identifier lookup circuits 240 may be implemented as multi-level table 500 of FIG. 5 .

上下文識別符查找電路240的陣列的項目可以包括指明位址轉換引擎230將如何執行位址轉換的各種資訊。例如,陣列的項目可以包括上下文識別符(例如軟體上下文識別符;有效標誌;及/或實體頁碼(PPN) (例如用於單階段或僅G-階段轉換的4KiB PPN)。在一些實施中,由第一硬體識別符值索引的陣列的項目包括表明特權等級、虛擬化模式及轉換模式的轉換標籤,且轉換標籤被包括於輸出位址轉換請求中。例如,特權等級可以來自包括機器特權等級及管理員特權等級的特權等級集合。例如,轉換模式可以來自包括單階段轉換模式、僅G-階段模式、僅VS-階段及巢套轉換模式的轉換模式集合。轉換標籤可以是針對位址轉換緩衝器232中項目的標籤的一部分,且可以由位址轉換緩衝器232及/或位址轉換引擎230的其他元件使用以執行請求的位址轉換。The entries of the array of context identifier lookup circuits 240 may include various information specifying how address translation engine 230 is to perform address translation. For example, an item of the array may include a context identifier (e.g., a software context identifier; a valid flag; and/or a physical page number (PPN) (e.g., a 4KiB PPN for single-phase or G-phase-only transitions). In some implementations, The entries of the array indexed by the first hardware identifier value include a translation tag indicating the privilege level, virtualization mode, and translation mode, and the translation tag is included in the output address translation request. For example, the privilege level may be derived from including the machine privilege A set of privilege levels for grades and administrator privilege levels. For example, a translation pattern can be from a set of translation patterns including single-phase translation patterns, G-phase-only patterns, VS-phase-only, and nested translation patterns. A translation label can be for an address A portion of the tag of an entry in translation buffer 232 and may be used by address translation buffer 232 and/or other components of address translation engine 230 to perform the requested address translation.

可以經由執行以主機介面(例如經由機器命令佇列或管理員命令佇列)接收的命令來配置及維護上下文識別符查找電路240。上下文識別符查找電路240也可以偵測一些錯誤,並因此 (例如經由機器錯誤紀錄佇列或管理員錯誤紀錄佇列)將一些錯誤紀錄發送至主機。在一些實施中,上下文識別符查找電路240被配置以只允許具有機器特權等級的程序寫入陣列的項目中。在一些實施中,上下文識別符查找電路240被配置以只允許具有管理員特權等級的程序寫入陣列的項目中。在一些實施中,上下文識別符查找電路240被配置以只允許管理程序寫入陣列的項目中。例如,位址轉換引擎230可以實施圖7的程序700以配置及維護上下文識別符查找電路240的陣列。在一些實施中,為了除錯目的,可以由軟體經由主機介面間接讀取上下文識別符查找電路240。The context identifier lookup circuit 240 may be configured and maintained by executing commands received at a host interface (eg, via a machine command queue or an administrator command queue). The context ID lookup circuit 240 can also detect some errors and send some error records to the host accordingly (eg, via the machine error log queue or the administrator error log queue). In some implementations, the context identifier lookup circuit 240 is configured to only allow programs with a machine privilege level to write to the entries of the array. In some implementations, the context identifier lookup circuit 240 is configured to only allow programs with an administrator privilege level to write to the items of the array. In some implementations, the context identifier lookup circuit 240 is configured to only allow the hypervisor to write to the entries of the array. For example, address translation engine 230 may implement procedure 700 of FIG. 7 to configure and maintain an array of context identifier lookup circuits 240 . In some implementations, the context identifier lookup circuit 240 can be read indirectly by software via the host interface for debugging purposes.

注意,雖然圖2沒有顯示,積體電路210可以具有由單一位址轉換引擎230處理的數個PCIe控制器。可以建構提供至位址轉換引擎230的硬體識別符,使得每個終端裝置具有唯一識別符(例如硬體識別符的最高有效位元可能是根片段ID)。Note that although not shown in FIG. 2 , integrated circuit 210 may have several PCIe controllers handled by a single address translation engine 230 . The hardware identifier provided to the address translation engine 230 may be structured such that each end device has a unique identifier (eg, the most significant bit of the hardware identifier may be the root segment ID).

圖3是用於將來自各種裝置的虛擬位址映射到記憶體系統的實體位址的位址轉換引擎300的範例的方塊圖。位址轉換引擎300可以位於靠近它為其提供位址轉換的裝置的積體電路(例如SOC)中。位址轉換引擎300的作用是攔截到系統記憶體的裝置位址事務,並使用其位址轉換緩衝器310(例如包括一個或更多TLB)執行請求的位址轉換。如果轉換緩衝器不具有所請求的資訊,則位址轉換引擎300經由匯流排(例如系統互連)以至系統記憶體的介面執行頁表走查。位址轉換引擎300可以管理命令佇列及錯誤紀錄佇列以與軟體介接。位址轉換引擎300也可以實施硬體性能監視器。位址轉換引擎300包括位址轉換緩衝器310、頁表走查器320、許可檢查電路330、上下文識別符查找電路340及主機介面電路350。例如,可以使用位址轉換引擎300實施圖6的程序600及/或圖7的程序700。3 is a block diagram of an example of an address translation engine 300 for mapping virtual addresses from various devices to physical addresses in a memory system. Address translation engine 300 may be located in an integrated circuit (eg, an SOC) close to the device for which it provides address translation. The address translation engine 300 functions to intercept device address transactions to system memory and perform the requested address translation using its address translation buffer 310 (eg, comprising one or more TLBs). If the translation buffer does not have the requested information, the address translation engine 300 performs a page table walk through the bus (eg system interconnect) to system memory interface. The address translation engine 300 can manage command queues and error log queues to interface with software. Address translation engine 300 may also implement a hardware performance monitor. The address translation engine 300 includes an address translation buffer 310 , a page table walker 320 , a permission checking circuit 330 , a context identifier lookup circuit 340 and a host interface circuit 350 . For example, the address translation engine 300 may be used to implement the procedure 600 of FIG. 6 and/or the procedure 700 of FIG. 7 .

位址轉換引擎300包括位址轉換緩衝器310。例如,位址轉換緩衝器310可以是轉換後備緩衝器。在一些實施中,位址轉換緩衝器310是二階位址轉換快取的一部份。例如,位址轉換緩衝器310可以包括小且快的L1TLB以及較大的L2TLB。在TLB命中上,經轉換的位址及許可可以藉由命中佇列發送到許可檢查電路330,而L2TLB未命中則被發送至頁表走查器320。位址轉換緩衝器310可以與項目的可配置數量完全相關聯或集合相關聯。例如,位址轉換緩衝器310的替換策略可以是偽LRU (偽最近最少使用)。在一些實施中,位址轉換緩衝器310使用reg元素的向量將頁面轉換儲存在暫存器中,reg元素建立暫存器陣列,暫存器輸出延遲一個時鐘週期(這取決於其啟用訊號)的輸入訊號的副本。位址轉換緩衝器310可以在下一個時鐘週期以命中/未命中指示進行回應及儲存虛擬到實體頁面轉換(例如對於4KB頁面或2MB/1GB/512GB超級頁面)。例如,位址轉換緩衝器310可以使用內容可定址記憶體(CAM)或靜態隨機存取記憶體(SRAM)來實施。位址轉換緩衝器310的項目包括標籤,標籤包括儲存上下文識別符(例如軟體上下文識別符)的欄位,其可以由多個硬體裝置共享。項目的標籤也可以包括虛擬位址及/或表明特權等級、虛擬化模式及轉換模式的轉換標籤。例如,特權等級可以來自包括機器特權等級及管理員特權等級的特權等級集合。例如,轉換模式可以來自包括單階段轉換模式、僅G-階段模式、僅VS-階段模式及巢套轉換模式的轉換模式集合。位址轉換緩衝器310的項目包括資料,資料包括實體位址及許可資料(例如包括讀許可標誌、寫許可標誌及執行許可標誌)。The address translation engine 300 includes an address translation buffer 310 . For example, address translation buffer 310 may be a translation lookaside buffer. In some implementations, the address translation buffer 310 is part of a second-level address translation cache. For example, address translation buffer 310 may include a small and fast L1 TLB and a larger L2 TLB. On a TLB hit, the translated address and permission can be sent to the permission checking circuit 330 via the hit queue, while an L2TLB miss is sent to the page table walker 320 . The address translation buffer 310 may be associated with a configurable number of entries entirely or in sets. For example, the replacement policy of the address translation buffer 310 may be pseudo-LRU (pseudo-least recently used). In some implementations, the address translation buffer 310 stores page translations in scratchpads using a vector of reg elements that create arrays of scratchpads whose output is delayed by one clock cycle (depending on its enable signal) A copy of the input signal. The address translation buffer 310 can respond with a hit/miss indication on the next clock cycle and store the virtual-to-physical page translation (eg, for 4KB pages or 2MB/1GB/512GB superpages). For example, address translation buffer 310 may be implemented using content addressable memory (CAM) or static random access memory (SRAM). The entries of the address translation buffer 310 include tags, which include fields for storing context identifiers (eg, software context identifiers), which may be shared by multiple hardware devices. Item tags may also include virtual addresses and/or translation tags indicating privilege levels, virtualization modes, and translation modes. For example, the privilege level may be from a set of privilege levels including a machine privilege level and an administrator privilege level. For example, the transition patterns may be from a set of transition patterns including single-phase transition patterns, G-phase-only patterns, VS-phase-only patterns, and nested transition patterns. The items of the address conversion buffer 310 include data, and the data includes physical addresses and permission data (for example, including read permission flags, write permission flags, and execute permission flags).

位址轉換引擎300包括配置以執行頁表走查的頁表走查器320,以確定位址轉換以回應位址轉換緩衝器310中的快取未命中。例如,頁表走查器320可以包括並行頁表走查器實例,它們對非葉頁表共享快取。頁表走查器320可被配置以使用到系統互連的記憶體介面存取儲存在系統記憶體(例如系統記憶體218)中的頁表。The address translation engine 300 includes a page table walker 320 configured to perform a page table walk to determine address translations in response to cache misses in the address translation buffer 310 . For example, page table walker 320 may include parallel page table walker instances that share caches for non-leaf page tables. Page table walker 320 may be configured to access page tables stored in system memory (eg, system memory 218 ) using a memory interface to the system interconnect.

位址轉換引擎300包括從位址轉換緩衝器310接收經轉換的實體位址的許可檢查電路330。許可檢查電路330可以包括仲裁器以從頁表走查器回應佇列得到經轉換的請求。它執行許可檢查並經由裝置轉換完成介面(例如經由PCIe匯流排主介面)向請求的硬體裝置發送回應。在違反的情況下,許可檢查電路330也可以將錯誤記錄寫入適當的錯誤記錄佇列。Address translation engine 300 includes permission check circuitry 330 that receives translated physical addresses from address translation buffer 310 . The license check circuit 330 may include an arbiter to obtain translated requests from the page table walker response queue. It performs a permission check and sends a response to the requesting hardware device via a device conversion completion interface (eg, via a PCIe bus host interface). In the case of a violation, the permission checking circuit 330 may also write the error record to the appropriate error record queue.

位址轉換引擎300包括上下文識別符查找電路340。上下文識別符查找電路340將硬體識別符映射至可以與多個硬體裝置相關聯的上下文識別符(例如軟體上下文識別符)以促進在裝置之間的位址轉換上下文的共享。上下文識別符查找電路340將上下文識別符儲存於由硬體識別符索引的陣列(例如上下文識別符項目陣列410)中。上下文識別符查找電路340可被配置以接收包括第一硬體識別符值的輸入位址轉換請求;產生的輸出位址轉換請求,輸出位址轉換請求包括第一上下文識別符值,第一上下文識別符值儲存於由第一硬體識別符值索引的陣列的項目中;以及將輸出位址轉換請求應用於位址轉換緩衝器310。位址轉換緩衝器310可以回復實體位址以回應輸出位址轉換請求,其可以回應輸入位址轉換請求而(由許可檢查電路330)傳輸。在一些實施中,輸入位址轉換請求是直接記憶體存取請求的一部份。例如,輸入位址轉換請求可以經由裝置轉換請求介面而被接收,裝置轉換請求介面可以是從屬介面及有效/就緒類型的介面。例如,上下文識別符查找電路340可以是圖4的上下文識別符查找電路400。Address translation engine 300 includes context identifier lookup circuitry 340 . The context identifier lookup circuit 340 maps hardware identifiers to context identifiers (eg, software context identifiers) that may be associated with multiple hardware devices to facilitate sharing of address translation contexts among devices. The context identifier lookup circuit 340 stores the context identifier in an array indexed by the hardware identifier, such as the context identifier entry array 410 . The context identifier lookup circuit 340 may be configured to receive an input address translation request comprising a first hardware identifier value; generate an output address translation request, the output address translation request comprising a first context identifier value, the first context storing the identifier value in an entry of the array indexed by the first hardware identifier value; and applying the output address translation request to the address translation buffer 310 . Address translation buffer 310 may return physical addresses in response to outgoing address translation requests, which may be transmitted (by permission check circuit 330 ) in response to incoming address translation requests. In some implementations, the incoming address translation request is part of a DMA request. For example, an incoming address translation request may be received through a device translation request interface, which may be a slave interface and an active/ready type interface. For example, the context identifier lookup circuit 340 may be the context identifier lookup circuit 400 of FIG. 4 .

在一些實施中,陣列被實施為單階表且陣列的所有項目儲存於上下文識別符查找電路340的資料儲存中。例如,上下文識別符查找電路340可以具有2^n個項目,其中n是硬體識別符的位元寬度。在一個範例中,上下文識別符查找電路340的陣列可以具有256個本地儲存的項目。在一些實施中,陣列被實施為多階表,且陣列的至少一些項目被儲存於由上下文識別符查找電路340經由匯流排存取的記憶體中。例如,上下文識別符查找電路340可以包括儲存具有項目的多階表的第一階的資料儲存,項目包括指向多階表的下一階的指標。例如,上下文識別符查找電路340將多階表的一些階儲存於系統記憶體中。例如,上下文識別符查找電路340的陣列可以被實施為圖5的多階表500。In some implementations, the array is implemented as a single-level table and all entries of the array are stored in the data storage of the context identifier lookup circuit 340 . For example, the context identifier lookup circuit 340 may have 2^n entries, where n is the bit width of the hardware identifier. In one example, the array of context identifier lookup circuits 340 may have 256 locally stored entries. In some implementations, the array is implemented as a multi-level table, and at least some entries of the array are stored in memory accessed by the context identifier lookup circuit 340 via the bus. For example, the context identifier lookup circuit 340 may include data storage storing a first level of a multi-level table with entries including pointers to the next level of the multi-level table. For example, the context ID lookup circuit 340 stores some levels of the multi-level table in system memory. For example, the array of context identifier lookup circuits 340 may be implemented as multi-level table 500 of FIG. 5 .

上下文識別符查找電路340的陣列的項目可以包括具體指明位址轉換引擎230將如何執行位址轉換的各種資訊。例如,陣列的項目可以包括上下文識別符(例如軟體上下文識別符;有效標誌;及/或PPN (例如用於單階段或僅G-階段轉換的4KiB PPN))。在一些實施中,由第一硬體識別符值索引的陣列的項目包括表明特權等級、虛擬化模式及轉換模式的轉換標籤,且轉換標籤被包括於輸出位址轉換請求中。例如,特權等級可以來自包括機器特權等級及管理員特權等級的特權等級集合。例如,轉換模式可以來自包括單階段轉換模式、僅G-階段模式、僅VS-階段及巢套轉換模式的轉換模式集合。轉換標籤可以是針對位址轉換緩衝器310中項目的標籤的一部分,且可以由位址轉換緩衝器310及/或頁表走查器320使用以執行請求的位址轉換。上下文識別符查找電路340可被配置以將由第一硬體識別符值索引的陣列的項目以及轉換請求提供至位址轉換緩衝器310。The entries of the array of context identifier lookup circuits 340 may include various information specifying how the address translation engine 230 is to perform address translation. For example, items of the array may include context identifiers (eg, software context identifiers; valid flags; and/or PPNs (eg, 4KiB PPNs for single-phase or G-phase-only transitions)). In some implementations, the entry of the array indexed by the first hardware identifier value includes a translation tag indicating the privilege level, virtualization mode, and translation mode, and the translation tag is included in the outgoing address translation request. For example, the privilege level may be from a set of privilege levels including a machine privilege level and an administrator privilege level. For example, the transition patterns may be from a set of transition patterns including single-phase transition patterns, G-phase-only patterns, VS-phase-only, and nested transition patterns. A translation tag may be part of the tags for an entry in address translation buffer 310 and may be used by address translation buffer 310 and/or page table walker 320 to perform the requested address translation. The context identifier lookup circuit 340 may be configured to provide the address translation buffer 310 with the entries of the array indexed by the first hardware identifier value and the translation request.

位址轉換引擎300包括主機介面電路350。主機介面電路350使主機能夠配置及維護位址轉換引擎310,其包括寫入上下文識別符查找電路340的陣列。主機介面電路350可以輸出錯誤紀錄、系統性能資料及/或除錯資料。例如,主機介面電路350可以實施命令佇列及錯誤紀錄佇列。主機介面電路350可以包括至系統互連的從屬介面。The address translation engine 300 includes a host interface circuit 350 . Host interface circuitry 350 enables the host to configure and maintain address translation engine 310 , which includes an array of write context identifier lookup circuitry 340 . The host interface circuit 350 can output error logs, system performance data and/or debug data. For example, the host interface circuit 350 can implement a command queue and an error log queue. Host interface circuitry 350 may include a slave interface to the system interconnect.

圖4是用於在裝置之間共享位址轉換上下文的上下文識別符查找電路400的範例的方塊圖。上下文識別符查找電路400包括上下文識別符項目陣列410 (例如表格)、儲存用於上下文識別符項目陣列410的個別項目的有效標誌的有效向量420、以及仲裁器430。例如,上下文識別符項目陣列410可以用於實施圖6的程序600。4 is a block diagram of an example of a context identifier lookup circuit 400 for sharing an address translation context between devices. The context identifier lookup circuit 400 includes a context identifier entry array 410 (eg, a table), a valid vector 420 storing valid flags for individual entries of the context identifier entry array 410 , and an arbiter 430 . For example, context identifier entry array 410 may be used to implement procedure 600 of FIG. 6 .

上下文識別符項目陣列410的項目可以描述特定IO裝置的每個階段轉換規則。進入事務的硬體識別符被用作為上下文識別符項目陣列410中的索引,以尋找頁表的相關聯的基礎實體頁碼(PPN)。例如,在小型非虛擬化系統上,連接至位址轉換引擎(例如位址轉換引擎300)的裝置的數量是有限的,而在大型虛擬化系統上,裝置及虛擬功能的數量可以很大。為了支援這兩種類型的系統,上下文識別符查找電路400可以具有至上下文識別符項目陣列410的兩個介面。當硬體識別符小(例如8位元或更少),上下文識別符項目陣列410可以是儲存在上下文識別符查找電路400中的單階表。軟體可以使用命令透過主機介面來填充表格。當硬體識別符大(例如大於8位元),上下文識別符項目陣列410可以是儲存在系統記憶體中的多階表。Items of context identifier item array 410 may describe each phase transition rule for a particular IO device. The hardware identifier of the incoming transaction is used as an index into the context identifier entry array 410 to find the associated underlying entity page number (PPN) of the page table. For example, on a small non-virtualized system, the number of devices connected to an address translation engine (eg, address translation engine 300 ) is limited, while on a large virtualized system, the number of devices and virtual functions can be large. To support these two types of systems, the context identifier lookup circuit 400 may have two interfaces to the context identifier entry array 410 . When the hardware identifier is small (eg, 8 bits or less), the context identifier entry array 410 may be a single-level table stored in the context identifier lookup circuit 400 . The software can use commands to populate the form through the host interface. When the hardware identifier is large (eg, greater than 8 bits), the context identifier entry array 410 may be a multi-level table stored in system memory.

上下文識別符項目陣列410的目標是提供與核心(hart)的記憶體管理單元(MMU)中的satp暫存器類似的資訊以用於單階段轉換,以及將IO裝置連結至上下文識別符(例如軟體上下文識別符)。與硬體識別符將硬體裝置功能連結至唯一識別符一樣,軟體可以使用由上下文識別符項目陣列410儲存的上下文識別符,將程序(例如VMID及ASID的結合)連結至唯一軟體識別符。此上下文識別符用於標記用於位址轉換的頁表的基本PPN。軟體將持續硬體識別符至上下文識別符映射的追蹤。在一些實施中,上下文識別符項目陣列410最多提供30位元,因為它有可能很好地映射至14位元VMID及16位元ASID,但它是關於如何處理上下文識別符的軟體實施選擇。然而,軟體不需要將{VMID, ASID}元組一對一映射至上下文識別符。例如,上下文識別符可具有的參數化大小介於16至30位元(參數:SCIDWidth)之間。在請求兩階段轉換的虛擬化系統的情況下,整個上下文識別符(SCID)可以用於識別第一階段或VS-階段轉換,而稱為GSCID的上下文識別符的一部分適用於識別第二階段或G-階段轉換。上下文識別符的GSCIDWidth上部位元可以用作為GSCID。GSCIDWidth是必須等於或小於SCIDWidth的參數及GSCID = SCID [ SCIDWidth -1: SCIDWidth - GSCIDWidth ]。例如,如果我們假設SCID是表示元組{VMID, ASID}的30位元上下文識別符,則GSCID可以是可以表示VMID的14位元識別符。上下文識別符可以用於標記單階段轉換以及VS-階段或巢套轉換。GSCID可以用於標記G-階段轉換。The purpose of the context identifier entry array 410 is to provide information similar to the satp register in the memory management unit (MMU) of the kernel (hart) for single-phase transitions, and to link IO devices to context identifiers (e.g. software context identifier). In the same way that hardware identifiers link hardware device functions to unique identifiers, software can use context identifiers stored by context identifier entry array 410 to link programs (eg, a combination of VMID and ASID) to unique software identifiers. This context identifier is used to mark the base PPN of the page table for address translation. The software will keep track of the mapping of hardware identifiers to context identifiers. In some implementations, the context identifier entry array 410 provides a maximum of 30 bits, since it likely maps well to a 14-bit VMID and a 16-bit ASID, but it is a software implementation choice on how to handle context identifiers. However, software is not required to map {VMID, ASID} tuples one-to-one to context identifiers. For example, a context identifier may have a parameterized size between 16 and 30 bits (parameter: SCIDWidth). In the case of a virtualization system requesting a two-stage transition, the entire context identifier (SCID) can be used to identify the first-stage or VS-stage transition, while a part of the context identifier called the GSCID is adapted to identify the second-stage or G-phase transition. The GSCIDWidth upper bit of the context identifier may be used as the GSCID. GSCIDWidth is a parameter that must be equal to or less than SCIDWidth and GSCID = SCID [ SCIDWidth -1: SCIDWidth - GSCIDWidth ]. For example, if we assume that SCID is a 30-bit context identifier representing the tuple {VMID, ASID}, then GSCID can be a 14-bit identifier that can represent VMID. Context identifiers can be used to mark single-phase transitions as well as VS-phase or nested transitions. GSCID can be used to mark G-phase transitions.

上下文識別符項目陣列410可以透過軟體發送的命令而被本地寫入上下文識別符查找電路400。不管是M模式軟體或是S模式軟體發送命令以配置上下文識別符項目陣列410,上下文識別符查找電路400的硬體讀取來自佇列的命令並在檢查命令的有效性後適當地填入上下文識別符項目陣列410。為了除錯目的,可以向主機提供對上下文識別符項目陣列410的僅讀取存取,其中此陣列(例如表格)由軟體所建構。讀取存取可以僅限定於M模式軟體。在一些實施中,硬體可以防止從主機直接寫入上下文識別符項目陣列410。The context identifier entry array 410 can be written locally into the context identifier lookup circuit 400 through commands sent by software. Regardless of whether the M-mode software or the S-mode software sends the command to configure the context identifier entry array 410, the hardware of the context identifier lookup circuit 400 reads the command from the queue and fills in the context appropriately after checking the validity of the command Identifier item array 410 . For debugging purposes, the host may be provided with read-only access to an array 410 of context identifier entries, where this array (eg, table) is constructed by software. Read access can be restricted to M-mode software only. In some implementations, hardware may prevent direct writes to context identifier entry array 410 from the host.

當硬體識別符是大的(例如大於8位元),可以定義除存在系統記憶體中的多階表。例如,可以使用圖5的多階表500。When the hardware identifier is large (eg larger than 8 bits), a multilevel table can be defined and stored in system memory. For example, the multi-stage table 500 of FIG. 5 may be used.

上下文識別符查找電路400與數個模組介接。上下文識別符查找電路400經由IO裝置介面與外部裝置介接,IO裝置介面可以是至匯流排(例如PCIe匯流排)的裝置從屬介面。根據轉換請求的接收,當轉換被啟用時,此轉換的上下文必須使用請求的硬體識別符而從上下文識別符項目陣列410中提取。上下文識別符查找電路400經由TLB介面與位址轉換緩衝器(例如TLB)介接。一旦針對經請求的硬體識別符而讀取上下文識別符項目陣列410的項目,可以將結果與請求一起轉發至位址轉換緩衝器(例如位址轉換緩衝器310)。上下文識別符查找電路400經由主機介面與主機介接。例如,上下文識別符項目陣列410可以透過執行在M命令佇列或S命令佇列上接收到的命令而被配置及維護。上下文識別符查找電路400也可以偵測一些錯誤而因此發送一些錯誤紀錄至M錯誤紀錄佇列或S錯誤紀錄佇列。為了除錯目的,上下文識別符查找電路400可以由軟體透過主機介面間接被讀取。為了簡化無效程序,與上下文識別符項目陣列410的每個項目相關聯的有效標誌或位元可以在有效向量420中被維護。例如,上下文識別符項目陣列410存取可以包括: 根據由請求有效輸入訊號指示的裝置轉換請求,輸入位址轉換請求的硬體識別符值的讀取。硬體識別符值索引的上下文識別符項目陣列410的項目在下一個時鐘週期以及項目有效訊號上是可用的。請求屬性可以被鎖定並經由TLB介面(例如ReqHCID_Out、ReqVA_Out、ReqID_out、Req_RWN_out)提供至位址轉換緩衝器。 The context identifier lookup circuit 400 interfaces with several modules. The context identifier lookup circuit 400 interfaces with external devices via an IO device interface, which may be a device slave interface to a bus (eg, PCIe bus). Upon receipt of a translation request, when the translation is enabled, the context for the translation must be extracted from the context identifier entry array 410 using the requested hardware identifier. The context identifier lookup circuit 400 interfaces with an address translation buffer (eg, TLB) via a TLB interface. Once an entry of context identifier entry array 410 is read for a requested hardware identifier, the result may be forwarded with the request to an address translation buffer (eg, address translation buffer 310 ). The context identifier lookup circuit 400 interfaces with the host via a host interface. For example, the context identifier entry array 410 may be configured and maintained by executing commands received on the M command queue or the S command queue. The context identifier lookup circuit 400 can also detect some errors and thus send some error records to the M error record queue or the S error record queue. For debugging purposes, the context identifier lookup circuit 400 can be read indirectly by software through the host interface. To simplify invalidation procedures, a valid flag or bit associated with each entry of context identifier item array 410 may be maintained in valid vector 420 . For example, context identifier entry array 410 access may include: According to the device conversion request indicated by the request valid input signal, the reading of the hardware identifier value of the input address conversion request is performed. The entries of the context identifier entry array 410 indexed by the hardware identifier value are available on the next clock cycle and on the entry valid signal. Request attributes can be pinned and provided to address translation buffers via TLB interfaces (eg, ReqHCID_Out, ReqVA_Out, ReqID_out, Req_RWN_out).

在WrHCID的寫入以建立由WrCTEVal請求的項目(WrCTE、WrPrivL)。上下文識別符項目陣列410的項目的適當PrivL欄位可以基於命令來源佇列而建立。當從M命令佇列執行CMD_SET_ONE_CTE命令,M模式特權等級被建立,且當從S命令佇列執行CMD_SET_ONE_CTE命令,S模式特權等級被建立。建立Valid[WrHCID]位元以表明上下文識別符項目陣列410中對應的項目是有效的。Writes at WrHCID to build entries (WrCTE, WrPrivL) requested by WrCTEVal. The appropriate PrivL fields for the items of the context identifier item array 410 can be established based on the command source queue. When the CMD_SET_ONE_CTE command is executed from the M command queue, the M mode privilege level is established, and when the CMD_SET_ONE_CTE command is executed from the S command queue, the S mode privilege level is established. The Valid[WrHCID] bit is set up to indicate that the corresponding entry in the context identifier entry array 410 is valid.

當由除錯讀取訊號請求時,在DbgHCID的讀取。為了除錯目的,來自主機介面的軟體具有至上下文識別符項目陣列410的間接讀取存取。Read in DbgHCID when requested by debug read signal. Software from the host interface has indirect read access to the context identifier entry array 410 for debugging purposes.

根據無效命令,需要更新有效向量420。當InvalidateAll訊號被建立(CMD_INV_ALL_CTE)時,重置整個有效向量420。當無效訊號建立時,與從InvHCIDBlock及InvVector輸入中提取的硬體識別符(或硬體識別符集合)相關聯的一個有效標誌(或複數個有效標誌)被清除。(命令:CMD_INV_ONE_CTE、CMD_INV_MANY_CTE)。According to the invalid command, the valid vector 420 needs to be updated. When the InvalidateAll signal is asserted (CMD_INV_ALL_CTE), the entire valid vector 420 is reset. A valid flag (or valid flags) associated with a hardware identifier (or set of hardware identifiers) extracted from the InvHCIDBlock and InvVector inputs is cleared when the invalid signal is asserted. (Commands: CMD_INV_ONE_CTE, CMD_INV_MANY_CTE).

仲裁器430包括控制對上下文識別符項目陣列410的存取的電路。例如,仲裁器430可以在IO裝置介面及主機介面之間實施簡單循環法(round robin),主機介面可以分為除錯介面及佇列管理介面。Arbiter 430 includes circuitry that controls access to array 410 of context identifier entries. For example, the arbiter 430 can implement a simple round robin between the IO device interface and the host interface, and the host interface can be divided into a debugging interface and a queue management interface.

圖5是實施陣列的多階表500的範例的方塊圖,該陣列用於將硬體識別符映射至位址轉換請求中的軟體上下文識別符。當硬體識別符寬度是大的(例如大於8位元),將儲存於系統記憶體的陣列定義為多階表是有利的。在圖5中,陣列被實施為多階表500,且陣列的至少一些項目儲存於由上下文識別符查找電路經由匯流排(例如系統架構214)的記憶體中。5 is a block diagram of an example of a multilevel table 500 implementing an array for mapping hardware identifiers to software context identifiers in address translation requests. When the HID width is large (eg, greater than 8 bits), it is advantageous to define the array stored in system memory as a multi-level table. In FIG. 5, the array is implemented as a multilevel table 500, and at least some entries of the array are stored in memory via a bus (eg, system architecture 214) by the context identifier lookup circuitry.

多階表500的每個階可以具有定義在位址轉換引擎的配置暫存器中的大小(PLATE_CIDT_L_WIDTH)以及第一階的基本位址(PLATE_CIDT_BASE)。在多階表500中,硬體識別符的上部8位元(由CID_L1_width定義)是用於定址CID_L1表格510。例如,硬體識別符的上部8位元可以對應至匯流排。描述符512表明剩餘的階以及指向下一個CID表階520的指標。硬體識別符的接下來8位元(由CID_L2_width定義)是用於索引CID_L2表格520。硬體識別符的接下來8位元可以對應於裝置及功能。例如,多階表500的描述符可以包括表明描述符是否有效的有效位元、表明為了達到陣列的項目而讀取的階數量的剩餘階欄位(例如4階表的2位元)、以及是多階表500的下一階的實體位址的下一階指標。描述符522表明剩餘階以及指向下一個CID表階530的指標。硬體識別符的接下來10位元(由CID_L3_width定義)是用於索引CID_L3表格530。例如,硬體識別符的接下來10位元可以對應於PSAID的最高有效位元。描述符532表明剩餘階以及指向下一個CID表階540的指標。硬體識別符的最後10位元(由CID_L4_width定義)是用於索引CID_L4表格540。例如,硬體識別符的最後10位元可以對應於PSAID的最低有效位元。陣列542的經請求項目由CID_L4表格540中硬體識別符的最後10位元索引。Each level of the multi-level table 500 may have a size (PLATE_CIDT_L_WIDTH) defined in the configuration register of the address translation engine and a base address of the first level (PLATE_CIDT_BASE). In the multilevel table 500 , the upper 8 bits of the hardware identifier (defined by CID_L1_width ) are used to address the CID_L1 table 510 . For example, the upper 8 bits of the hardware identifier may correspond to a bus. Descriptor 512 indicates the remaining level and a pointer to the next CID table level 520 . The next 8 bits of the hardware identifier (defined by CID_L2_width) are used to index the CID_L2 table 520 . The next 8 bits of the hardware identifier may correspond to the device and function. For example, a descriptor for a multi-level table 500 may include a valid bit indicating whether the descriptor is valid, a remaining level field indicating the number of levels to read to reach an entry in the array (e.g., 2 bits for a 4-level table), and is the next-level index of the next-level physical address of the multi-level table 500 . Descriptor 522 indicates the remaining levels and a pointer to the next CID table level 530 . The next 10 bits of the hardware identifier (defined by CID_L3_width) are used to index the CID_L3 table 530 . For example, the next 10 bits of the hardware identifier may correspond to the most significant bits of the PSAID. Descriptor 532 indicates the remaining level and a pointer to the next CID table level 540 . The last 10 bits of the hardware identifier (defined by CID_L4_width) are used to index the CID_L4 table 540 . For example, the last 10 bits of the hardware identifier may correspond to the least significant bits of the PSAID. The requested entry of array 542 is indexed by the last 10 bits of the hardware identifier in CID_L4 table 540 .

多階表500可被配置以使用2、3或4階。例如,一些終端可以支援PASID,而一些可能不支援。描述符中的剩餘階資訊有助於優化為了上下文識別符查找電路去讀取以到達其陣列的相關項目的表格的數量及多階表500的大小。例如,描述符514表明一個剩餘階及指向下一個CID表階550的指標。硬體識別符的接下來8位元(由CID_L2_width定義)用於索引CID_L2表格520以擷取陣列552的所請求項目。例如,硬體識別符的接下來8位元可以對應於裝置及功能。在一些實施中,多階表500表格必須由最高特權軟體(Mmode)配置,因為它需要在陣列的項目中建立PrivL欄位。The multilevel table 500 can be configured to use 2, 3 or 4 levels. For example, some terminals may support PASID and some may not. The remaining level information in the descriptor helps to optimize the number of tables and the size of the multilevel table 500 for the context identifier lookup circuit to read to reach the associated item of its array. For example, descriptor 514 indicates a remaining level and a pointer to the next CID table level 550 . The next 8 bits of the hardware identifier (defined by CID_L2_width) are used to index the CID_L2 table 520 to retrieve the requested entry of the array 552 . For example, the next 8 bits of the hardware identifier may correspond to device and function. In some implementations, the multi-level table 500 table must be configured by the most privileged software (Mmode) because it requires the establishment of the PrivL field in the array's entries.

在一些實施中,上下文識別符查找電路包括儲存具有項目的多階表500的第一階510的資料儲存,項目包括指向多階表500的下一階的指標。In some implementations, the context identifier lookup circuitry includes a data store storing the first level 510 of the multilevel table 500 with entries including pointers to the next level of the multilevel table 500 .

圖6是用於將硬體識別符映射至位址轉換請求中的軟體上下文識別符的程序600的範例的流程圖,程序600使裝置間能夠共享位址轉換上下文。程序600包括接收610包括第一硬體識別符值的輸入位址轉換請求;產生620輸出位址轉換請求,輸出位址轉換請求包括儲存於由第一硬體識別符值索引的陣列的項目中的第一上下文識別符值;以及將輸出位址轉換請求應用630於位址轉換緩衝器。例如,程序600可以使用圖1的系統100來實施。例如,程序600可以使用圖2的積體電路210來實施。例如,程序600可以使用圖3的位址轉換引擎300來實施。例如,程序600可以使用圖4的上下文識別符查找電路400來實施。6 is a flowchart of an example of a process 600 for mapping a hardware identifier to a software context identifier in an address translation request, the process 600 enabling sharing of an address translation context between devices. Process 600 includes receiving 610 an input address translation request comprising a first hardware identifier value; generating 620 an output address translation request comprising an item stored in an array indexed by the first hardware identifier value and applying 630 the output address translation request to the address translation buffer. For example, procedure 600 may be implemented using system 100 of FIG. 1 . For example, procedure 600 may be implemented using integrated circuit 210 of FIG. 2 . For example, program 600 may be implemented using address translation engine 300 of FIG. 3 . For example, procedure 600 may be implemented using context identifier lookup circuit 400 of FIG. 4 .

程序600包括接收610包括第一硬體識別符值的輸入位址轉換請求。例如,可以從處理器核心經由匯流排接收610輸入位址轉換請求,且第一硬體識別符值可以與處理器核心相關聯。例如,輸入位址轉換請求可以是直接記憶體存取請求的一部份。在一些實施中,從外部裝置(例如終端裝置222)經由PCIe匯流排接收輸入位址轉換請求。Process 600 includes receiving 610 an incoming address translation request including a first hardware identifier value. For example, an incoming address translation request can be received 610 from a processor core via a bus, and a first hardware identifier value can be associated with the processor core. For example, an incoming address translation request may be part of a DMA request. In some implementations, an incoming address translation request is received from an external device (eg, end device 222 ) via the PCIe bus.

程序600包括產生620輸出位址轉換請求,輸出位址轉換請求包括儲存於由第一硬體識別符值索引的陣列(例如上下文識別符項目陣列410)的項目中的第一上下文識別符值。在一些實施中,陣列被實施為單階表。在一些實施中,陣列被實施為多階表(例如多階表500),且陣列的至少一些項目儲存於經由匯流排存取的記憶體中。例如,多階表的第一階可以具有包括指向多階表的下一階的指標的項目。陣列可以由主機軟體(例如管理程序)配置及維持。例如,圖7的程序700可以經由主機介面實施以更新陣列。在一些實施中,由第一硬體識別符值索引的陣列的項目包括表明特權等級、虛擬化模式及轉換模式的轉換標籤,且轉換標籤被包括在輸出位址轉換請求中。例如,特權等級可以來自包括機器特權等級及管理員特權等級的特權等級集合。例如,轉換模式可以來自包括一單階段轉換模式、僅G-階段模式、僅VS-階段及巢套轉換模式的轉換模式集合。Process 600 includes generating 620 an output address translation request including a first context identifier value stored in an item of an array indexed by the first hardware identifier value (eg, context identifier entry array 410 ). In some implementations, the array is implemented as a single-level table. In some implementations, the array is implemented as a multi-level table (eg, multi-level table 500 ), and at least some entries of the array are stored in memory accessed via the bus. For example, the first level of a multi-level table may have entries that include pointers to the next level of the multi-level table. Arrays can be configured and maintained by host software, such as a hypervisor. For example, process 700 of FIG. 7 may be implemented via a host interface to update an array. In some implementations, the entry of the array indexed by the first hardware identifier value includes a translation tag indicating the privilege level, virtualization mode, and translation mode, and the translation tag is included in the outgoing address translation request. For example, the privilege level may be from a set of privilege levels including a machine privilege level and an administrator privilege level. For example, the transition pattern may be from a set of transition patterns including a single-phase transition pattern, G-phase-only pattern, VS-phase-only, and nested transition patterns.

程序600包括將輸出位址轉換請求應用630於位址轉換緩衝器(例如位址轉換快取)。位址轉換緩衝器的項目包括標籤,標籤包括儲存上下文識別符的欄位。例如,位址轉換緩衝器可以是轉換後備緩衝器。在一些實施中,位址轉換緩衝器是二階位址轉換快取的一部份。Process 600 includes applying 630 an output address translation request to an address translation buffer (eg, an address translation cache). Items in the address translation buffer include tags including fields for storing context identifiers. For example, the address translation buffer may be a translation lookaside buffer. In some implementations, the address translation buffer is part of a second-level address translation cache.

圖7是基於來自主機的寫入請求以更新上下文識別符查找電路的陣列的程序700的範例的流程圖。程序700包括接收710來自主機的寫入請求。例如,主機可以是在包括上下文識別符查找電路的積體電路的處理器核心上運行的程序。程序700包括檢查720主機的特權等級。如果(在725)滿足特權等級需求,則程序700包括基於寫入請求更新730陣列的項目。如果(在725)不滿足特權等級需求,則程序700包括拒絕740寫入請求。在一些實施中,檢查720主機的特權等級包括在更新730陣列的項目之前檢查720寫入陣列的主機具有機器特權等級。在一些實施中,檢查720主機的特權等級包括在更新730陣列的項目之前檢查720寫入陣列的主機具有管理員特權等級。在一些實施中,檢查720主機的特權等級包括在更新730陣列的項目之前檢查720寫入陣列的主機具有是管理程序。例如,程序700可以使用圖1的系統100來實施。例如,程序700可以使用圖2的積體電路210來實施。例如,程序700可以使用圖3的位址轉換引擎300來實施。例如,程序700可以使用圖4的上下文識別符查找電路400來實施。7 is a flowchart of an example of a process 700 for updating an array of context identifier lookup circuits based on a write request from a host. Process 700 includes receiving 710 a write request from a host. For example, a host may be a program running on a processor core of an integrated circuit including a context identifier lookup circuit. Process 700 includes checking 720 the privilege level of the host. If (at 725) the privilege level requirements are met, the procedure 700 includes updating 730 the entries of the array based on the write request. If (at 725 ) the privilege level requirements are not met, procedure 700 includes denying 740 the write request. In some implementations, checking 720 the privilege level of the host includes checking 720 that the host writing to the array has a machine privilege level prior to updating 730 an item of the array. In some implementations, checking 720 the privilege level of the host includes checking 720 that the host writing to the array has an administrator privilege level prior to updating 730 an item of the array. In some implementations, checking 720 the privilege level of the host includes checking 720 that the host writing to the array has a hypervisor prior to updating 730 an entry in the array. For example, procedure 700 may be implemented using system 100 of FIG. 1 . For example, routine 700 may be implemented using integrated circuit 210 of FIG. 2 . For example, program 700 may be implemented using address translation engine 300 of FIG. 3 . For example, procedure 700 may be implemented using context identifier lookup circuit 400 of FIG. 4 .

在第一方面,本說明書中描述的主題可以體現在用於執行指令的積體電路中,積體電路包括位址轉換緩衝器,其中位址轉換緩衝器的項目包括標籤,標籤包括儲存上下文識別符的欄位;以及上下文識別符查找電路,將上下文識別符儲存於由硬體識別符索引的陣列中,其中上下文識別符查找電路被配置以:接收包括第一硬體識別符值的輸入位址轉換請求;產生輸出位址轉換請求,輸出位址轉換請求包括儲存於由第一硬體識別符值索引的陣列的項目中的第一上下文識別符值;以及將輸出位址轉換請求應用於位址轉換緩衝器。In a first aspect, the subject matter described in this specification can be embodied in an integrated circuit for executing instructions, the integrated circuit including an address translation buffer, wherein the entries of the address translation buffer include tags, the tags include storage context identification and a context identifier lookup circuit storing the context identifier in an array indexed by the hardware identifier, wherein the context identifier lookup circuit is configured to: receive an input bit comprising a first hardware identifier value an address translation request; generate an output address translation request, the output address translation request including a first context identifier value stored in an entry of an array indexed by a first hardware identifier value; and applying the output address translation request to address translation buffer.

在第二方面,本說明書中描述的主題可以體現在方法中,包括接收包括第一硬體識別符值的輸入位址轉換請求;產生輸出位址轉換請求,包括儲存於由第一硬體識別符值索引的陣列的項目中的第一上下文識別符值;以及將輸出位址轉換請求應用於位址轉換緩衝器,其中位址轉換緩衝器的項目包括儲存上下文識別符的欄位的標籤。In a second aspect, the subject matter described in this specification can be embodied in a method comprising receiving an input address translation request comprising a first hardware identifier value; generating an output address translation request comprising storing in a the first context identifier value in an entry of the array indexed by the symbol value; and applying the output address translation request to an address translation buffer, wherein the address translation buffer entry includes a tag for a field storing the context identifier.

雖然已經結合某些實施方式描述了本揭露內容,但是應當理解,本揭露內容不限於所揭露的實施方式,相反地,旨在涵蓋包括在所附申請專利範圍中的各種修改與等效設置,該範圍應給予最廣泛的解釋,以涵蓋所有此類修改及等效結構。Although the present disclosure has been described in connection with certain embodiments, it should be understood that the present disclosure is not limited to the disclosed embodiments, but instead, is intended to cover various modifications and equivalent arrangements included in the patent scope of the appended application, The scope should be given the broadest interpretation to cover all such modifications and equivalent constructions.

100、200:系統 110:處理器核心複合體 120:記憶體互連 130:記憶體子系統 140:IO裝置 150:位址轉換引擎 152:IO橋接器 210:積體電路 212:處理器核心 214:系統架構 216:記憶體控制器 218:系統記憶體 220:PCIe控制器 222、224、226:終端裝置 228:開關 230:位址轉換引擎 232:位址轉換緩衝器 240:上下文識別符(CID)查找電路 300:位址轉換引擎 310:位址轉換緩衝器 320:頁表走查器 330:許可檢查電路 340、400:CID查找電路 350:主機介面電路 410:上下文ID項目陣列 420:有效向量 430:仲裁器 500:多階表 510:定址CID_L1表格 512、514、522、532:描述符 520:索引CID_L2表格 530:索引CID_L3表格 540:索引CID_L4 542、552:陣列 550:CID表階 600、700:程序 TLB:轉換後備緩衝器 100, 200: system 110: Processor core complex 120:Memory Interconnect 130:Memory Subsystem 140:IO device 150:Address translation engine 152: IO bridge 210: Integrated circuit 212: processor core 214: System Architecture 216: Memory controller 218: System memory 220: PCIe controller 222, 224, 226: terminal device 228: switch 230:Address conversion engine 232: Address conversion buffer 240: Context identifier (CID) lookup circuit 300: address translation engine 310: address conversion buffer 320:Page table walkthrough 330: license check circuit 340, 400: CID search circuit 350: host interface circuit 410: Context ID item array 420: valid vector 430: Arbiter 500: multi-stage table 510: Address CID_L1 form 512, 514, 522, 532: Descriptors 520: Index CID_L2 table 530: Index CID_L3 table 540: Index CID_L4 542, 552: array 550: CID table order 600, 700: program TLB: translation lookaside buffer

當配合所附圖式閱讀時,本揭露內容從以下的詳細描述被最佳地理解。要強調的是,根據通常實務,圖式的各種特徵未按照比例。相反地,為清楚起見,各種特徵的尺寸可被任意放大或縮小。 圖1是用於與輸入/輸出裝置共享記憶體的系統的範例的方塊圖,該系統包括用於將來自各種裝置的虛擬位址轉換為實體位址的位址轉換引擎。 圖2是用於與藉由匯流排連接至積體電路的裝置共享記憶體的系統的範例的方塊圖,該系統包括位址轉換引擎,該位址轉換引擎包括使裝置間能夠共享位址轉換上下文的上下文識別符查找電路。 圖3是用於將來自各種裝置的虛擬位址映射到記憶體系統的實體位址的位址轉換引擎的範例的方塊圖。 圖4是用於在裝置之間共享位址轉換上下文的上下文識別符查找電路的範例的方塊圖。 圖5是實施陣列的多階表的範例的方塊圖,該陣列用於將硬體識別符映射至位址轉換請求中的軟體上下文識別符。 圖6是用於將硬體識別符映射至位址轉換請求中的軟體上下文識別符的程序的範例的流程圖,以使裝置間能夠共享位址轉換上下文。 圖7是程序的範例的流程圖,該程序用於基於來自主機的寫入請求以更新上下文識別符查找電路的陣列。 The present disclosure is best understood from the following Detailed Description when read with the accompanying Drawings. It is emphasized that, according to common practice, the various features of the drawings are not to scale. On the contrary, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. 1 is a block diagram of an example of a system for sharing memory with I/O devices, the system including an address translation engine for translating virtual addresses from various devices into physical addresses. 2 is a block diagram of an example of a system for sharing memory with devices connected to an integrated circuit by a bus, the system including an address translation engine including an address translation engine that enables sharing of address translation between devices. The context identifier for the context lookup circuit. 3 is a block diagram of an example of an address translation engine for mapping virtual addresses from various devices to physical addresses in a memory system. 4 is a block diagram of an example of a context identifier lookup circuit for sharing an address translation context between devices. 5 is a block diagram of an example of a multilevel table implementing an array for mapping hardware identifiers to software context identifiers in address translation requests. 6 is a flowchart of an example of a procedure for mapping a hardware identifier to a software context identifier in an address translation request, so that the address translation context can be shared among devices. 7 is a flowchart of an example of a procedure for updating an array of context identifier lookup circuits based on a write request from a host.

200:系統 200: system

210:積體電路 210: Integrated circuit

212:處理器核心 212: processor core

214:系統架構 214: System Architecture

216:記憶體控制器 216: Memory controller

218:系統記憶體 218: System memory

220:PCIe控制器 220: PCIe controller

222、224、226:終端裝置 222, 224, 226: terminal device

228:開關 228: switch

230:位址轉換引擎 230:Address translation engine

232:位址轉換緩衝器 232: Address conversion buffer

240:上下文識別符(CID)查找電路 240: Context identifier (CID) lookup circuit

Claims (20)

一種積體電路,包括: 一位址轉換緩衝器,其中該位址轉換緩衝器的一項目包括一標籤,該標籤包括儲存一上下文識別符的一欄位;以及 一上下文識別符查找電路,將上下文識別符儲存於由一硬體識別符索引的一陣列中,其中該上下文識別符查找電路被配置以: 接收包括一第一硬體識別符值的一輸入位址轉換請求; 產生一輸出位址轉換請求,該輸出位址轉換請求包括儲存於由該第一硬體識別符值索引的該陣列的一項目中的一第一上下文識別符值;以及 將該輸出位址轉換請求應用於該位址轉換緩衝器。 An integrated circuit comprising: an address translation buffer, wherein an entry of the address translation buffer includes a tag including a field storing a context identifier; and a context identifier lookup circuit storing the context identifiers in an array indexed by a hardware identifier, wherein the context identifier lookup circuit is configured to: receiving an incoming address translation request including a first hardware identifier value; generating an output address translation request including a first context identifier value stored in an entry of the array indexed by the first hardware identifier value; and The output translating request is applied to the translating buffer. 如請求項1所述的積體電路,其中該陣列被實施為一單階表且該陣列的所有項目儲存於該上下文識別符查找電路的一資料儲存中。The integrated circuit of claim 1, wherein the array is implemented as a single-level table and all entries of the array are stored in a data store of the context identifier lookup circuit. 如請求項1所述的積體電路,其中該陣列被實施為一多階表,且該陣列的至少一些項目儲存於由該上下文識別符查找電路經由一匯流排存取的一記憶體中。The integrated circuit of claim 1, wherein the array is implemented as a multi-level table, and at least some entries of the array are stored in a memory accessed by the context identifier lookup circuit via a bus. 如請求項3所述的積體電路,其中該上下文識別符查找電路包括儲存具有項目的該多階表的一第一階的一資料儲存,該項目包括指向該多階表的一下一階的一指標。The integrated circuit of claim 3, wherein the context identifier lookup circuit includes a data store storing a first level of the multilevel table having entries including pointers to the next next level of the multilevel table an indicator. 如請求項1所述的積體電路,其中該上下文識別符查找電路從一處理器核心經由一匯流排接收該輸入位址轉換請求,且該第一硬體識別符值與該處理器核心相關聯。The integrated circuit of claim 1, wherein the context identifier lookup circuit receives the input address translation request from a processor core via a bus, and the first hardware identifier value is associated with the processor core couplet. 如請求項1所述的積體電路,其中該上下文識別符查找電路被配置以只允許具有一機器特權等級的程序寫入該陣列的項目。The integrated circuit of claim 1, wherein the context identifier lookup circuit is configured to only allow programs with a machine privilege level to write to the items of the array. 如請求項1所述的積體電路,其中該上下文識別符查找電路被配置以只允許具有一管理員特權等級的程序寫入該陣列的項目。The integrated circuit of claim 1, wherein the context identifier lookup circuit is configured to only allow programs with an administrator privilege level to write to the items of the array. 如請求項1所述的積體電路,其中該上下文識別符查找電路被配置以只允許一管理程序寫入該陣列的項目。The integrated circuit of claim 1, wherein the context identifier lookup circuit is configured to only allow a hypervisor to write to the entries of the array. 如請求項1所述的積體電路,其中該位址轉換緩衝器是一二階位址轉換快取的一部份。The integrated circuit of claim 1, wherein the address translation buffer is part of a second-level address translation cache. 如請求項1所述的積體電路,其中該位址轉換緩衝器是一轉換後備緩衝器。The integrated circuit as claimed in claim 1, wherein the address translation buffer is a translation lookaside buffer. 如請求項1所述的積體電路,其中該輸入位址轉換請求是一直接記憶體存取請求的一部份。The integrated circuit of claim 1, wherein the input address translation request is part of a direct memory access request. 如請求項1所述的積體電路,其中該輸入位址轉換請求是從一外部裝置經由一PCIe匯流排被接收。The integrated circuit of claim 1, wherein the input address translation request is received from an external device via a PCIe bus. 如請求項1所述的積體電路,其中由該第一硬體識別符值索引的該陣列的該項目包括表明一特權等級、一虛擬化模式及一轉換模式的一轉換標籤,且該轉換標籤被包括在該輸出位址轉換請求中。The integrated circuit of claim 1, wherein the entry of the array indexed by the first hardware identifier value includes a translation tag indicating a privilege level, a virtualization mode, and a translation mode, and the translation Tags are included in the outgoing address translation request. 如請求項13所述的積體電路,其中該特權等級來自包括一機器特權等級及一管理員特權等級的一特權等級集合。The integrated circuit of claim 13, wherein the privilege level is from a set of privilege levels including a machine privilege level and an administrator privilege level. 如請求項13所述的積體電路,其中該轉換模式來自包括一單階段轉換模式、一僅G-階段模式、一僅VS-階段及一巢套轉換模式的一轉換模式集合。The integrated circuit of claim 13, wherein the switching pattern is from a set of switching patterns including a single-stage switching pattern, a G-stage-only pattern, a VS-stage-only, and a nested switching pattern. 一種方法,包括: 接收包括一第一硬體識別符值的一輸入位址轉換請求; 產生一輸出位址轉換請求,該輸出位址轉換請求包括儲存於由該第一硬體識別符值索引的一陣列的一項目中的一第一上下文識別符值;以及 將該輸出位址轉換請求應用於一位址轉換緩衝器,其中該位址轉換緩衝器的一項目包括一標籤,該標籤包括儲存一上下文識別符的一欄位。 A method comprising: receiving an incoming address translation request including a first hardware identifier value; generating an outgoing address translation request including a first context identifier value stored in an entry of an array indexed by the first hardware identifier value; and The output address translation request is applied to an address translation buffer, wherein an entry of the address translation buffer includes a tag including a field storing a context identifier. 如請求項16所述的方法,其中該陣列被實施為一單階表。The method of claim 16, wherein the array is implemented as a single-level table. 如請求項16所述的方法,其中該陣列被實施為一多階表,且該陣列的至少一些項目儲存於經由一匯流排存取的一記憶體中。The method of claim 16, wherein the array is implemented as a multi-level table and at least some entries of the array are stored in a memory accessed via a bus. 如請求項18所述的方法,其中該多階表的一第一階具有包括指向該多階表的一下一階的一指標的項目。The method of claim 18, wherein a first level of the multi-level table has entries including a pointer to a next next level of the multi-level table. 如請求項16所述的方法,其中該輸入位址轉換請求從一處理器核心經由一匯流排而接收,且該第一硬體識別符值與該處理器核心相關聯。The method of claim 16, wherein the incoming address translation request is received from a processor core via a bus, and the first hardware identifier value is associated with the processor core.
TW111138972A 2021-10-17 2022-10-14 Software indirection level for address translation sharing TW202319902A (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US202163256628P 2021-10-17 2021-10-17
US202163256630P 2021-10-17 2021-10-17
US63/256,630 2021-10-17
US63/256,628 2021-10-17
US202163291314P 2021-12-17 2021-12-17
US63/291,314 2021-12-17

Publications (1)

Publication Number Publication Date
TW202319902A true TW202319902A (en) 2023-05-16

Family

ID=84357929

Family Applications (1)

Application Number Title Priority Date Filing Date
TW111138972A TW202319902A (en) 2021-10-17 2022-10-14 Software indirection level for address translation sharing

Country Status (2)

Country Link
TW (1) TW202319902A (en)
WO (1) WO2023064590A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9323715B2 (en) * 2013-11-14 2016-04-26 Cavium, Inc. Method and apparatus to represent a processor context with fewer bits
EP3899719A4 (en) * 2018-12-21 2022-07-06 INTEL Corporation Process address space identifier virtualization using hardware paging hint
US11210102B2 (en) * 2019-11-26 2021-12-28 Arm Limited Speculative buffer for speculative memory accesses with entries tagged with execution context identifiers

Also Published As

Publication number Publication date
WO2023064590A1 (en) 2023-04-20

Similar Documents

Publication Publication Date Title
US7882330B2 (en) Virtualizing an IOMMU
US10223306B2 (en) Programmable memory transfer request processing units
KR101575827B1 (en) Iommu using two-level address translation for i/o and computation offload devices on a peripheral interconnect
KR101614865B1 (en) I/o memory management unit including multilevel address translation for i/o and computation offload
KR100871743B1 (en) Caching support for direct memory access address translation
US7548999B2 (en) Chained hybrid input/output memory management unit
JPH08320829A (en) Data processor
JPH0531776B2 (en)
WO2021061466A1 (en) Memory management unit, address translation method, and processor
US20240126703A1 (en) Software-hardware memory management modes
KR20190105623A (en) Variable Conversion Index Buffer (TLB) Indexing
EP4302186A1 (en) Method, system, and apparatus for supporting multiple address spaces to facilitate data movement
CN115269457A (en) Method and apparatus for enabling cache to store process specific information within devices supporting address translation services
CN114661638A (en) Secure address translation service using bundled access control
US20220269621A1 (en) Providing Copies of Input-Output Memory Management Unit Registers to Guest Operating Systems
US11494211B2 (en) Domain identifier and device identifier translation by an input-output memory management unit
EP3980885A1 (en) Guest operating system buffer and log access by an input-output memory management unit
TW202334802A (en) Page table entry caches with multiple tag lengths
TW202319902A (en) Software indirection level for address translation sharing
CN114676465A (en) Method and apparatus for runtime memory isolation across different execution domains
TW202324108A (en) Translation tagging for address translation caching
US11009841B2 (en) Initialising control data for a device