TWI805866B - Processor to detect redundancy of page table walk - Google Patents

Processor to detect redundancy of page table walk Download PDF

Info

Publication number
TWI805866B
TWI805866B TW108139160A TW108139160A TWI805866B TW I805866 B TWI805866 B TW I805866B TW 108139160 A TW108139160 A TW 108139160A TW 108139160 A TW108139160 A TW 108139160A TW I805866 B TWI805866 B TW I805866B
Authority
TW
Taiwan
Prior art keywords
address
page table
walkthrough
index
level
Prior art date
Application number
TW108139160A
Other languages
Chinese (zh)
Other versions
TW202034174A (en
Inventor
朴城範
希德 莫紐
崔周熙
Original Assignee
南韓商三星電子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南韓商三星電子股份有限公司 filed Critical 南韓商三星電子股份有限公司
Publication of TW202034174A publication Critical patent/TW202034174A/en
Application granted granted Critical
Publication of TWI805866B publication Critical patent/TWI805866B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]
    • G06F2212/684TLB miss handling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30047Prefetch instructions; cache control instructions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Software Systems (AREA)
  • Hardware Redundancy (AREA)

Abstract

A processor includes a page table walk cache that stores address translation information, and a page table walker. The page table walker fetches first output addresses indicated by first indexes of a first input address by looking up the address translation information and at least a part of page tables, and compares a matching level between second indexes of a second input address and the first indexes of the first input address with a walk cache hit level obtained by looking up the page table walk cache using the second indexes.

Description

偵測頁表走查冗餘之處理器 Processors that detect page table walkthrough redundancy

本發明是有關於一種處理器,且更具體而言,是有關於一種被配置成偵測頁表走查冗餘的處理器。 The present invention relates to a processor, and more particularly to a processor configured to detect page table walk redundancy.

[相關申請案的交叉參考] [CROSS-REFERENCE TO RELATED APPLICATIONS]

本專利申請案主張於2019年2月8日在美國專利商標局提出申請的美國臨時專利申請案第62/803,227號的優先權利以及於2019年2月26日在韓國智慧財產局提出申請的韓國專利申請案第10-2019-0022184號的優先權利,各專利申請案的揭露內容全部併入本文中供參考。 This patent application claims priority to U.S. Provisional Patent Application No. 62/803,227, filed with the U.S. Patent and Trademark Office on February 8, 2019, and Korean filed with the Korea Intellectual Property Office on February 26, 2019. Priority rights to patent application No. 10-2019-0022184, the disclosures of each patent application are incorporated herein by reference in their entirety.

系統晶片(system on chip;下文中稱為「SoC」)是一種積體電路,其中整合有電子系統的多個組件或整合有多個智慧財產權(intellectual property,IP)。用語「智慧財產權」及縮寫字「IP」均指代可各自分別受到智慧財產保護的獨特的電路及電路組件。當在本文說明中使用時,所述用語及縮寫字可與相似的用語(例 如「IP區塊」或「IP電路」)同義。SoC的處理器可執行使用者想要的多個應用程式,且為此,處理器可與記憶體裝置交換資料。然而,由於使用者想要快速並同時執行多個應用程式,因此處理器有必要高效地使用記憶體裝置的有限資源。處理器可使用虛擬記憶體空間,且可藉由包括所述虛擬記憶體空間與記憶體裝置的實體記憶體空間之間的映射資訊在內的功能來管理頁表(page table)。處理器可查找頁表,且可在虛擬記憶體空間的虛擬位址與實體記憶體空間的實體位址之間執行變換(translation)。 A system on chip (hereinafter referred to as "SoC") is an integrated circuit in which multiple components of an electronic system or multiple intellectual property (IP) are integrated. The terms "intellectual property" and the abbreviation "IP" both refer to unique circuits and circuit components that are each individually protected by intellectual property. When used in the description herein, the terms and abbreviations are interchangeable with similar terms (e.g. Synonymous with "IP block" or "IP circuit"). The processor of the SoC can execute multiple application programs desired by the user, and for this purpose, the processor can exchange data with the memory device. However, since users want to execute multiple applications quickly and simultaneously, it is necessary for the processor to efficiently use the limited resources of the memory device. The processor can use virtual memory space and can manage page tables through functions including mapping information between the virtual memory space and the physical memory space of the memory device. The processor can look up the page table and can perform a translation between a virtual address in the virtual memory space and a physical address in the physical memory space.

本發明的實施例提供一種偵測頁表走查冗餘的處理器。 Embodiments of the present invention provide a processor for detecting page table walkthrough redundancy.

根據示例性實施例,一種處理器包括頁表走查快取(page table walk cache)及頁表走查器(page table walker)。所述頁表走查快取儲存位址變換資訊。所述頁表走查器藉由查找所述位址變換資訊及至少一部分頁表來提取由第一輸入位址的第一索引指示的第一輸出位址。所述頁表走查器亦將匹配層階(matching level)與走查快取命中層階(walk cache hit level)進行比較。所述匹配層階是第二輸入位址的第二索引與所述第一輸入位址的所述第一索引之間的匹配層階。所述走查快取命中層階是藉由使用所述第二索引查找所述頁表走查快取而獲得。 According to an exemplary embodiment, a processor includes a page table walk cache and a page table walker. The page table walkthrough cache stores address translation information. The page table walker extracts the first output address indicated by the first index of the first input address by searching the address translation information and at least a part of the page table. The page table walker also compares the matching level with the walk cache hit level. The matching level is a matching level between the second index of the second input address and the first index of the first input address. The walkthrough cache hit level is obtained by using the second index to look up the page table walkthrough cache.

根據另一示例性實施例,一種處理器包括頁表走查快取及頁表走查器。所述頁表走查快取儲存位址變換資訊。所述頁表 走查器藉由查找所述位址變換資訊及第一級的至少一部分第一頁表來提取由第一輸入位址的第一索引指示的第一中間位址。所述頁表走查器亦藉由查找所述位址變換資訊及第二級的至少一部分第二頁表來提取由所述第一中間位址中的每一者的第二索引指示的第一輸出位址。此外,所述頁表走查器將匹配層階與走查快取命中層階進行比較。所述匹配層階是由第二輸入位址的第三索引指示的第二中間位址中的每一者的第四索引與所述第一中間位址中的每一者的所述第二索引之間的匹配層階。所述走查快取命中層階是藉由使用所述第四索引查找所述頁表走查快取而獲得。 According to another exemplary embodiment, a processor includes a page table walkthrough cache and a page table walkthrough. The page table walkthrough cache stores address translation information. The page table The walkthrough extracts the first intermediate address indicated by the first index of the first input address by searching the address translation information and at least a part of the first page table of the first level. The page table walker also extracts the first index indicated by the second index of each of the first intermediate addresses by looking up the address translation information and at least a portion of a second page table of the second level. - output address. Additionally, the page table walker compares the match level to the walk cache hit level. The matching hierarchy is the fourth index of each of the second intermediate addresses indicated by the third index of the second input address and the second index of each of the first intermediate addresses. Matching hierarchy between indexes. The walkthrough cache hit level is obtained by using the fourth index to look up the page table walkthrough cache.

根據又一示例性實施例,一種處理器包括頁表走查快取及頁表走查器。所述頁表走查快取儲存位址變換資訊。所述頁表走查器藉由查找所述位址變換資訊及第一級的至少一部分第一頁表來提取由第一輸入位址的第一索引指示的第一中間位址。所述頁表走查器亦藉由查找所述位址變換資訊及第二級的至少一部分第二頁表來提取由所述第一中間位址中的每一者的第二索引指示的第一輸出位址。此外,所述頁表走查器將第一匹配層階與第一走查快取命中層階進行比較。所述第一匹配層階是第二輸入位址的第三索引與所述第一輸入位址的所述第一索引之間的匹配層階。所述第一走查快取命中層階是藉由使用所述第三索引查找所述頁表走查快取而獲得。此外,所述頁表走查器將第二匹配層階與第二走查快取命中層階進行比較。所述第二匹配層階是由所述第二輸入位址的所述第三索引指示的第二中間位址中的每一者的 第四索引與所述第一中間位址中的每一者的所述第二索引之間的匹配層階。所述第二走查快取命中層階是藉由使用所述第四索引查找所述頁表走查快取而獲得。 According to yet another exemplary embodiment, a processor includes a page table walkthrough cache and a page table walkthrough. The page table walkthrough cache stores address translation information. The page table walker extracts a first intermediate address indicated by a first index of a first input address by searching the address translation information and at least a part of the first page table of the first level. The page table walker also extracts the first index indicated by the second index of each of the first intermediate addresses by looking up the address translation information and at least a portion of a second page table of the second level. - output address. Additionally, the page table walker compares the first match level with the first walk cache hit level. The first matching level is a matching level between the third index of the second input address and the first index of the first input address. The first walkthrough cache hit level is obtained by using the third index to look up the page table walkthrough cache. Additionally, the page table walker compares the second match level with the second walk cache hit level. The second matching level is for each of the second intermediate addresses indicated by the third index of the second input address A level of matching between a fourth index and the second index of each of the first intermediate addresses. The second walkthrough cache hit level is obtained by using the fourth index to look up the page table walkthrough cache.

100:電子裝置 100: Electronic device

1000:系統晶片(SoC) 1000: System on Chip (SoC)

1100:核心 1100: Core

1100_1:第一核心/核心 1100_1: the first core/core

1100_2:第二核心/核心 1100_2: second core/core

1100_3:第三核心/核心 1100_3: The third core/core

1100_4:第四核心/核心 1100_4: fourth core/core

1110:提取單元 1110: extraction unit

1120:解碼單元 1120: decoding unit

1130:暫存器重新命名單元 1130: Register renaming unit

1140:發佈/引出單元 1140: release/export unit

1150:算術邏輯單元(ALU) 1150: Arithmetic Logic Unit (ALU)

1160:浮點單元(FPU) 1160: Floating point unit (FPU)

1170:分支檢查單元 1170: branch check unit

1180:載入/儲存單元 1180:Load/store unit

1190:L2快取 1190: L2 cache

1200:記憶體管理單元(MMU) 1200: Memory Management Unit (MMU)

1200_1:第一記憶體管理單元 1200_1: The first memory management unit

1200_2:第二記憶體管理單元 1200_2: Second memory management unit

1200_3:第三記憶體管理單元 1200_3: The third memory management unit

1200_4:第四記憶體管理單元 1200_4: The fourth memory management unit

1210:變換後備緩衝器(TLB) 1210:Transform lookaside buffer (TLB)

1220:頁表走查器 1220: page table walkthrough

1221:頁表走查排程器 1221: Page table walkthrough scheduler

1222:危險/重放控制器 1222: Hazard/Replay Controller

1223、1224:走查器 1223, 1224: walkthrough device

1225:冗餘走查偵測器 1225: Redundant walkthrough detector

1226:第二冗餘走查偵測器 1226:Second redundant walkthrough detector

1230:頁表走查快取 1230: page table scan cache

1241:變換表基底暫存器(TTBR) 1241: Transformation Table Base Register (TTBR)

1242:虛擬變換表基底暫存器(VTTBR) 1242: Virtual Transformation Table Base Register (VTTBR)

1300:快取記憶體 1300: cache memory

1400:匯流排 1400: busbar

2000:主記憶體 2000: Main memory

AP1:應用程式/第一應用程式 AP1: Application/First Application

AP2:應用程式/第二應用程式 AP2: Application/Secondary Application

AP3:應用程式/第三應用程式 AP3: Application/Third Application

AP4:應用程式/第四應用程式 AP4: Application/Fourth Application

IA0、IA1:輸入位址 IA0, IA1: input address

L0、L1、L2、L3:層階 L0, L1, L2, L3: levels

S103、S106、S109、S113、S116、S119、S123、S126、S129、S133、S136、S139、S143、S146、S203、S206、S209、S213、S216、S219、S223、S226、S229、S233、S236、S239、S243、S246、S249、S253、S256、S259、S263、S266、S269、S273、S276、S279:操作 S103, S106, S109, S113, S116, S119, S123, S126, S129, S133, S136, S139, S143, S146, S203, S206, S209, S213, S216, S219, S223, S226, S229, S233, S2 36. S239, S243, S246, S249, S253, S256, S259, S263, S266, S269, S273, S276, S279: Operation

S1WC:第一級的頁表走查快取 S1WC: First-level page table scan cache

S2WC:第二級的頁表走查快取 S2WC: Second-level page table walkthrough cache

OS1:第一作業系統 OS1: the first operating system

OS2:第二作業系統 OS2: second operating system

圖1示出根據本發明實施例的電子裝置的方塊圖。 FIG. 1 shows a block diagram of an electronic device according to an embodiment of the present invention.

圖2示出圖1所示SoC中的第一核心至第四核心中的任一者的方塊圖。 FIG. 2 shows a block diagram of any one of the first to fourth cores in the SoC shown in FIG. 1 .

圖3示出主記憶體以及可由圖1所示SoC執行的應用程式及作業系統。 FIG. 3 shows the main memory and the applications and operating system executable by the SoC shown in FIG. 1 .

圖4示出圖3所示應用程式的虛擬位址空間(virtual address spaces)與實體位址空間(physical address space)之間的映射。 FIG. 4 shows the mapping between the virtual address spaces (virtual address spaces) and the physical address spaces (physical address spaces) of the application shown in FIG. 3 .

圖5示出其中圖2所示頁表走查器執行頁表走查的操作。 FIG. 5 shows an operation in which the page table walkthrough shown in FIG. 2 performs a page table walkthrough.

圖6示出主記憶體以及可由圖1所示SoC執行的應用程式及作業系統。 FIG. 6 shows the main memory and the applications and operating system executable by the SoC shown in FIG. 1 .

圖7示出圖6所示應用程式的虛擬位址空間與實體位址空間之間的映射。 FIG. 7 shows the mapping between the virtual address space and the physical address space of the application shown in FIG. 6 .

圖8A及圖8B示出其中圖2所示頁表走查器基於第一級(stage)及第二級來執行頁表走查的操作的流程圖。 FIGS. 8A and 8B are flow charts showing operations in which the page table walkthrough shown in FIG. 2 performs page table walkthrough based on a first stage and a second stage.

圖9示出圖2所示頁表走查器的詳細方塊圖及操作。 FIG. 9 shows a detailed block diagram and operation of the page table walker shown in FIG. 2 .

圖10示出圖2所示頁表走查器的另一詳細方塊圖及操作。 FIG. 10 shows another detailed block diagram and operation of the page table walker shown in FIG. 2 .

圖11示出圖2所示頁表走查器的另一詳細方塊圖及操作。 FIG. 11 shows another detailed block diagram and operation of the page table walker shown in FIG. 2 .

圖12示出圖2所示頁表走查器的另一詳細方塊圖及操作。 FIG. 12 shows another detailed block diagram and operation of the page table walker shown in FIG. 2 .

圖13示出其中圖2所示頁表走查器執行頁表走查以將虛擬位址變換成實體位址的流程圖。 FIG. 13 shows a flowchart in which the page table walker shown in FIG. 2 performs a page table walk to convert virtual addresses into physical addresses.

圖14A及圖14B示出其中圖2所示頁表走查器執行第一級的頁表走查以將虛擬位址變換成中間實體位址並執行第二級的頁表走查以將中間實體位址變換成實體位址的操作的流程圖。 14A and FIG. 14B show that the page table walkthrough shown in FIG. 2 executes the first-level page table walkthrough to transform the virtual address into an intermediate physical address and executes the second-level page table walkthrough to convert the intermediate A flow chart of the operation of converting physical address to physical address.

圖1示出根據本發明實施例的電子裝置的方塊圖。電子裝置100可包括系統晶片(SoC)1000及主記憶體2000。電子裝置100亦可被稱為「電子系統」。例如,電子裝置100可為桌上型電腦、膝上型電腦、工作站、伺服器、行動裝置等。SoC 1000可為其中例如在單個整合型基底上及/或例如在整合型殼體內整合有各種(多個不同的)系統的一個晶片。 FIG. 1 shows a block diagram of an electronic device according to an embodiment of the present invention. The electronic device 100 may include a system on chip (SoC) 1000 and a main memory 2000 . The electronic device 100 may also be referred to as an "electronic system". For example, the electronic device 100 can be a desktop computer, a laptop computer, a workstation, a server, a mobile device, and the like. SoC 1000 may be one chip in which various (multiple different) systems are integrated, eg, on a single integrated substrate and/or eg, within an integrated housing.

SoC 1000可作為應用處理器(application processor,AP)來控制電子裝置100的整體操作。SoC 1000可包括第一核心1100_1至第四核心1100_4(其各自亦可被稱為「處理器」或「中央處理單元(central processing unit,CPU)」)、快取記憶體1300及匯流排1400。儘管未在圖式中示出,但SoC 1000可更包括任何其他智慧財產權(IP),例如記憶體控制器。第一核心1100_1至第四核心1100_4中的每一者可執行各種軟體,例如應用程式、作 業系統及/或裝置驅動器。圖1所示第一核心1100_1至第四核心1100_4的數目僅為實例,且SoC 1000可包括一或多個同質(homogeneous)或異質(heterogeneous)核心。 The SoC 1000 can function as an application processor (application processor, AP) to control the overall operation of the electronic device 100 . The SoC 1000 may include a first core 1100_1 to a fourth core 1100_4 (each of which may also be referred to as a “processor” or a “central processing unit (CPU)”), a cache memory 1300 and a bus 1400 . Although not shown in the drawings, SoC 1000 may further include any other intellectual property (IP), such as a memory controller. Each of the first to fourth cores 1100_1 to 1100_4 can execute various software such as applications, operating industry system and/or device drivers. The numbers of the first cores 1100_1 to the fourth cores 1100_4 shown in FIG. 1 are just examples, and the SoC 1000 may include one or more homogeneous or heterogeneous cores.

第一核心1100_1至第四核心1100_4可分別包括第一記憶體管理單元(memory management unit,MMU)1200_1至第四MMU 1200_4。第一MMU 1200_1至第四MMU 1200_4可將所使用的虛擬位址變換成在硬體記憶體裝置(例如SoC 1000中的快取記憶體1300、SoC 1000之外的主記憶體2000及/或SoC 1000之外的輔助記憶體(未示出))中使用的實體位址。在第一核心1100_1到第四核心1100_4分別執行第一軟體至第四軟體時,第一MMU 1200_1至第四MMU 1200_4可將虛擬位址變換成實體位址。第一MMU 1200_1至第四MMU 1200_4可管理虛擬位址與實體位址之間的位址變換資訊(例如,變換表)。第一MMU 1200_1至第四MMU 1200_4可容許應用程式具有私有(專用)虛擬記憶體空間,且可容許第一核心1100_1至第四核心1100_4執行多個任務。 The first core 1100_1 to the fourth core 1100_4 may respectively include a first memory management unit (MMU) 1200_1 to a fourth MMU 1200_4. The first MMU 1200_1 to the fourth MMU 1200_4 can convert the used virtual address into a hardware memory device (such as the cache memory 1300 in the SoC 1000, the main memory 2000 outside the SoC 1000 and/or the SoC Physical address used in auxiliary memory (not shown) other than 1000. When the first core 1100_1 to the fourth core 1100_4 respectively execute the first software to the fourth software, the first MMU 1200_1 to the fourth MMU 1200_4 can convert virtual addresses into physical addresses. The first MMU 1200_1 to the fourth MMU 1200_4 can manage address translation information (for example, a translation table) between virtual addresses and physical addresses. The first MMU 1200_1 to the fourth MMU 1200_4 can allow applications to have private (dedicated) virtual memory space, and can allow the first core 1100_1 to the fourth core 1100_4 to execute multiple tasks.

快取記憶體1300可分別連接至第一核心1100_1至第四核心1100_4,且可由第一核心1100_1至第四核心1100_4共用。例如,快取記憶體1300可藉由使用暫存器、正反器、靜態隨機存取記憶體(static random access memory,SRAM)或其組合來實作。對於第一核心1100_1至第四核心1100_4,快取記憶體1300可具有較主記憶體2000更快的存取速度。快取記憶體1300可儲存用於第一核心1100_1至第四核心1100_4及/或與第一核心1100_1至 第四核心1100_4相關聯的指令、資料、位址、位址變換資訊等。 The cache memory 1300 can be respectively connected to the first core 1100_1 to the fourth core 1100_4, and can be shared by the first core 1100_1 to the fourth core 1100_4. For example, the cache memory 1300 can be implemented by using registers, flip-flops, static random access memory (static random access memory, SRAM) or a combination thereof. For the first core 1100_1 to the fourth core 1100_4 , the cache memory 1300 may have a faster access speed than the main memory 2000 . The cache memory 1300 can store data for the first core 1100_1 to the fourth core 1100_4 and/or with the first core 1100_1 to the fourth core 1100_4 Instructions, data, addresses, address conversion information, etc. associated with the fourth core 1100_4.

匯流排1400可連接SoC 1000的內部IP(例如核心1100_1至1100_4、快取記憶體1300等),或者可為SoC 1000的內部IP提供通往主記憶體2000的存取路徑。匯流排1400可為高級微控制器匯流排架構(Advanced Microcontroller Bus Architecture,AMBA)標準匯流排協定類型。AMBA的匯流排類型可為高級高效能匯流排(Advanced High-Performance Bus,AHB)、高級周邊匯流排(Advanced Peripheral Bus,APB)或高級可擴展介面(Advanced eXtensible Interface,AXI)。 The bus 1400 can connect internal IPs of the SoC 1000 (such as the cores 1100_1 to 1100_4 , the cache memory 1300 , etc.), or can provide an access path to the main memory 2000 for the internal IPs of the SoC 1000 . The bus 1400 can be an Advanced Microcontroller Bus Architecture (AMBA) standard bus protocol type. The AMBA bus type can be Advanced High-Performance Bus (AHB), Advanced Peripheral Bus (APB) or Advanced Extensible Interface (AXI).

主記憶體2000可與SoC 1000通訊。主記憶體2000可向第一核心1100_1至第四核心1100_4提供較快取記憶體1300更大的容量。主記憶體2000可儲存自SoC 1000提供的指令、資料、位址、位址變換資訊等。例如,主記憶體2000可為動態隨機存取記憶體(dynamic random access memory,DRAM)。在實施例中,除主記憶體2000之外,電子裝置100可更包括與SoC 1000通訊的任何其他硬體記憶體裝置(未示出),例如固態磁碟機(solid state drive,SSD)、硬碟機(hard disk drive,HDD)或記憶卡。 The main memory 2000 can communicate with the SoC 1000 . The main memory 2000 can provide the first core 1100_1 to the fourth core 1100_4 with a larger capacity than the cache memory 1300 . The main memory 2000 can store instructions, data, addresses, address conversion information, etc. provided by the SoC 1000 . For example, the main memory 2000 can be a dynamic random access memory (dynamic random access memory, DRAM). In an embodiment, in addition to the main memory 2000, the electronic device 100 may further include any other hardware memory devices (not shown) communicating with the SoC 1000, such as solid state drives (solid state drives, SSDs), Hard disk drive (hard disk drive, HDD) or memory card.

圖2示出圖1所示SoC中的第一核心至第四核心中的任一者的方塊圖。核心1100可為圖1所示第一核心1100_1至第四核心1100_4中的任一者。核心1100可包括提取單元(fetch unit)1110、解碼單元1120、暫存器重新命名單元(register rename unit)1130、發佈/引出單元(issue/retire unit)1140、算術邏輯單元 (arithmetic logic unit,ALU)1150、浮點單元(floating-point unit,FPU)1160、分支檢查單元(branch check unit)1170、載入/儲存單元1180、L2快取1190及MMU 1200。可藉由使用類比電路、數位電路、邏輯電路、時脈電路、正反器、暫存器等而以硬體來實作核心1100的所有組件(包括MMU 1200的詳細組件)。 FIG. 2 shows a block diagram of any one of the first to fourth cores in the SoC shown in FIG. 1 . The core 1100 can be any one of the first core 1100_1 to the fourth core 1100_4 shown in FIG. 1 . The core 1100 may include a fetch unit 1110, a decode unit 1120, a register rename unit 1130, an issue/retire unit 1140, an arithmetic logic unit (arithmetic logic unit, ALU) 1150 , floating-point unit (floating-point unit, FPU) 1160 , branch check unit (branch check unit) 1170 , load/store unit 1180 , L2 cache 1190 and MMU 1200 . All components of the core 1100 (including detailed components of the MMU 1200 ) can be implemented in hardware by using analog circuits, digital circuits, logic circuits, clock circuits, flip-flops, registers, etc.

提取單元1110可參考程式計數器(未示出)中所儲存的記憶體位址來提取指令且可將所提取的指令儲存於指令暫存器(未示出)中,所述程式計數器追蹤所述指令的記憶體位址。例如,指令可儲存於記憶體(例如核心1100中的快取記憶體(未示出)、快取記憶體1300或主記憶體2000)中。解碼單元1120可對指令暫存器中所儲存的指令進行解碼,且可判斷欲執行什麼指令,以使得所述指令被執行。暫存器重新命名單元1130可將由指令指定的邏輯暫存器映射成核心1100中的實體暫存器。暫存器重新命名單元1130可將由連續指令指定的邏輯暫存器映射成不同的實體暫存器,且可去除指令之間的相依性(dependence)。發佈/引出單元1140可控制何時將經解碼指令發佈(或分派)至管線以及何時引出所傳回結果。 Fetch unit 1110 may fetch instructions and may store the fetched instructions in an instruction register (not shown) with reference to a memory address stored in a program counter (not shown), which tracks the instructions. memory address. For example, instructions may be stored in a memory such as cache memory (not shown) in core 1100, cache memory 1300, or main memory 2000. The decoding unit 1120 can decode the instructions stored in the instruction register, and can determine what instructions are to be executed, so that the instructions are executed. The register renaming unit 1130 can map logical registers specified by instructions into physical registers in the core 1100 . The register renaming unit 1130 can map logical registers specified by consecutive instructions into different physical registers, and can remove dependencies between instructions. Issue/extract unit 1140 may control when decoded instructions are issued (or dispatched) to the pipeline and when returned results are evicted.

ALU 1150可基於所分派的指令來執行算術運算、邏輯運算或移位運算。可自記憶體為ALU 1150提供進行運算所必需的運算碼、運算元等。FPU 1160可執行浮點運算。分支檢查單元1170可檢查分支指令的分支方向是否被預測,以改良管線的流量。載入/儲存單元1180可執行載入及儲存指令,可產生在載入及儲存操 作中所使用的虛擬位址,且可自L2快取1190、快取記憶體1300或主記憶體2000載入資料,或者可將資料儲存於L2快取1190、快取記憶體1300或主記憶體2000中。 The ALU 1150 may perform arithmetic, logical, or shift operations based on the dispatched instructions. The ALU 1150 can be provided with necessary operation codes and operands for operation from the memory. The FPU 1160 can perform floating point operations. The branch checking unit 1170 can check whether the branch direction of the branch instruction is predicted to improve pipeline flow. The load/store unit 1180 can execute load and store instructions, and can generate The virtual address used in the operation, and can load data from L2 cache 1190, cache memory 1300 or main memory 2000, or can store data in L2 cache 1190, cache memory 1300 or main memory Body 2000.

MMU是例如核心1100等核心的組件。MMU 1200可為圖1所示第一MMU 1200_1至第四MMU 1200_4中的任一者。MMU 1200可包括變換後備緩衝器(translation lookaside buffer,TLB)1210、頁表走查器1220、頁表走查快取1230、變換表基底暫存器(translation table base register,TTBR)1241及虛擬變換表基底暫存器(virtual translation table base register,VTTBR)1242。頁表走查器1220在以下予以闡述,且可被實作為與核心1100的其他單元類似的單元。頁表走查器1220可被實作為或包括執行邏輯操作的單元,所述邏輯操作包括由核心1100提取或啟動提取的操作以及由核心1100比較或啟動比較的操作。最近存取的頁變換可快取於TLB 1210中。對於由核心1100執行的每一記憶體存取,MMU 1200可檢查給定虛擬位址的變換是否快取於TLB 1210中。TLB 1210中可儲存有各自被劃分成標籤及資料的多個條目(entry)。例如,虛擬位址的資訊可位於標籤中,且實體位址的資訊可位於資料中。在其中虛擬位址的變換(映射資訊)快取於TLB 1210中的情形中(在TLB命中(hit)的情形中),變換可立即可用。在其中TLB 1210中不存在虛擬位址的有效變換的情形中(在TLB未命中(miss)的情形中),應藉由頁表走查在TLB 1210中更新虛擬位址的變換,所述頁表走查涉及搜尋快取記憶體1300及 /或主記憶體2000中所儲存的頁表。頁表可為儲存虛擬位址與實體位址之間的映射的資料結構。 The MMU is a component of a core such as core 1100 . The MMU 1200 can be any one of the first MMU 1200_1 to the fourth MMU 1200_4 shown in FIG. 1 . The MMU 1200 may include a translation lookaside buffer (TLB) 1210, a page table walker 1220, a page table walkthrough cache 1230, a translation table base register (TTBR) 1241 and virtual translation Table base register (virtual translation table base register, VTTBR) 1242 . Page table walker 1220 is described below and may be implemented as a unit similar to other units of core 1100 . Page table walker 1220 may be implemented as or include a unit that performs logical operations, including operations of fetching or enabling fetching by core 1100 and operations of comparing or enabling comparison by core 1100 . Recently accessed page translations may be cached in the TLB 1210 . For each memory access performed by core 1100 , MMU 1200 may check whether a translation for a given virtual address is cached in TLB 1210 . The TLB 1210 may store a plurality of entries each divided into tags and data. For example, information for a virtual address can be in a tag, and information for a physical address can be in a data. In cases where the translation (mapping information) of the virtual address is cached in the TLB 1210 (in the case of a TLB hit), the translation is available immediately. In the case where there is no valid translation of the virtual address in the TLB 1210 (in the case of a TLB miss), the translation of the virtual address should be updated in the TLB 1210 by a page table walk, the page Table walkthrough involves searching cache memory 1300 and /or the page table stored in the main memory 2000 . The page table can be a data structure that stores the mapping between virtual addresses and physical addresses.

頁表走查器1220可針對未自TLB 1210找到或查找到的虛擬位址執行頁表走查。頁表走查器1220可「走查」或查找頁表,以將虛擬位址變換成實體位址。頁表走查器1220可自快取記憶體1300或主記憶體2000中所儲存的頁表提取關於虛擬位址的位址變換資訊。 The page table walker 1220 may perform a page table walk for virtual addresses not found or looked up from the TLB 1210 . The page table walker 1220 may "walk" or look up the page table to translate virtual addresses into physical addresses. The page table walker 1220 can extract the address translation information about the virtual address from the page table stored in the cache memory 1300 or the main memory 2000 .

頁表走查快取1230可快取或儲存虛擬位址的局部位址變換資訊或全部位址變換資訊(partial or full address translation information)。例如,頁表可為階層式地建構的。頁表走查器1220可有序(依序)地存取或查找頁表,可自頁表提取局部位址變換資訊,且可將所提取的資訊儲存於頁表走查快取1230中。此外,頁表走查器1220可跳過存取或查找快取記憶體1300或主記憶體2000中所儲存的一些頁表,且可藉由查找先前(已經)快取於頁表走查快取1230中的局部位址變換資訊來使頁表走查加速。 The page table walkthrough cache 1230 can cache or store partial or full address translation information of virtual addresses. For example, page tables may be structured hierarchically. The page table walker 1220 can access or search the page table sequentially, extract local address translation information from the page table, and store the extracted information in the page table walkthrough cache 1230 . In addition, the page table walker 1220 can skip accessing or looking up some page tables stored in the cache memory 1300 or the main memory 2000, and can scan the page table by looking up the previous (already) cached page table The local address translation information in 1230 is fetched to speed up the page table walkthrough.

TTBR 1241可儲存指示頁表的基址(base address)。VTTBR 1242亦可儲存指示頁表的基址。TTBR 1241及VTTBR 1242中所儲存的基址的值可隨著可由核心1100執行的軟體(例如,應用程式、作業系統等)而變化。 The TTBR 1241 can store the base address indicating the page table. VTTBR 1242 may also store base addresses indicating page tables. The value of the base address stored in TTBR 1241 and VTTBR 1242 may vary with software executable by core 1100 (eg, application programs, operating systems, etc.).

圖3示出主記憶體以及可由圖1所示SoC執行的應用程式及作業系統。圖4示出圖3所示應用程式的虛擬位址空間與實體位址空間之間的映射。將一起闡述圖3及圖4。 FIG. 3 shows the main memory and the applications and operating system executable by the SoC shown in FIG. 1 . FIG. 4 shows the mapping between the virtual address space and the physical address space of the application shown in FIG. 3 . 3 and 4 will be described together.

參照圖3,作業系統可管理包括SoC 1000及主記憶體2000的硬體以及包括應用程式AP1及/或應用程式AP2的軟體。作業系統可運行以容許應用程式AP1及/或應用程式AP2在SoC 1000及主記憶體2000上執行。圖3所示的應用程式AP1及應用程式AP2的數目僅為實例。參照圖4,在執行第一應用程式AP1時,作業系統可將過程的虛擬位址(virtual address,VA)空間映射成實體位址(physical address,PA)空間。在執行第二應用程式AP2時,作業系統可將過程的虛擬位址空間映射成實體位址空間。藉由管理上述映射,作業系統可高效地使用安裝於硬體上的有限容量的記憶體。 Referring to FIG. 3 , the operating system may manage hardware including the SoC 1000 and the main memory 2000 and software including the application program AP1 and/or the application program AP2. The operating system can run to allow the application program AP1 and/or the application program AP2 to execute on the SoC 1000 and the main memory 2000 . The numbers of application programs AP1 and application programs AP2 shown in FIG. 3 are merely examples. Referring to FIG. 4 , when the first application program AP1 is executed, the operating system can map the virtual address (VA) space of the process into the physical address (PA) space. When executing the second application program AP2, the operating system can map the virtual address space of the process into the physical address space. By managing the above mapping, the operating system can efficiently use the limited capacity of memory installed on the hardware.

圖5示出其中圖2所示頁表走查器執行頁表走查的操作。頁表走查器1220可自載入/儲存單元1180接收虛擬位址。頁表走查器1220接收的虛擬位址可為在TLB 1210中查找過的位址(即,TLB未命中位址)。虛擬位址的多位元(例如,K個位元,其中「K」是自然數)部分可被劃分成L0索引、L1索引、L2索引、L3索引及偏移區域(offset area)。虛擬位址的索引可根據層階L0至L3而劃分。此外,頁表可根據層階L0至L3而劃分或階層式地建構。因此,索引可反映多位元部分的各自具有不同權重的段(segment),且頁表可以對應於虛擬位址的多位元部分的段的權重而建構的階層(hierarchy)來佈置。在圖5中,層階的數目、索引的數目及頁表的數目僅為實例。頁表走查器1220可依序查找根據層階L0至L3階層式地建構的頁表。關於搜尋次序,「L0」 可為第一層階,且「L3」可為最末層階。 FIG. 5 shows an operation in which the page table walkthrough shown in FIG. 2 performs a page table walkthrough. The page table walker 1220 can receive the virtual address from the load/store unit 1180 . The virtual address received by the page table walker 1220 may be an address looked up in the TLB 1210 (ie, a TLB miss address). The multi-bit (eg, K bits, where "K" is a natural number) part of the virtual address can be divided into L0 index, L1 index, L2 index, L3 index and offset area. Indexes of virtual addresses can be divided according to levels L0 to L3. In addition, the page table can be divided or structured hierarchically according to levels L0 to L3. Therefore, the index can reflect the segments of the multi-bit part each having a different weight, and the page table can be arranged in a hierarchy constructed corresponding to the weights of the segments of the multi-bit part of the virtual address. In FIG. 5, the number of levels, the number of indexes, and the number of page tables are examples only. The page table walker 1220 can search the page tables hierarchically structured according to the levels L0 to L3 in sequence. Regarding the search order, "L0" can be the first level, and "L3" can be the last level.

首先,頁表走查器1220可自由TTBR 1241中所儲存的基址指示的L0頁表的條目中查找由虛擬位址的L0索引指示的條目。L0頁表(page table,PT)可由L0索引來標引。每一條目中所儲存的描述符(descriptor)可包括屬性及輸出位址(由深色陰影標記)。例如,屬性可包括與輸出位址相關聯的權限位元(permission bit)、存取位元、已使用位元(dirty bit)、安全位元(secure bit)等。頁表走查器1220可提取由虛擬位址的L0索引指示的條目中所包含的描述符,且可在頁表走查快取1230中儲存或更新描述符的局部資訊(即,關於虛擬位址的L0索引的局部位址變換資訊)。 First, the page table walker 1220 can search the entry indicated by the L0 index of the virtual address from the entries of the L0 page table indicated by the base address stored in the TTBR 1241 . The L0 page table (page table, PT) can be indexed by the L0 index. Descriptors stored in each entry may include attributes and output addresses (marked by dark shading). For example, attributes may include permission bits, access bits, dirty bits, secure bits, etc. associated with the output address. Page table walker 1220 may fetch the descriptor contained in the entry indicated by the L0 index of the virtual address, and may store or update local information of the descriptor (i.e., about the virtual bits) in page table walk cache 1230 local address translation information for the L0 index of the address).

頁表走查器1220可自由自L0頁表提取的描述符的L0輸出位址指示的L1頁表的條目中查找由虛擬位址的L1索引指示的條目。換言之,頁表走查器1220可自基於自L0頁表提取的描述符的L0輸出位址而指示的L1頁表的條目中查找由虛擬位址的L1索引指示的條目。頁表走查器1220可提取由虛擬位址的L1索引指示的條目中所包含的描述符,且可在頁表走查快取1230中儲存或更新描述符的局部資訊(即,關於虛擬位址的L1索引的局部位址變換資訊)。 The page table walker 1220 can freely search the entry indicated by the L1 index of the virtual address among the entries of the L1 page table indicated by the L0 output address of the descriptor extracted from the L0 page table. In other words, the page table walker 1220 may search for the entry indicated by the L1 index of the virtual address among the entries of the L1 page table indicated based on the L0 output address of the descriptor extracted from the L0 page table. Page table walker 1220 may fetch the descriptor contained in the entry indicated by the L1 index of the virtual address, and may store or update local information of the descriptor (i.e., about the virtual bits) in page table walk cache 1230 local address translation information for the L1 index of the address).

頁表走查器1220可自由自L1頁表提取的描述符的L1輸出位址指示的L2頁表的條目中查找由虛擬位址的L2索引指示的條目。換言之,頁表走查器1220可自基於自L1頁表提取的描 述符的L1輸出位址而指示的L2頁表的條目中查找由虛擬位址的L2索引指示的條目。頁表走查器1220可提取由虛擬位址的L2索引指示的條目中所包含的描述符,且可在頁表走查快取1230中儲存或更新描述符的局部資訊(即,關於虛擬位址的L2索引的局部位址變換資訊)。 The page table walker 1220 can freely search the entry indicated by the L2 index of the virtual address among the entries of the L2 page table indicated by the L1 output address of the descriptor extracted from the L1 page table. In other words, the page table walker 1220 can automatically The entry indicated by the L2 index of the virtual address is looked up among the entries of the L2 page table indicated by the L1 output address of the descriptor. Page table walker 1220 may fetch the descriptor contained in the entry indicated by the L2 index of the virtual address, and may store or update local information for the descriptor (i.e., about the virtual bits) in page table walk cache 1230 local address translation information for the L2 index of the address).

頁表走查器1220可自由自L2頁表提取的描述符的L2輸出位址指示的L3頁表的條目中查找由虛擬位址的L3索引指示的條目。換言之,頁表走查器1220可自基於自L2頁表提取的描述符的L2輸出位址而指示的L3頁表的條目中查找由虛擬位址的L3索引指示的條目。頁表走查器1220可提取由虛擬位址的L3索引指示的條目中所包含的描述符,且可在頁表走查快取1230中儲存或更新描述符的局部資訊(即,關於虛擬位址的L3索引的局部位址變換資訊)。此外,由於與L3索引及L3頁表對應的層階是最末層階,因此頁表走查器1220亦可將描述符儲存於TLB 1210中。 The page table walker 1220 can freely search the entry indicated by the L3 index of the virtual address among the entries of the L3 page table indicated by the L2 output address of the descriptor extracted from the L2 page table. In other words, the page table walker 1220 may search for the entry indicated by the L3 index of the virtual address among the entries of the L3 page table indicated based on the L2 output address of the descriptor extracted from the L2 page table. Page table walker 1220 may fetch the descriptor contained in the entry indicated by the L3 index of the virtual address, and may store or update local information for the descriptor (i.e., about the virtual bits) in page table walk cache 1230 local address translation information for the L3 index of the address). In addition, since the level corresponding to the L3 index and the L3 page table is the last level, the page table walker 1220 can also store the descriptor in the TLB 1210 .

MMU 1200可自由自L3頁表提取的描述符的L3輸出位址指示的頁中查找由虛擬位址的偏移指示的頁,且可計算最終實體位址(例如,最終實體位址=L3輸出位址+偏移)。在其中虛擬位址與L3頁表的L3輸出位址之間的映射(即,最終變換)快取於TLB 1210中的情形中,MMU 1200可藉由使用快取於TLB 1210中的偏移及輸出位址來立即計算最終實體位址,且可將最終實體位址傳回至載入/儲存單元1180。 The MMU 1200 is free to look up the page indicated by the offset of the virtual address from the page indicated by the L3 output address of the descriptor extracted from the L3 page table, and can calculate the final physical address (e.g., final physical address = L3 output address + offset). In cases where the mapping (i.e., the final translation) between the virtual address and the L3 output address of the L3 page table is cached in the TLB 1210, the MMU 1200 can be cached in the TLB 1210 by using the offset and The address is output to immediately calculate the final physical address, and the final physical address may be passed back to the load/store unit 1180 .

在實施例中,頁表走查器1220可針對一個虛擬位址執 行頁表走查,且然後可針對另一虛擬位址執行頁表走查。在執行針對一個虛擬位址的頁表走查時,局部位址變換資訊可能已儲存於頁表走查快取1230中。在其中關於另一虛擬位址的一部分索引的局部位址變換資訊儲存於頁表走查快取1230中的情形中,頁表走查器1220可跳過自特定層階提取描述符的操作。例如,在其中L0索引的局部位址變換資訊已儲存於頁表走查快取1230中的情形中(即,當在頁表走查快取中發生命中時),頁表走查器1220可跳過查找L0頁表的操作。如同在L0層階的上述操作中,頁表走查器1220可執行剩餘的L1層階、L2層階及L3層階的操作。 In an embodiment, the page table walker 1220 may perform a virtual address A page table walk is performed, and a page table walk can then be performed for another virtual address. When performing a page table walk for a virtual address, local address translation information may have been stored in the page table walk cache 1230 . In cases where local address translation information for a portion of the index of another virtual address is stored in the page table walk cache 1230, the page table walker 1220 may skip fetching descriptors from a particular level. For example, in situations where local address translation information for the L0 index is already stored in the page table walkthrough cache 1230 (i.e., when a hit occurs in the page table walkthrough cache), the page table walkthrough 1220 may Skip the lookup of the L0 page table. As in the above operations at the L0 level, the page table walker 1220 can perform the remaining operations at the L1 level, L2 level, and L3 level.

圖6示出主記憶體以及可由圖1所示SoC執行的應用程式及作業系統。圖7示出圖6所示應用程式的虛擬位址空間與實體位址空間之間的映射。將一起闡述圖6及圖7,且說明將集中於基於圖6及圖7的實施例與基於圖3及圖4的實施例之間的差異。 FIG. 6 shows the main memory and the applications and operating system executable by the SoC shown in FIG. 1 . FIG. 7 shows the mapping between the virtual address space and the physical address space of the application shown in FIG. 6 . 6 and 7 will be explained together, and the description will focus on the differences between the embodiment based on FIGS. 6 and 7 and the embodiment based on FIGS. 3 and 4 .

參照圖6,第一作業系統可管理包括SoC 1000及主記憶體2000的硬體以及包括應用程式AP1及/或應用程式AP2的軟體。第二作業系統可管理包括SoC 1000及主記憶體2000的同一硬體以及包括應用程式AP3及/或應用程式AP4的軟體。第一作業系統、第二作業系統及硬體之間可另外存在軟體層,即管理程式(hypervisor)。所述管理程式可用於藉由使用有限的硬體資源來操作二或更多個作業系統。 Referring to FIG. 6 , the first operating system may manage hardware including the SoC 1000 and the main memory 2000 and software including the application program AP1 and/or the application program AP2. The second operating system may manage the same hardware including SoC 1000 and main memory 2000 and software including application program AP3 and/or application program AP4. There may be another software layer between the first operating system, the second operating system and the hardware, that is, a hypervisor. The hypervisor can be used to operate two or more operating systems by using limited hardware resources.

參照圖7,在執行第一應用程式AP1時,第一作業系統OS1可將過程的虛擬位址空間映射成中間實體位址(intermediate physical address,IPA)空間。在執行第二應用程式AP2時,第一作業系統OS1亦可將過程的虛擬位址空間映射成中間實體位址空間。相似地,在執行第三應用程式AP3時,第二作業系統OS2可將過程的虛擬位址空間映射成中間實體位址空間。在執行第四應用程式AP4時,第二作業系統OS2亦可將過程的虛擬位址空間映射成中間實體位址空間。第一作業系統OS1及第二作業系統OS2中的每一者可管理虛擬位址與中間實體位址之間第一級的位址變換。管理程式可管理中間實體位址與實體位址之間第二級的位址變換。與圖4所示情形相較,在電腦系統中使用管理程式會提供第二級的位址變換及上述特徵中的其他特徵的能力。 Referring to FIG. 7, when executing the first application program AP1, the first operating system OS1 can map the virtual address space of the process into an intermediate entity address (intermediate physical address, IPA) space. When executing the second application program AP2, the first operating system OS1 can also map the virtual address space of the process into an intermediate physical address space. Similarly, when the third application program AP3 is executed, the second operating system OS2 can map the virtual address space of the process into an intermediate physical address space. When executing the fourth application program AP4, the second operating system OS2 can also map the virtual address space of the process into an intermediate physical address space. Each of the first operating system OS1 and the second operating system OS2 can manage a first level of address translation between virtual addresses and intermediate physical addresses. The hypervisor can manage the second-level address translation between the intermediate physical address and the physical address. In contrast to the situation shown in FIG. 4, the use of a hypervisor in a computer system provides a second level of address translation and other capabilities among the features described above.

圖8A及圖8B示出其中圖2所示頁表走查器基於第一級及第二級來執行頁表走查的操作的流程圖。將一起闡述圖8A及圖8B。在圖8A及圖8B中,「S」、「L」及「PT」分別表示級、層階及頁表。頁表走查器1220可自載入/儲存單元1180接收自TLB 1210查找過的虛擬位址。虛擬位址的索引可根據層階L0至L3而劃分。頁表可被劃分成第一級S1及第二級S2,且可在每一級中被劃分成層階L0至L3或根據層階L0至L3而階層式地建構。如參照圖6及圖7所述,管理程式可用於虛擬化(virtualization)。頁表走查器1220可藉由將TTBR 1241中所儲存的基址與虛擬位址的L0索引相加來計算S1L0中間實體位址(intermediate physical address,IPA)(亦稱為「中間位址」)。 8A and 8B are flowcharts illustrating operations in which the page table walkthrough shown in FIG. 2 performs page table walkthrough based on the first level and the second level. 8A and 8B will be explained together. In FIG. 8A and FIG. 8B, "S", "L" and "PT" denote a level, a hierarchy and a page table, respectively. The page table walker 1220 may receive the virtual address looked up from the TLB 1210 from the load/store unit 1180 . Indexes of virtual addresses can be divided according to levels L0 to L3. The page table may be divided into a first level S1 and a second level S2, and may be divided into levels L0 to L3 in each level or structured hierarchically according to the levels L0 to L3. As described with reference to FIGS. 6 and 7 , the hypervisor can be used for virtualization. The page table walker 1220 can calculate the S1L0 intermediate physical address (IPA) (also called "intermediate address") by adding the base address stored in the TTBR 1241 to the L0 index of the virtual address ).

頁表走查器1220可自由VTTBR 1242中所儲存的基址 指示的S2L0頁表的條目中查找由S1L0中間實體位址的L0索引指示的條目,可提取條目中所包含的描述符,且可在頁表走查快取1230中儲存描述符的局部資訊(即,關於S1L0中間實體位址的L0索引的局部位址變換資訊)。頁表走查器1220可自由S2L0輸出位址指示的S2L1頁表的條目中查找由S1L0中間實體位址的L1索引指示的條目,可提取條目中所包含的描述符,且可在頁表走查快取1230中儲存描述符的局部資訊(即,關於S1L0中間實體位址的L1索引的局部位址變換資訊)。如同在與S2L1頁表相關聯的操作中,頁表走查器1220可執行與分別由S2L1及S2L2輸出位址指示的S2L2及S2L3頁表相關聯的操作。頁表走查器1220可自由自S2L3頁表提取的描述符的S2L3輸出位址指示的S1L0頁表的條目中查找由S1L0中間實體位址的偏移指示的條目,可提取條目中所包含的描述符,且可在頁表走查快取1230中儲存描述符的局部資訊(即,關於S1L0中間實體位址的偏移的局部位址變換資訊)。 The page table walker 1220 can free the base address stored in the VTTBR 1242 Find the entry indicated by the L0 index of the S1L0 intermediate entity address in the entry of the indicated S2L0 page table, the descriptor contained in the entry can be extracted, and the local information of the descriptor can be stored in the page table walkthrough cache 1230 ( That is, local address translation information about the L0 index of the S1L0 intermediate physical address). The page table walker 1220 can search the entry indicated by the L1 index of the S1L0 intermediate entity address in the entry of the S2L1 page table indicated by the S2L0 output address, and can extract the descriptor contained in the entry, and can walk in the page table The lookup cache 1230 stores the local information of the descriptor (ie, the local address translation information about the L1 index of the S1L0 intermediate entity address). As in the operations associated with the S2L1 page table, the page table walker 1220 may perform operations associated with the S2L2 and S2L3 page tables indicated by the S2L1 and S2L2 output addresses, respectively. The page table walker 1220 can freely search the entry indicated by the offset of the S1L0 intermediate entity address from the entry of the S1L0 page table indicated by the S2L3 output address of the descriptor extracted from the S2L3 page table, and can extract the entries contained in the entry. descriptor, and the local information of the descriptor (ie, the local address translation information about the offset of the S1L0 intermediate physical address) can be stored in the page table walkthrough cache 1230 .

頁表走查器1220可藉由將自S1L0頁表提取的S1L0輸出位址與虛擬位址的L1索引相加來計算S1L1中間實體位址。如同在對S1L0中間實體位址執行的第二級的頁表走查中,頁表走查器1220可對S1L1中間實體位址執行第二級的頁表走查。如同在對S1L1中間實體位址執行的第二級的頁表走查中,頁表走查器1220可分別對S1L2中間實體位址、S1L3中間實體位址及最終中間實體位址執行第二級的頁表走查。第二級的頁表走查指示查找 S2L0至S2L3頁表並提取描述符的操作,且第一級的頁表走查指示查找S1L0至S1L3頁表並提取描述符的操作。 The page table walker 1220 can calculate the S1L1 intermediate physical address by adding the S1L0 output address extracted from the S1L0 page table to the L1 index of the virtual address. As in the second-level page table walkthrough performed on the S1L0 intermediate entity address, the page table walker 1220 may perform a second-level page table walkthrough on the S1L1 intermediate entity address. As in the second level of page table walkthrough performed on the S1L1 intermediate entity address, the page table walker 1220 may perform the second level on the S1L2 intermediate entity address, the S1L3 intermediate entity address, and the final intermediate entity address, respectively. The page table walkthrough. Second-level page table walkthrough instructions lookup The operations of S2L0 to S2L3 page tables and extracting descriptors, and the first-level page table scan indicates the operations of searching S1L0 to S1L3 page tables and extracting descriptors.

頁表走查器1220可藉由將TTBR 1241中所儲存的基址與虛擬位址的L0索引相加來計算S1L0中間實體位址,且可對S1L0中間實體位址執行第二級的頁表走查。頁表走查器1220亦可藉由將S1L0輸出位址與虛擬位址的L1索引相加來計算S1L1中間實體位址,且可對S1L1中間實體位址執行第二級的頁表走查。頁表走查器1220可藉由將S1L1輸出位址與虛擬位址的L2索引相加來另外計算S1L2中間實體位址,且可對S1L2中間實體位址執行第二級的頁表走查。頁表走查器1220可藉由將S1L2輸出位址與虛擬位址的L3索引相加來進一步計算S1L3中間實體位址,且可對S1L3中間實體位址執行第二級的頁表走查。此外,頁表走查器1220可藉由將S1L3輸出位址與虛擬位址的偏移相加來計算最終中間實體位址,且可對最終中間實體位址執行第二級的頁表走查。在對最終中間實體位址執行第二級的頁表走查之後,頁表走查器1220可將最後提取的描述符儲存於頁表走查快取1230中。此外,頁表走查器1220亦可將最後提取的描述符儲存於TLB 1210中作為最終結果。頁表走查器1220的上述操作可被稱為「巢套式走查(nested walk)」。 The page table walker 1220 can calculate the S1L0 intermediate physical address by adding the base address stored in the TTBR 1241 to the L0 index of the virtual address, and can perform a second-level page table on the S1L0 intermediate physical address walk through. The page table walker 1220 can also calculate the S1L1 intermediate physical address by adding the S1L0 output address and the L1 index of the virtual address, and can perform a second-level page table walk on the S1L1 intermediate physical address. The page table walker 1220 can additionally calculate the S1L2 intermediate physical address by adding the S1L1 output address to the L2 index of the virtual address, and can perform a second level of page table walk on the S1L2 intermediate physical address. The page table walker 1220 can further calculate the S1L3 intermediate physical address by adding the S1L2 output address to the L3 index of the virtual address, and can perform a second-level page table walk on the S1L3 intermediate physical address. In addition, the page table walker 1220 can calculate the final intermediate physical address by adding the S1L3 output address and the offset of the virtual address, and can perform a second-level page table walkthrough on the final intermediate physical address . After performing the second-level page table walk on the final intermediate entity address, the page table walker 1220 may store the last fetched descriptor in the page table walk cache 1230 . In addition, the page table walker 1220 may also store the last extracted descriptor in the TLB 1210 as the final result. The above operations of the page table walker 1220 may be called "nested walk".

MMU 1200可自由自S2L3頁表提取的描述符的S2L3輸出位址指示的頁中查找由虛擬位址的偏移指示的頁,且可自所查找的頁獲得實體位址(例如,最終實體位址=S2L3輸出位址+偏 移)。亦即,在其中虛擬位址與S2L3輸出位址之間的映射(即,最終變換)快取於TLB 1210中的情形中,MMU 1200可藉由使用快取於TLB 1210中的偏移及輸出位址來立即計算實體位址,且可傳回實體位址。 The MMU 1200 is free to look up the page indicated by the offset of the virtual address from the page indicated by the S2L3 output address of the descriptor extracted from the S2L3 page table, and can obtain the physical address (e.g., the final physical bit Address = S2L3 output address + offset shift). That is, in cases where the mapping (i.e., the final translation) between virtual addresses and S2L3 output addresses is cached in the TLB 1210, the MMU 1200 can use the offset and output cached in the TLB 1210 Address to calculate the entity address immediately, and can return the entity address.

在圖8A及圖8B中將實例示作每級的層階數目是4且級數目是2,但本發明的教示內容並非僅限於此。例如,第一級的層階數目可為「m」(「m」是1或更大的自然數),且第二級的層階數目可為「n」(「n」是1或更大的自然數)。在其中頁表走查器1220在TLB未命中及頁表走查快取未命中條件下針對虛擬位址執行頁表走查的情形中,自頁表提取描述符的次數可為「(m+1)×(n+1)-1」。當然,在頁表走查器1220分別執行第一級及第二級的頁表走查時,頁表走查器1220可參考頁表走查快取1230中所儲存的局部位址變換資訊而跳過提取描述符的操作。 Examples are shown in FIGS. 8A and 8B as the number of levels per level is 4 and the number of levels is 2, but the teachings of the present invention are not limited thereto. For example, the number of levels of the first level may be "m" ("m" is a natural number of 1 or greater), and the number of levels of the second level may be "n" ("n" is a natural number of 1 or greater of natural numbers). In the case where the page table walker 1220 performs a page table walk for a virtual address under TLB miss and page table walk cache miss conditions, the number of descriptor fetches from the page table may be "(m+ 1)×(n+1)-1”. Of course, when the page table walkthrough device 1220 executes the first-level and second-level page table walkthroughs respectively, the page table walkthrough device 1220 can refer to the local address conversion information stored in the page table walkthrough cache 1230 Skip fetching descriptors.

圖9至圖11示出圖2所示頁表走查器的詳細方塊圖及操作。將一起闡述圖9至圖11。在圖9至圖11中,假設頁表走查器執行參照圖3至圖5所述的頁表走查。 9 to 11 illustrate the detailed block diagram and operation of the page table walker shown in FIG. 2 . 9 to 11 will be explained together. In FIGS. 9 to 11 , it is assumed that the page table walker performs the page table walk described with reference to FIGS. 3 to 5 .

頁表走查器1220可包括頁表走查排程器(page table walk scheduler)1221、走查器1223及1224以及冗餘走查偵測器(redundant walk detector)1225。可藉由使用類比電路、數位電路、邏輯電路、時脈電路、正反器、暫存器等而以硬體來實作頁表走查器1220的所有組件。換言之,無論是實作為儲存及執行軟體指令的處理器/記憶體組合(例如,微處理器/記憶體)還是實作 為例如應用專用積體電路等的邏輯電路,頁表走查器1220均可被準確地標示為頁表走查器電路。頁表走查排程器1221可接收尚未自TLB 1210查找到的一或多個輸入位址(虛擬位址)。頁表走查排程器1221可管理條目,所述條目中的每一者儲存或包含輸入位址的L0至L3索引、危險位元(hazard bit)、危險層階位元(hazard level bit)及危險識別(ID)位元。與具有輸入位址的走查請求相關聯的資訊可被輸入至頁表走查排程器1221的每一條目。 The page table walker 1220 may include a page table walk scheduler 1221 , walkers 1223 and 1224 and a redundant walk detector 1225 . All components of the page table walker 1220 can be implemented in hardware by using analog circuits, digital circuits, logic circuits, clock circuits, flip-flops, registers, etc. In other words, whether implemented as a processor/memory combination (e.g., microprocessor/memory) for storing and executing software instructions or as an implementation The page table walker 1220 can be accurately labeled as a page table walker circuit, for example, as a logic circuit such as an ASIC. The page table walk scheduler 1221 may receive one or more input addresses (virtual addresses) that have not yet been looked up from the TLB 1210 . The page table walk scheduler 1221 can manage entries, each of which stores or includes an L0 to L3 index of an input address, a hazard bit, a hazard level bit and Hazard Identification (ID) bits. Information associated with a walkthrough request with an input address may be entered into each entry of the page table walkthrough scheduler 1221 .

危險/重放控制器(hazard/replay controller)1222可檢查或識別每一條目的危險位元、危險層階位元及危險ID位元,且可將每一條目中所儲存的輸入位址或與具有輸入位址的走查請求相關聯的資訊提供至走查器1223及1224中的任一者。走查器1223及1224中的每一者可針對自頁表走查器1220提供的輸入位址執行頁表走查,且可提取輸出位址。輸入位址可為虛擬位址,且由走查器1223及1224中的每一者提取的輸出位址中的每一者可為實體位址。與圖9中的圖示不同,走查器1223及1224的數目可多於2個,且頁表走查器1220可並行地或同時執行2或更多個頁表走查。 Hazard/replay controller (hazard/replay controller) 1222 can check or identify the hazard bit, hazard level bit and hazard ID bit of each entry, and can input address stored in each entry or with Information associated with a walkthrough request with an input address is provided to either of the walkthroughs 1223 and 1224 . Each of walkers 1223 and 1224 may perform a page table walk on input addresses provided from page table walker 1220 and may extract output addresses. The input addresses may be virtual addresses, and each of the output addresses extracted by each of walkers 1223 and 1224 may be physical addresses. Different from the illustration in FIG. 9 , the number of walkers 1223 and 1224 can be more than 2, and the page table walker 1220 can perform 2 or more page table walks in parallel or simultaneously.

冗餘走查偵測器1225可計算已被確定為由走查器1223及1224執行的頁表走查的輸入位址與尚未關於是否連續執行頁表走查而確定的頁表走查的輸入位址之間的匹配層階。匹配層階可指示一個輸入位址的索引與另一輸入位址的索引的匹配程度如何。由於輸入位址之間的相似性可隨著匹配層階變高而變高,因 此輸入位址各自的頁表走查的執行結果可能彼此相似且可能是重複的(或冗餘的)。匹配層階亦可被稱為「冗餘命中層階(redundancy hit level)」。匹配層階可由冗餘走查偵測器1225計算,或者可由頁表走查排程器1221計算。 Redundant walkthrough detector 1225 may calculate input addresses of page table walkthroughs that have been determined to be performed by walkthroughs 1223 and 1224 and inputs of page table walkthroughs that have not been determined as to whether page table walkthroughs are performed consecutively The level of matching between addresses. The matching level may indicate how well the index of one input address matches the index of another input address. Since the similarity between input addresses can become higher as the matching level becomes higher, the The execution results of the respective page table walkthroughs of the input addresses may be similar to each other and may be duplicated (or redundant). The matching level may also be called "redundancy hit level". The matching level can be calculated by the redundancy walk detector 1225 , or can be calculated by the page table walk scheduler 1221 .

冗餘走查偵測器1225可管理條目,所述條目儲存或包含輸入至走查器1223及1224的輸入位址。例如,輸入至頁表走查排程器1221的條目的輸入位址可在不加以修改的情況下被提供至冗餘走查偵測器1225的條目。冗餘走查偵測器1225可藉由使用條目中的每一者中所儲存的輸入位址的索引查找頁表走查快取1230來獲得並儲存走查快取命中層階。冗餘走查偵測器1225可使用所述走查快取命中層階進行比較(即,與輸入位址的索引之間的匹配層階),以預先偵測並預測針對輸入位址的頁表走查的冗餘。當可因此避免冗餘時,作為使用走查快取命中層階的實際意義的實例,此提高了效率,避免了不必要的功率消耗,且避免了不必要的處理。此外,冗餘走查偵測器1225可獲得並儲存在上述走查器1223及1224將分別由輸入位址的索引指示的每一輸出位址儲存於頁表走查快取1230中時被更新的走查快取層階。 Redundant walkthrough detector 1225 may manage entries that store or include input addresses to walkthroughs 1223 and 1224 . For example, the input address of an entry input to page table walk scheduler 1221 may be provided to an entry of redundant walk detector 1225 without modification. The redundant walk detector 1225 may obtain and store the walk cache hit hierarchy by looking up the page table walk cache 1230 using an index of the input address stored in each of the entries. Redundant walkthrough detector 1225 may use the walkthrough cache hit level comparison (ie, match level with the index of the input address) to pre-detect and predict the page for the input address Table walkthrough redundancy. As an example of the practical implications of using a walkthrough cache hit hierarchy, this improves efficiency, avoids unnecessary power consumption, and avoids unnecessary processing when redundancy can thus be avoided. In addition, the redundant walkthrough detector 1225 may obtain and store updated when the aforementioned walkthroughs 1223 and 1224 store each output address indicated by the index of the input address respectively in the page table walkthrough cache 1230 The walkthrough cache hierarchy.

在其中由任何索引指示的描述符已被排程為自儲存頁表的記憶體提取或者已儲存於頁表走查快取1230中的情形中,不需要再次自記憶體提取此描述符。冗餘走查偵測器1225可將匹配層階與走查快取命中層階進行比較,且可基於比較結果來標記危險位元。冗餘走查偵測器1225可基於比較結果來預先偵測並預測 針對輸入位址的頁表走查的冗餘。針對輸入位址的頁表走查的冗餘意味著在藉由使用輸入位址的索引中與已被確定為欲執行的另一頁表走查的輸入位址的索引相匹配的索引來查找頁表的操作的至少一部分中存在冗餘。頁表走查器1220可執行其中不存在冗餘的頁表走查,而非其中存在冗餘的頁表走查,因此改良SoC 1000的效能並降低SoC 1000的功率消耗。冗餘走查偵測器1225可將匹配層階與走查快取層階進行比較,且可基於比較結果來清除所標記的危險位元。以下,將更全面地闡述一種偵測頁表走查冗餘的方式。 In the case where the descriptor indicated by any index has already been scheduled to be fetched from memory storing the page table or already stored in the page table walk cache 1230, this descriptor need not be fetched from memory again. The redundant scan detector 1225 can compare the match level to the scan cache hit level, and can flag dangerous bits based on the comparison result. The redundant walkthrough detector 1225 can pre-detect and predict based on the comparison results Redundancy in page table walkthroughs for incoming addresses. The redundancy of the page table walk for the input address means that by using the index of the input address that matches the index of the input address of another page table walk that has been determined to be performed, lookup Redundancy exists in at least a part of the operation of the page table. The page table walker 1220 can perform a page table walk where there is no redundancy instead of a page table walk where there is redundancy, thus improving the performance of the SoC 1000 and reducing the power consumption of the SoC 1000 . The redundant walkthrough detector 1225 can compare the matching level with the walkthrough cache level, and can clear the flagged dangerous bits based on the comparison result. A method for detecting page table walkthrough redundancy will be described more fully below.

參照圖9,假設輸入位址分別被輸入至頁表走查排程器1221的條目0及條目1,危險位元、危險層階位元及危險ID位元處於被清除狀態,且先前執行的頁表走查的結果(即,位址變換資訊)儲存於頁表走查快取1230的條目0中。條目的數目並非僅限於圖9至圖11所示實例。在圖9至圖11中,多個條目中的有效條目的有效位元可由「Y」來標記。 Referring to FIG. 9 , assuming that the input addresses are respectively input to the entry 0 and the entry 1 of the page table scan scheduler 1221, the dangerous bit, the dangerous level bit and the dangerous ID bit are in the cleared state, and the previously executed The result of the page table walkthrough (ie, address translation information) is stored in entry 0 of the page table walkthrough cache 1230 . The number of entries is not limited to the examples shown in FIGS. 9 to 11 . In FIGS. 9-11 , valid bits of valid entries among the plurality of entries may be marked by "Y".

頁表走查排程器1221可將輸入至條目0的輸入位址IA0分配給走查器1223(其處於等待狀態),且走查器1223可針對輸入位址IA0執行頁表走查。走查器1223可檢查(或判斷)是否自頁表走查快取1230查找到由L0索引0x12、L1索引0x23、L2索引0x34及L3索引0x78指示的輸出位址。參照圖9,由L0索引0x12指示的輸出位址0x100已儲存於頁表走查快取1230中(在頁表走查快取1230中發生L0層階命中)。冗餘走查偵測器1225可 藉由使用輸入位址IA0的L0索引0x12、L1索引0x23、L2索引0x34及L3索引0x78查找頁表走查快取1230來獲得或計算出輸入位址IA0的走查快取命中層階是「L0」。此外,當由L0索引0x12指示的輸出位址0x100儲存於頁表走查快取1230中時,冗餘走查偵測器1225可標記出輸入位址IA0的走查快取層階是「L0」(Y)。由於由L0索引0x12指示的輸出位址0x100已儲存於頁表走查快取1230中,因此可跳過自記憶體提取輸出位址0x100的操作。然而,由於由L1索引0x23指示的輸出位址未儲存於頁表走查快取1230中(即,在頁表走查快取1230中發生未命中),因此走查器1223可啟動(起始或開始)自記憶體提取由L1索引0x23指示的輸出位址。 The page table walk scheduler 1221 may assign the input address IA0 input to entry 0 to the walker 1223 (which is in a waiting state), and the walker 1223 may perform a page table walk for the input address IA0. The walkthrough device 1223 can check (or determine) whether the output address indicated by the L0 index 0x12, the L1 index 0x23, the L2 index 0x34, and the L3 index 0x78 is found from the page table walkthrough cache 1230. Referring to FIG. 9 , the output address 0x100 indicated by the L0 index 0x12 has been stored in the page table walkthrough cache 1230 (the L0 level hit occurred in the page table walkthrough cache 1230 ). Redundant Walkthrough Detector 1225 can be The walkthrough cache hit level of the input address IA0 is obtained or calculated by using the L0 index 0x12, the L1 index 0x23, the L2 index 0x34 and the L3 index 0x78 of the input address IA0 to look up the page table walkthrough cache 1230. L0". In addition, when the output address 0x100 indicated by the L0 index 0x12 is stored in the page table walkthrough cache 1230, the redundant walkthrough detector 1225 can mark that the walkthrough cache level of the input address IA0 is "L0 "(Y). Since the output address 0x100 indicated by the L0 index 0x12 is already stored in the page table lookup cache 1230, the operation of fetching the output address 0x100 from the memory can be skipped. However, since the output address indicated by the L1 index 0x23 is not stored in the page table walk cache 1230 (i.e., a miss occurred in the page table walk cache 1230), the walk walker 1223 may start (start or start) fetches the output address indicated by L1 index 0x23 from the memory.

參照圖10,頁表走查排程器1221可將輸入至條目1的輸入位址IA1分配給走查器1224。走查器1224可針對輸入位址IA1執行頁表走查。走查器1224可檢查是否自頁表走查快取1230查找到由L0索引0x12、L1索引0x23、L2索引0x9A及L3索引0xBC指示的輸出位址。參照圖10,由L0索引0x12指示的輸出位址0x100已儲存於頁表走查快取1230中。冗餘走查偵測器1225可藉由使用L0索引0x12、L1索引0x23、L2索引0x9A及L3索引0xBC查找頁表走查快取1230來獲得或計算出輸入位址IA1的走查快取命中層階是「L0」。此外,當由L0索引0x12指示的輸出位址0x100儲存於頁表走查快取1230中時,冗餘走查偵測器1225可標記出輸入位址IA1的走查快取層階是「L0」(Y)。 Referring to FIG. 10 , the page table walk scheduler 1221 may assign the input address IA1 input to entry 1 to the walk walker 1224 . The walker 1224 can perform a page table walkthrough for the input address IA1. The walkthrough 1224 can check whether the output address indicated by the L0 index 0x12, the L1 index 0x23, the L2 index 0x9A, and the L3 index 0xBC is found from the page table walkthrough cache 1230. Referring to FIG. 10 , the output address 0x100 indicated by the L0 index 0x12 has been stored in the page table walkthrough cache 1230 . Redundant walkthrough detector 1225 may obtain or calculate a walkthrough cache hit for input address IA1 by looking up page table walkthrough cache 1230 using L0 index 0x12, L1 index 0x23, L2 index 0x9A, and L3 index 0xBC The level is "L0". In addition, when the output address 0x100 indicated by the L0 index 0x12 is stored in the page table walkthrough cache 1230, the redundant walkthrough detector 1225 can mark that the walkthrough cache level of the input address IA1 is "L0 "(Y).

由於由L0索引0x12指示的輸出位址0x100已儲存於頁表走查快取1230中,因此可跳過自記憶體提取輸出位址0x100的操作。由於由L1索引0x23指示的輸出位址未儲存於頁表走查快取1230中,因此走查器1224可啟動自記憶體提取由L1索引0x23指示的輸出位址。 Since the output address 0x100 indicated by the L0 index 0x12 is already stored in the page table lookup cache 1230, the operation of fetching the output address 0x100 from the memory can be skipped. Since the output address indicated by the L1 index 0x23 is not stored in the page table walk cache 1230, the walker 1224 can start fetching the output address indicated by the L1 index 0x23 from the memory.

在其中所有走查器1223及1224均自記憶體提取由L1索引0x23指示的輸出位址的情形中,走查器1223及1224可提取相同的輸出位址,且因此走查器1223及1224的操作可具有冗餘性且可為重複的。由於走查器1223首先起始自記憶體提取由L1索引0x23指示的輸出位址,因此其中走查器1224自記憶體提取由L1索引0x23指示的輸出位址的操作可為冗餘的且可為重複的。針對輸入位址IA1的頁表走查的冗餘是自記憶體中所儲存的L1頁表提取由L1索引0x23指示的輸出位址的操作。因此,可預測及/或偵測到欲由走查器1224執行的頁表走查的冗餘,以防止冗餘。 In the case where all walkers 1223 and 1224 fetch the output address indicated by L1 index 0x23 from memory, walkers 1223 and 1224 may fetch the same output address, and therefore the walkers 1223 and 1224's Operations can be redundant and can be repeated. Since walker 1223 first starts fetching the output address indicated by L1 index 0x23 from memory, the operation in which walker 1224 fetches the output address indicated by L1 index 0x23 from memory may be redundant and may for repeated. The redundancy of the page table walk-through for the input address IA1 is the operation of fetching the output address indicated by the L1 index 0x23 from the L1 page table stored in memory. Accordingly, redundancies in page table walks to be performed by walker 1224 may be predicted and/or detected to prevent redundancies.

為偵測欲由走查器1224執行的頁表走查的冗餘,冗餘走查偵測器1225可以索引或層階為單位將輸入位址IA0與輸入位址IA1進行比較,所述索引或層階基於各自作為輸入虛擬位址不同部分的段。索引層階的增加可反映輸入虛擬位址的粒度(granularity),且當前輸入虛擬位址與現有及/或先前輸入虛擬位址之間的匹配度愈高,可避免處理中的愈多冗餘,如本文所述。輸入位址IA1的L0索引0x12及L1索引0x23可分別與輸入位址 IA0的L0索引0x12及L1索引0x23匹配(相等)。冗餘走查偵測器1225可計算出輸入位址IA0與IA1之間的匹配層階是「L1」。此外,冗餘走查偵測器1225可計算出在計算匹配層階時被進行比較的不同的輸入位址IA0與輸入位址IA1的走查快取命中層階是「L0」。冗餘走查偵測器1225可將輸入位址IA0與輸入位址IA1之間的匹配層階L1與輸入位址IA1的走查快取命中層階L0進行比較。 To detect redundancy in the page table walk to be performed by walkthrough detector 1224, redundancy walkthrough detector 1225 may compare input address IA0 with input address IA1 in units of indices or levels. or hierarchies based on segments each being a different part of the input virtual address. The increase in the index level can reflect the granularity of the input virtual address, and the higher the matching degree between the current input virtual address and the existing and/or previous input virtual address, the more redundancy in the process can be avoided , as described in this article. The L0 index 0x12 and L1 index 0x23 of the input address IA1 can be compared with the input address L0 index 0x12 and L1 index 0x23 of IA0 match (equal). The redundant walkthrough detector 1225 can calculate that the matching level between the input addresses IA0 and IA1 is "L1". In addition, the redundant walkthrough detector 1225 can calculate that the walkthrough cache hit level of the different input address IA0 and input address IA1 that are compared when calculating the matching level is "L0". The redundant walkthrough detector 1225 can compare the matching level L1 between the input address IA0 and the input address IA1 with the walkthrough cache hit level L0 of the input address IA1 .

由於匹配層階L1大於(或高於)走查快取命中層階L0,因此冗餘走查偵測器1225可標記頁表走查排程器1221的條目1的危險位元(Y)。所標記的危險位元指示輸入位址IA1的匹配層階L1大於走查快取命中層階L0,並指示在針對輸入位址IA1的頁表走查中存在冗餘。在其中危險位元被標記的情形中,可取消正在走查器1224中針對輸入位址IA1執行的頁表走查。而是,走查器1224可針對頁表走查排程器1221的另一條目(例如,2、3或4)中所儲存的輸入位址執行頁表走查。冗餘走查偵測器1225可防止走查器1224被冗餘地使用。由於以此種方式使用走查快取命中層階,作為使用走查快取命中層階的實際意義的實例,可避免冗餘,此又可提高效率、避免不必要的功率消耗且避免不必要的處理。 Since the match level L1 is larger (or higher) than the walk cache hit level L0, the redundant walk detector 1225 can mark the dangerous bit (Y) of entry 1 of the page table walk scheduler 1221 . The flagged hazard bit indicates that the match level L1 of the input address IA1 is greater than the walkthrough cache hit level L0 and indicates that there is redundancy in the page table walkthrough for the input address IA1 . In cases where dangerous bits are marked, the page table walk being performed in walker 1224 for input address IA1 may be cancelled. Instead, the walker 1224 may perform a page table walk for an input address stored in another entry (eg, 2, 3, or 4) of the page table walk scheduler 1221 . The redundant walkthrough detector 1225 prevents the walkthrough detector 1224 from being used redundantly. By using the walkthrough cache hit hierarchy in this way, as an example of the practical implications of using the walkthrough cache hit hierarchy, redundancy can be avoided, which in turn improves efficiency, avoids unnecessary power consumption, and avoids unnecessary processing.

在上述實例中,給出了如下說明:在其中於走查器1224針對輸入位址IA1執行頁表走查時危險位元被標記的情形中取消頁表走查。在另一實施例中,頁表走查排程器1221可首先檢查輸 入位址IA1的危險位元是否被標記,且然後可將輸入位址IA1提供至走查器1224。在此種情形中,可在去除針對輸入位址IA1的頁表走查的冗餘之後(即,在危險位元被清除之後)執行頁表走查。 In the above example, the following explanation is given: cancel the page table walk in the case where the dangerous bit is marked when the walker 1224 performs the page table walk for the input address IA1. In another embodiment, the page table walk scheduler 1221 may first check the input Whether the dangerous bit of the input address IA1 is marked, and then the input address IA1 can be provided to the scanner 1224 . In this case, the page table walk may be performed after removing the redundancy of the page table walk for the input address IA1 (ie, after the dangerous bits are cleared).

冗餘走查偵測器1225可將頁表走查排程器1221的條目1的危險層階位元標記為「1」。此處,「1」指示用於階層式地建構頁表的層階中的「L1」,且僅為示例性值。危險層階位元可指示輸入位址IA0及IA1中相匹配的索引的最高層階,或者可指示輸入位址IA0及IA1中相匹配的索引的另一層階。冗餘走查偵測器1225可將頁表走查排程器1221的條目1的危險ID標記為「0」。危險ID可指示走查器1223及1224中的哪個走查器(在上述實例中為走查器1223)針對索引與輸入位址IA1的一些索引相同的輸入位址IA0執行頁表走查。 The redundant walkthrough detector 1225 can mark the dangerous level bit of entry 1 of the page table walkthrough scheduler 1221 as "1". Here, "1" indicates "L1" in the level used to construct the page table hierarchically, and is only an exemplary value. The hazard level bit may indicate the highest level of matching indices in input addresses IAO and IA1, or may indicate another level of matching indices in input addresses IA0 and IA1. The redundant walkthrough detector 1225 can mark the hazard ID of entry 1 of the page table walkthrough scheduler 1221 as "0". The hazard ID may indicate which of walkers 1223 and 1224 (walker 1223 in the above example) performs a page table walk for input address IA0 with the same index as some of input address IA1.

參照圖11,走查器1223可完成自記憶體提取由輸入位址IA0的L1索引0x23指示的輸出位址0x200,且可將輸出位址0x200儲存於頁表走查快取1230的條目1中以填充條目1。可在頁表走查快取1230中儲存及更新輸入位址IA0的L1索引0x23的局部位址變換資訊。在輸出位址0x200儲存於頁表走查快取1230的條目1中時,冗餘走查偵測器1225可將輸入位址IA0及輸入位址IA1的走查快取層階更新為「L1」。例如,可將走查快取命中層階作為與最近提取的輸出位址對應的輸入位址的索引所屬的層階來計算。因此,可基於冗餘走查偵測器1225的操作來動態更新用 於減少冗餘的走查快取命中層階。 Referring to FIG. 11 , the walkthrough device 1223 can extract the output address 0x200 indicated by the L1 index 0x23 of the input address IA0 from the memory, and can store the output address 0x200 in the entry 1 of the page table walkthrough cache 1230 to populate entry 1. The local address translation information of the L1 index 0x23 of the input address IA0 can be stored and updated in the page table walkthrough cache 1230 . When the output address 0x200 is stored in entry 1 of the page table walkthrough cache 1230, the redundant walkthrough detector 1225 can update the walkthrough cache levels of the input address IA0 and the input address IA1 to "L1 ". For example, the walkthrough cache hit level may be calculated as the level to which the index of the input address corresponding to the most recently fetched output address belongs. Therefore, the user can be dynamically updated based on the operation of the redundant walkthrough detector 1225 Hierarchy of walkthrough caches to reduce redundancy.

由於輸入位址IA0的走查快取層階L1被更新,因此冗餘走查偵測器1225可將輸入位址IA1的匹配層階L1與輸入位址IA0的走查快取層階L1進行比較。由於匹配層階L1不大於(即,相同於)走查快取命中層階L1,因此冗餘走查偵測器1225可清除頁表走查排程器1221的包含輸入位址IA1的條目1的危險位元、危險層階位元及危險ID。 Since the walkthrough level L1 of the input address IA0 is updated, the redundant walkthrough detector 1225 can match the matching level L1 of the input address IA1 with the walkthrough level L1 of the input address IA0 Compare. Since match level L1 is not greater than (i.e., the same as) walkthrough cache hit level L1, redundant walkthrough detector 1225 may clear entry 1 of page table walkthrough scheduler 1221 containing input address IA1 Hazard bit, Hazard level bit, and Hazard ID.

當條目1的危險位元被清除時,頁表走查排程器1221的危險/重放控制器1222可再次將輸入位址IA1提供至走查器1224。走查器1224可在頁表走查快取1230中查找由L0索引及L1索引指示的輸出位址0x100及0x200,且然後可起始自記憶體提取由L2索引指示的輸出位址。走查器1224可重放或重新執行針對輸入位址IA1的頁表走查。 When the hazard bit of entry 1 is cleared, the hazard/replay controller 1222 of the page table walk scheduler 1221 can provide the input address IA1 to the walker 1224 again. The walker 1224 can look up the output addresses 0x100 and 0x200 indicated by the L0 index and the L1 index in the page table walkthrough cache 1230 , and then can start fetching the output address indicated by the L2 index from the memory. The walkthrough unit 1224 can replay or re-execute the page table walkthrough for the input address IA1.

當在頁表走查快取1230中發生關於L0索引的命中時,跳過對L0頁表的查找,且執行對剩餘的L1至L3頁表的查找。走查器1223藉由查找頁表走查快取1230中所儲存的位址變換資訊(輸出位址0x100)以及至少一部分頁表來提取由輸入位址IA0的索引指示的輸出位址。在走查器1223提取輸出位址時,冗餘走查偵測器1225可將輸入位址IA0與輸入位址IA1之間的匹配層階與輸入位址IA1的走查快取命中層階進行比較,且可偵測針對輸入位址IA1的頁表走查的冗餘。頁表走查排程器1221可直至冗餘走查偵測器1225清除輸入位址IA1的危險位元才將輸入位址IA1 提供至走查器1223及1224。 When a hit on the L0 index occurs in the page table walk cache 1230, the lookup of the L0 page table is skipped, and the lookup of the remaining L1 to L3 page tables is performed. The scan unit 1223 extracts the output address indicated by the index of the input address IA0 by looking up the address translation information (output address 0x100) stored in the page table scan cache 1230 and at least a part of the page table. When the walkthrough 1223 extracts the output address, the redundant walkthrough detector 1225 can compare the match level between the input address IA0 and the input address IA1 with the walkthrough cache hit level of the input address IA1 comparison, and can detect redundancy in the page table walkthrough for input address IA1. The page table scan scheduler 1221 may not input the input address IA1 until the redundant scan detector 1225 clears the dangerous bit of the input address IA1 Provided to the walkthroughs 1223 and 1224.

圖12示出圖2所示頁表走查器的詳細方塊圖及由頁表走查器管理的條目。在圖12中,假設頁表走查器執行參照圖6至圖8B所述的第一級的頁表走查、及第二級的頁表走查。說明將集中於基於圖12的實施例與基於圖9至圖11的實施例之間的差異。 FIG. 12 shows a detailed block diagram of the page table walker shown in FIG. 2 and the entries managed by the page table walker. In FIG. 12 , it is assumed that the page table walkthrough executes the first-level page table walkthrough and the second-level page table walkthrough described with reference to FIGS. 6 to 8B . The description will focus on the differences between the embodiment based on FIG. 12 and the embodiment based on FIGS. 9 to 11 .

頁表走查器1220可包括作為第一冗餘走查偵測器的冗餘走查偵測器1225及第二冗餘走查偵測器1226。作為第一冗餘走查偵測器的冗餘走查偵測器1225可與用於將虛擬位址變換成中間實體位址的第一級的頁表走查相關聯。第二冗餘走查偵測器1226可與用於將中間實體位址變換成實體位址的第二級的頁表走查相關聯。 The page table walker 1220 may include a redundant walkthrough detector 1225 and a second redundant walkthrough detector 1226 as a first redundant walkthrough detector. Redundant walk detector 1225, which is the first redundant walk detector, may be associated with a first level of page table walk used to translate virtual addresses into intermediate physical addresses. The second redundant walkthrough detector 1226 may be associated with a second level of page table walkthrough for translating intermediate physical addresses into physical addresses.

如同參照圖9至圖11所述的冗餘走查偵測器1225,第一冗餘走查偵測器可偵測第一級的頁表走查的冗餘。作為第一冗餘走查偵測器的冗餘走查偵測器1225可將例如當前輸入位址等輸入位址(虛擬位址)與例如先前輸入位址等另一輸入位址之間的第一匹配層階與藉由使用輸入位址的索引查找頁表走查快取1230而獲得的第一走查快取命中層階進行比較。作為第一冗餘走查偵測器的冗餘走查偵測器1225可基於比較結果來標記危險位元、第一級危險層階位元及危險ID位元。 Like the redundancy walkthrough detector 1225 described with reference to FIGS. 9 to 11 , the first redundancy walkthrough detector can detect redundancy of the first level page table walkthrough. Redundant walkthrough detector 1225, which is a first redundant walkthrough detector, can compare the difference between an input address (virtual address) such as a current input address and another input address such as a previous input address. The first match level is compared with the first walk cache hit level obtained by using the index lookup page table walk cache 1230 of the input address. The redundant scrutiny detector 1225 as the first redundant scrutiny detector can mark the hazard bit, the first-level hazard level bit, and the hazard ID bit based on the comparison result.

如同參照圖9至圖11所述的冗餘走查偵測器1225,第二冗餘走查偵測器1226可偵測第二級的頁表走查的冗餘。第二冗餘走查偵測器1226可將輸入位址與另一輸入位址之間的第二匹配 層階與藉由使用輸入位址的索引查找頁表走查快取1230而獲得的第二走查快取命中層階進行比較。例如,輸入位址可為例如當前中間實體位址等中間實體位址,且另一輸入位址可為例如先前中間實體位址等另一中間實體位址。第二冗餘走查偵測器1226可基於比較結果來標記危險位元、第二級危險層階位元及危險ID位元。由作為第一冗餘走查偵測器的冗餘走查偵測器1225標記或清除的危險位元可與由第二冗餘走查偵測器1226標記或清除的危險位元相同或不同。 Like the redundant walkthrough detector 1225 described with reference to FIGS. 9 to 11 , the second redundant walkthrough detector 1226 can detect the redundancy of the second-level page table walkthrough. The second redundant scan detector 1226 can compare a second match between an input address with another input address The hierarchy is compared to a second walkthrough cache hit hierarchy obtained by looking up the page table walkthrough cache 1230 using the index of the input address. For example, the input address may be an intermediary entity address such as a current intermediary entity address, and another input address may be another intermediary entity address such as a previous intermediary entity address. The second redundant walkthrough detector 1226 can flag the hazard bit, the second hazard level bit, and the hazard ID bit based on the comparison result. The dangerous bits marked or cleared by redundant walkthrough detector 1225 as the first redundant walkthrough detector may be the same or different from the dangerous bits marked or cleared by second redundant walkthrough detector 1226 .

走查器1223提取作為自圖8A及圖8B所示S1L0至S1L3頁表提取的輸出位址且由例如當前輸入位址等輸入位址的索引指示的中間實體位址。走查器藉由查找頁表走查快取1230中所儲存的第一級的位址變換資訊、以及第一級的至少一部分頁表來提取中間實體位址。在走查器1223提取中間實體位址時,作為第一冗餘走查偵測器的冗餘走查偵測器1225可將第一匹配層階與第一走查快取命中層階進行比較,且可偵測針對輸入位址的頁表走查的冗餘。此外,走查器1223提取作為自圖8A及圖8B所示S2L0至S2L3頁表提取的輸出位址且由中間實體位址的索引指示的實體位址。走查器1223藉由查找頁表走查快取1230中所儲存的第二級的位址變換資訊、以及第二級的至少一部分頁表來提取實體位址。在走查器1223提取實體位址時,第二冗餘走查偵測器1226可將中間實體位址之間的第二匹配層階與第二走查快取命中層階進行比較,且可偵測針對中間實體位址的頁表走查的冗餘。走查 器1223及1224中的每一者可針對自頁表走查器1220提供的輸入位址執行頁表走查,且可提取輸出位址。例如,輸入位址可為虛擬位址,且輸出位址中的每一者可為中間實體位址。對於另一實例,輸入位址可為中間實體位址,且輸出位址中的每一者可為實體位址。 The walker 1223 extracts the intermediate entity address as the output address extracted from the S1L0 to S1L3 page tables shown in FIG. 8A and FIG. 8B and indicated by the index of the input address such as the current input address. The walk-through device extracts the intermediate entity address by searching the first-level address translation information stored in the page-table walkthrough cache 1230 and at least a part of the first-level page table. When the walker 1223 extracts the intermediate entity address, the redundant walkthrough detector 1225 as the first redundant walkthrough detector can compare the first matching level with the first walkthrough cache hit level , and can detect redundancies in page table walkthroughs for incoming addresses. In addition, the walker 1223 extracts the physical address as the output address extracted from the S2L0 to S2L3 page tables shown in FIG. 8A and FIG. 8B and indicated by the index of the intermediate physical address. The walkthrough unit 1223 extracts the physical address by searching the second-level address translation information stored in the page table walkthrough cache 1230 and at least a part of the second-level page table. When the walker 1223 extracts physical addresses, the second redundant walkthrough detector 1226 can compare a second level of matching between intermediate physical addresses with a second level of walkthrough cache hits, and can Detect redundancy in page table walkthroughs for intermediate physical addresses. walk through Each of detectors 1223 and 1224 can perform a page table walk on the input address provided from page table walker 1220, and can extract an output address. For example, the input addresses may be virtual addresses, and each of the output addresses may be intermediate physical addresses. For another example, the input addresses may be intermediate physical addresses, and each of the output addresses may be physical addresses.

圖13示出其中圖2所示頁表走查器執行頁表走查以將虛擬位址變換成實體位址的流程圖,且參照圖5來予以闡述。在操作S103中,頁表走查器1220可在TLB未命中之後接收虛擬位址(即,輸入位址)。頁表走查器1220接收的輸入位址是未自TLB 1210查找到的位址。MMU 1200可藉由使用輸入位址及上下文(context)來查找TLB 1210。為便於闡述,在圖5、圖8A及圖8B中將輸入位址示作包含索引及偏移,但輸入位址可更包含上下文。例如,上下文可為位址空間ID(address space ID,ASID)、特權等級(privilege level)、非安全、虛擬機ID(virtual machine ID,VMID)等的資訊。 FIG. 13 shows a flowchart in which the page table walker shown in FIG. 2 performs a page table walk to convert virtual addresses into physical addresses, and is explained with reference to FIG. 5 . In operation S103, the page table walker 1220 may receive a dummy address (ie, an input address) after the TLB miss. The input address received by the page table walker 1220 is an address not found from the TLB 1210 . The MMU 1200 can look up the TLB 1210 by using the input address and context. For ease of illustration, the input address is shown in FIGS. 5, 8A, and 8B as including an index and an offset, but the input address may further include a context. For example, the context may be address space ID (address space ID, ASID), privilege level (privilege level), non-security, virtual machine ID (virtual machine ID, VMID) and other information.

在操作S106中,頁表走查器1220可將輸入位址及上下文分配或提供至頁表走查(page table walk,PTW)排程器1221。例如,如參照圖9所述,輸入位址可分別儲存於頁表走查器1220的條目中。 In operation S106 , the page table walker 1220 may allocate or provide the input address and context to the page table walk (PTW) scheduler 1221 . For example, as described with reference to FIG. 9 , the input addresses may be stored in entries of the page table walker 1220 respectively.

在操作S109中,頁表走查排程器1221可檢查輸入位址的危險位元是否被標記。當危險位元被標記(S109=是)時,頁表走查排程器1221可直至危險位元被清除才將輸入位址分配給走查 器1223及1224。可直至危險位元被清除才執行針對輸入位址的頁表走查。 In operation S109, the page table walk scheduler 1221 may check whether the dangerous bit of the input address is marked. When the dangerous bit is marked (S109=Yes), the page table scan scheduler 1221 may not assign the input address to the scan until the dangerous bit is cleared Devices 1223 and 1224. The page table walk for the input address may not be performed until the dangerous bits are cleared.

在操作S113中,當危險位元未被標記(S109=否)或被清除時,頁表走查排程器1221可將輸入位址分配給走查器1223及1224中的任一者(例如,未執行頁表走查的空閒走查器)。此外,頁表走查排程器1221可將輸入位址分配給冗餘走查偵測器(redundant walk detector,RWD)1225。 In operation S113, when the dangerous bit is not marked (S109=No) or cleared, the page table walk scheduler 1221 may assign the input address to any one of the walkers 1223 and 1224 (for example , an idle walker that does not perform a page table walk). In addition, the page table walk scheduler 1221 can distribute the input address to a redundant walk detector (redundant walk detector, RWD) 1225 .

在操作S116中,走查器可檢查頁表走查快取1230中是否儲存有局部位址變換資訊及全部位址變換資訊。例如,走查器可為走查器1223及1224中的任一者,局部位址變換資訊及全部位址變換資訊可為由輸入位址(例如當前輸入位址)的索引指示的描述符,且頁表走查快取1230可為第一級的頁表走查快取S1WC。亦即,在操作S116中,被分配了輸入位址的走查器可檢查頁表走查快取1230中是否儲存有與輸入位址及上下文相關聯的局部位址變換資訊及全部位址變換資訊。走查器可識別由輸入位址的索引指示的輸出位址的第一級的層階中儲存於頁表走查快取1230中的最高層階。走查器可檢查由輸入位址的索引指示的儲存於頁表走查快取1230中的輸出位址的第一級的層階。當走查器查找頁表走查快取1230時,走查器可進一步參考上下文以及第一級的層階L0至L3中每一者的索引。例如,走查器可使用頁表走查快取1230的上下文及索引分別與所請求的上下文及索引匹配的條目的局部位址變換資訊。 In operation S116 , the walkthrough device may check whether the page table walkthrough cache 1230 stores local address translation information and all address translation information. For example, the walker can be any one of the walkers 1223 and 1224, and the partial address translation information and the full address translation information can be descriptors indicated by an index of an input address (such as a current input address), And the page table walkthrough cache 1230 may be the first level page table walkthrough cache S1WC. That is, in operation S116, the walkthrough device assigned the input address can check whether the page table walkthrough cache 1230 stores local address translation information and all address translation information associated with the input address and context Information. The walker can identify the highest level stored in the page table walkthrough cache 1230 among the first level of hierarchies of the output address indicated by the index of the input address. The walker may examine the hierarchy of the first level of the output address stored in the page table walk cache 1230 indicated by the index of the input address. When the walker searches the page table walkthrough cache 1230 , the walker may further refer to the context and the index of each of the first-level levels L0 to L3 . For example, the walker may use the context and index of the page table walk cache 1230 for entries that match the requested context and index, respectively, with local address translation information.

在操作S119中,當在頁表走查快取1230中發生命中(S116=是)時,走查器可跳過提取儲存於頁表走查快取1230中或由命中索引指示的輸出位址的操作。走查器可跳過提取輸出位址的操作,直至第一級的命中層階為止。例如,在其中當前輸入位址是圖11所示輸入位址IA1的情形中,走查器可跳過提取分別由L0及L1索引指示的輸出位址的操作。走查器可跳過提取自第一級的第一層階(例如,L0)至在操作S116中命中的命中層階(例如,L1)中的對應輸出位址的操作。隨著第一級的命中層階變高,走查器查找的頁表的數目可減少。 In operation S119, when a hit occurs in the page table walkthrough cache 1230 (S116=Yes), the walkthrough may skip fetching the output address stored in the page table walkthrough cache 1230 or indicated by the hit index operation. The walkthrough can skip extracting output addresses until the first level hit hierarchy. For example, in the case where the current input address is the input address IA1 shown in FIG. 11, the walkthrough may skip the operation of fetching the output addresses indicated by the L0 and L1 indices, respectively. The walkthrough may skip the operation of extracting the corresponding output address from the first level (eg, L0) of the first level to the hit level (eg, L1) hit in operation S116. As the hit hierarchy of the first level becomes higher, the number of page tables looked up by the walker can be reduced.

在操作S123中,冗餘走查偵測器1225可藉由將匹配層階與走查快取命中層階進行比較來偵測在其中走查器提取由未在頁表走查快取1230中命中的索引指示的輸出位址的操作(例如,頁表走查)中是否存在冗餘。冗餘走查偵測器1225可計算未完成頁表走查的輸入位址(或任何其他輸入位址)與當前輸入位址之間第一級的匹配層階。匹配層階可指示當前輸入位址與任何其他輸入位址中相匹配的索引的層階(例如,與危險層階位元對應的匹配層階)。隨著匹配層階變高,當前輸入位址的索引與另一輸入位址的索引彼此匹配的程度可變高。冗餘走查偵測器1225可計算匹配層階中的最高(最大)匹配層階作為當前輸入位址的匹配層階。此外,冗餘走查偵測器1225可藉由使用當前輸入位址的索引來查找頁表走查快取1230,且可獲得第一級的走查快取命中層階。 In operation S123, the redundant walkthrough detector 1225 can detect where the walkthrough fetches are not in the page table walkthrough cache 1230 by comparing the matching level with the walkthrough cache hit level Whether there is redundancy in the operation (eg, page table walk-through) of the output address indicated by the hit index. The redundant walkthrough detector 1225 can calculate the first level of matching hierarchy between the incoming address of the incomplete page table walkthrough (or any other incoming address) and the current incoming address. The match level may indicate the level of the index that matches the current input address with any other input address (eg, the match level corresponding to the risk level bit). As the matching level becomes higher, the degree to which the index of the current input address and the index of another input address match each other may be higher. The redundant walkthrough detector 1225 can calculate the highest (maximum) matching level in the matching levels as the matching level of the current input address. In addition, the redundant walkthrough detector 1225 can search the page table walkthrough cache 1230 by using the index of the current input address, and can obtain the first level walkthrough cache hit hierarchy.

在操作S126中,當匹配層階高於走查快取命中層階時 (即,當偵測到冗餘時)(S123=是),冗餘走查偵測器1225可更新儲存或包含輸入位址及上下文的條目中的危險資訊(例如,危險位元、危險層階位元及危險ID位元),以使得不執行包括冗餘的頁表走查。冗餘走查偵測器1225可為輸入位址標記第一級的危險位元。此外,可將危險位元被標記的輸入位址自冗餘走查偵測器1225解除分配。如針對操作S109所述,直至所標記的危險位元被清除,輸入位址才可被分配給走查器1223及1224以及冗餘走查偵測器1225。頁表走查器1220可不將當前輸入位址提供至走查器1223及1224,可不針對當前輸入位址執行頁表走查,且可在正執行頁表走查時取消或停止頁表走查。 In operation S126, when the matching level is higher than the walkthrough cache hit level (i.e., when redundancy is detected) (S123=Yes), the redundancy walkthrough detector 1225 may update hazard information (e.g., hazard bits, hazard levels) in entries that store or include the input address and context rank bit and hazard ID bit) so that no page table walk including redundancy is performed. The redundant walkthrough detector 1225 can flag the first-level dangerous bits for the input address. In addition, the input addresses whose dangerous bits are flagged can be deallocated from the redundant walkthrough detector 1225 . As described for operation S109 , until the flagged dangerous bits are cleared, the input address may not be allocated to the walkthroughs 1223 and 1224 and the redundant walkthrough detector 1225 . The page table walkthrough device 1220 may not provide the current input address to the walkthrough devices 1223 and 1224, may not execute the page table walkthrough for the current input address, and may cancel or stop the page table walkthrough when the page table walkthrough is being performed. .

在操作S129中,當匹配層階不高於走查快取命中層階(S123=否)時,走查器可檢查針對輸入位址的頁表走查是否完成。在操作S133中,當針對輸入位址的頁表走查未完成(S129=否)時,走查器可提取由輸入位址的索引指示且未自頁表走查快取1230查找到的輸出位址。在操作S136中,走查器可將所提取的輸出位址儲存於頁表走查快取1230中(即,頁表走查快取1230被更新)。由走查器提取的輸出位址亦可儲存於冗餘走查偵測器1225中(即,冗餘走查偵測器1225被更新)。 In operation S129, when the matching level is not higher than the walkthrough cache hit level (S123=No), the walkthrough device may check whether the page table walkthrough for the input address is completed. In operation S133, when the page table walkthrough for the input address is not completed (S129=No), the walkthrough device can extract the output indicated by the index of the input address and not found from the page table walkthrough cache 1230 address. In operation S136, the walkthrough may store the extracted output address in the page table walkthrough cache 1230 (ie, the page table walkthrough cache 1230 is updated). The output addresses extracted by the walkthrough detector may also be stored in the redundant walkthrough detector 1225 (ie, the redundant walkthrough detector 1225 is updated).

在操作S139中,冗餘走查偵測器1225可獲得或計算在由輸入位址的索引指示的輸出位址儲存於頁表走查快取1230中時被更新的走查快取層階。冗餘走查偵測器1225可基於當前輸入位址的走查快取層階與當前輸入位址和其他輸入位址的匹配層階的 比較結果來清除任何其他頁表走查的第一級的危險位元。例如,當走查快取層階達到或相同於匹配層階時,冗餘走查偵測器1225可清除先前輸入的另一輸入位址的危險位元。可重複地執行操作S133及操作S136,直至在操作S129中確定頁表走查完成為止;隨著操作S133及操作S136被重複地執行,走查快取層階可逐漸變高。 In operation S139 , the redundant walkthrough detector 1225 may obtain or calculate a walkthrough cache level that is updated when the output address indicated by the index of the input address is stored in the page table walkthrough cache 1230 . The redundant walkthrough detector 1225 can be based on the walkthrough cache level of the current input address and the matching levels of the current input address and other input addresses Compare the results to clear the first-level dangerous bits of any other page table walkthrough. For example, when the walkthrough cache level reaches or is equal to the matching level, the redundant walkthrough detector 1225 may clear the dangerous bit of another input address previously input. Operation S133 and operation S136 may be repeatedly performed until it is determined in operation S129 that the page table scan is completed; as operation S133 and operation S136 are repeatedly performed, the scan cache level may gradually become higher.

當頁表走查完成(S129=是)時,在操作S143中,可將輸入位址自頁表走查排程器1221及冗餘走查偵測器1225解除分配。在操作S146中,MMU 1200可參考TLB 1210中所儲存的位址變換資訊來獲得與虛擬位址(即,輸入位址)對應的實體位址。 When the page table walkthrough is completed (S129=Yes), in operation S143, the input address may be deallocated from the page table walkthrough scheduler 1221 and the redundant walkthrough detector 1225. In operation S146, the MMU 1200 may refer to the address translation information stored in the TLB 1210 to obtain a physical address corresponding to the virtual address (ie, the input address).

圖14A及圖14B示出其中圖2所示頁表走查器執行第一級的頁表走查以將虛擬位址變換成中間實體位址並執行第二級的頁表走查以將中間實體位址變換成實體位址的流程圖,參照圖8A及圖8B對該些變換進行了闡述。將一起闡述圖14A及圖14B。 14A and FIG. 14B show that the page table walkthrough shown in FIG. 2 executes the first-level page table walkthrough to transform the virtual address into an intermediate physical address and executes the second-level page table walkthrough to convert the intermediate The flow chart of physical address conversion into physical address, these conversions are explained with reference to FIG. 8A and FIG. 8B. 14A and 14B will be explained together.

如同在操作S103中,在操作S203中,頁表走查器1220可在TLB未命中之後接收虛擬位址(即,輸入位址)。如同在操作S106中,在操作S206中,頁表走查器1220可將虛擬位址及上下文分配或提供至頁表走查排程器1221。如同在操作S109中,在操作S209中,頁表走查排程器1221可檢查是否為虛擬位址的第一級或第二級標記了危險位元。如上所述,關於第一級及第二級,危險位元可由作為第一冗餘走查偵測器的冗餘走查偵測器1225及第二冗餘走查偵測器1226一起管理。作為另一選擇,第一級的危 險位元可由作為第一冗餘走查偵測器的冗餘走查偵測器1224管理,且第二級的危險位元可由第二冗餘走查偵測器1226管理。 Like in operation S103, in operation S203, the page table walker 1220 may receive a dummy address (ie, an input address) after a TLB miss. Like in operation S106, in operation S206, the page table walker 1220 may allocate or provide the virtual address and the context to the page table walk scheduler 1221. Like in operation S109, in operation S209, the page table walk scheduler 1221 may check whether a dangerous bit is marked for the first level or the second level of the virtual address. As mentioned above, regarding the first level and the second level, the dangerous bits can be managed together by the redundant scrutiny detector 1225 as the first redundant scrutiny detector 1226 and the second redundant scrutiny detector 1226 . Alternatively, the first level of risk Danger bits can be managed by redundant walkthrough detector 1224 as the first redundant walkthrough detector, and second-level risk bits can be managed by second redundant walkthrough detector 1226 .

如同在操作S113中,在操作S213中,當危險位元未被標記或被清除(S209=否)時,頁表走查排程器1221可將虛擬位址分配給走查器1223及1224中的任一者以及作為第一冗餘走查偵測器的冗餘走查偵測器1225。如同在操作S116中,在操作S216中,被分配了虛擬位址的走查器可檢查頁表走查快取1230中是否儲存有與虛擬位址及上下文相關聯的局部位址變換資訊及全部位址變換資訊。例如,局部變換資訊及全部變換資訊可為由虛擬位址的索引指示的描述符,且頁表走查快取1230可為第一級的頁表走查快取S1WC。如同在操作S119中,在操作S219中,走查器可跳過提取輸出位址的操作,直至操作S216中第一級的命中層階為止。如同在操作S123中,在操作S223中,作為第一冗餘走查偵測器的冗餘走查偵測器1225可藉由將走查快取命中層階與第一級的匹配層階進行比較來偵測在其中走查器提取由未在頁表走查快取1230中命中的索引指示的輸出位址的操作(例如,頁表走查)中是否存在冗餘。如同在操作S126中,在操作S226中,當第一級的匹配層階高於走查快取命中層階(S223=是)時,作為第一冗餘走查偵測器的冗餘走查偵測器1225可為虛擬位址標記第一級的危險位元。可將危險位元被標記的輸入位址自作為第一冗餘走查偵測器的冗餘走查偵測器1225解除分配。 As in operation S113, in operation S213, when the dangerous bit is not marked or cleared (S209=No), the page table walkthrough scheduler 1221 may allocate virtual addresses to the walkthroughs 1223 and 1224 any one of and redundant walkthrough detector 1225 as the first redundant walkthrough detector. As in operation S116, in operation S216, the walker assigned the virtual address may check whether the page table walkthrough cache 1230 stores local address translation information and all associated virtual addresses and contexts. Address translation information. For example, the local transformation information and the whole transformation information may be descriptors indicated by the index of the virtual address, and the page table walkthrough cache 1230 may be the first level page table walkthrough cache S1WC. As in operation S119 , in operation S219 , the walkthrough may skip extracting output addresses until the hit level of the first level in operation S216 . Like in operation S123, in operation S223, the redundant walkthrough detector 1225 as the first redundant walkthrough detector may be performed by matching the walkthrough cache hit level with the matching level of the first level. The comparison detects whether there is redundancy in an operation in which the walker fetches an output address indicated by an index that is not hit in the page table walk cache 1230 (eg, a page table walk). As in operation S126, in operation S226, when the matching level of the first level is higher than the walkthrough cache hit level (S223=Yes), redundant walkthrough as the first redundant walkthrough detector The detector 1225 may flag first-level risky bits for the virtual address. The input address where the dangerous bit is marked can be deallocated from the redundant scrutiny detector 1225 as the first redundant scrutiny detector.

在操作S229中,當匹配層階不大於走查快取命中層階 (S223=否)時,頁表走查排程器1221可將虛擬位址的中間實體位址分配給第二冗餘走查偵測器1226。在操作S233中,走查器(例如,與操作S216中的走查器相同)可判斷頁表走查快取1230(例如,第二級的頁表走查快取S2WC)中是否儲存有中間實體位址的局部位址變換資訊及全部位址變換資訊(例如,中間實體位址的索引所指示的描述符)。此處,第一級的頁表走查快取S1WC及第二級的頁表走查快取S2WC均可包含於頁表走查快取1230中,或者第一級的頁表走查快取S1WC及第二級的頁表走查快取S2WC可單獨地在頁表走查快取1230中實作。在操作S236中,走查器可跳過提取輸出位址的操作,直至操作S233中第二級的命中層階為止。 In operation S229, when the matching level is not greater than the walkthrough cache hit level (S223=No), the page table scan scheduler 1221 may allocate the intermediate physical address of the virtual address to the second redundant scan detector 1226 . In operation S233, the walk-through device (for example, the same as the walk-through device in operation S216) can determine whether intermediate Local address translation information for physical addresses and full address translation information (eg, descriptors indicated by indexes of intermediate physical addresses). Here, the first-level page table walkthrough cache S1WC and the second-level page table walkthrough cache S2WC can be included in the page table walkthrough cache 1230, or the first-level page table walkthrough cache S1WC and the second-level page table walkthrough cache S2WC can be implemented in the page table walkthrough cache 1230 independently. In operation S236, the walkthrough may skip the operation of extracting the output address until the hit level of the second level in operation S233.

在操作S239中,第二冗餘走查偵測器1226可藉由將走查快取命中層階與第二級的匹配層階進行比較來偵測在其中走查器提取由未在頁表走查快取1230中命中的索引指示的輸出位址的操作(例如,頁表走查)中是否存在冗餘。第二冗餘走查偵測器1226可計算未完成頁表走查的中間實體位址與當前中間實體位址之間第二級的匹配層階。匹配層階可指示當前中間實體位址與任何其他輸入位址中相匹配的索引的層階。第二冗餘走查偵測器1226可計算匹配層階中的最高(最大)匹配層階作為當前中間實體位址的匹配層階。此外,第二冗餘走查偵測器1226可藉由使用當前中間實體位址的索引來查找頁表走查快取1230,且可獲得第二級的走查快取命中層階。在操作S243中,當第二級的匹配層階 高於走查快取命中層階(S239=是)時,第二冗餘走查偵測器1226可為中間實體位址標記第二級的危險位元。可將危險位元被標記的中間實體位址自第二冗餘走查偵測器1226解除分配。 In operation S239, the second redundant walkthrough detector 1226 may detect a situation where the walkthrough cache hit level is compared with the match level of the second level in which the walkthrough fetches the Whether there is redundancy in the operation of walking through the output address indicated by the hit index in cache 1230 (eg, page table walkthrough). The second redundant walkthrough detector 1226 can calculate a second-level matching hierarchy between the intermediate physical address of the unfinished page table walkthrough and the current intermediate physical address. The matching hierarchy may indicate the hierarchy of indices where the current intermediate entity address matches any other input address. The second redundant walkthrough detector 1226 can calculate the highest (maximum) matching level in the matching levels as the matching level of the current intermediate entity address. In addition, the second redundant walkthrough detector 1226 can search the page table walkthrough cache 1230 by using the index of the current intermediate physical address, and can obtain the second level walkthrough cache hit hierarchy. In operation S243, when the matching hierarchy of the second level When higher than the walkthrough cache hit level (S239=Yes), the second redundant walkthrough detector 1226 can mark the second-level dangerous bits for the intermediate entity address. The intermediate entity address where the dangerous bit is marked may be deallocated from the second redundant walkthrough detector 1226 .

在操作S246中,當匹配層階不高於走查快取命中層階(S239=否)時,走查器可檢查針對中間實體位址的第二級的頁表走查是否完成。當頁表走查未完成(S246=否)時,在操作S249中,走查器可提取由中間實體位址的索引指示且未自頁表走查快取1230找到的輸出位址。在操作S253中,走查器可將所提取的輸出位址儲存於頁表走查快取1230中(即,頁表走查快取1230被更新)。由走查器提取的輸出位址亦可儲存於第二冗餘走查偵測器1226中(即,第二冗餘走查偵測器1226被更新)。 In operation S246, when the matching level is not higher than the walkthrough cache hit level (S239=No), the walkthrough device may check whether the second-level page table walkthrough for the intermediate physical address is completed. When the page table walkthrough is not completed (S246=No), in operation S249, the walkthrough may fetch output addresses indicated by the index of the intermediate entity address and not found from the page table walkthrough cache 1230. In operation S253, the walkthrough may store the extracted output address in the page table walkthrough cache 1230 (ie, the page table walkthrough cache 1230 is updated). The output address extracted by the walkthrough detector can also be stored in the second redundant walkthrough detector 1226 (ie, the second redundant walkthrough detector 1226 is updated).

在操作S256中,第二冗餘走查偵測器1226可獲得或計算在由中間實體位址的索引指示的輸出位址儲存於頁表走查快取1230中時被更新的走查快取層階。第二冗餘走查偵測器1226可基於當前中間實體位址的走查快取層階與當前中間實體位址和其他中間實體位址之間的匹配層階的比較結果來清除任何其他頁表走查的第二級的危險位元。可重複地執行操作S249及操作S253,直至在操作S246中確定第二級的頁表走查完成為止。隨著操作S249及操作S253被重複地執行,走查快取命中層階可逐漸變高。 In operation S256, the second redundant walkthrough detector 1226 may obtain or calculate the walkthrough cache which is updated when the output address indicated by the index of the intermediate entity address is stored in the page table walkthrough cache 1230 Hierarchy. The second redundant walkthrough detector 1226 may flush any other pages based on a comparison of the walkthrough cache level of the current IR and the matching levels between the current IR and other IR The second level of dangerous bits in the table walkthrough. Operation S249 and operation S253 may be repeatedly performed until it is determined in operation S246 that the page table scan of the second level is completed. As operations S249 and S253 are repeatedly performed, the walkthrough cache hit level may gradually become higher.

當頁表走查完成(S246=是)時,在操作S259中,可將中間實體位址自第二冗餘走查偵測器1226解除分配。此後,操作S263至操作S273可與圖13所示操作S129至操作S139實質上相 同。當第一頁表走查完成(S263=是)時,在操作S276中,可將輸入位址自頁表走查排程器1221及作為第一冗餘走查偵測器的冗餘走查偵測器1225解除分配。在操作S279中,MMU 1200可參考TLB 1210中所儲存的位址變換資訊來獲得與虛擬位址(即,輸入位址)對應的實體位址。 When the page table walkthrough is completed (S246=Yes), in operation S259, the intermediate entity address may be deallocated from the second redundant walkthrough detector 1226. Thereafter, operation S263 to operation S273 may be substantially the same as operation S129 to operation S139 shown in FIG. same. When the first page table walkthrough is completed (S263=Yes), in operation S276, the input address can be sent from the page table walkthrough scheduler 1221 and the redundant walkthrough as the first redundant walkthrough detector The detector 1225 deallocates. In operation S279, the MMU 1200 may refer to the address translation information stored in the TLB 1210 to obtain a physical address corresponding to the virtual address (ie, the input address).

根據本發明的實施例,可藉由將匹配層階與走查快取命中層階進行比較來預測並偵測頁表走查的冗餘。處理器可執行其中不存在冗餘的另一頁表走查,因此改良處理器的效能並降低功率消耗。 According to an embodiment of the present invention, redundancies in page table walkthroughs can be predicted and detected by comparing the match level with the walkthrough cache hit level. The processor can perform another page table walk where there is no redundancy, thus improving the performance of the processor and reducing power consumption.

雖然已參照本文所述本發明概念的示例性實施例作出了本發明概念的教示內容,但此項技術中具有通常知識者應明瞭,可在不背離在以下申請專利範圍中所述的本發明精神及範圍的條件下對所述實施例作出各種改變及潤飾。 While the teachings of the inventive concept have been made with reference to exemplary embodiments of the inventive concept described herein, it should be apparent to those of ordinary skill in the art that the invention may be made without departing from the invention described in the claims below. Various changes and modifications of the described embodiments are made within the spirit and scope of the described embodiments.

1220:頁表走查器 1220: page table walkthrough

1221:頁表走查排程器 1221: Page table walkthrough scheduler

1222:危險/重放控制器 1222: Hazard/Replay Controller

1223、1224:走查器 1223, 1224: walkthrough device

1225:冗餘走查偵測器 1225: Redundant walkthrough detector

1230:頁表走查快取 1230: page table scan cache

IA0:輸入位址 IA0: input address

L0、L1、L2、L3:層階 L0, L1, L2, L3: levels

Claims (17)

一種處理器,包括:頁表走查快取,被配置成儲存位址變換資訊;以及頁表走查器,其中所述頁表走查器被配置成:藉由查找所述位址變換資訊及至少一部分頁表來提取由第一輸入位址的第一索引指示的第一輸出位址;以及將第二輸入位址的第二索引與所述第一輸入位址的所述第一索引之間的匹配層階與藉由使用所述第二索引查找所述頁表走查快取而獲得的走查快取命中層階進行比較,其中基於所述匹配層階與所述走查快取命中層階的比較結果,所述頁表走查器預先偵測到在藉由使用所述第二輸入位址的所述第二索引中與所述第一輸入位址的所述第一索引相匹配的索引自查找至少一部分所述頁表的操作中存在冗餘。 A processor, comprising: a page table walkthrough cache configured to store address translation information; and a page table walkthrough, wherein the page table walkthrough is configured to: by looking up the address translation information and at least a portion of the page table to extract a first output address indicated by a first index of a first input address; and combining a second index of a second input address with said first index of said first input address The matching level between is compared with the walkthrough cache hit level obtained by looking up the page table walkthrough cache using the second index, wherein based on the matching level and the walkthrough cache Taking the comparison result of the hit level, the page table walker detects in advance that the first index with the first input address in the second index using the second input address An index that matches an index has redundancy in the operation of looking up at least a portion of the page table. 如申請專利範圍第1項所述的處理器,其中所述第一輸入位址及所述第二輸入位址中的每一者是虛擬位址,且所述第一輸出位址及由所述第二輸入位址的所述第二索引指示的第二輸出位址中的每一者是實體位址。 The processor described in claim 1, wherein each of the first input address and the second input address is a virtual address, and the first output address and the Each of the second output addresses indicated by the second index of the second input addresses is a physical address. 如申請專利範圍第1項所述的處理器,其中所述第一輸入位址及所述第二輸入位址中的每一者是中間位址,且所述第一輸出位址及由所述第二輸入位址的所述第二索引指示的第二輸出位址中的每一者是實體位址。 The processor described in claim 1, wherein each of the first input address and the second input address is an intermediate address, and the first output address and the Each of the second output addresses indicated by the second index of the second input addresses is a physical address. 如申請專利範圍第1項所述的處理器,其中所述第一輸入位址及所述第二輸入位址中的每一者是虛擬位址,且所述第一輸出位址及由所述第二輸入位址的所述第二索引指示的第二輸出位址中的每一者是中間位址。 The processor described in claim 1, wherein each of the first input address and the second input address is a virtual address, and the first output address and the Each of the second output addresses indicated by the second index of the second input address is an intermediate address. 如申請專利範圍第1項所述的處理器,其中當偵測到所述匹配層階高於所述走查快取命中層階時,所述頁表走查器直至所述匹配層階相同於或小於在所述第一輸出位址中的每一者儲存於所述頁表走查快取中時被更新的走查快取層階,才執行提取由所述第二輸入位址的所述第二索引指示的第二輸出位址。 The processor described in item 1 of the scope of the patent application, wherein when it is detected that the matching level is higher than the walkthrough cache hit level, the page table walkthrough until the matching level is the same fetches from the second input address are performed at or below a walkthrough cache level that is updated when each of the first output addresses is stored in the page table walkthrough cache The second output address indicated by the second index. 如申請專利範圍第1項所述的處理器,其中當在提取由所述第二輸入位址的所述第二索引指示的第二輸出位址時偵測到所述匹配層階高於所述走查快取命中層階時,所述頁表走查器停止提取由所述第二索引指示的所述第二輸出位址,直至所述匹配層階相同於或小於在所述第一輸出位址中的每一者儲存於所述頁表走查快取中時被更新的走查快取層階為止。 The processor according to claim 1, wherein when fetching the second output address indicated by the second index of the second input address, it is detected that the matching level is higher than the When the walkthrough cache hit level is specified, the page table walker stops fetching the second output address indicated by the second index until the matching level is equal to or less than that at the first Each of the output addresses is updated at the walkthrough level when stored in the page table walkthrough cache. 如申請專利範圍第1項所述的處理器,其中所述匹配層階是第一匹配層階,且其中所述頁表走查器更被配置成:藉由查找所述位址變換資訊及至少一部分所述頁表來提取由第三輸入位址的第三索引指示的第三輸出位址;以及當在提取所述第一輸出位址及所述第三輸出位址時所述第二輸入位址的所述第二索引與所述第三輸入位址的所述第 三索引之間的第二匹配層階大於所述第一匹配層階時,將所述第二匹配層階與所述走查快取命中層階進行比較。 The processor described in item 1 of the scope of the patent application, wherein the matching level is the first matching level, and wherein the page table walker is further configured to: by searching the address translation information and at least a portion of said page table to fetch a third output address indicated by a third index of a third input address; and said second output address when fetching said first output address and said third output address The second index of the input address and the first index of the third input address When the second matching level among the three indexes is greater than the first matching level, the second matching level is compared with the walkthrough cache hit level. 如申請專利範圍第1項所述的處理器,其中所述頁表走查器包括:頁表走查排程器,被配置成管理第一條目及第二條目,關於包含所述第一輸入位址的走查請求的資訊被輸入至所述第一條目,關於包含所述第二輸入位址的走查請求的資訊被輸入至所述第二條目;以及多個走查器,被配置成提取所述第一輸出位址且提取由所述第二輸入位址的所述第二索引指示的第二輸出位址。 The processor described in item 1 of the scope of the patent application, wherein the page table walk-through device includes: a page table walk-through scheduler configured to manage the first entry and the second entry, regarding the inclusion of the second entry information about a walkthrough request for an input address is entered into the first entry, information about a walkthrough request including the second input address is entered into the second entry; and a plurality of walkthroughs a register configured to extract the first output address and extract a second output address indicated by the second index of the second input address. 如申請專利範圍第8項所述的處理器,其中所述頁表走查排程器的所述第二條目包含根據所述匹配層階與所述走查快取命中層階的比較結果而被標記的危險位元。 The processor according to claim 8 of the patent application, wherein the second entry of the page table walk scheduler includes a comparison result based on the matching level and the walkthrough cache hit level And the flagged dangerous bits. 如申請專利範圍第9項所述的處理器,其中當所述危險位元被標記時,所述頁表走查排程器直至所述危險位元被清除才將所述第二條目中所包含的具有所述第二輸入位址的所述走查請求的所述第二索引提供至所述多個走查器。 The processor according to claim 9 of the patent application, wherein when the dangerous bit is marked, the page table walk-through scheduler does not add the dangerous bit to the second entry until the dangerous bit is cleared The second index of the walkthrough request included with the second input address is provided to the plurality of walkers. 一種處理器,包括:頁表走查快取,被配置成儲存位址變換資訊;以及頁表走查器,其中所述頁表走查器被配置成:藉由查找所述位址變換資訊及第一級的至少一部分第一 頁表來提取由第一輸入位址的第一索引指示的第一中間位址,且藉由查找所述位址變換資訊及第二級的至少一部分第二頁表來提取由所述第一中間位址中的每一者的第二索引指示的第一輸出位址;以及將由第二輸入位址的第三索引指示的第二中間位址中的每一者的第四索引與所述第一中間位址中的每一者的所述第二索引之間的匹配層階與藉由使用所述第四索引查找所述頁表走查快取而獲得的走查快取命中層階進行比較,以及其中所述頁表走查器包括:頁表走查排程器,被配置成管理第一條目及第二條目,關於包含所述第一輸入位址的走查請求的資訊被輸入至所述第一條目,關於包含所述第二輸入位址的走查請求的資訊被輸入至所述第二條目;多個走查器,被配置成提取與所述第一輸入位址相關聯的所述第一中間位址及所述第一輸出位址,且提取與所述第二輸入位址相關聯的所述第二中間位址及由所述第二中間位址中的每一者的所述第四索引指示的第二輸出位址;以及冗餘走查偵測器,被配置成將所述匹配層階與所述走查快取命中層階進行比較。 A processor, comprising: a page table walkthrough cache configured to store address translation information; and a page table walkthrough, wherein the page table walkthrough is configured to: by looking up the address translation information and at least part of the first level a page table to extract a first intermediate address indicated by a first index of a first input address, and to extract the first intermediate address indicated by the first index by looking up the address translation information and at least a portion of a second page table of the second stage the first output address indicated by the second index of each of the intermediate addresses; and the fourth index of each of the second intermediate addresses indicated by the third index of the second input address with the A matching level between the second index for each of the first intermediate addresses and a walkthrough hit level obtained by looking up the page table walkthrough using the fourth index comparing, and wherein said page table walker includes: a page table walk scheduler configured to manage a first entry and a second entry, with respect to a walkthrough request comprising said first input address information is input into the first entry, information about a walkthrough request including the second input address is entered into the second entry; a plurality of walkthroughs configured to extract information related to the second entry the first intermediate address and the first output address associated with an input address, and extract the second intermediate address associated with the second input address and the second intermediate address associated with the second intermediate address a second output address indicated by the fourth index of each of the addresses; and a redundant walkthrough detector configured to match the match level with the walkthrough cache hit level Compare. 如申請專利範圍第11項所述的處理器,其中在所述第一輸出位址中的每一者儲存於所述頁表走查快取中時,走查快取層階被更新,且當偵測到所述走查快取層階達到所述匹配層階 時,所述頁表走查器執行藉由查找所述位址變換資訊及所述第二級的至少一部分所述第二頁表來提取由所述第二中間位址中的每一者的所述第四索引指示的所述第二輸出位址。 The processor of claim 11, wherein the walkthrough cache hierarchy is updated when each of the first output addresses is stored in the page table walkthrough cache, and When it is detected that the walkthrough cache level reaches the matching level , the page table walker performs extraction of information from each of the second intermediate addresses by looking up the address translation information and at least a portion of the second page table of the second level The second output address indicated by the fourth index. 如申請專利範圍第11項所述的處理器,其中所述頁表走查排程器的所述第二條目包含根據所述匹配層階與所述走查快取命中層階的比較結果而被標記的危險位元。 The processor according to claim 11, wherein the second entry of the page table walk scheduler includes a comparison result based on the matching level and the walkthrough cache hit level And the flagged dangerous bits. 如申請專利範圍第13項所述的處理器,其中所述多個走查器中的第一走查器執行第一頁表走查以提取所述第一中間位址及所述第一輸出位址,其中所述多個走查器中的第二走查器執行第二頁表走查以提取所述第二中間位址及所述第二輸出位址,且其中當所述冗餘走查偵測器標記所述危險位元時,由所述第二走查器執行的所述第二頁表走查被取消。 The processor according to claim 13, wherein a first walker among the plurality of walkers executes a first page table walk to extract the first intermediate address and the first output address, wherein a second walker of the plurality of walkers performs a second page table walk to extract the second intermediate address and the second output address, and wherein when the redundant The second page table walk performed by the second walker is canceled when the walkthrough detector flags the dangerous bit. 如申請專利範圍第13項所述的處理器,其中所述頁表走查排程器的所述第二條目更包含危險識別位元,所述危險識別位元指示所述多個走查器中執行提取所述第一中間位址及所述第一輸出位址的走查器的編號。 The processor according to claim 13, wherein the second entry of the page table scan scheduler further includes a hazard identification bit indicating the plurality of scans The serial number of the scan device that executes extracting the first intermediate address and the first output address in the device. 如申請專利範圍第11項所述的處理器,其中所述匹配層階指示所述第二中間位址中的每一者的所述第四索引與所述第一中間位址中的每一者的所述第二索引的匹配程度如何。 The processor of claim 11, wherein the matching hierarchy indicates the fourth index of each of the second intermediate addresses and each of the first intermediate addresses How well does the second index of the user match. 一種處理器,包括:頁表走查快取,被配置成儲存位址變換資訊;以及 頁表走查器,其中所述頁表走查器被配置成:藉由查找所述位址變換資訊及第一級的至少一部分第一頁表來提取由第一輸入位址的第一索引指示的第一中間位址,且藉由查找所述位址變換資訊及第二級的至少一部分第二頁表來提取由所述第一中間位址中的每一者的第二索引指示的第一輸出位址;將第二輸入位址的第三索引與所述第一輸入位址的所述第一索引之間的第一匹配層階與藉由使用所述第三索引查找所述頁表走查快取而獲得的第一走查快取命中層階進行比較;以及將由所述第二輸入位址的所述第三索引指示的第二中間位址中的每一者的第四索引與所述第一中間位址中的每一者的所述第二索引之間的第二匹配層階與藉由使用所述第四索引查找所述頁表走查快取而獲得的第二走查快取命中層階進行比較,以及其中所述頁表走查器包括:頁表走查排程器,被配置成管理第一條目及第二條目,關於包含所述第一輸入位址的走查請求的資訊被輸入至所述第一條目,關於包含所述第二輸入位址的走查請求的資訊被輸入至所述第二條目;多個走查器,被配置成提取與所述第一輸入位址相關聯的 所述第一中間位址及所述第一輸出位址,且提取與所述第二輸入位址相關聯的所述第二中間位址及由所述第二中間位址中的每一者的所述第四索引指示的第二輸出位址;第一冗餘走查偵測器,被配置成將所述第一匹配層階與所述第一走查快取命中層階進行比較;以及第二冗餘走查偵測器,被配置成將所述第二匹配層階與所述第二走查快取命中層階進行比較。 A processor comprising: a page table walkthrough cache configured to store address translation information; and A page table walker, wherein the page table walker is configured to: extract a first index from a first input address by looking up the address translation information and at least a portion of the first page table of the first level first intermediate addresses indicated, and extracting a second index indicated by each of the first intermediate addresses by looking up the address translation information and at least a portion of a second page table of the second level first output address; combining a first matching level between a third index of a second input address and said first index of said first input address by looking up said comparing the first walkthrough cache hit levels obtained by page table walkthrough; and comparing the second intermediate addresses indicated by the third index of the second input address A second matching level between the four indexes and the second index of each of the first intermediate addresses obtained by looking up the page table walkthrough cache using the fourth index A second walkthrough cache hit level is compared, and wherein said page table walker comprises: a page table walkthrough scheduler configured to manage a first entry and a second entry, with respect to the inclusion of said second entry information about a walkthrough request for an input address is entered into the first entry, information about a walkthrough request including the second input address is entered into the second entry; a plurality of walkthroughs , configured to extract the the first intermediate address and the first output address, and extracting the second intermediate address associated with the second input address and by each of the second intermediate address a second output address indicated by the fourth index; a first redundant walkthrough detector configured to compare the first match level with the first walkthrough cache hit level; and a second redundant walkthrough detector configured to compare the second match level with the second walkthrough cache hit level.
TW108139160A 2019-02-08 2019-10-30 Processor to detect redundancy of page table walk TWI805866B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201962803227P 2019-02-08 2019-02-08
US62/803,227 2019-02-08
KR10-2019-0022184 2019-02-26
KR1020190022184A KR20200098354A (en) 2019-02-08 2019-02-26 Processor to detect redundancy of page table walk

Publications (2)

Publication Number Publication Date
TW202034174A TW202034174A (en) 2020-09-16
TWI805866B true TWI805866B (en) 2023-06-21

Family

ID=72292999

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108139160A TWI805866B (en) 2019-02-08 2019-10-30 Processor to detect redundancy of page table walk

Country Status (2)

Country Link
KR (1) KR20200098354A (en)
TW (1) TWI805866B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114238176B (en) * 2021-12-14 2023-03-10 海光信息技术股份有限公司 Processor, address translation method for processor and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI388984B (en) * 2008-07-14 2013-03-11 Via Tech Inc Microprocessor, method and computer program product that perform speculative tablewalks
US20140156930A1 (en) * 2012-12-05 2014-06-05 Arm Limited Caching of virtual to physical address translations
US9405702B2 (en) * 2014-11-14 2016-08-02 Cavium, Inc. Caching TLB translations using a unified page table walker cache
US20170344492A1 (en) * 2016-05-26 2017-11-30 Arm Limited Address translation within a virtualised system background
US20180107604A1 (en) * 2016-10-14 2018-04-19 Arm Limited Apparatus and method for maintaining address translation data within an address translation cache

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI388984B (en) * 2008-07-14 2013-03-11 Via Tech Inc Microprocessor, method and computer program product that perform speculative tablewalks
US20140156930A1 (en) * 2012-12-05 2014-06-05 Arm Limited Caching of virtual to physical address translations
US9405702B2 (en) * 2014-11-14 2016-08-02 Cavium, Inc. Caching TLB translations using a unified page table walker cache
US20170344492A1 (en) * 2016-05-26 2017-11-30 Arm Limited Address translation within a virtualised system background
US20180107604A1 (en) * 2016-10-14 2018-04-19 Arm Limited Apparatus and method for maintaining address translation data within an address translation cache

Also Published As

Publication number Publication date
KR20200098354A (en) 2020-08-20
TW202034174A (en) 2020-09-16

Similar Documents

Publication Publication Date Title
CN111552654B (en) Processor for detecting redundancy of page table walk
JP6081672B2 (en) Efficient address translation caching in processors that support many different address spaces
US7167970B2 (en) Translating loads for accelerating virtualized partition
US9405702B2 (en) Caching TLB translations using a unified page table walker cache
KR20130140582A (en) Zero cycle load
TWI698745B (en) Cache memory, method for operating the same and non-transitory computer-readable medium thereof
US8190652B2 (en) Achieving coherence between dynamically optimized code and original code
US11403222B2 (en) Cache structure using a logical directory
US10831675B2 (en) Adaptive tablewalk translation storage buffer predictor
US11775445B2 (en) Translation support for a virtual cache
TW201003396A (en) Microprocessor, method and computer program product that perform speculative tablewalks
JP7062696B2 (en) Sharing virtual and real transformations in virtual cache
US11221963B2 (en) Methods and systems for incorporating non-tree based address translation into a hierarchical translation lookaside buffer (TLB)
US20190026231A1 (en) System Memory Management Unit Architecture For Consolidated Management Of Virtual Machine Stage 1 Address Translations
TWI805866B (en) Processor to detect redundancy of page table walk
US10684951B2 (en) Minimizing cache latencies using set predictors
US20240220416A1 (en) Prioritized unified tlb lookup with variable page sizes