CN111949562A - Application processor, system-on-chip and method for operating memory management unit - Google Patents

Application processor, system-on-chip and method for operating memory management unit Download PDF

Info

Publication number
CN111949562A
CN111949562A CN201911013907.0A CN201911013907A CN111949562A CN 111949562 A CN111949562 A CN 111949562A CN 201911013907 A CN201911013907 A CN 201911013907A CN 111949562 A CN111949562 A CN 111949562A
Authority
CN
China
Prior art keywords
context
cache
address
target
translation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911013907.0A
Other languages
Chinese (zh)
Inventor
朴城范
莫纽·希德
崔周熙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/413,034 external-priority patent/US11216385B2/en
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN111949562A publication Critical patent/CN111949562A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1036Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] for multiple virtual address spaces, e.g. segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1063Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently virtually addressed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/657Virtual address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]
    • G06F2212/683Invalidation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]
    • G06F2212/684TLB miss handling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides an application processor, a system-on-chip and a method for operating a memory management unit. A memory management unit in the application processor responds to an access request corresponding to the check request. The memory management unit includes a context cache, a translation cache, an invalidation queue, and an address translation manager. The context cache stores contexts and identifiers of the stored contexts while avoiding copying of the contexts. The translation cache stores a first address, a first context identifier, and a second address, the first address corresponding to a virtual address, the first context identifier corresponding to a first context, and the second address corresponding to the first address and the first context. The invalidation queue stores at least one of the context identifiers stored in the translation cache that is to be invalidated. The address translation manager controls the context cache, translation cache, and invalidation queue.

Description

Application processor, system-on-chip and method for operating memory management unit
Technical Field
Exemplary embodiments of the inventive concept relate to processors, and more particularly, to an application processor, a system on chip including the application processor, and a method of operating a memory management unit included in the application processor.
Background
A Memory Management Unit (MMU) is a hardware component that processes memory access requests issued by a direct memory access unit (e.g., a Central Processing Unit (CPU)). The MMU may also be referred to as a paged MMU (PMMU).
Generally, an MMU initially attempts to translate a virtual page address to a physical page address of a memory (e.g., an instruction memory) using an associated cache called a Translation Lookaside Buffer (TLB). If a physical page address matching the virtual page address is not allocated in the TLB, the TLB performs a slower process of referencing the page table to determine the necessary physical page address. This may delay the MMU's channel activity.
Disclosure of Invention
According to an exemplary embodiment of the inventive concept, an application processor includes a Memory Management Unit (MMU). The MMU responds to an access request received from a host Intellectual Property (IP), and the access request includes a target context and a target virtual address. The access request corresponds to a check request to translate the target virtual address to a first target physical address. The MMU includes a context cache, a translation cache, an invalidation queue, and an Address Translation Manager (ATM). The context cache stores contexts and stored context identifiers of the contexts as first tags and first data, respectively, while avoiding copying of contexts, and the contexts are used in the inspection requests. The translation cache stores a first address and a first context identifier as a second tag and a second address as second data, the first address corresponding to a virtual address used in the examination request, the first context identifier corresponding to a first context used in the examination request, and the second address corresponding to the first address and the first context. The invalidation queue stores at least one of the context identifiers stored in the translation cache that is to be invalidated. The ATM controls the context cache, the translation cache, and the invalidation queue.
According to an exemplary embodiment of the present inventive concept, a system-level chip includes a main Intellectual Property (IP) outputting an access request, an application processor, and a memory device. The application processor includes a Memory Management Unit (MMU), and the MMU translates a target virtual address to a first target physical address in response to the access request including a target context and the target virtual address. The memory device is coupled to the MMU and includes a page table that stores mapping information between a virtual address and a first physical address. The MMU includes a context cache, a translation cache, an invalidation queue, and an Address Translation Manager (ATM). The context cache stores contexts and stored context Identifiers (IDs) of the contexts as first tags and first data, respectively, while avoiding copying of contexts used in the access requests corresponding to check requests. The translation cache stores a first address and a first context identifier as a second tag and a second address as second data, the first address corresponding to a virtual address used in the examination request, the first context identifier corresponding to a first context used in the examination request, and the second address corresponding to the first address and the first context. The invalidation queue stores at least one of the context identifiers stored in the translation cache that is to be invalidated. The ATM controls the context cache, the translation cache, and the invalidation queue.
In accordance with an exemplary embodiment of the present inventive concept, in a method of operating a Memory Management Unit (MMU) of an application processor, an access request is received by an Address Translation Manager (ATM), the access request including a target context and a target virtual address. Determining, by the ATM, whether the target context matches at least one of the first entries in the context cache by checking a context cache, and the context cache storing a context and a stored context identifier of the context as a first tag and first data, respectively, while avoiding copying of contexts. Determining, by the ATM, whether a target context identifier corresponding to the target context matches at least one of the second entries in the translation cache by selectively checking the translation cache based on the checking of the context cache, and the translation cache storing the context identifier and a virtual address corresponding to the context identifier as a second tag and a physical address corresponding to the virtual address as second data. Translating the target virtual address corresponding to a target physical address to a corresponding target physical address based on the selective determination.
Thus, the MMU in the application processor according to an exemplary embodiment may translate virtual addresses to physical addresses by primarily checking the context cache, which stores context while avoiding copying context, and selectively checking the translation cache based on the results of checking the context cache. Thus, the translation cache size may be reduced. In addition, the performance of the application processor may be enhanced by: when an invalidation request specifies context-based invalidation, the invalidation request is processed if the translation cache is not used.
Drawings
The above and other features of the present inventive concept will be more clearly understood by describing in detail exemplary embodiments thereof with reference to the attached drawings.
FIG. 1 is a diagram of a system-on-chip (SoC) including a Memory Management Unit (MMU) in accordance with an exemplary embodiment.
FIG. 2 is a block diagram illustrating an example of the application processor of FIG. 1, according to an example embodiment.
Fig. 3 is a diagram showing a mapping between virtual addresses and physical addresses.
Fig. 4A is a diagram for explaining the operation of the MMU in fig. 1 according to an exemplary embodiment.
FIG. 4B illustrates an example of the translation cache of FIG. 4A, according to an example embodiment.
FIG. 4C illustrates an example of the translation cache of FIG. 4B, according to an example embodiment.
FIG. 4D illustrates another example of the translation cache of FIG. 4B, according to an example embodiment.
FIG. 5 is a block diagram illustrating an MMU in the SoC shown in FIG. 1, in accordance with an illustrative embodiment.
FIGS. 6A and 6B respectively illustrate a portion of the MMU of FIG. 5, according to an exemplary embodiment.
FIG. 7 is a flowchart illustrating exemplary operation of the MMU of FIG. 5 in accordance with an exemplary embodiment.
Fig. 8 is a diagram for explaining the operation in fig. 7.
FIG. 9A is a flowchart illustrating another exemplary operation of the MMU of FIG. 5 in accordance with an exemplary embodiment.
FIG. 9B is a flowchart illustrating another exemplary operation of the MMU of FIG. 5 in accordance with an exemplary embodiment.
Fig. 10 illustrates the allocation of a new context identifier in fig. 7.
FIG. 11 is an exemplary operation of an MMU performing the operations in FIG. 10.
FIG. 12 illustrates assigning a new context identifier in FIG. 7 according to an example embodiment.
FIG. 13A is a flowchart illustrating an exemplary method of invalidating an entry in a context cache in an MMU in accordance with an exemplary embodiment.
FIG. 13B is a flowchart illustrating an exemplary method of invalidating an entry in a translation cache in an MMU in accordance with an exemplary embodiment.
FIG. 14 is a flowchart illustrating another exemplary method of invalidating an entry in a translation cache in an MMU in accordance with an exemplary embodiment.
Fig. 15 illustrates another example of an application processor in the SoC of fig. 1 according to an exemplary embodiment.
FIG. 16 is a block diagram illustrating an example of the MMU module of FIG. 15 in accordance with an illustrative embodiment.
Fig. 17 illustrates an example of the address allocator of fig. 16, according to an example embodiment.
FIG. 18 is a conceptual diagram useful in explaining the operation of the MMU module of FIG. 16.
FIG. 19 is a flowchart illustrating a method of operating an MMU in an application processor in accordance with an illustrative embodiment.
Fig. 20 is a block diagram of a mobile device including a SoC, according to an example embodiment.
[ description of symbols ]
10. 910: a system on chip (SoC);
20: a display;
30: a memory device;
40: a page table;
45: a data/instruction storage block;
50: an input device;
100. 100a, 920: an application processor;
110: a Central Processing Unit (CPU);
120: a system peripheral circuit;
121: a Real Time Clock (RTC);
123: a Phase Locked Loop (PLL);
125: a watchdog timer;
130: a multimedia accelerator;
131: a camera interface;
133: a graphics engine;
135: a high definition multimedia interface;
140: a connection circuit;
141: an audio interface;
143: a storage interface;
145: connecting an interface;
150: a display controller;
160: a memory controller;
170: caching;
180: a system bus;
190: a primary Intellectual Property (IP);
200: a Memory Management Unit (MMU);
200 a: a Memory Management Unit (MMU) module;
210. CC1, CC2, CCk: a context cache;
211. 217, 219a, 221, 251: a tag field;
213. 219, 219b, 223, 253: a data field;
215: translation Lookaside Buffer (TLB)/traditional TLB;
218. 218a, 218b, TC1, TC2, TCk: a translation cache;
220: a Translation Lookaside Buffer (TLB);
230. IQ1, IQ2, IQk: an invalidation queue;
240: a page table walker;
250: a walk cache;
260: an Address Translation Manager (ATM);
261: a first interface;
263: a second interface;
265: a control register;
267: a storage interface;
270: an address allocator;
271: a register group;
273: an address comparator;
275: a first bus interface;
281: memory Management Unit (MMU)/MMU 1;
282: memory Management Unit (MMU)/MMU 2;
28 k: memory Management Unit (MMU)/MMUk;
290: a second bus interface;
900: a mobile device;
930: a wide input output memory;
940: a low power double rate data x memory device;
950: an image sensor;
960: a display;
ASID: an address space identifier;
AT: authority information;
CID, CID1, CID11, CID12, CID13, CID14, CID1_ a, CID1_ b: a context identifier;
CTL: a control signal;
CTX11, CTX 12: a context;
DTA: data;
EL: an exception level field;
FN: a frame number;
FN0, FN1, FNn: a frame;
FN 2: frame/frame number;
INST: instructions;
MMU 1-MMUk: a Memory Management Unit (MMU);
MMU _ ID, ID1, ID2, IDk: a Memory Management Unit (MMU) identity;
and NS: a non-secure field;
OFF 2: offsetting;
PA, PA1, PA11, PA12, PA13, PA1_ a, PA1_ b, PA1_ c, PA1_ d: a physical address;
PA 2: physical address/second physical address;
PN0, PN1, PN3, PN4, PNn: a page;
PN: a page number;
PN 2: page/page number;
pVA, pVA11, pVA12, pVA 13: a local virtual address;
REQ: an access request;
s100, S105, S106, S107, S115, S117, S120, S123, S125, S130, S135, S140, S200a, S200b, S210, S220, S230, S250, S280, S285, S290, S305, S310, S320, S330, S340, S350, S355, S360, S370, S380, S410, S415, S420, S430, S440, S450, S460, S470, S510, S520, S530, S540: operating;
VA, VA0, VA1, VA2, VA3, VA4, VA5, VAn-2, VAn-1, VAn, VA1_ a, VA1_ b, VA1_ c, VA1_ d: a virtual address;
VALID: valid information;
VMID: a virtual machine identifier.
Detailed Description
Exemplary embodiments of the inventive concept will be described more fully hereinafter with reference to the accompanying drawings. Throughout this application, like reference numbers may refer to like elements.
FIG. 1 is a diagram of a system-on-chip (SoC) including a Memory Management Unit (MMU) in accordance with an exemplary embodiment.
Referring to fig. 1, the SoC 10 may be implemented as any of a large number of electronic devices, examples of which include a Personal Computer (PC), a tablet PC, a netbook, an e-reader, a Personal Digital Assistant (PDA), a Portable Multimedia Player (PMP), an MP3 player, and an MP4 player. The SoC 10 may include an application processor 100, the application processor 100 executing program instructions to control the overall operation of the SoC 10. The SoC 10 may also include a display 20, a memory device 30, and an input device 50.
For example, application processor 100 may receive program instructions via input device 50. In an exemplary embodiment, the application processor 100 executes program instructions by reading data from the memory device 30 and displaying the data on the display 20. The input device 50 may include a keypad (keypad), a keyboard, and a point-and-touch device (e.g., a touch pad and a computer mouse).
The memory device 30 may include a page table (page table)40, the page table 40 storing mapping information between virtual addresses and physical addresses.
FIG. 2 is a block diagram illustrating an example of the application processor of FIG. 1, according to an example embodiment.
Referring to fig. 2, the application processor 100 includes a Central Processing Unit (CPU)110, a cache memory 170, an MMU 200, a system bus 180, a system peripheral circuit 120, a multimedia accelerator 130, a connection circuit (connection circuit)140, a display controller 150, and/or a memory controller 160.
CPU 110 executes the received program instructions. Cache 170 is a high-speed memory that stores selected data (e.g., frequently accessed data) to reduce the average latency of memory access operations by CPU 110. The MMU 200 is a hardware component that processes requests from the CPU 110 to access the memory device 30.
The functions of MMU 200 may include translating virtual addresses to physical addresses, performing memory protection, controlling cache 170, performing bus arbitration, and/or performing bank switching.
System peripheral circuitry 120, multimedia accelerator 130, connection circuitry 140, display controller 150, and/or memory controller 160 communicate data or instructions with each other via a system bus 180.
The system bus 180 may include multiple channels, such as a read data channel, a read address channel, a write address channel, and/or a write data channel.
The system peripheral circuit 120 includes a real-time clock (RTC) 121, a phase-locked loop (PLL) 123, and/or a watchdog timer 125.
The multimedia accelerator 130 includes a graphic engine 133. The multimedia accelerator 130 may also include a camera interface 131, a graphics engine integrated with frame buffers or video display circuitry to perform graphics computations, and/or a high-definition multimedia interface (HDMI) 135, the high-definition multimedia interface 135 being an audio/video interface for transferring uncompressed digital data. It should be noted here that the MMU 200 may be used to translate virtual addresses output from the graphics engine 133 into physical addresses.
Accordingly, exemplary embodiments of the present inventive concept may be applied to various memory devices and various applications to ensure operational stability and maintain or enhance operational performance.
The connection circuit 140 may include an audio interface 141, a storage interface 143 (e.g., an Advanced Technology Attachment (ATA) interface), and/or a connection interface 145. The connection circuit 140 may communicate with the input device 50.
The display controller 150 controls data to be displayed in the display 20. The MMU 200 may be used to translate virtual addresses output from the display controller 150 into physical addresses.
The memory controller 160 enables the memory device 30 to be accessed according to a type of memory, such as a flash memory (DRAM) or a Dynamic Random Access Memory (DRAM).
Fig. 3 is a diagram showing a mapping between virtual addresses and physical addresses.
Referring to fig. 1 through 3, a virtual address space may be divided into a plurality of pages PN0 through PNn, where n is an integer greater than two.
Each of pages PN 0-PNn is a block formed by adjacent virtual addresses. Each of pages PN0 through PNn has a given data size of, for example, 4 Kilobytes (KB). However, the size of pages PN0 through PNn is not limited and may vary.
As with the virtual address space, the physical address space may also be divided into a plurality of frames FN 0-FNn. Each of the frames FN 0-FNn has a fixed size.
The virtual address (e.g., VA2) includes a page number (e.g., PN2) and an offset within the page (e.g., OFF 2). In other words, the virtual address can be expressed by equation 1:
VAi=PNj+OFFx(1)
where 'i', 'j', and 'x' are 1 or natural numbers greater than 1, VAi is the virtual address, PNj is the page number, and OFFx is the offset.
Page number PN2 is used as an index into page table 40.
The offset OFF2 is combined with a frame number (e.g., FN2) that defines the physical address (e.g., PA 2). The physical address can be expressed by equation 2:
PAr=FNs+OFFx(2)
where 'r','s', and 'x' are 1 or natural numbers greater than 1, PAr is a physical address, FNs is a frame number, and OFFx is an offset.
The page number PN2 may be referred to as a virtual page number and the frame number FN2 may be referred to as a physical page number.
Page table 40 has a mapping between the virtual address of the page and the physical address of the frame.
For ease of explanation, in the description it is assumed that the processor (e.g., CPU 110), graphics engine 133, and display controller 150 that processes data in each workgroup are referred to as the Master Intellectual Property (IP). The primary IP may operate for each workgroup and may process multiple workgroups at once. A workgroup is a set of data stored in the memory device 30. The working group represents a group of pages in the memory device 30 that are frequently referred to by the master IP (e.g., more than a reference number of times in a reference time period), or represents the amount of pages that can be loaded from the master IP to the memory device 30. In the exemplary embodiment, each workgroup is managed independently of the other workgroups in the primary IP.
FIG. 4A is a diagram illustrating the operation of the MMU of FIG. 1 in accordance with an exemplary embodiment.
Referring to fig. 1, 2, and 4A, MMU 200 includes Translation Cache (TC)218, Context Cache (CC)210, and/or Invalidate Queue (IQ)230 and is connected to main IP190 and memory device 30 through multiple channels.
The primary IP190 processes the workgroup. Main IP190 outputs an access request corresponding to the workgroup to MMU 200 or cache 170. The access request may include a virtual address VA on a workgroup and a context on an attribute on the workgroup in the memory device 30. The access request may include one of a check request and an invalidate request.
MMU 200 calculates physical address PA using virtual address VA based on access requests by host IP 190. Alternatively, MMU 200 may invalidate at least one of the entries in translation cache 218 based on an access request by primary IP 190.
Context cache 210 stores a context associated with an attribute of an access request as a first tag while avoiding copying entries and/or stores a context identifier of the context as first data. When the access request corresponds to the check request, the context cache 210 may store the context used in the check request as the first tag while avoiding copying the entry and may store the context identifier of the context as the first data.
Translation cache 218 may store the first address and the first context identifier as a second tag and may store the first address and the first context identifier as a second tag, and may store the second address as second data. The first address corresponds to a virtual address used in the examination request and the first context identifier corresponds to a first context used in the examination request. The second address corresponds to the first address and the first context.
The translation cache 218 may include a Translation Lookaside Buffer (TLB) or a walk cache (walk cache).
A TLB is memory management hardware used to increase the speed of virtual address translation. The TLB stores a mapping between page number PN and frame number FN. The TLB stores mapping information between context identifiers associated with pages referenced by the main IP190, virtual addresses VA, and physical addresses PA. The TLB stores the context identifier and the virtual address as a second tag and stores the physical address corresponding to the virtual address as second data.
The walk cache stores a portion of a virtual address and stores a physical address indicating a location of a page table corresponding to the portion of the virtual address.
When translating a virtual address to a physical address, the MMU 200 first checks the context cache 210. If the context associated with the requested virtual address VA corresponding to the primary IP190 matches at least one of the entries in the context cache 210 (referred to herein as a TC-hit), the context cache 210 provides the translation cache 218 with a context identifier corresponding to the matching context.
If the mapping information corresponding to the virtual address VA is in the translation cache 218 (this is referred to as a TC hit), the MMU 200 processes the translation directly without accessing the memory device 30 and reading the mapping information from the memory device 30.
If the context associated with the requested virtual address VA corresponding to the main IP190 does not match any or all of the entries in the context cache 210 (referred to as a CC miss), or if a matching entry is not found in the translation cache 218 for the matched context identifier and virtual address VA (referred to as a TC miss), page table walk is carried out.
Page table walk is the process of determining whether the page number PN and frame number FN of the virtual address VA match the page table 40 stored in the memory device 30 when the context associated with the virtual address VA corresponding to the request of the host IP190 does not match any or all of the entries in the context cache 210, or when the frame number FN of the virtual address VA does not match the page number PN in the translation cache 218 (i.e., no mapping information between the virtual address VA and the physical address PA is found in the translation cache 218). The page table 40 stores mapping information between virtual and physical addresses for each datum in the memory device 30.
When the main IP190 attempts to read the instruction INST or the data DTA using the physical address PA and the instruction or data corresponding to the physical address PA is in the cache 170, the cache 170 may output the instruction INST or the data DTA directly to the main IP190 without accessing the memory device 30 (this is referred to as a 'cache hit').
However, when the instruction INST or data DTA is not present in the cache 170, the cache 170 may access the data/instruction storage block 45 in the memory device 30 to read the instruction or data (this is referred to as a 'cache miss'). Data/instruction storage block 45 stores information about each of the data/instructions in memory device 30.
Invalidate queue 230 may store at least one of the context identifiers stored in translation cache 218 that is to be invalidated.
FIG. 4B illustrates an example of the translation cache of FIG. 4A, according to an example embodiment.
Referring to FIG. 4B, translation cache 218 includes a tag field 219a and a data field 219B.
The tag field 219a may store a context identifier CID1 and a virtual address VA1 and the data field 219b may store a physical address PA1 corresponding to the virtual address VA 1.
For example, assume that context identifiers CID1_ a and CID1_ b are used in the check request. In an exemplary embodiment, the context identifier CID1_ a and the virtual addresses VA1_ a, VA1_ b, and VA1_ c corresponding to the context identifier CID1_ a are stored in the tag field 219a, and the context identifier CID1_ b and the virtual address VA1_ d are stored in the tag field 219 a. In addition, physical addresses PA1_ a, PA1_ b, PA1_ c, and PA1_ d corresponding to the virtual addresses VA1_ a, VA1_ b, VA1_ c, and VA1_ d, respectively, are stored in the data field 219 b. The virtual addresses VA1_ a, VA1_ b, VA1_ c, and VA1_ d stored in the tag field 219a may be referred to as a first address and the physical addresses PA1_ a, PA1_ b, PA1_ c, and PA1_ d stored in the data field 219b may be referred to as a second address.
FIG. 4C illustrates an example of the translation cache of FIG. 4B, according to an example embodiment.
Referring to FIG. 4C, the translation cache 218a may include a TLB 220. The configuration of the TLB 220 will be described with reference to FIG. 6A.
FIG. 4D illustrates another example of the translation cache of FIG. 4B, according to an example embodiment.
Referring to fig. 4D, the conversion cache 218b may include a Walking Cache (WC) 250. The configuration of the walk cache 250 will be described with reference to fig. 6B.
FIG. 5 is a block diagram illustrating an MMU in the SoC shown in FIG. 1, in accordance with an illustrative embodiment.
Referring to FIG. 5, MMU 200 includes a first interface 261, a second interface 263, a storage interface 267, an Address Translation Manager (ATM)260, a context cache 210, a translation cache 218, an invalidate queue 230, a Page Table Walker (PTW)240, and/or a control register 265.
The first interface 261 provides an interface with the primary IP 190. The first interface 261 may have an interface structure corresponding to, for example, an advanced extensible interface (AXI) protocol.
Host IP190 may transmit an access request REQ to MMU 200 via first interface 261.
The second interface 263 is a separate slave interface that sets the control register 265. For example, the CPU 110 (see FIG. 2) may control certain operations of the MMU 200 via the second interface 263. The second interface 263 may communicate with the CPU 110 according to, for example, an Advanced Peripheral Bus (APB) protocol. The MMU 200 can receive control signals CTL from the CPU 110.
ATM260 operates to translate virtual address VA included in access request REQ to physical address PA.
ATM260 essentially checks (searches or looks up) context cache 210 to translate virtual address VA provided from host IP190 over the address channel to physical address PA.
When there is a context associated with virtual address VA in context cache 210 (CC hit), context cache 210 provides ATM260 with a context identifier corresponding to the context associated with virtual address VA. ATM260 checks translation cache 218 based on the context identifier. When there is a context identifier provided from context cache 210 in translation cache 218 (a TC hit, i.e., if the context identifier provided from context cache 210 matches at least one of the entries in translation cache 218), ATM260 may generate physical address PA by referencing the context identifier and the virtual address.
If there is no context associated with virtual address VA in context cache 210 (CC miss), or if the context identifier provided from context cache 210 does not match any or all of the entries in translation cache 218 (TC miss), ATM260 controls page table walker 240 to perform page table walk on page table 40.
Information for controlling the operation of MMU 200 is stored in control registers 265. ATM260 may control context cache 210, translation cache 218, invalidate queue 230, and/or page table walker 240 based on information stored in control registers 265.
The memory interface 267 provides an interface for communicating with the memory device 30. The MMU 200 can read the page table 40 in the memory device 30 through the memory interface 267 or can access the data/instruction storage block 45 in the memory device 30 through the memory interface 267.
FIGS. 6A and 6B respectively illustrate a portion of the MMU of FIG. 5, according to an exemplary embodiment.
FIG. 6A illustrates the implementation of the translation cache 218 of FIG. 5 by the TLB 220 and FIG. 6B illustrates the implementation of the translation cache 218 of FIG. 5 by the walk cache 250.
In FIG. 6A, the context cache 210, the TLB 220, and the invalidate queue 230 are shown, and in FIG. 6B, the walk cache 250 is shown. Additionally, a conventional TLB 215 is shown in FIG. 6A for comparison with TLB 220.
Referring to fig. 6A, context cache 210 includes a tag field 211 and a data field 213. The tag field 211 stores the context used in the check request of the primary IP190 as a first tag while avoiding copying the context. Each of the contexts may include VALID information (VALID), an Address Space Identifier (ASID) for identifying an address space, a Virtual Machine Identifier (VMID) for identifying a virtual machine, a non-secure field NS associated with whether secure, and an exception level field EL associated with an exception field. The data field 213 includes a context identifier CID for each of the contexts.
For example, context CTX11 may have VALID of 'Y', ASID of '0 xA', VMID of '0 xB', NS of '1', EL of '1', and context identifier CID11 of '0 x 4'. In addition, the context CTX12 may have VALID of 'Y', ASID of '0 xC', VMID of '0 xB', NS of '1', EL of '1', and context identifier CID12 of '0 x 6'.
TLB 220 includes a tag field 221 and a data field 223. The tag field 221 stores a context identifier CID and a virtual address VA corresponding to the context identifier CID, and the data field 223 stores a physical address PA corresponding to the virtual address VA and authority information AT associated with whether read (R)/write (W) access is allowed. Each of the entries in the tag field 221 of the TLB 220 may also include VALID information (VALID). The tag field 221 of the TLB 220 may store a virtual address (first address) and a context identifier of a context used in the check request as the second tag, and the data field 223 of the TLB 220 may store a physical address corresponding to the first address and the context identifier as the second address.
For example, the context identifier CID11 may have virtual addresses of '0 x 1000', '0 x 6000', and '0 x 3000'. The context identifier CID12 may have a virtual address of '0 x 8000'.
The virtual address '0 x 1000' may have a physical address of '0 x 9000' and an AT of 'R/W'. The virtual address '0 x 6000' may have a physical address of '0 xA 000' and an AT of 'R/W'. The virtual address '0 x 3000' may have a physical address of '0 x 0000' and an AT of 'R'. The virtual address '0 x 8000' may have a physical address of '0 x 2000' and an AT of 'W'.
The invalidate queue 230 stores context identifiers of '0 x 7' and '0 x 3' as invalidated entries.
Walk cache 250 includes a tag field 251 and a data field 253. The tag field 251 stores a context identifier CID and a local virtual address pVA corresponding to the context identifier CID, and the data field 253 stores a second physical address PA2 that specifies the location of the page table 40, the location of the page table 40 corresponding to the local virtual address pVA. Each of the entries in the tag field 251 of the walk cache 250 may also include VALID information (VALID).
For example, the context identifier CID11 may have local virtual addresses of 'pVA 11' and 'pVA 12'. The context identifier CID12 may have a local virtual address of 'pVA 13'. Local virtual addresses 'pVA 11', 'pVA 12', and 'pVA 13' specify locations in page table 40, which are specified by physical addresses PA11, PA12, and PA12, respectively.
As described in fig. 6A, the context identifiers CID11 have different virtual addresses for one context identifier. Thus, ATM260 may translate virtual address VA to physical address PA by: the context cache 210 is primarily checked to determine whether the target context included in the access request by the main IP190 matches at least one of the entries of the tag field 211 in the context cache 210 and to selectively check the TLB 220 based on the results of the checking of the context cache 210.
Since the context cache 210 stores contexts while avoiding copying the contexts, the footprint of the context cache 210 may be reduced. In addition, the TLB 220 does not store ASIDs and VMIDs having relatively more bits, and the configuration of the TLB 220 may be simplified.
Additionally, if the target context does not match any or all of the entries of the tag field 211 of the context cache 210, page table walk is carried out without checking the TLB 220, and thus the performance of the MMU 200 may be enhanced.
In contrast, because the conventional TLB 215, including the tag field 217 and the data field 219, stores ASIDs and VMIDs having relatively more bits, the size of the TLB 215 increases and more time is required to check the TLB 215 to determine whether the target context matches an entry of the TLB 215.
Fig. 7 is a flowchart illustrating an exemplary operation of the MMU of fig. 5 according to an exemplary embodiment, and fig. 8 is a diagram for explaining the operation of fig. 7.
Referring to fig. 5 to 8, when the ATM260 receives the access request REQ from the host IP190, the ATM260 checks (looks up) the context cache 210 based on the target context included in the access request REQ (operation S100), and determines whether the target context matches at least one of the first entries in the context cache 210 (CC tag hit in operation S115.
If the target context does not match any or all of the first entries in context cache 210 (NO in S115), ATM260 assigns the target context to a new context Identifier (ID) (operation S200) and controls page table walker 240 to perform page table walk on page table 40 (operation S290).
If the target context matches one (or at least one) of the first entries in the context cache 210 (yes in S115), the ATM260 obtains a context identifier (context ID) corresponding to the target context (S120). The ATM260 checks the Translation Cache (TC)218 based on the obtained context identifier and the target virtual address (operation S125) and determines whether the obtained context identifier and the target virtual address match one of the second entries in the translation cache 218 (is the TC tag hit in operation S130.
If the obtained context identifier and target virtual address match all second entries in translation cache 218 (NO in S130), ATM260 controls page table walker 240 to perform page table walk on page table 40 (operation S290).
If the obtained context identifier and the target virtual address match one of the second entries in cache 218 (YES in S130), ATM260 obtains a physical address PA corresponding to the target virtual address (operation S140) and performs address translation to provide the physical address PA to host IP190 (operation S285).
FIG. 9A is a flowchart illustrating another exemplary operation of the MMU of FIG. 5 in accordance with an exemplary embodiment.
Referring to fig. 5 and 9A, the ATM260 determines whether the context currently used is changed (operation S105).
If the context currently used has not changed (no in S105), the ATM260 performs operation S105.
If the context currently used is changed (yes in S105), the ATM260 checks the context cache 210 based on the changed context (operation S107).
ATM260 determines that the changed context matches at least one of the first entries in context cache 210 (operation S117). If the changed context does not match any or all of the first entries in context cache 210 (NO in S117), ATM260 assigns the changed context to a new context Identifier (ID) (operation S200) and stores the changed context in context cache 210.
If the changed context matches at least one of the first entries in context cache 210 (yes in S117), ATM260 updates context cache 210 by assigning the matching entry as the first context identifier (operation S123).
Since the context rarely changes, operations S107, S120, and S123 may be skipped.
FIG. 9B is a flowchart illustrating another exemplary operation of the MMU of FIG. 5 in accordance with an exemplary embodiment.
Referring to fig. 5 and 9B, ATM260 determines whether a new request is received from host IP190 (operation S106). If a new request is not received from the primary IP190 (no in S106), the ATM260 carries out operation S106. If a new request is received from the primary IP190 (YES in S106), the ATM260 checks the translation cache 218 based on the most recently used context identifier and the virtual address VA included in the new request (operation S125).
ATM260 determines whether the most recently used context identifier and virtual address VA match at least one of the second entries in translation cache 218 (operation S135). If the most recently used context identifier and virtual address VA do not match all of the second entries in cache 218 (NO in S135), ATM260 controls page table walker 240 to perform page table walk on page table 40 (operation S290). The physical address PA may be obtained after the page table walk (operation S140). If the most recently used context identifier and virtual address VA match one (at least one) of the second entries in translation cache 218 (yes in S135), ATM260 obtains physical address PA (operation S140).
FIG. 10 illustrates an exemplary operation of the MMU assigning a new context identifier in FIG. 7 and FIG. 11 is a flowchart of the operations in FIG. 10.
Referring to fig. 5, 7, 10 and 11, to assign a new context identifier to a target context (operation S200a), ATM260 determines whether context cache 210 has available space, e.g., whether context cache 210 is full (operation S210).
If context cache 210 has available space (NO in S210), ATM260 controls page table walker 240 to perform page table walk on page table 40 (operation S290).
If the context cache 210 does not have available space (YES in S210), the ATM260 invalidates (selects) at least one of the entries in the context cache 210 based on the usage history of the first entry and places (records) the context identifier of the selected entry to be invalidated in the invalidation queue 230 (operation S250). That is, the ATM260 changes the valid information of the least recently used context identifier CID13 of '0 x 8' from 'Y' to 'N' and records '0 x 8' in the invalidation queue 230.
The ATM260 stores the target context in a location in the context cache 210 where the context identifier CID13 was stored and assigns a new context identifier CID14 to the target context (operation S280).
FIG. 12 illustrates assigning a new context identifier in FIG. 7 according to an example embodiment.
Referring to fig. 5, 7 and 12, to assign a new context identifier to a target context (operation S200b), ATM260 determines whether context cache 210 has space available, e.g., whether context cache 210 is full (operation S210).
If the context cache 210 does not have available space (YES in S210), the ATM260 determines whether the invalidate queue 230 has available space, e.g., whether the invalidate queue 230 is full (operation S220). If the invalidate queue 230 has available space (NO in S220), ATM260 invalidates (selects) at least one of the entries in context cache 210 based on the usage history of the first entry and places (records) the context identifier of the selected entry to be invalidated in the invalidate queue 230 (operation S250).
If the invalidate queue 230 does not have available space, for example, if the invalidate queue 230 is full (YES in S220), ATM260 invalidates at least some of the entries in the translation cache 218 based on the invalidate queue 230 and dequeues (clears) at least some of the entries in the invalidate queue 230 such that the invalidate queue 230 has available space (operation S230).
ATM260 invalidates at least one entry among the entries in context cache 210 based on the usage history of the first entry (selects at least one entry) and places (records) a context identifier of the selected entry to be invalidated in invalidation queue 230 (operation S250). ATM260 stores the target context in the location in context cache 210 where the context identifier was stored and assigns a new context identifier to the target context (operation S280).
FIG. 13A is a flowchart illustrating an exemplary method of invalidating an entry in a context cache in an MMU in accordance with an exemplary embodiment.
Referring to fig. 5, 6A and 13A, ATM260 determines whether the access request from main IP190 corresponds to a new invalidation request to invalidate at least one of the entries in translation cache 218 (operation S305).
If the access request from main IP190 corresponds to a new context-based invalidation request (yes in S305), ATM260 checks context cache 210 based on the target context included in the invalidation request (operation S310), and determines whether the target context matches one or more of the entries in context cache 210 (operation S320). If the target context does not match any or all of the entries in context cache 210 (NO in S320), ATM260 notifies primary IP190 that the invalidation has been completed (operation S330).
If the target context matches one or more of the entries in context cache 210 (yes in S320), ATM260 invalidates the entry in context cache 210 corresponding to the target context, records (places) the context identifier of the invalidated entry (operation S340) and notifies primary IP190 that the invalidation is complete (operation S330).
FIG. 13B is a flowchart illustrating an exemplary method of invalidating an entry in a translation cache in an MMU in accordance with an exemplary embodiment.
Referring to fig. 13B, the ATM260 determines whether the invalidation queue 230 has available space, for example, whether the invalidation queue 230 is not empty (operation S350). If invalidate queue 230 is not empty (YES in S350), ATM260 determines whether translation cache 218 is unused, e.g., whether there is no activity in translation cache 218 (operation S355). If translation cache 218 is not in use (YES in S355), ATM260 dequeues the context identifier from invalidate queue 230 (extracts the context identifier) and checks translation cache 218 based on the dequeued context identifier (operation S360).
ATM260 determines whether the dequeued context identifier matches at least one of the entries in translation cache 218 (operation S370). If the context identifier of the dequeued context does not match any or all of the entries in the translation cache 218 (NO in S370), the process ends. If the dequeued context identifier matches at least one of the entries in translation cache 218 (YES in S370), ATM260 changes the valid information of the matching entry (e.g., matching context identifier) from 'Y' to 'N' (operation S380).
FIGS. 13A and 13B illustrate context-based invalidation entries for the translation cache 218. Additionally, FIG. 13B illustrates the invalidation of entries of the translation cache 218 as performed in the background when the translation cache 218 is not being used.
FIG. 14 is a flowchart illustrating another exemplary method of invalidating an entry in a translation cache in an MMU in accordance with an exemplary embodiment.
Referring to fig. 5, 6A and 14, ATM260 determines whether the access request from main IP190 corresponds to a new invalidation request to invalidate at least one of the entries in translation cache 218 (operation S410). In an exemplary embodiment, the access request may include a virtual address VA.
If the access request from main IP190 corresponds to a new virtual address-based invalidation request to invalidate an entry having a particular context (yes in S410), ATM260 checks context cache 210 based on the target context included in the invalidation request (operation S415), and determines whether the target context matches one or more of the entries in context cache 210 (operation S420). If the target context does not match any or all of the entries in context cache 210 (NO in S420), ATM260 notifies primary IP190 that the invalidation has been completed (operation S470).
If the target context matches one or more of the entries in context cache 210 (yes in S420), ATM260 obtains a context identifier corresponding to the target context (operation S430), and checks translation cache 218 based on the obtained context identifier and virtual address (operation S440). ATM260 determines whether the obtained context identifier and virtual address match at least one of the entries in translation cache 218 (operation S450). If the obtained context identifier and virtual address do not match all of the entries in translation cache 218 (NO in S450), ATM260 notifies host IP190 that the invalidation has been completed (operation S470).
If the obtained context identifier and virtual address match at least one of the entries in translation cache 218 (YES in S450), ATM260 changes the valid information of the matching entry (e.g., matching context identifier) from 'Y' to 'N' (operation S460) and notifies host IP190 that the invalidation is complete (operation S470).
FIG. 14 illustrates a virtual address based invalidation entry for translation cache 218.
The MMU 200 in the application processor 100 according to the illustrative embodiment may translate virtual addresses to physical addresses by: checking primarily a context cache that stores contexts while avoiding copying of contexts; and selectively checking the translation cache based on a result of checking the context cache. Thus, the size of the translation cache may be reduced. In addition, the performance of the application processor 100 may be enhanced by: when the invalidation request specifies context-based invalidation, the invalidation request is processed in the background during the time that the translation cache 218 is not being used.
Fig. 15 illustrates another example of an application processor in the SoC of fig. 1 according to an exemplary embodiment.
Referring to FIG. 15, the application processor 100a may include an MMU module 200 a.
MMU module 200a may include at least one MMU and may translate the virtual address included in the request from host IP190 to a physical address.
FIG. 16 is a block diagram illustrating an example of the MMU module 200a of FIG. 15 in accordance with an illustrative embodiment.
In fig. 16, the main IP190 and the memory device 30 are shown for convenience of explanation.
Referring to FIG. 16, MMU module 200a includes an address allocator 270, a first bus interface 275, a plurality of MMUs 281 through 28k, and/or a second bus interface 290. Although not shown in FIG. 16, MMU module 200a may also include a cache that stores data and/or instructions corresponding to physical addresses.
The MMU (MMU1)281 includes a context cache CC1, a translation cache TC1, and/or an invalidation queue IQ 1. The MMU (MMU2)282 includes a context cache CC2, a translation cache TC2, and/or an invalidation queue IQ 2. Mmu (mmuk)28k includes context cache CCk, translation cache TCk, and/or invalidate queue IQk. Each of the translation caches TC1 through TCk may comprise a TLB or walk cache.
Primary IP190 may operate for each workgroup and may process multiple workgroups at once. A workgroup is a set of data stored in the memory device 30. The working group represents a group of pages that the master IP190 frequently accesses (e.g., exceeds a reference number of times in a reference time period) or represents an amount of pages that can be loaded from the master IP190 to the memory device 30. According to an exemplary embodiment of the present inventive concept, each workgroup is managed independently of other workgroups in the primary IP 190.
When master IP190 performs operations for multiple workgroups, address assigner 270 may dynamically assign an MMU to each of the workgroups. The address allocator 270 stores MMU allocation information corresponding to each of the workgroups.
When a request for a work group is received from host IP190, address allocator 270 may output, based on MMU allocation information, an MMU identity MMU _ ID of the MMU corresponding to virtual address VA included in the request to first bus interface 275. First bus interface 275 may transfer the request and data to the MMU corresponding to MMU identity MMU _ ID.
Fig. 17 illustrates an example of the address allocator 270 of fig. 16, according to an example embodiment.
Referring to fig. 17, the address allocator 270 includes a register set 271 and/or an address comparator 273.
Register set 271 stores MMU allocation information corresponding to each of the workgroups. In other words, the register set 271 stores MMU allocation information in which the virtual address VA corresponding to each work group is mapped to the MMU ID. According to an exemplary embodiment, the MMU allocation information may include indicator information for distinguishing the virtual address VA of each work group. The indicator information may be, for example, the start and/or end of consecutive virtual addresses VA of the workgroup.
Address comparator 273 may compare the requested virtual address VA received from host IP190 with MMU allocation information. Address comparator 273 may output the MMU identity MMU _ ID corresponding to the request as a result of the comparison.
FIG. 18 is a conceptual diagram useful in explaining the operation of the MMU module of FIG. 16.
As shown in fig. 18, the first to k-th work groups may include, for example, a plurality of pages, i.e., a plurality of adjacent virtual addresses VA, in the memory device 30 that are frequently referred to (e.g., more than the reference number of times in the reference period) by the master IP 190. For example, the first working group includes virtual addresses VA 0-VA 2, the second working group includes virtual addresses VA 3-VA 5, and the kth working group includes virtual addresses VAn-2-VAn. However, the work groups are managed independently of each other in the operation of the primary IP 190. In other words, a single virtual address VA does not belong to two or more working groups. For example, the virtual addresses VA 0-VAn may be arranged consecutively for the workgroup, as shown in fig. 18.
Each of MMU's 1-MMUk translates the virtual address VA of the working set mapped to the MMU to a physical address PA including PA 0-PA 2. Address translation may be performed based on a TLB within the MMU. The physical address PA translated by an MMU may be different from or the same as the physical address translated by another MMU.
When it is assumed that the work group of data to be processed by the primary IP190 is mapped to the MMU1281, the first bus interface 275 receives the MMU ID1 of the MMU1281 from the address allocator 270 and transmits the request and data of the primary IP190 to the MMU 1281. When it is assumed that the work group of data to be processed by primary IP190 is mapped to MMU 2282, first bus interface 275 receives MMU ID2 of MMU 2282 from address allocator 270 and transmits the request and data of primary IP190 to MMU 2282. When it is assumed that a workgroup of data to be processed by primary IP190 is mapped to MMUk 28k, first bus interface 275 receives the MMU ID IDk for MMUk 28k from address allocator 270 and transmits requests and data for primary IP190 to MMUk 28 k.
The MMU1281 translates the requested virtual address VA to a physical address PA. When the MMU1281 translates the virtual address VA to a physical address PA, the MMU1281 primarily checks the context cache CC1 and, based on the results of the checking of the context cache CC1, selectively checks the translation cache TC1 and communicates a request to translate the virtual address VA to the physical address PA to the primary IP190 over the first bus interface 275. In addition, the MMU1281 communicates the request and data to the memory device 30 through the second bus interface 290. The second bus interface 290 accesses the physical address PA in the memory device 30 and carries out the operation corresponding to the request on the data.
When main IP190 starts operating for another working group while carrying out operations for the current working group, one of the MMUs of MMU module 200a that is not allocated to the current working group is allocated to the new working group and operates independently. Thus, TC misses are reduced as compared to the case where only one MMU is shared by all workgroups used by primary IP 190. Accordingly, during data processing operations of the main IP190, the hit rate increases, and the operating speed of the SoC 10 also increases, while the interaction between the work groups is minimized or reduced. In addition, since the MMU is allocated to each workgroup, the MMU is flexible in operation.
FIG. 19 is a flowchart illustrating a method of operating an MMU in an application processor in accordance with an illustrative embodiment.
Referring to fig. 1 to 14 and 19, in a method of operating the MMU 200 in the application processor 100, the ATM260 in the MMU 200 receives an access request including a target context and a target virtual address from the host IP190 (operation S510).
ATM260 determines whether the target context matches at least one of the first entries in context cache 210 by checking context cache 210 (operation S520). Context cache 210 stores contexts and context identifiers of stored contexts as first tags and first data, respectively, while avoiding copying of the contexts.
ATM260 selectively determines whether the target context identifier corresponding to the target context matches at least one of the second entries in translation cache 218 by selectively checking translation cache 218 based on checking context cache 210 (operation S530). Translation cache 218 stores the context identifier and the virtual address corresponding to the context identifier as a second tag and the physical address corresponding to the virtual address as second data.
The ATM260 converts the target virtual address into the target physical address based on the selective judgment (step S540), and outputs the target physical address to the host IP 190.
Fig. 20 is a block diagram of a mobile device including a SoC, according to an example embodiment.
Referring to fig. 20, the mobile device 900 includes a SoC 910, a Low Power Double Data Rate x (lpddr) memory device 940, an image sensor 950, and/or a display 960. SoC 910 includes an application processor 920 and/or a wide input output (WideIO) memory 930.
Data stored in the wide input output memory 930 or the lpddr memory device 940 may be displayed on the display 960 under control of the SoC 910. SoC 910, and in particular application processor 920, can include MMU 200 in FIG. 5 or MMU module 200a in FIG. 16.
Thus, the MMU of application processor 920 may include a context cache, a translation cache, an invalidation queue, and/or an ATM. The ATM may translate the virtual address included in the access request from the host IP to a physical address by: checking primarily a context cache that stores contexts while avoiding copying of contexts; and selectively checking the translation cache based on a result of checking the context cache. Thus, the size of the translation cache may be reduced.
The SoC and the semiconductor device according to the inventive concept may be packaged into one of various types for subsequent embedding. For example, a SoC according to the inventive concept may be packaged by one of the following: package On Package (PoP), Ball Grid Array (BGA), Chip Scale Packages (CSP), Plastic Leaded Chip Carrier (PLCC), Plastic Dual In-Line Package (PDIP), Wafer-In-Package (DIE WAFFLE Package), Chip-In-Wafer (DIE IN WAFER FORM), Chip-On-Board (COB), Ceramic Dual In-Line Package (CERDIP), Plastic Metric Quad Flat Package (MQFP), Thin Quad Flat Package (ThfQin-Flatpack, TQF), Small Outline Integrated Circuit (Small Outline Integrated Circuit (Small Outline) Package (SOIC), Shrink Small Outline Package (Small Outline Package, Small Outline Package (SSOP), Small Outline Package (Small Outline Package, TQQ), TSOP), System In Package (SIP), Multi-Chip Package (MCP), Wafer-Level Fabricated Package (WFP), and Wafer-Level Processed Stack Package (WSP).
The elements of fig. 1-20 set forth above may be implemented in the following devices: processing circuitry, such as hardware including logic circuitry; a hardware/software combination, such as a processor executing software; or a combination thereof and a memory. For example, the processing circuitry may more specifically include, but is not limited to, a Central Processing Unit (CPU), an Arithmetic Logic Unit (ALU), a digital signal processor, a microcomputer, a Field Programmable Gate Array (FPGA), a programmable logic unit, a microprocessor, an application-specific integrated circuit (ASIC), and so forth.
The foregoing is illustrative of exemplary embodiments and is not to be construed as limiting thereof. Although exemplary embodiments have been described, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of the present disclosure. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the claims.

Claims (20)

1. An application processor comprising:
a memory management unit configured to respond to an access request received from a host intellectual property, the access request comprising a target context and a target virtual address,
wherein the access request corresponds to a check request to translate the target virtual address to a first target physical address, and
wherein the memory management unit comprises:
a context cache configured to store contexts and stored context identifiers of the contexts as first tags and first data, respectively, while avoiding copying of contexts used in the inspection requests;
a translation cache configured to store a first address and a first context identifier as a second tag and configured to store a second address as second data, the first address corresponding to a virtual address used in the inspection request, the first context identifier corresponding to a first context used in the inspection request, and the second address corresponding to the first address and the first context;
an invalidation queue configured to store at least one of the context identifiers stored in the translation cache to be invalidated; and
an address translation manager configured to control the context cache, the translation cache, and the invalidation queue.
2. The application processor of claim 1, wherein the address translation manager is configured to translate the first address to the second address by checking the context cache in response to the check request and selectively checking the translation cache based on a result of checking the context cache.
3. The application processor of claim 2, wherein if the target context matches at least one of the first entries in the context cache, the address translation manager is configured to obtain a context identifier corresponding to the target context as a target context identifier.
4. The application processor of claim 3, wherein the translation cache comprises a translation lookaside buffer,
wherein the translation lookaside buffer is configured to store the virtual address as the first address and is configured to store a physical address corresponding to the virtual address as the second address, and
wherein if the target context identifier and the target virtual address match one of the second entries in the translation lookaside buffer, the address translation manager is configured to control the translation lookaside buffer to provide a first physical address corresponding to the target virtual address as a first target physical address.
5. The application processor of claim 4, wherein the memory management unit further comprises a page table walker,
wherein if the target context identifier does not match any or all of the second entries in the translation lookaside buffer, the address translation manager is configured to control the page table walker to perform page table walk on a page table that maps virtual addresses to corresponding physical addresses, and
if the target context identifier does not match any or all of the first entries in the context cache, the address translation manager is configured to control the page table walker to perform page table walk on the page table.
6. The application processor of claim 3, wherein the memory management unit further comprises a page table walker,
wherein the conversion cache comprises a walking cache,
wherein the walk cache is configured to store a local virtual address of the virtual address as the first address and configured to store a second physical address indicating a location of the page table corresponding to the first address,
wherein the address translation manager is configured to control the page table walker to perform page table walk on a page table that maps the target virtual address to the first target physical address, and
wherein if the target context identifier and the target virtual address match one of the second entries in the walk cache, the address translation manager is configured to control the walk cache to provide a second physical address corresponding to the first address to the page table walker.
7. The application processor of claim 3, wherein if the target context does not match any or all of the first entries in the context cache, the address translation manager is configured to assign a new context identifier to the target context and store the target context in the context cache.
8. The application processor of claim 7,
the address translation manager is configured to determine whether the context cache has available memory,
if the context cache does not have the available storage space, then
The address translation manager is configured to record a context identifier of at least one of the first entries in the invalidation queue based on a usage history of the first entries stored in the context cache, and is configured to store the target context and the new context identifier in the context cache.
9. The application processor of claim 7,
the address translation manager is configured to determine whether the context cache has a first available memory space,
if the context cache does not have the first available storage space, the address translation manager is configured to determine whether the invalidate queue has a second available storage space,
if the invalidated queue does not have the second available space, then
The address translation manager is configured to dequeue at least one of the to-be-invalidated context identifiers stored in the invalidation queue, to invalidate zero or more second entries in a second entry in the translation cache based on the context identifier dequeued, and to store the target context and the new context identifier in the context cache.
10. The application processor of claim 2, wherein the address translation manager is configured to:
determining whether the context has changed and, if the context has changed, primarily checking the context cache;
if, as a result of said checking, said changed context does not match any or all of the first entries in the first entry in said context cache, assigning a new context identifier to said changed context and storing said new context identifier as the first context identifier in said context cache; and
if, according to the result of the check, the changed context matches at least one of the first entries in the context cache, then storing the matching context as the first context identifier in the context cache.
11. The application processor of claim 10, wherein the address translation manager, in response to the check request, is configured to:
checking the translation cache based on the first context identifier and the virtual address; and
obtaining the second address if at least one of the second entries in the translation cache matches the first address corresponding to the first context identifier and the virtual address.
12. The application processor of claim 1, wherein the access request corresponds to a context-based invalidation request to invalidate a second entry in the translation cache, and the translation cache stores a context identifier corresponding to the target context as the second tag, and
wherein the address translation manager is configured to:
checking the context cache in response to the context-based invalidation request;
selectively checking the translation cache based on a result of checking the context cache; and
notifying that the invalidation of the second entry in the translation cache is complete.
13. The application processor of claim 12, wherein if the target context matches at least one of the first entries in the context cache, the address translation manager is configured to:
invalidating a target context identifier corresponding to the target context; and
record the target context identifier that is invalidated in the invalidation queue, an
Wherein if the target context does not match any or all of the first entries in the first entry in the context cache, the address translation manager is configured to notify that the invalidation has been completed.
14. The application processor of claim 12,
the address translation manager is configured to determine whether the invalidate queue has available memory,
if the invalidate queue does not have the available storage space, then
The address translation manager is configured to determine whether the translation cache is used,
if the translation cache is not in use, the address translation manager is configured to extract a context identifier recorded in the invalidation queue, configured to check the translation cache based on the extracted context identifier, and configured to invalidate the matching entry if the extracted context identifier matches at least one of the entries in the translation cache.
15. The application processor of claim 1, wherein the access request corresponds to a virtual address-based invalidation request to invalidate a second entry in the translation cache, and the translation cache stores a context identifier corresponding to the target context as the second tag, and
wherein the address translation manager is configured to:
performing a first check on the context cache based on the target context;
obtaining a target context identifier corresponding to the target context if the target context matches at least one of the first entries in the context cache according to a result of the first check;
performing a second check on the translation cache based on a virtual address and the target context identifier; and
invalidating a matching one of the second entries in the translation cache if the virtual address and the target context identifier match at least one of the second entries based on a result of the second check.
16. The application processor of claim 1, wherein when operating page table walk, the address translation manager is configured to translate the first address to the second address by primarily checking the context cache in response to the check request and performing the page table walk based on a result of checking the context cache.
17. The application processor of claim 16, wherein if the target context matches at least one of the first entries in the context cache, the address translation manager is configured to obtain a context identifier corresponding to the target context as a target context identifier, and
wherein if the target context identifier and the target virtual address match at least one of the second entries in the walk cache, the address translation manager is configured to control the walk cache to provide the physical address corresponding to the first address as the target physical address.
18. A system-on-chip comprising:
a main intellectual property configured to output an access request;
an application processor comprising a memory management unit configured to translate a target virtual address to a first target physical address in response to the access request comprising a target context and the target virtual address; and
a memory device coupled to the memory management unit and including a page table storing mapping information between a virtual address and a first physical address,
wherein the memory management unit comprises:
a context cache configured to store contexts and context identifiers of the stored contexts as first tags and first data, respectively, while avoiding copying of contexts used in check requests corresponding to the access requests;
a translation cache configured to store a first address and a first context identifier as a second tag and configured to store a second address as second data, the first address corresponding to a virtual address used in the inspection request, the first context identifier corresponding to a first context used in the inspection request, and the second address corresponding to the first address and the first context;
an invalidation queue configured to store at least one of the context identifiers stored in the translation cache to be invalidated; and
an address translation manager configured to control the context cache, the translation cache, and the invalidation queue.
19. A method of operating a memory management unit of an application processor, the method comprising:
receiving, by an address translation manager, an access request, the access request including a target context and a target virtual address;
determining, by the address translation manager, whether the target context matches at least one of the first entries in the context cache by checking a context cache, wherein the context cache is configured to store a context as a first tag and a stored context identifier of the context as a first tag and first data, respectively, while avoiding copying of a context;
determining, by the address translation manager, whether a target context identifier corresponding to the target context matches at least one of second entries in a translation cache by selectively checking the translation cache based on checking the context cache, wherein the translation cache is configured to store the context identifier and a virtual address corresponding to the context identifier as a second tag and is configured to store a physical address corresponding to the virtual address as second data; and
translating the target virtual address to a corresponding target physical address based on the determination by selectively checking the translation cache.
20. The method of claim 19, wherein translating the target virtual address to the corresponding target physical address comprises:
obtaining, by the address translation manager, a context identifier corresponding to the target context as a target context identifier if the target context matches at least one of the first entries in the context cache; and
outputting, by the address translation manager, a physical address corresponding to the target virtual address as the target physical address if the target context identifier and the target virtual address match at least one of the second entries in the translation cache.
CN201911013907.0A 2019-05-15 2019-10-23 Application processor, system-on-chip and method for operating memory management unit Pending CN111949562A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US16/413,034 US11216385B2 (en) 2019-05-15 2019-05-15 Application processor, system-on chip and method of operating memory management unit
US16/413,034 2019-05-15
KR1020190062943A KR20200133165A (en) 2019-05-15 2019-05-29 Application processor, system-on chip and method of operating memory management unit
KR10-2019-0062943 2019-05-29

Publications (1)

Publication Number Publication Date
CN111949562A true CN111949562A (en) 2020-11-17

Family

ID=73019033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911013907.0A Pending CN111949562A (en) 2019-05-15 2019-10-23 Application processor, system-on-chip and method for operating memory management unit

Country Status (2)

Country Link
CN (1) CN111949562A (en)
DE (1) DE102019117783A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114676071A (en) * 2022-05-18 2022-06-28 飞腾信息技术有限公司 Data processing method and device, electronic equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115460172B (en) * 2022-08-22 2023-12-05 曙光信息产业股份有限公司 Device address allocation method, device, computer device, medium and program product

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9191441B2 (en) * 2013-03-15 2015-11-17 International Business Machines Corporation Cell fabric hardware acceleration
US10095620B2 (en) * 2016-06-29 2018-10-09 International Business Machines Corporation Computer system including synchronous input/output and hardware assisted purge of address translation cache entries of synchronous input/output transactions

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114676071A (en) * 2022-05-18 2022-06-28 飞腾信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN114676071B (en) * 2022-05-18 2022-08-19 飞腾信息技术有限公司 Data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
DE102019117783A1 (en) 2020-11-19

Similar Documents

Publication Publication Date Title
US11216385B2 (en) Application processor, system-on chip and method of operating memory management unit
US8799621B2 (en) Translation table control
US6742104B2 (en) Master/slave processing system with shared translation lookaside buffer
US20100058026A1 (en) Loading entries into a tlb in hardware via indirect tlb entries
US6741258B1 (en) Distributed translation look-aside buffers for graphics address remapping table
EP2936322B1 (en) Processing device with address translation probing and methods
JP7340326B2 (en) Perform maintenance operations
JP2020529656A (en) Address translation cache
KR20160122278A (en) Translation look-aside buffer with prefetching
JP2013543195A (en) Streaming translation in display pipes
NL2012387A (en) System-on-chip and method of operating the same.
US6742103B2 (en) Processing system with shared translation lookaside buffer
US11853223B2 (en) Caching streams of memory requests
CN111949562A (en) Application processor, system-on-chip and method for operating memory management unit
JP2020514859A (en) Configurable skew associativity in translation lookaside buffers
US20050273572A1 (en) Address translator and address translation method
JPWO2008155849A1 (en) Arithmetic processing device, TLB control method, TLB control program, and information processing device
CN109983538B (en) Memory address translation
JP7469306B2 (en) Method for enabling allocation of virtual pages to discontiguous backing physical subpages - Patents.com
US8700865B1 (en) Compressed data access system and method
CN116383101A (en) Memory access method, memory management unit, chip, device and storage medium
Hur Representing contiguity in page table for memory management units
US6674441B1 (en) Method and apparatus for improving performance of an accelerated graphics port (AGP) device
US20230385203A1 (en) Input output memory management unit and electronic device having the same
US20150121012A1 (en) Method and apparatus for providing dedicated entries in a content addressable memory to facilitate real-time clients

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination