GB2460464A - Memory paging control method using two cache parts, each maintained using a FIFO algorithm - Google Patents

Memory paging control method using two cache parts, each maintained using a FIFO algorithm Download PDF

Info

Publication number
GB2460464A
GB2460464A GB0809926A GB0809926A GB2460464A GB 2460464 A GB2460464 A GB 2460464A GB 0809926 A GB0809926 A GB 0809926A GB 0809926 A GB0809926 A GB 0809926A GB 2460464 A GB2460464 A GB 2460464A
Authority
GB
United Kingdom
Prior art keywords
cache
page
ram
paging
pages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0809926A
Other versions
GB0809926D0 (en
Inventor
Jonathan Medhurst
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Oyj
Original Assignee
Nokia Oyj
Symbian Software Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Oyj, Symbian Software Ltd filed Critical Nokia Oyj
Priority to GB0809926A priority Critical patent/GB2460464A/en
Publication of GB0809926D0 publication Critical patent/GB0809926D0/en
Priority to PCT/FI2009/050461 priority patent/WO2009144384A1/en
Publication of GB2460464A publication Critical patent/GB2460464A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/122Replacement control using replacement algorithms of the least frequently used [LFU] type, e.g. with individual count value
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms

Abstract

A computing device pages software components into a RAM. The device maintains a cache of memory pages which have been paged into RAM. The cache is arranged in two parts. Each part operated on a first in, first out basis. When a new page is loaded into RAM, it is loaded into the first part and a page is then moved from the first part of the cache to the second. When a page in the second part is accessed, it is moved to the first part. The second part of the cache may be inaccessible to the processor. The cache may be maintained to ensure that the ratio of the number of pages in the two parts is a predetermined ratio.

Description

Intellectual Property Office mm For Creetity and Innovation Application No. GB0809926.9 RTM Date:15 October 2008 The following terms are registered trademarks and should be read as such wherever they occur in this document: "Symbian" UK Intellectual Property Office is an operating name of The Patent Office Memory Paging Control Method and System
Technical Field
The present invention relates to a memory paging control method and system, and in particular to such a method and system which controls which pages are maintained in a paging cache in the executable memory of a computing device.
Background to the Invention
Many modern electronic devices make use of operating systems. Modern operating systems can be found on anything composed of integrated circuits, like personal computers, Internet servers, cell phones, music players, routers, switches, wireless access points, network storage, game consoles, digital cameras, DVD players, sewing machines, and telescopes. An operating system is the software that manages the sharing of the resources of the device, and provides programmers with an interface to access those resources. An operating systems processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs on the system. At its most basic, the operating system performs tasks such as controlling and allocating memory, prioritising system requests, controlling input and output devices, facilitating networking, and managing files. An operating system is in essence an interface by which higher level applications can access the hardware of the device.
Many modern electronic devices which make use of operating systems have as their basis a similar physical hardware architecture, making use of an application processor provided with suitable memory which stores the device operating system, as well as the higher level application programs which determine the functionality of the device. The operating system and other programs are typically stored in non-volatile Read-Only Memory, and the operating system is loaded first, to allow the application process to then run the higher level application programs. One very common modern electronic device which makes use of an operating system is a smartphone, the generic hardware architecture for which is shown in Figure 1.
With reference to Figure 1, a typical smartphone 10 comprises hardware to perform the telephony functions, together with an application processor and corresponding support hardware to enable the phone to have other functions which are desired by a smartphone, such as messaging, calendar, word processing functions and the like. In Figure 1 the telephony hardware is represented by the RF processor 102 which provides an RF signal to antenna 126 for the transmission of telephony signals, and the receipt therefrom. Additionally provided is baseband processor 104, which provides signals to and receives signals from the RF Processor 102. The baseband processor 104 also interacts with a subscriber identity module 106, as is well known in the art. The telephony subsystem of the smartphone 10 is beyond the scope of the present invention.
Also typically provided is a display 116, and a keypad 118. These are controlled by an application processor 108, which is often a separate integrated circuit from the baseband processor 104 and RF processor 102, although in the future it is anticipated that single chip solutions will become available. A power and audio controller 120 is provided to supply power from a battery (not shown) to the telephony subsystem, the application processor, and the other hardware. Additionally, the power and audio controller 120 also controls input from a microphone 122, and audio output via a speaker 124.
In order for the application processor 108 to operate, various different types of memory are often provided. Firstly, the application processor 108 may be provided with some Random Access Memory (RAM) 112 into which data and program code can be written and read from at will. Code placed anywhere in RAM can be executed by the application processor 108 from the RAM.
Additionally provided often is separate user memory 110, which is used to store user data, such as user application programs (typically higher layer application programs which determine the functionality of the device), as well as user data files, and the like.
As mentioned previously, in order for the application processor 108 to operate, an operating system is necessary, which must be started as soon as the smartphone system is first switched on. The operating system code is commonly stored in a Read-Only Memory, and in modem devices, the Read-Only Memory is often NAND Flash ROM 114. The ROM will store the necessary operating system component in order for the device 10 to operate, but other software programs may also be stored, such as application programs, and the like, and in particular those application programs which are mandatory to the device, such as, in the case of a smartphone, communications applications and the like. These would typically be the applications which are bundled with the smartphone by the device manufacturer when the phone is first sold. Further applications which are added to the smartphone by the user would usually be stored in the user memory 110.
ROM (Read-Only Memory) traditionally refers to memory devices that physically store data in a way which cannot be modified. These devices also allow direct random access to their contents and so code can be executed from them directly -code is eXecute-In-Place (XIP). This has the advantage that programs and data in ROM are always available and don't require any action to load them into memory.
In the case of smartphones, a well known operating system is that produced by the present applicant, known as Symbian OS. In Symbian OS, the term ROM has developed the looser meaning of data stored in such a way that it behaves like it is stored in read-only memory'. The underlying media may actually be physically writeable, like RAM or flash memory but the file system presents a ROM-like interface to the rest of the OS, usually as drive Z:.
The ROM situation is further complicated when the underlying media is not XIP. This is the case for NAND flash, used in many modern devices. Here it is necessary to copy (or shadow) any code in NAND to RAM, where it can be executed in place. The simplest way of achieving this is to copy the entire ROM contents into RAM during system boot and use the Memory Management Unit (MMU) to mark this area of RAM with read-only permissions. The data stored by this method is called the Core ROM image (or just Core image) to distinguish it from other data stored in NAND. The Core image is an XIP ROM and is usually the only one; it is permanently resident in RAM.
Figure 2, layout A shows how the NAND flash 20 is structured in this simple case. All the ROM contents 22 are permanently resident in RAM and any executables in the user data area 24 (usually the C: or D: drive) are copied into RAM as they are needed.
The above method is costly in terms of RAM usage so a more efficient scheme was developed that (broadly speaking) splits the ROM contents into those parts required to boot the OS, and everything else. The former is placed in the Core image as before and the latter is placed into another area called the Read-Only File System (ROFS). Code in ROFS is copied into RAM as it is needed at runtime, at the granularity of an executable (or other whole file), in the same way as executables in the user data area. In Symbian OS the component responsible for doing this is the Loader', which is part of the File Server process. Herein, executables' means any executable code, including DLL (dynamic link library) functions.
Potentially, there are several ROFS images, for example localisation and/or operator-specific images. Usually, the first one (called the primary ROFS) is combined with the Core image into a single ROM-like interface by what is known as the Composite File System.
Layout B in Figure 2 shows an ordinary Composite File System structure. Here, ROM is divided into the Core Image 32 comprising those components of the OS which will always be loaded into RAM, and the ROFS 34 containing those components which do not need to be continuously present in RAM, but which can be loaded in and out of RAM as required. As mentioned, components in the ROFS 34 are loaded in and out of RAM as whole components when they are required (in the case of loading in) or not required. Comparing this to layout A, it can be seen that layout B is more RAM-efficient because some of the contents of the ROFS 34 are not copied into RAM at any given time. The more unused files there are in the ROFS 34, the greater the RAM saving.
It would, however, be beneficial if even further RAM savings could be made. Virtual memory techniques are known in the art, where the combined size of any programs, data and stack exceeds the physical memory available, but programs and data are split up into units called pages. The pages which are required to be executed can be loaded into RAM, with the rest of the pages of the program and data stored in non XIP memory (such as on disk). Demand paging refers to a form of paging where pages are loaded into memory on demand as they are needed, rather than in advance. Demand paging therefore generally relies on page faults occurring to trigger the loading of a page into RAM for execution.
When a page fault occurs, and a new page is to be loaded, a decision needs to be made as to which page to remove from RAM to make room for the new page to be loaded.
This is particularly the case where the RAM is full, and the number of pages in RAM cannot simply be increased.
Tanenbaum A.S. Modern Operating Systems 2'' ed Prentice Hall, 2001, at pages 214 to 228 describes several different page replacement algorithms which are already known in the art, such as the Not Recently Used (NRU) algorithm, Least Recently Used (LRU) algorithm, First-In First-Out (FIFO) algorithm, and the second chance algorithm.
The FIFO algorithm simply maintains a list of all pages currently in memory, with the page at the head of the list the oldest one and the page at the tail the most recent arrival.
On a page fault, the page at the head of the list is removed, and the new page added to the tail of the list. Whilst a FIFO page replacement algorithm is therefore easy to implement, it is rarely used because it makes no account for whether the page that is being removed might actually be required again in the near future. In this respect, if the removed page is required again, then a further page fault will be triggered, and the removed page will have to be re-loaded from storage into RAM.
A variation on the FIFO algorithm is the second chance algorithm. Here each page has a Read flag R, which says whether the pages has been referenced recently (i.e. in the last n clock cycles). The pages are maintained in a FIFO cache, but if the oldest page has its flag set such that it has been accessed recently, then when it comes to the head of the list and hence is subject to be removed, instead of being removed it is placed back at the tail of the queue, and hence may work its way back through the FIFO cache. This provides a large performance improvement over the FIFO algorithm, but it is dependent on a page being read as it approaches the end of the queue. Conversely, depending on the length of the queue, a page may be maintained which is not in fact required very often. The problem therefore lies in the fact that the algorithm only looks to give a page a "second chance" as it is about to be deleted from the cache. If that chance is not taken, then the page is lost, and may need to be re-loaded.
Another algorithm is the LRU algorithm. Tanenbaum describes the LRU algorithm as "excellent, but difficult to implement". The LRU algorithm approximates well an optimal page replacement algorithm. In this respect, an optimal page replacement algorithm would remove the page from RAM which is the page which will not be required again until farthest into the future (if at all), thereby putting off another page fault as long as possible. The LRU algorithm is based on the premise that pages which have been heavily used in the last few instructions will probably be heavily used again in the next few (and hence should be retained in RAM). However, pages that have not been used for a long time will probably remain unused for a long time, and hence can be replaced. Therefore, an LRU algorithm operates to try and replace the page that has been unused for the longest time.
As mentioned, whilst LRU algorithms provide a good approximation to the optimal page replacement algorithm, as acknowledged by Tanenbaum they are difficult to implement, requiring in some cases special hardware, or for software implementations, large processing overheads. It would therefore be beneficial if a further paging algorithm could be found, which has the benefits of LRU, but which is easier to implement with smaller processing overhead.
Summary of the Invention
Embodiments of the invention provide a new page replacement algorithm which provides substantially the benefits of an LRU algorithm but with the implementation simplicity of a FIFO arrangement. In particular, according to embodiments of the invention pages which have been paged into memory are maintained in a paging cache, which is split into at least two parts. Each part is maintained as a separate FIFO, and new pages are loaded into the first part, with pages output from the first part being placed into the second part. Hence, essentially the arrangement is no more complex than simply maintaining two FIFO lists. However, to ensure that pages are not immediately deleted from the cache when they reach the output of the first part of the cache, they are retained in the second part of the cache, and whilst they are working their way through the second part of the cache they can be re-instated into the input of the first part of the cache whenever required. Such operation approximates a least recently used type algorithm, because whilst in the first part of the cache the page can be used by the processor. However, when the page is relegated to the second part of the cache, if it is needed again whilst in the second part of the cache it can be immediately promoted to the first part of the cache, and will not need to be re-loaded. On the other hand, if the page is not required whilst it is resident in the second part of the cache i.e. it is not recently used, then it will eventually be paged out. The second part of the cache therefore provides a buffer which acts to sort out which pages are recently used and which are not.
In view of the above, from a first aspect there is provided a memory paging control method for use in a computing device having RAM and software components which are paged into said RAM for use by a processor of the computing device, the method comprising: maintaining a paging cache of memory pages which have been paged into RAM, the paging cache being arranged in at least a first part and a second part, each part being configured to operate on a respective First-In, First-Out (FIFO) basis; loading a new page into RAM, the new page being loaded into the first part of the cache; outputting a page from the first part of the cache into the second part of the cache; and when a page is required which is located in the second part of the cache, the required page is removed from the second part of the cache and placed into the first part of the cache.
Therefore, the present invention provides a paging control method which is relatively straightforward to implement, but which allows, by virtue of the splitting of the cache into two parts, for a determination to be made as to whether in fact a page has been recently used and should therefore be maintained in the cache, or whether the page can be paged out.
Preferably, the second part of the cache is inaccessible to the processor, wherein said movement of said required page from the first part of the cache into the second part of the cache renders the page accessible. By making the second part of the cache inaccessible, the memory controller in charge of the process has to move the required page into the first part of the cache, and hence the page is rejuvenated, and will not be subject to relegation into the second part of the cache until it has worked through the first part of the cache. Hence, once rejuventaed, a page is available in the cache for some time.
Preferably the relative sizes of the first and second parts of the cache are maintained in dependence on a predetermined ratio coefficient R. This provides a mechanism to ensure that the first part of the cache does not become too large, and hence that pages move through the first and second parts of the cache lists in accordance with the control algorithm. In particular, a page is removed from the first part of the cache and placed in the second part of the cache if the number of pages in the first part of the cache is greater than the product of the ratio coefficient R and the number of pages in the second part of the cache.
In the embodiment, a page is removed from the second part of the cache when it is necessary to free RAM for another purpose. Hence, whilst there is free RAM in the system, a page which has been relegated into the second part of the cache remains in the cache, and the cache simply grows large. This means that if a relegated page is required, it can be simply transferred back into the first part of the cache, and does not need to be re-loaded. By only reducing the size of the second part of the cache when RAM is required for another purpose, as many pages as possible can be maintained as available for transferring back into the first part of the cache, and the need to re-load pages can be kept to a minimum.
Preferably, when a page is accessed whilst in the first part of the cache the position of the accessed page in the first part of the cache is not altered. Hence, even though accessed, the page stays in its same position in the FIFO queue. The reason for this is to prevent too much processing overhead in maintaining the FIFO queue. Whilst in the first part of the cache the page can be accessed -it does not matter where in the FIFO queue the page is located. If the page is required again once it has come to the end of the FIFO queue in the first part of the cache then it can be re-instated to the front of the first part of the cache if it is needed again once it has been relegated. In this way, the only FIFO operations are placing pages at the start of the FIFO queue in the first cache, and no queue re-ordering is required.
Preferably, there are plural pages in the first and second parts of the paging cache, and more preferably the paging cache is maintained at a minimum size. Maintaining the paging cache at a minimum size prevents a phenomenon known as "thrashing" where pages are being continually paged out from and re-loaded back into RAM. This is very detrimental to processing efficiency, as re-loading a page takes a substantial amount of time (e.g. a few milliseconds, to reload from disk).
From another aspect the invention provides a computer program or suite of computer programs so arranged such that when executed by a computer it/they cause the computer to operate in accordance with the method of the first aspect. Additionally, the invention also provides a computer readable storage medium storing such a computer program or at least one of the suite of computer programs.
From a yet further aspect there is also provided a memory paging control system for use in a computing device having RAM and software components which are paged into said RAM for use by a processor of the computing device, the system comprising: a paging cache of memory pages which have been paged into RAM, the paging cache being arranged in at least a first part and a second part, each part being configured to operate on a respective First-In, First-Out (FIFO) basis; and a memory control unit arranged in use to perform the following:-i) load a new page into RAM, the new page being loaded into the first part of the cache; ii) output a page from the first part of the cache into the second part of the cache; and iii) when a page is required which is located in the second part of the cache, remove the required page from the second part of the cache and place into the first part of the cache.
Within this further aspect the same advantages, and same further features and associated advantages can be obtained as in respect of the first aspect.
Further features and aspects will be apparent from the appended claims.
Brief Description of the Drawings
Further features and advantages of the present invention will become apparent from the following description of embodiments thereof, presented by way of example only, and with reference to the accompanying drawings, wherein like reference numerals refer to like parts, and wherein: -Figure 1 is a block diagram of a typical smartphone architecture of the prior art; Figure 2A is a diagram illustrating a memory layout forming background to the invention; Figure 2B is a diagram illustrating a memory layout forming background to the invention; Figure 2C is a diagram illustrating a memory layout according to an embodiment of the invention; Figure 3 is a diagram illustrating how paged data can be paged into RAM; Figure 4 is a diagram illustrating a paging cache; Figure 5 is a diagram illustrating how a new page can be added to the paging cache; Figure 6 is a diagram illustrating how pages can be aged within a paging cache; Figure 7 is a diagram illustrating how aged pages can be rejuvenated in a paging cache; Figure 8 is a diagram illustrating how a page can be paged out of the paging cache; Figure 9 is a diagram illustrating the RAM savings obtained using demand paging;
Description of the Embodiments
A preferred embodiment of the invention to be described is based upon the smartphone architecture discussed in the introduction, and in particular a smartphone running Symbian OS. Within Symbian OS, as mentioned, the part of the operating system which is responsible overall for loading programs and data from non XIP memory into RAM is the "loader" Many further details of the prior art operation of the loader can be found in Sales J. Symbian OS Internals John Wiley & Sons, 2005, and in particular chapter 10 thereof, the entire contents of which are incorporated herein be reference. Within the embodiment to be described the operation of the loader is modified to allow demand paging techniques to be used within the framework of Symbian OS.
In particular, according to the preferred embodiment, a smartphone is provided having a composite file system as previously described, wherein the CFS provides a Core Image comprising those components of the OS which will always be loaded into RAM, and the ROFS containing those components which do not need to be continuously present in RAM, but which can be loaded in and out of RAM as required. In order to reduce the RAM requirement of the smartphone, within the embodiment the principles of virtual memory are used on the core image, to allow data and programs to be paged in and out of memory when required or not required. By using virtual memory techniques such as this, then RAM savings can be made, and overall hardware cost of a smartphone reduced.
Since an XIP ROM image on NAND is actually stored in RAM, an opportunity arises to demand page the contents of the XIP ROM. That is, read its data contents from NAND flash into RAM (where it can be executed), on demand. This is called XIP ROM Paging (or demand paging). Here, "paging" refers to reading in required segments ("pages") of executable code into RAM as they are required, at a finer granularity than that of the entire executable. Typically, page size may be around 4kB; that is, code can be read in arid out of RAM as required in 4kB chunks. A single executable may comprise a large number of pages. Paging is therefore very different from the operation of the ROFS, for example, wherein whole executables are read in and out of RAM as they are required to be run.
In the embodiment of the invention an XIP ROM image is split into two parts, one containing unpaged data and one containing data paged on demand. Unpaged data is those executables and other data which cannot be split up into pages. Unpaged data consists of kernel-side code plus those parts that should not be paged for other reasons (e.g. performance, robustness, power management, etc). The terms locked down' or wired' are also used to mean unpaged. Paged data is those executables and other data which can be split up into pages.
At boot time, the unpaged area at the start of the XIP ROM image is loaded into RAM as normal but the linear address region normally occupied by the paged area is left unmapped -i.e. no RAM is allocated for it.
When a thread accesses memory in the paged area, it takes a page fault. The page fault handler code in the kernel then allocates a page of RAM and reads the contents for this from the XIP ROM image contained on storage media (e.g. NAND flash). As mentioned, a page is a convenient unit of memory allocation, usually 4kB. The thread then continues execution from the point where it took the page fault. This process is called paging in' and is described in more detail later.
When the free RAM on the system reaches zero, memory allocation requests can be satisfied by taking RAM from the paged-in XIP ROM region. As RAM pages in the XIP ROM region are unloaded, they are said to be paged out'. Figure 3 shows the operations just described.
Note that all content in the paged data area of an XIP ROM is subject to paging, not just executable code; accessing any file in this area may induce a page fault. A page may contain data from one or more files and page boundaries do not necessarily coincide with file boundaries.
Figure 2, layout C shows a typical XIP ROM paging structure according to the present embodiment. Here, ROM 40 comprises an unpaged core area 42 containing those components which should not be paged, and a paged core area 44 containing those components which should reside in the core image rather than the ROFS, but which can be paged. ROFS 46 then contains those components which do not need to be in the Core image. Although the unpaged area of the Core image may be larger than the total Core image in layout B, only a fraction of the contents of the paged area needs to be copied into RAM compared to the amount of loaded ROFS code in layout B. Further details of the algorithm which controls demand paging will now be described.
All memory content that can be demand paged is said to be paged memory' and the process is controlled by the paging subsystem'. A page is typically a 4kB block of RAM, as mentioned, although in different systems other size pages can be used. Here are some other terms that are used: 1. Live Page -A page of paged memory whose contents are currently available.
2. Dead Page -A page of paged memory whose contents are not currently available.
3. Page In -The act of making a dead page into a live page.
4. Page Out -The act of making a live page into a dead page. The RAM used to store the content of this may then be reused for other purposes.
Efficient performance of the paging subsystem is dependent on the algorithm that selects which pages are live at any given time, or conversely, which live pages should be made dead. The paging subsystem approximates a Least Recently Used (LRU) algorithm for determining which pages to page out. The memory management unit 28 (MMU) provided in the device, being the hardware (and sometimes software) component which has overall responsibility for the proper operation of the device memory, and in particular for allowing the application processor to write to or read from the memory, is part of the paging subsystem.
The paging algorithm according to the present embodiment provides a "live page list".
All live pages are stored on the live page list', which is an integral part of the paging cache. Figure 4 shows the live page list. The live page list is split into two sub-lists, one containing young pages (the "young page list" 72) and the other, old pages (the "old page list" 74). The memory management unit (MMU) 58 in the device is used to make all young pages accessible to programs but the old pages inaccessible. However, the contents of old pages are preserved and they still count as being live. The net effect is of a FIFO (first-in, first-out) list in front of an LRU list, which results in less page churn than a plain LRU.
Figure 5 shows what happens when a page is "paged in". When a page is paged in, it is added to the start of the young list 72 in the live page list, making it the youngest.
The paging subsystem attempts to keep the relative sizes of the two lists equal to a value called the young/old ratio. If this ratio is R, the number of young pages is Ny and the number of old pages is No then if (Ny > RN0), a page is taken from the end of the young list 72 and placed at the start of the old list 74. This process is called ageing, and is shown in Figure 6.
If an old page is accessed by a program, this causes a page fault because the MMU has marked old pages as inaccessible. The paging subsystem then turns that page into a young page (i.e. rejuvenates it), and at the same time turns the last young page into an old page. This is shown in Figure 7, wherein the old page to be accessed is taken from the old list 74 and added to the young list 72, and the last (oldest) young page is aged from the young list 72 to the old list 74.
When the operating system requires more RAM for another purpose then it may need to obtain the memory used by a live page. In this case the oldest' live page is selected for paging out, turning it into a dead page, as shown in Figure 8. If paging out leaves too many young pages, according to the young/old ratio, then the last young page (e.g. Page D in Figure 8) would be aged. In this way, the young/old ratio helps to maintain the stability of the paging algorithm, and ensure that there are always some pages in the old list.
When a program attempts to access paged memory that is dead', a page fault is generated by the MMU and the executing thread is diverted to the Symbian OS exception handler. This performs the following tasks: 1. Obtain a page of RAM from the system's pooi of unused RAM (i.e. the free pool'), or if this is empty, page out the oldest live page and use that instead.
2. Read the contents for this page from some media (e.g. NAND flash).
3. Update the paging cache's live list as described previously.
4. Use the MMU to make this RAM page accessible at the correct linear address.
5. Resume execution of the program's instructions, starting with the one that caused the initial page fault.
Note the above actions are executed in the context of the thread that tries to access the paged memory.
When the system requires more RAM and the free pooi is empty then RAIvI that is being used to store paged memory is freed up for use. This is called paging out' and happens by the following steps: 1. Remove the oldest' RAM page from the paging cache.
2. Use the MMU to mark the page as inaccessible.
3. Return the RAM page to the free pool.
Having described the demand paging algorithm in detail, discussion will now be undertaken of the benefits provided thereby, and of further performance improvements that can be obtained by the appropriate setting of parameters relating to the demand paging algorithm.
Although the primary purpose of demand paging is to save RAM, there are at least 2 other potential benefits that may be observed. These benefits are highly dependent on the paging configuration, discussed later.
A first performance benefit is due to so -called "lazy loading". In general, the cost of servicing a page fault means that paging has a negative impact on performance.
However, in some cases demand paging (DP) actually improves performance compared with the non-DP composite file system case (Figure 2, layout B), especially when the use-case normally involves loading a large amount of code into RAM (e.g. when booting or starting up large applications). In these cases, the performance overhead of paging can be outweighed by the performance gain of loading less code into RAM. This is sometimes known as lazy loading' of code.
Note that when the non-DP case consists of a large core image (i.e. something closer to Figure 2, layout A), most or all of the code involved in a use-case will already be permanently loaded, and so the performance improvement of lazy loading will be reduced. The exception to this is during boot, where the cost of loading the whole core image into RAM contributes to the overall boot time.
A second performance improvement lies in improved stability of the device. The stability of a device is often at its weakest in Out Of Memory (OOM) situations. Poorly written code may not cope well with exceptions caused by failed memory allocations.
As a minimum, an OOM situation will degrade the user experience.
[f DP is enabled on a device and the same physical RAM is available compared with the non-DP case, the increased RAM saving makes it more difficult for the device to go OOM, avoiding many potential stability issues. Furthermore, the RAM saving achieved by DP is proportional to the amount of code loaded in the non-DP case at a particular time. For instance, the RAM saving when 5 applications are running is greater than the saving immediately after boot. This makes it even harder to induce an OOM situation.
Note this increased stability only applies when the entire device is OOM. Individual threads may have OOM problems due to reaching their own heap limits. DP will not help in these cases.
In addition to the above described benefits per se of demand paging, as mentioned, further performance improvements may be obtained in dependence on the demand paging configuration. In particular, demand paging introduces three new configurable parameters to the system. These are: 1. The amount of code and data that is marked as unpaged.
2. The minimum size of the paging cache.
3. The ratio of young pages to old pages in the paging cache.
The first two are the most important and they are discussed below. The third has a less dramatic effect on the system and should be determined empirically.
With respect to the amount of unpaged files, it is important that all areas of the OS involved in servicing a paging fault are protected from blocking on the thread that took the paging fault (directly or indirectly). Otherwise, a deadlock situation may occur. This is partly achieved in Symbian OS by ensuring that all kernel-side components are always unpaged.
In addition to kernel-side components, there are likely to be a number of components that are explicitly made unpaged to meet the functional and performance requirements of the device. The performance overhead of servicing a page fault is unbounded and variable so some critical code paths may need to be protected by making files unpaged.
It may be necessary to make chains of files and their dependencies unpaged to achieve this. It may be possible to reduce the set of unpaged components by breaking unnecessary dependencies and separating critical code paths from non-critical ones.
Whilst making a component unpaged is a straightforward performance/RAM trade-off, this can be made configurable, allowing the device manufacturer to make the decision based on their system requirements.
With respect to the paging cache size, as described previously if the system requires more free RAM and the free RAM pooi is empty, then pages are removed from the paging cache in order to service the memory allocation. This cannot continue indefinitely or a situation will arise where the same pages are continually paged in and out of the paging cache; this is known as page thrashing. Performance is dramatically reduced in this situation.
To avoid catastrophic performance loss due to thrashing, within the embodiment a minimum paging cache size can be defined. If a system memory allocation would cause the paging cache to drop below the minimum size, then the allocation fails.
As paged data is paged in, the paging cache grows but any RAM used by the cache above the minimum size does not contribute to the amount of used RAM reported by the system. Although this RAM is really being used, it will be recycled whenever anything else in the system requires the RAM. So the effective RAM usage of the paging cache is determined by its minimum size.
In theory, it is also possible to limit the maximum paging cache size. However, this is not useful in production devices because it prevents the paging cache from using all the otherwise unused RAM in the system. This may negatively impact performance for no effective RAM saving.
By setting such a minimum paging cache size, then thrashing can be prevented. In this respect, the minimum paging cache size relates to a minimum number of pages which should be in the paging cache at any one moment. In the present embodiment the pages in the paging cache are divided between the young list and the old list. This is not essential, however, and in other embodiments the paging cache may not be divided, or may be further sub divded into more than two lists. The important point to prevent thrashing, however, is the overall minimum size of the list, and in particular that the pages therein are accessible without having to be re-loaded into memory.
Overall the main advantage of using DP is therefore the RAM saving which is obtained.
The easiest way to visualise the RAM saving achieved by DP is to compare the most simplistic configurations. Consider a non-DP ROM consisting of a Core with no ROFS (as in Figure 2, layout A). Compare that with a DP RUM consisting of an XIP RUM paged Core image, again with no RUFS (similar to Figure 2, layout C but without the RUFS). The total RUM contents are the same in both cases. Here the effective RAM saving is depicted by Figure 9 The effective RAM saving is the size of all paged components minus the minimum size of the paging cache. Note that when a ROFS section is introduced, this calculation is much more complicated because the contents of the ROFS are likely to be different between the non-DP and DP cases.
The RAM saving can be increased by reducing the set of unpaged components and/or reducing the minimum paging cache size (i.e. making the configuration more stressed'). Performance can be improved (up to a point) by increasing the set of unpaged components and/or increasing the minimum paging cache size (i.e. making the configuration more relaxed'). However, if the configuration is made too relaxed then it is possible to end up with a net RAM increase compared with a non-DP RUM.
Demand paging is therefore able to present significant advantages in terms of RAM savings, and hence providing an attendant reduction in the manufacturing cost of a device. Additionally, as mentioned above, depending on configuration performance improvements can also be obtained.
Whilst within the above described embodiment we have focussed on the device being a smartphone, it should be understood that in other embodiments different types of device may be provided, for various different functions. For example, the techniques of the present invention may be used to provide embodiments with different applications, such as for example, as a general purpose computer, or as a portable media player, or other AV device, such as a camera. Any device or machine which incorporates a computing device provided with RAM into which data and programs need to be loaded for execution may benefit from the invention and constitute an embodiment thereof. The invention may therefore be applied in many fields, to provide improved devices or machines that require less RAM to operate than had heretofore been the case.
In addition, whilst the preferred embodiment has been described in respect of a smartphone running Symbian OS, which makes use of a combined file system, it should be further understood that this is presented for explanation only, and in other embodiments the concepts of the demand paging algorithm described herein may be used in other devices, and in particular devices which do not require a split file system such as the composite file system described. Instead, the demand paging algorithm herein described may be used in any device in which virtual memory techniques involving paging programs and data into memory for use by a processor may be used.
Various further modification, including additions and deletions will be apparent to the skilled person to provide further embodiments, any and all of which are intended to fall within the appended claims.

Claims (18)

  1. Claims 1. A memory paging control method for use in a computing device having RAM and software components which are paged into said RAM for use by a processor of the computing device, the method comprising: maintaining a paging cache of memory pages which have been paged into RAM, the paging cache being arranged in at least a first part and a second part, each part being configured to operate on a respective First-In, First-Out (FIFO) basis; loading a new page into RAM, the new page being loaded into the first part of the cache; outputting a page from the first part of the cache into the second part of the cache; and when a page is required which is located in the second part of the cache, the required page is removed from the second part of the cache and placed into the first part of the cache.
  2. 2. A method as claimed in claim 1, wherein the second part of the cache is inaccessible to the processor, wherein said movement of said required page from the first part of the cache into the second part of the cache renders the page accessible.
  3. 3. A method according to any of the preceding claims, wherein the relative sizes of the first and second parts of the cache are maintained in dependence on a predetermined ratio coefficient R.
  4. 4. A method according to claim 3, wherein a page is removed from the first part of the cache and placed in the second part of the cache if the number of pages in the first part of the cache is greater than the product of the ratio coefficient R and the number of pages in the second part of the cache.
  5. 5. A method according to any of the preceding claims wherein a page is removed from the second part of the cache when it is necessary to free RAM for another purpose.
  6. 6. A method according to any of the preceding claims, wherein when a page is accessed whilst in the first part of the cache the position of the accessed page in the first part of the cache is not altered.
  7. 7. A method according to any of the preceding claims, wherein there are plural pages in the first and second parts of the paging cache.
  8. 8. A method according to any of the preceding claims, wherein the paging cache is maintained at a minimum size.
  9. 9. A computer program or suite of computer programs so arranged such that when executed by a computer it/they cause the computer to operate in accordance with the method of any of the preceding claims.
  10. 10. A computer readable storage medium storing a computer program or at least one of the suite of computer programs according to claim 9.
  11. 11. A memory paging control system for use in a computing device having RAM and software components which are paged into said RAM for use by a processor of the computing device, the system comprising: a paging cache of memory pages which have been paged into RAM, the paging cache being arranged in at least a first part and a second part, each part being configured to operate on a respective First-In, First-Out (FIFO) basis; and a memory control unit arranged in use to perform the following:-i) load a new page into RAM, the new page being loaded into the first part of the cache; ii) output a page from the first part of the cache into the second part of the cache; and iii) when a page is required which is located in the second part of the cache, remove the required page from the second part of the cache and place into the first part of the cache.
  12. 12. A system as claimed in claim 11, wherein the second part of the cache is inaccessible to the processor, wherein said movement of said required page from the first part of the cache into the second part of the cache renders the page accessible.
  13. 13. A system according to any of claims 11 or 12, wherein the relative sizes of the first and second parts of the cache are maintained in dependence on a predetermined ratio coefficient R.
  14. 14. A system according to claim 13, wherein a page is removed from the first part of the cache and placed in the second part of the cache if the number of pages in the first part of the cache is greater than the product of the ratio coefficient R and the number of pages in the second part of the cache.
  15. 15. A system according to any of claims 11 to 14 wherein a page is removed from the second part of the cache when it is necessary to free RAM for another purpose.
  16. 16. A system according to any of claims 11 to 15, wherein when a page is accessed whilst in the first part of the cache the position of the accessed page in the first part of the cache is not altered.
  17. 17. A system according to any of claims 11 to 16, wherein there are plural pages in the first and second parts of the paging cache.
  18. 18. A system according to any of claims 11 to 17, wherein the paging cache is maintained at a minimum size.
GB0809926A 2008-05-30 2008-05-30 Memory paging control method using two cache parts, each maintained using a FIFO algorithm Withdrawn GB2460464A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB0809926A GB2460464A (en) 2008-05-30 2008-05-30 Memory paging control method using two cache parts, each maintained using a FIFO algorithm
PCT/FI2009/050461 WO2009144384A1 (en) 2008-05-30 2009-06-01 Memory paging control method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0809926A GB2460464A (en) 2008-05-30 2008-05-30 Memory paging control method using two cache parts, each maintained using a FIFO algorithm

Publications (2)

Publication Number Publication Date
GB0809926D0 GB0809926D0 (en) 2008-07-09
GB2460464A true GB2460464A (en) 2009-12-02

Family

ID=39637928

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0809926A Withdrawn GB2460464A (en) 2008-05-30 2008-05-30 Memory paging control method using two cache parts, each maintained using a FIFO algorithm

Country Status (2)

Country Link
GB (1) GB2460464A (en)
WO (1) WO2009144384A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11366764B2 (en) 2020-09-29 2022-06-21 International Business Machines Corporation Managing a least-recently-used data cache with a persistent body

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5261066A (en) * 1990-03-27 1993-11-09 Digital Equipment Corporation Data processing system and method with small fully-associative cache and prefetch buffers
US20030084251A1 (en) * 2001-10-31 2003-05-01 Gaither Blaine D. Computer performance improvement by adjusting a time used for preemptive eviction of cache entries

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02273843A (en) * 1989-04-14 1990-11-08 Nec Corp Swapping device
US20030110357A1 (en) * 2001-11-14 2003-06-12 Nguyen Phillip V. Weight based disk cache replacement method
US6910106B2 (en) * 2002-10-04 2005-06-21 Microsoft Corporation Methods and mechanisms for proactive memory management
US7543123B2 (en) * 2005-11-07 2009-06-02 International Business Machines Corporation Multistage virtual memory paging system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5261066A (en) * 1990-03-27 1993-11-09 Digital Equipment Corporation Data processing system and method with small fully-associative cache and prefetch buffers
US20030084251A1 (en) * 2001-10-31 2003-05-01 Gaither Blaine D. Computer performance improvement by adjusting a time used for preemptive eviction of cache entries

Also Published As

Publication number Publication date
WO2009144384A1 (en) 2009-12-03
GB0809926D0 (en) 2008-07-09

Similar Documents

Publication Publication Date Title
US7962684B2 (en) Overlay management in a flash memory storage device
JP5422652B2 (en) Avoiding self-eviction due to dynamic memory allocation in flash memory storage
US11360884B2 (en) Reserved memory in memory management system
CN1877548A (en) Method and system for management of page replacement
WO2010144832A1 (en) Partitioned replacement for cache memory
CN109313604B (en) Computing system, apparatus, and method for dynamic configuration of compressed virtual memory
US8930732B2 (en) Fast speed computer system power-on and power-off method
US20110208916A1 (en) Shared cache controller, shared cache control method and integrated circuit
CN114546634B (en) Management of synchronous restart of system
US9318167B2 (en) Information processing apparatus
US9063868B2 (en) Virtual computer system, area management method, and program
TWI399637B (en) Fast switch machine method
US9317438B2 (en) Cache memory apparatus, cache control method, and microprocessor system
GB2461499A (en) Loading software stored in two areas into RAM, the software in a first area is loaded whole and from a second it is demand paged loaded.
GB2460464A (en) Memory paging control method using two cache parts, each maintained using a FIFO algorithm
CN112654965A (en) External paging and swapping of dynamic modules
KR100994723B1 (en) selective suspend resume method of reducing initial driving time in system, and computer readable medium thereof
US20090031100A1 (en) Memory reallocation in a computing environment
JP5870043B2 (en) Start control device, information device, and start control method
WO2009144386A1 (en) Method and apparatus for storing software components in memory
GB2460462A (en) Method for loading software components into RAM by modifying the software part to be loaded based on the memory location to be used.
JP4088763B2 (en) Computer system, hardware / software logic suitable for the computer system, and cache method
KR102563648B1 (en) Multi-processor system and method of operating the same
US11907761B2 (en) Electronic apparatus to manage memory for loading data and method of controlling the same
US20230161616A1 (en) Communications across privilege domains within a central processing unit core

Legal Events

Date Code Title Description
COOA Change in applicant's name or ownership of the application

Owner name: NOKIA CORPORATION

Free format text: FORMER OWNER: SYMBIAN SOFTWARE LTD

WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)