WO2009144385A1 - Memory management method and apparatus - Google Patents

Memory management method and apparatus Download PDF

Info

Publication number
WO2009144385A1
WO2009144385A1 PCT/FI2009/050463 FI2009050463W WO2009144385A1 WO 2009144385 A1 WO2009144385 A1 WO 2009144385A1 FI 2009050463 W FI2009050463 W FI 2009050463W WO 2009144385 A1 WO2009144385 A1 WO 2009144385A1
Authority
WO
WIPO (PCT)
Prior art keywords
ram
page
memory
paging
pages
Prior art date
Application number
PCT/FI2009/050463
Other languages
French (fr)
Inventor
Jonathan Medhurst
Jonathan Coppeard
Original Assignee
Nokia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation filed Critical Nokia Corporation
Publication of WO2009144385A1 publication Critical patent/WO2009144385A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management

Definitions

  • Embodiments of the present invention relate to a method and apparatus to provide virtual memory in a device in which programs and data are required to be loaded into memory for use by a processor unit, and in particular to such a method and apparatus for use in a device where some of the data and programs must be modified as they are paged into memory.
  • Pages are predefined quantities of memory space, and they can act as a unit of memory size in the context of storing or loading code or data into memory locations.
  • Demand paging is a technique which involves loading pages of code or data into memory on demand, i.e. based on when they are required for a processing operation.
  • a method comprising: storing first software components in a first storage medium, the first software components being divided into memory pages; and then when at least part of a first software component is required by a processor, performing the following:- a) determining a location in random access memory (RAM) into which at least the required part can be loaded for use by the processor; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
  • RAM random access memory
  • the present invention provides apparatus comprising: a processor; a first storage medium storing first software components, the first software components being divided into memory pages; and a loader for loading software components into random access memory (RAM) for use by the processor; wherein when at least part of a first software component is required by the processor, the loader performs the following:- a) determining a location in RAM into which at least the required part can be loaded; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
  • RAM random access memory
  • the present invention provides apparatus comprising: processor means; first storage means storing first software components, the first software components being divided into memory pages; and loading means for loading software components into random access memory (RAM) for use by the processor; wherein when at least part of a first software component is required by the processor means, the loading means performs the following:- a) determining a location in RAM into which at least the required part can be loaded; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
  • RAM random access memory
  • the processor means may include one or more separate processor cores.
  • the loading means may be provided in software. In some examples it may form a part of an operating system.
  • the invention may include a computer program, a suite of computer programs, a computer readable storage medium, or any software arrangement for implementing the method of the first example. Aspects of the invention may also be carried out in hardware, or in a combination of software and hardware.
  • Figure 1 is a block diagram of a smartphone architecture
  • Figure 2A is a diagram illustrating a memory layout forming background to the invention
  • Figure 2B is a diagram illustrating a memory layout forming background to the invention
  • Figure 2C is a diagram illustrating a memory layout according to an embodiment of the invention
  • Figure 3 is a diagram illustrating how paged data can be paged into RAM
  • Figure 4 is a diagram illustrating a paging cache
  • Figure 5 is a diagram illustrating how a new page can be added to the paging cache
  • Figure 6 is a diagram illustrating how pages can be aged within a paging cache
  • Figure 7 is a diagram illustrating how aged pages can be rejuvenated in a paging cache
  • Figure 8 is a diagram illustrating how a page can be paged out of the paging cache
  • Figure 9 is a diagram illustrating the RAM savings obtained using demand paging
  • Figure 10 is a flow diagram illustrating the operation of an example embodiment of the invention.
  • FIG. 1 shows an example of a device that may benefit from embodiments of the present invention.
  • the smartphone 10 comprises hardware to perform the telephony functions, together with an application processor and corresponding support hardware to enable the phone to have other functions which are desired by a smartphone, such as messaging, calendar, word processing functions and the like.
  • the telephony hardware is represented by the RF processor 102 which provides an RF signal to antenna 126 for the transmission of telephony signals, and the receipt therefrom.
  • baseband processor 104 which provides signals to and receives signals from the RF Processor 102.
  • the baseband processor 104 also interacts with a subscriber identity module 106.
  • a display 116 and a keypad 118. These are controlled by an application processor 108, which is often a separate integrated circuit from the baseband processor 104 and RF processor 102.
  • a power and audio controller 120 is provided to supply power from a battery to the telephony subsystem, the application processor, and the other hardware. Additionally, the power and audio controller 120 also controls input from a microphone 122, and audio output via a speaker 124.
  • various different types of memory are often provided. Firstly, the application processor 108 is provided with some Random Access Memory (RAM) 112 into which data and program code can be written and read from at will. Code placed anywhere in RAM can be executed by the application processor 108 from the RAM.
  • RAM Random Access Memory
  • separate user memory 110 which is used to store user data, such as user application programs (typically higher layer application programs which determine the functionality of the device), as well as user data files, and the like.
  • user application programs typically higher layer application programs which determine the functionality of the device
  • user data files and the like.
  • An operating system is the software that manages the sharing of the resources of the device, and provides programmers with an interface to access those resources.
  • An operating system processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs on the system. At its most basic, the operating system performs tasks such as controlling and allocating memory, prioritising system requests, controlling input and output devices, facilitating networking, and managing files.
  • An operating system is in essence an interface by which higher level applications can access the hardware of the device.
  • an operating system is provided, which is started when the smartphone system 10 is first switched on.
  • the operating system code is commonly stored in a Read-Only Memory, and in modern devices, the Read-Only Memory is often NAND Flash ROM 114.
  • the ROM will store the necessary operating system component in order for the device 10 to operate, but other software programs may also be stored, such as application programs, and the like, and in particular those application programs which are mandatory to the device, such as, in the case of a smartphone, communications applications and the like. These would typically be the applications which are bundled with the smartphone by the device manufacturer when the phone is first sold. Further applications which are added to the smartphone by the user would usually be stored in the user memory 110.
  • ROM Read-Only Memory
  • ROM Read-Only Memory
  • ROM Read Only Memory
  • the term ROM can be used to mean 'data stored in such a way that it behaves like it is stored in read-only memory'.
  • the underlying media may actually be physically writeable, like RAM or flash memory but the file system presents a ROM-like interface to the rest of the OS, for example as a particular drive.
  • ROM situation is further complicated when the underlying media is not XIP. This is the case for NAND flash, used in many modern devices. Here code in NAND is copied (or shadowed) to RAM, where it can be executed in place. One way of achieving this is to copy the entire ROM contents into RAM during system boot and use the
  • MMU Memory Management Unit
  • ROM read only memory
  • RAM random access memory
  • layout A shows how the NAND flash 20 is structured in a simple example. All the ROM contents 22 are permanently resident in RAM and any executables in the user data area 24 (for example the C: or D: drive) are copied into RAM as they are needed.
  • ROFS Read-Only File System
  • Code in ROFS is copied into RAM as it is needed at runtime, at the granularity of an executable (or other whole file), in the same way as executables in the user data area.
  • the component responsible for doing this is the 'Loader', which is part of the File Server process.
  • the primary ROFS is combined with the Core image into a single ROM- like interface by what is known as the Composite File System.
  • Layout B in Figure 2 shows a Composite File System structure of another example.
  • ROM 30 is divided into the Core Image 32 comprising those components of the OS which will always be loaded into RAM, and the ROFS 34 containing those components which do not need to be continuously present in RAM, but which can be loaded in and out of RAM as required.
  • components in the ROFS 34 are loaded in and out of RAM as whole components when they are required (in the case of loading in) or not required. Comparing this to layout A, it can be seen that layout B is more RAM-efficient because some of the contents of the ROFS 34 are not copied into RAM at any given time. The more unused files there are in the ROFS 34, the greater the RAM saving.
  • Virtual memory techniques are known in the art, where the combined size of any programs, data and stack exceeds the physical memory available, but programs and data are split up into units called pages.
  • the pages which are required to be executed can be loaded into RAM, with the rest of the pages of the program and data stored in non XIP memory (such as on disk).
  • Demand paging refers to a form of paging where pages are loaded into memory on demand as they are needed, rather than in advance. Demand paging therefore generally relies on page faults occurring to trigger the loading of a page into RAM for execution.
  • An example embodiment of the invention to be described is based upon the smartphone architecture shown in Figure 1, and in particular a smartphone running Symbian OS.
  • Symbian OS the part of the operating system which is responsible overall for loading programs and data from non XIP memory into RAM is the "loader".
  • loader the part of the operating system which is responsible overall for loading programs and data from non XIP memory into RAM.
  • Many further details of the operation of the loader can be found in Sales J. Symbian OS Internals John Wiley & Sons, 2005, and in particular chapter 10 thereof, the entire contents of which are incorporated herein by reference.
  • the operation of the loader is modified to allow demand paging techniques to be used within the framework of Symbian OS.
  • a smartphone having a composite file system as previously described, wherein the CFS provides a Core Image comprising those components of the OS which will always be loaded into RAM, and the ROFS containing those components which do not need to be continuously present in RAM, but which can be loaded in and out of RAM as required.
  • the principles of virtual memory are used on the core image, to allow data and programs to be paged in and out of memory when required or not required. By using virtual memory techniques such as this, then RAM savings can be made, and overall hardware cost of a smartphone reduced.
  • XIP ROM Paging can refer to reading in required segments ("pages") of executable code into RAM as they are required, at a finer granularity than that of the entire executable. Typically, page size may be around 4kB; that is, code can be read in and out of RAM as required in 4kB chunks. A single executable may comprise a large number of pages. Paging is therefore very different from the operation of the ROFS, for example, wherein whole executables are read in and out of RAM as they are required to be run.
  • an XIP ROM image is split into two parts, one containing unpaged data and one containing data paged on demand.
  • the unpaged data is those executables and other data which cannot be split up into pages.
  • the unpaged data consists of kernel-side code plus those parts that should not be paged for other reasons (e.g. performance, robustness, power management, etc).
  • the terms 'locked down' or 'wired' can also be used to mean unpaged.
  • Paged data in this example is those executables and other data which can be split up into pages.
  • the unpaged area at the start of the XIP ROM image is loaded into RAM as normal but the linear address region normally occupied by the paged area is left unmapped - i.e. no RAM is allocated for it in this example.
  • a thread accesses memory in the paged area, it takes a page fault.
  • the page fault handler code in the kernel then allocates a page of RAM and reads the contents for this from the XIP ROM image contained on storage media (e.g. NAND flash).
  • storage media e.g. NAND flash.
  • a page is a convenient unit of memory allocation: in this example it is 4kB.
  • the thread then continues execution from the point where it took the page fault. This process is referred to in this example embodiment as 'paging in' and is described in more detail later.
  • a page may contain data from one or more files and page boundaries do not necessarily coincide with file boundaries in the example embodiment.
  • FIG 2 layout C shows an XIP ROM paging structure which forms background to the present embodiment.
  • This drawing is background only, as in Figure 2C code paging is not used.
  • Figure 2C does illustrate XIP ROM paging of the core, which forms part of the present embodiment.
  • Figure 2 D illustrates an example embodiment in its entirety, and comparison with Figure 2 C readily shows the differences therebetween.
  • ROM 40 comprises an unpaged core area 42 containing those components which should not be paged, and a paged core area 44 containing those components which should reside in the core image rather than the ROFS, but which can be paged.
  • ROFS 46 then contains those components which do not need to be in the Core image. Components in the ROFS may be code paged, as will be described later.
  • the unpaged area of the Core image may be larger than the total Core image in layout B, only a fraction of the contents of the paged area needs to be copied into RAM compared to the amount of loaded ROFS code in layout B.
  • Dead Page A page of paged memory whose contents are not currently available.
  • Page Out The act of making a live page into a dead page.
  • the RAM used to store the content of this may then be reused for other purposes.
  • efficient performance of the paging subsystem is dependent on the algorithm that selects which pages are live at any given time, or conversely, which live pages should be made dead.
  • the paging subsystem of this embodiment approximates a Least Recently Used (LRU) algorithm for determining which pages to page out.
  • the memory management unit 28 (MMU) provided in the example device is a component comprising hardware and software which has overall responsibility for the proper operation of the device memory, and in particular for allowing the application processor to write to or read from the memory.
  • the MMU is part of the paging subsystem of this example embodiment.
  • the paging algorithm according to the present embodiment provides a "live page list". All live pages are stored on the 'live page list', which is a part of the paging cache.
  • FIG 4 shows the live page list.
  • the live page list is split into two sub-lists, one containing young pages (the "young page list” 72) and the other, old pages (the “old page list” 74).
  • the memory management unit (MMU) 58 in the device of this example is used to make all young pages accessible to programs but the old pages inaccessible. However, the contents of old pages are preserved and they still count as being live.
  • the net effect is of a FIFO (first-in, first-out) list in front of an LRU list, which results in less page churn than a plain LRU.
  • FIFO first-in, first-out
  • Figure 5 shows what happens when a page is "paged in” in this example embodiment. When a page is paged in, it is added to the start of the young list 72 in the live page list, making it the youngest.
  • the paging subsystem of some embodiments attempts to keep the relative sizes of the two lists equal to a value called the young/old ratio. If this ratio is R, the number of young pages is Ny and the number of old pages is No then if (Ny > RNo ) , a page is taken from the end of the young list 72 and placed at the start of the old list 74. This process is called ageing, and is shown in Figure 6.
  • the operating system When the operating system requires more RAM for another purpose then it may obtain the memory used by a live page.
  • the 'oldest' live page is selected for paging out, turning it into a dead page, as shown in Figure 8. If paging out leaves too many young pages, according to the young/old ratio, then the last young page (e.g. Page D in Figure 8) would be aged. In this way, the young/old ratio helps to maintain the stability of the paging algorithm, and ensure that there are always some pages in the old list.
  • the above actions are executed in the context of the thread that tries to access the paged memory.
  • a purpose of demand paging is to save RAM, but there may also be at least two other potential benefits. These benefits can be dependent on a paging configuration, discussed later.
  • DP actually improves performance compared with the non-DP composite file system case ( Figure 2, layout B), especially when the use-case normally involves loading a large amount of code into RAM (e.g. when booting or starting up large applications).
  • the performance overhead of paging can be outweighed by the performance gain of loading less code into RAM. This is sometimes known as 'lazy loading' of code.
  • non-DP case consists of a large core image (i.e. something closer to Figure 2, layout A)
  • most or all of the code involved in a use-case may already be permanently loaded, and so the performance improvement of lazy loading may be reduced.
  • An exception to this is during boot, where the cost of loading the whole core image into RAM contributes to the overall boot time.
  • a second possible performance improvement lies in improved stability of the device.
  • the stability of a device is often at its weakest in Out Of Memory (OOM) situations. Poorly written code may not cope well with exceptions caused by failed memory allocations. As a minimum, an OOM situation will degrade the user experience.
  • OOM Out Of Memory
  • the RAM saving achieved by DP is proportional to the amount of code loaded in the non-DP case at a particular time. For instance, the RAM saving when 5 applications are running is greater than the saving immediately after boot. This can make it even harder to induce an OOM situation.
  • demand paging can introduce three new configurable parameters to the system. These are:
  • the first two are discussed below.
  • the third should be determined empirically.
  • a number of components are explicitly made unpaged in example embodiments of the invention, to meet the functional and performance requirements of a device.
  • the performance overhead of servicing a page fault is unbounded and variable so it may be desirable to protect some critical code paths by making files unpaged. Chains of files and their dependencies may need to be unpaged to achieve this. It may be possible to reduce the set of unpaged components by breaking unnecessary dependencies and separating critical code paths from non-critical ones.
  • paging cache size As described previously if the system requires more free RAM and the free RAM pool is empty, then pages are removed from the paging cache in order to service the memory allocation. In some embodiments this cannot continue indefinitely or a situation may arise where the same pages are continually paged in and out of the paging cache; this is known as page thrashing. Performance is dramatically reduced in this situation. To avoid catastrophic performance loss due to thrashing, within some embodiments a minimum paging cache size can be defined. If a system memory allocation would cause the paging cache to drop below the minimum size, then the allocation fails.
  • the paging cache grows but any RAM used by the cache above the minimum size does not contribute to the amount of used RAM reported by the system. Although this RAM is really being used, it will be recycled whenever anything else in the system requires the RAM. So the effective RAM usage of the paging cache is determined by its minimum size.
  • the minimum paging cache size relates to a minimum number of pages which should be in the paging cache at any one moment.
  • the pages in the paging cache are divided between the young list and the old list. This is not essential, however, and in other embodiments the paging cache may not be divided, or may be further sub divded into more than two lists. To help prevent thrashing, it is useful to maintain an overall minimum size of the list, and to make the pages therein accessible without having to be re-loaded into memory.
  • the effective RAM saving is the size of all paged components minus the minimum size of the paging cache. Note that when a ROFS section is introduced, this calculation is much more complicated because the contents of the ROFS are likely to be different between the non-DP and DP cases.
  • the RAM saving can be increased by reducing the set of unpaged components and/or reducing the minimum paging cache size (i.e. making the configuration more 'stressed'). Performance can be improved (up to a point) by increasing the set of unpaged components and/or increasing the minimum paging cache size (i.e. making the configuration more 'relaxed'). However, if the configuration is made too relaxed then it is possible to end up with a net RAM increase compared with a non-DP ROM.
  • a program or other software component that is not stored in XIP ROM exists as a file stored in another file system, such as the ROFS or user data area 24.
  • code Before such code can be executed it needs to be copied into a particular RAM location, so that it can be properly accessed by the application processor.
  • This RAM location usually cannot be determined ahead of time, and hence in order for the component to run correctly the executable contents must be modified so that the memory pointers contained within them are correct.
  • the memory pointers should be modified so that they are referenced with respect to the RAM location into which the program is to be loaded. This modification is called 'relocation' and 'fix-up'.
  • the task of reading, copying and modifying executables is performed by the Loader, and is shown in more detail in the form of an example in Figure 10.
  • an executable or other data from non-core storage such as the ROFS or user data area
  • Loading into RAM allows the executable to be accessed by the application processor, so that the executable can be run.
  • the Loader reads in the data to be loaded, and 10.6 the memory management unit returns the RAM location to the Loader into which the data is to be written.
  • the RAM is available, and the data has been read.
  • the memory pointers in the data (executable) will be incorrect, in that they will likely reference as default other parts of the RAM than the part of the RAM into which the data has been loaded. Therefore, the memory pointers should be updated to take into account the memory location into which the data is to be written. In some embodiments, this may be as straightforward as adding an offset onto the existing memory pointer values, the offset being determined from the RAM location address into which the data is to be written.
  • the memory pointers are updated by the Loader at 10.8 in the example of Figure 10. Thereafter, the updated data (executable) can be written to the determined RAM location, at 10.10. The data (executable) is then available to the application processor.
  • Code paging as described in the context of these embodiments is therefore a form of demand paging, but of code which is located outside of XIP ROM storage. Paging of such code may be considerably more complex than XIP ROM paging because not only is it necessary to copy the contents into memory on demand, but, as described, the relocation and fix-up modifications should be applied at the same time. Due to this additional overhead of code paging, it may be preferable to use XIP ROM paging where possible.
  • Layout D in Figure 2 shows a NAND flash structure in an embodiment of the invention in which both code paging and XIP ROM paging are used. Only those parts of an executable currently in use are copied into RAM.
  • XIP ROM paging may be used for most data in ROM and code paging may be used for any remaining paged executables in ROFS. Note that in some examples pages in code paged executables do not cross into other executables like in XIP ROM paging. So the last page of an executable may contain less than 4kB of data.
  • the techniques of the present invention may be used to provide embodiments with different applications, such as for example, as a general purpose computer, or as a portable media player, or other audio visual device, such as a camera.
  • Any device or machine which incorporates a computing device provided with RAM into which data and programs need to be loaded for execution may benefit from the invention and constitute an embodiment thereof.
  • the invention may therefore be applied in many fields, to provide improved devices or machines that require less RAM to operate than had heretofore been the case.
  • Embodiments of the present invention apply virtual memory techniques to a system provided with storage in which software components may be read into RAM as pages from one storage area without the content of the pages requiring any modification, whereas software components in another storage area may be read into RAM as pages, but require modification to elements thereof before being written into RAM.
  • software components in the other storage area require memory pointers contained therein to be updated to account for the RAM location to which they are to be written.
  • memory pointers contained therein to be updated to account for the RAM location to which they are to be written.
  • Embodiments of the invention may allow code which is stored in any storage medium on a device to be paged into RAM, and hence the benefits of paging in terms of reduced RAM requirements can be obtained.
  • modifying the first software component comprises updating memory pointers in at least the required part of the first software component to account for the determined RAM location.
  • any RAM location can be used, and the code is dynamically adapted as it is loaded into the RAM location.
  • Such techniques may even allow the code to be split between two or more RAM locations, with the memory pointers of each part being adapted accordingly.
  • the method further comprises storing second software components in a second storage medium, the second software components being divided into memory pages; and when at least part of a second software component is required by the processor, loading the memory page containing the part of the software components presently required into RAM.
  • This can allow data, including executables, which are stored in XIP ROM to be paged into RAM as required, in addition to data stored in the first storage medium.
  • data in XIP ROM which is to be paged in does not, however, require modifying, and hence can be loaded straight into RAM.
  • the first storage medium and the second storage medium can constitute different parts of the same storage medium.
  • paging can be used with devices which have a composite file system, such as, for example, smartphones of the prior art.
  • the storage medium may be NAND flash memory, which is commonly used for portable computing devices such as a smartphone, MP3 player, or the like, because of its relatively low cost.
  • NAND flash memory which is commonly used for portable computing devices such as a smartphone, MP3 player, or the like, because of its relatively low cost.
  • it has a drawback that it is not XIP.
  • Embodiments of the present invention may allow the use of paging from NAND flash into RAM nevertheless, thereby allowing RAM savings to be achieved together with the use of NAND flash memory.
  • the first component and/or second software component is demand paged into RAM.
  • the thread may be determined that a first or second software component is required by a program thread by the thread attempting the access the page, thereby generating a page fault.
  • a paging cache of pages of the second software components which have been recently loaded is maintained, the paging cache being arranged on a first-in, first-out (FIFO) basis.
  • Maintaining a paging cache as a FIFO can allow a large degree of control to be maintained over the paging process, and can help avoidmemory pages from completely filling up the available RAM.
  • the paging cache may be divided into at least two parts, being a young page part having the pages most recently loaded and an old page part with pages less recently loaded. This feature in combination with the FIFO arrangement provides an effective Least Recently Used (LRU) type paging algorithm, which is relatively straightforward to implement, but which results in less page churn than other known LRU implementations.
  • LRU Least Recently Used
  • the relative sizes of the young page part and the old page part are controlled to maintain substantially a predetermined young/old size ratio.
  • another page previously loaded into RAM may be transferred into the old page part in dependence on the young/old size ratio and the present relative sizes of the young page part and old page part.
  • the relative sizes of the young page part and the old page part are controlled to maintain the young/old size ratio by transferring pages between the two parts, and deleting pages from the old part.
  • a page in the old page part is preferably inaccessible to the processor, but when access is required to a page in the old page part the page is transferred into the young page part for access by the processor.
  • This allows pages to be aged out of the paging cache, but if they are required again whilst still in the old page part, they can simply be transferred back into the young page part so as to be accessed by the processor. This presents significantly less overhead than having to load the page again from the storage media.
  • the old page part of the cache acts as a sort of buffer to provide extra time for a page to be re-used, before it is completely paged out and made dead. As a consequence, less page churn results.
  • the paging cache is maintained at a minimum size. If the paging cache is too small then a known problem referred to as "thrashing" can occur, where pages are being loaded into and out of RAM very quickly. As each page load incurs a significant overhead, processing performance can be drastically reduced. However, by maintaining the cache at a minimum size, the problem of thrashing can be reduced.
  • the paging cache When the paging cache is larger than the minimum size and a memory allocation event occurs and there is no free memory then memory can be allocated from the paging cache, unless such allocation would cause the paging cache to be lower than the minimum size. Such operation can help ensure that the minimum paging cache size is maintained, but does not prevent the paging cache from being larger than the minimum size. In this respect, if there is free RAM at any given time, then the paging cache can be allowed to grow to use as much RAM as it needs, subject to the RAM constraints.

Abstract

Embodiments of the present invention apply virtual memory techniques to a system provided with storage in which software components may be read into RAM as pages from one storage area without the data in the pages requiring any modification, whereas software components in another storage area may be read into RAM as pages, but require modification to elements thereof before being written into RAM. In particular, software components in the other storage area require any memory pointers contained therein to be updated to account for the RAM location to which they are to be written. However, by performing such memory pointer modifications, more data than has heretofore been the case can be paged into RAM, and hence significant RAM hardware savings can be made.

Description

Memory Management Method and Apparatus
Technical Field
Embodiments of the present invention relate to a method and apparatus to provide virtual memory in a device in which programs and data are required to be loaded into memory for use by a processor unit, and in particular to such a method and apparatus for use in a device where some of the data and programs must be modified as they are paged into memory.
Background to the Invention
The concept of memory pages is often employed in memory managements systems. Pages are predefined quantities of memory space, and they can act as a unit of memory size in the context of storing or loading code or data into memory locations. Demand paging is a technique which involves loading pages of code or data into memory on demand, i.e. based on when they are required for a processing operation.
Summary of the Invention
In a first example of the invention there is provided a method comprising: storing first software components in a first storage medium, the first software components being divided into memory pages; and then when at least part of a first software component is required by a processor, performing the following:- a) determining a location in random access memory (RAM) into which at least the required part can be loaded for use by the processor; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
In a second example, the present invention provides apparatus comprising: a processor; a first storage medium storing first software components, the first software components being divided into memory pages; and a loader for loading software components into random access memory (RAM) for use by the processor; wherein when at least part of a first software component is required by the processor, the loader performs the following:- a) determining a location in RAM into which at least the required part can be loaded; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
In another example, the present invention provides apparatus comprising: processor means; first storage means storing first software components, the first software components being divided into memory pages; and loading means for loading software components into random access memory (RAM) for use by the processor; wherein when at least part of a first software component is required by the processor means, the loading means performs the following:- a) determining a location in RAM into which at least the required part can be loaded; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
The processor means may include one or more separate processor cores. The loading means may be provided in software. In some examples it may form a part of an operating system.
In other examples, the invention may include a computer program, a suite of computer programs, a computer readable storage medium, or any software arrangement for implementing the method of the first example. Aspects of the invention may also be carried out in hardware, or in a combination of software and hardware.
Brief Description of the Drawings
Features and advantages of example embodiments of the present invention will become apparent from the following description with reference to the accompanying drawings, wherein: -
Figure 1 is a block diagram of a smartphone architecture;
Figure 2A is a diagram illustrating a memory layout forming background to the invention; Figure 2B is a diagram illustrating a memory layout forming background to the invention;
Figure 2C is a diagram illustrating a memory layout according to an embodiment of the invention; Figure 3 is a diagram illustrating how paged data can be paged into RAM; Figure 4 is a diagram illustrating a paging cache;
Figure 5 is a diagram illustrating how a new page can be added to the paging cache; Figure 6 is a diagram illustrating how pages can be aged within a paging cache; Figure 7 is a diagram illustrating how aged pages can be rejuvenated in a paging cache; Figure 8 is a diagram illustrating how a page can be paged out of the paging cache;
Figure 9 is a diagram illustrating the RAM savings obtained using demand paging; and Figure 10 is a flow diagram illustrating the operation of an example embodiment of the invention.
Description of the Embodiments
Figure 1 shows an example of a device that may benefit from embodiments of the present invention. The smartphone 10 comprises hardware to perform the telephony functions, together with an application processor and corresponding support hardware to enable the phone to have other functions which are desired by a smartphone, such as messaging, calendar, word processing functions and the like. In Figure 1 the telephony hardware is represented by the RF processor 102 which provides an RF signal to antenna 126 for the transmission of telephony signals, and the receipt therefrom. Additionally provided is baseband processor 104, which provides signals to and receives signals from the RF Processor 102. The baseband processor 104 also interacts with a subscriber identity module 106.
Also provided are a display 116, and a keypad 118. These are controlled by an application processor 108, which is often a separate integrated circuit from the baseband processor 104 and RF processor 102. A power and audio controller 120 is provided to supply power from a battery to the telephony subsystem, the application processor, and the other hardware. Additionally, the power and audio controller 120 also controls input from a microphone 122, and audio output via a speaker 124. In order for the application processor 108 to operate, various different types of memory are often provided. Firstly, the application processor 108 is provided with some Random Access Memory (RAM) 112 into which data and program code can be written and read from at will. Code placed anywhere in RAM can be executed by the application processor 108 from the RAM.
Additionally provided is separate user memory 110, which is used to store user data, such as user application programs (typically higher layer application programs which determine the functionality of the device), as well as user data files, and the like.
Many modern electronic devices make use of operating systems. Modern operating systems can be found on anything composed of integrated circuits, like personal computers, Internet servers, cell phones, music players, routers, switches, wireless access points, network storage, game consoles, digital cameras, DVD players, sewing machines, and telescopes. An operating system is the software that manages the sharing of the resources of the device, and provides programmers with an interface to access those resources. An operating system processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs on the system. At its most basic, the operating system performs tasks such as controlling and allocating memory, prioritising system requests, controlling input and output devices, facilitating networking, and managing files. An operating system is in essence an interface by which higher level applications can access the hardware of the device.
In order for the application processor 108 to operate in the embodiment of Figure 1, an operating system is provided, which is started when the smartphone system 10 is first switched on. The operating system code is commonly stored in a Read-Only Memory, and in modern devices, the Read-Only Memory is often NAND Flash ROM 114. The ROM will store the necessary operating system component in order for the device 10 to operate, but other software programs may also be stored, such as application programs, and the like, and in particular those application programs which are mandatory to the device, such as, in the case of a smartphone, communications applications and the like. These would typically be the applications which are bundled with the smartphone by the device manufacturer when the phone is first sold. Further applications which are added to the smartphone by the user would usually be stored in the user memory 110.
ROM (Read-Only Memory) traditionally refers to memory devices that physically store data in a way which cannot be modified. These devices also allow direct random access to their contents and so code can be executed from them directly - code is eXecute-In-
Place (XIP). This has the advantage that programs and data in ROM are always available and don't require any action to load them into memory. The term ROM can be used to mean 'data stored in such a way that it behaves like it is stored in read-only memory'. The underlying media may actually be physically writeable, like RAM or flash memory but the file system presents a ROM-like interface to the rest of the OS, for example as a particular drive.
The ROM situation is further complicated when the underlying media is not XIP. This is the case for NAND flash, used in many modern devices. Here code in NAND is copied (or shadowed) to RAM, where it can be executed in place. One way of achieving this is to copy the entire ROM contents into RAM during system boot and use the
Memory Management Unit (MMU) to mark this area of RAM with read-only permissions. The data stored by this method is called the Core ROM image (or just Core image) to distinguish it from other data stored in NAND. The Core image is an XIP
ROM and is usually the only one; it is permanently resident in RAM.
Figure 2, layout A shows how the NAND flash 20 is structured in a simple example. All the ROM contents 22 are permanently resident in RAM and any executables in the user data area 24 (for example the C: or D: drive) are copied into RAM as they are needed.
The above method can be costly in terms of RAM usage, and a more efficient scheme can be used to split the ROM contents into those parts required to boot the OS, and everything else. The former is placed in the Core image as before and the latter is placed into another area called the Read-Only File System (ROFS). Code in ROFS is copied into RAM as it is needed at runtime, at the granularity of an executable (or other whole file), in the same way as executables in the user data area. In a specific example of an embodiment using Symbian OS, the component responsible for doing this is the 'Loader', which is part of the File Server process. In an example embodiment, there are several ROFS images, for example localisation and/or operator-specific images. Usually, the first one (called the primary ROFS) is combined with the Core image into a single ROM- like interface by what is known as the Composite File System.
Layout B in Figure 2 shows a Composite File System structure of another example. Here, ROM 30 is divided into the Core Image 32 comprising those components of the OS which will always be loaded into RAM, and the ROFS 34 containing those components which do not need to be continuously present in RAM, but which can be loaded in and out of RAM as required. As mentioned, components in the ROFS 34 are loaded in and out of RAM as whole components when they are required (in the case of loading in) or not required. Comparing this to layout A, it can be seen that layout B is more RAM-efficient because some of the contents of the ROFS 34 are not copied into RAM at any given time. The more unused files there are in the ROFS 34, the greater the RAM saving.
It would, however, be beneficial if even further RAM savings could be made. Virtual memory techniques are known in the art, where the combined size of any programs, data and stack exceeds the physical memory available, but programs and data are split up into units called pages. The pages which are required to be executed can be loaded into RAM, with the rest of the pages of the program and data stored in non XIP memory (such as on disk). Demand paging refers to a form of paging where pages are loaded into memory on demand as they are needed, rather than in advance. Demand paging therefore generally relies on page faults occurring to trigger the loading of a page into RAM for execution.
An example embodiment of the invention to be described is based upon the smartphone architecture shown in Figure 1, and in particular a smartphone running Symbian OS. Within Symbian OS, as mentioned, the part of the operating system which is responsible overall for loading programs and data from non XIP memory into RAM is the "loader". Many further details of the operation of the loader can be found in Sales J. Symbian OS Internals John Wiley & Sons, 2005, and in particular chapter 10 thereof, the entire contents of which are incorporated herein by reference. Within the example embodiment to be described the operation of the loader is modified to allow demand paging techniques to be used within the framework of Symbian OS.
In particular, according to the example embodiment, a smartphone is provided having a composite file system as previously described, wherein the CFS provides a Core Image comprising those components of the OS which will always be loaded into RAM, and the ROFS containing those components which do not need to be continuously present in RAM, but which can be loaded in and out of RAM as required. In order to reduce the RAM requirement of the smartphone, within the example embodiment the principles of virtual memory are used on the core image, to allow data and programs to be paged in and out of memory when required or not required. By using virtual memory techniques such as this, then RAM savings can be made, and overall hardware cost of a smartphone reduced.
Since an XIP ROM image on NAND is actually stored in RAM, an opportunity arises to demand page the contents of the XIP ROM, that is, read its data contents from NAND flash into RAM (where it can be executed), on demand. This is called XIP ROM Paging (or demand paging). "P aging" can refer to reading in required segments ("pages") of executable code into RAM as they are required, at a finer granularity than that of the entire executable. Typically, page size may be around 4kB; that is, code can be read in and out of RAM as required in 4kB chunks. A single executable may comprise a large number of pages. Paging is therefore very different from the operation of the ROFS, for example, wherein whole executables are read in and out of RAM as they are required to be run.
In the example embodiment of the invention an XIP ROM image is split into two parts, one containing unpaged data and one containing data paged on demand. In this example the unpaged data is those executables and other data which cannot be split up into pages. The unpaged data consists of kernel-side code plus those parts that should not be paged for other reasons (e.g. performance, robustness, power management, etc). The terms 'locked down' or 'wired' can also be used to mean unpaged. Paged data in this example is those executables and other data which can be split up into pages. At boot time, the unpaged area at the start of the XIP ROM image is loaded into RAM as normal but the linear address region normally occupied by the paged area is left unmapped - i.e. no RAM is allocated for it in this example.
When a thread accesses memory in the paged area, it takes a page fault. The page fault handler code in the kernel then allocates a page of RAM and reads the contents for this from the XIP ROM image contained on storage media (e.g. NAND flash). As mentioned, a page is a convenient unit of memory allocation: in this example it is 4kB. The thread then continues execution from the point where it took the page fault. This process is referred to in this example embodiment as 'paging in' and is described in more detail later.
When the free RAM on the system reaches zero, memory allocation requests can be satisfied by taking RAM from the paged-in XIP ROM region. As RAM pages in the XIP ROM region are unloaded, they are 'paged out'. Figure 3 shows the operations just described.
Note that the content in the paged data area of an XIP ROM is subject to paging in this example, not just executable code; accessing any file in this area may induce a page fault. A page may contain data from one or more files and page boundaries do not necessarily coincide with file boundaries in the example embodiment.
Figure 2, layout C shows an XIP ROM paging structure which forms background to the present embodiment. This drawing is background only, as in Figure 2C code paging is not used. However, Figure 2C does illustrate XIP ROM paging of the core, which forms part of the present embodiment. Figure 2 D illustrates an example embodiment in its entirety, and comparison with Figure 2 C readily shows the differences therebetween.
For the sake of simplicity, the elements of the example embodiment relating to paging from the core image will be described with respect to Figure 2 C, and thereafter the additional parts of the embodiment relating to code paging will be described with respect to Figure 2 D. With reference to Figure 2C, therefore, ROM 40 comprises an unpaged core area 42 containing those components which should not be paged, and a paged core area 44 containing those components which should reside in the core image rather than the ROFS, but which can be paged. ROFS 46 then contains those components which do not need to be in the Core image. Components in the ROFS may be code paged, as will be described later. Although the unpaged area of the Core image may be larger than the total Core image in layout B, only a fraction of the contents of the paged area needs to be copied into RAM compared to the amount of loaded ROFS code in layout B.
Further details of the algorithm which controls demand paging from the core image in this example embodiment will now be described. All memory content that can be demand paged is said in this example to be 'paged memory' and the process is controlled by the 'paging subsystem'. Other terms that are used in describing example embodiments of the invention are: 1. Live Page - A page of paged memory whose contents are currently available.
2. Dead Page - A page of paged memory whose contents are not currently available.
3. Page In - The act of making a dead page into a live page.
4. Page Out - The act of making a live page into a dead page. The RAM used to store the content of this may then be reused for other purposes.
In one embodiment, efficient performance of the paging subsystem is dependent on the algorithm that selects which pages are live at any given time, or conversely, which live pages should be made dead. The paging subsystem of this embodiment approximates a Least Recently Used (LRU) algorithm for determining which pages to page out. The memory management unit 28 (MMU) provided in the example device is a component comprising hardware and software which has overall responsibility for the proper operation of the device memory, and in particular for allowing the application processor to write to or read from the memory. The MMU is part of the paging subsystem of this example embodiment. The paging algorithm according to the present embodiment provides a "live page list". All live pages are stored on the 'live page list', which is a part of the paging cache. Figure 4 shows the live page list. The live page list is split into two sub-lists, one containing young pages (the "young page list" 72) and the other, old pages (the "old page list" 74). The memory management unit (MMU) 58 in the device of this example is used to make all young pages accessible to programs but the old pages inaccessible. However, the contents of old pages are preserved and they still count as being live. The net effect is of a FIFO (first-in, first-out) list in front of an LRU list, which results in less page churn than a plain LRU.
Figure 5 shows what happens when a page is "paged in" in this example embodiment. When a page is paged in, it is added to the start of the young list 72 in the live page list, making it the youngest.
The paging subsystem of some embodiments attempts to keep the relative sizes of the two lists equal to a value called the young/old ratio. If this ratio is R, the number of young pages is Ny and the number of old pages is No then if (Ny > RNo ) , a page is taken from the end of the young list 72 and placed at the start of the old list 74. This process is called ageing, and is shown in Figure 6.
If an old page is accessed by a program in an example embodiment, this causes a page fault because the MMU has marked old pages as inaccessible. The paging subsystem then turns that page into a young page (i.e. rejuvenates it), and at the same time turns the last young page into an old page. This is shown in Figure 7, wherein the old page to be accessed is taken from the old list 74 and added to the young list 72, and the last (oldest) young page is aged from the young list 72 to the old list 74.
When the operating system requires more RAM for another purpose then it may obtain the memory used by a live page. In one example the 'oldest' live page is selected for paging out, turning it into a dead page, as shown in Figure 8. If paging out leaves too many young pages, according to the young/old ratio, then the last young page (e.g. Page D in Figure 8) would be aged. In this way, the young/old ratio helps to maintain the stability of the paging algorithm, and ensure that there are always some pages in the old list.
When a program attempts to access paged memory that is 'dead', a page fault is generated by the MMU and the executing thread is diverted to the Symbian OS exception handler. This performs the following tasks:
1. Obtain a page of RAM from the system's pool of unused RAM (i.e. the 'free pool'), or if this is empty, page out the oldest live page and use that instead.
2. Read the contents for this page from some media (e.g. NAND flash). 3. Update the paging cache's live list as described previously.
4. Use the MMU to make this RAM page accessible at the correct linear address.
5. Resume execution of the program's instructions, starting with the one that caused the initial page fault.
In some embodiments the above actions are executed in the context of the thread that tries to access the paged memory.
When the system requires more RAM and the free pool is empty then RAM that is being used to store paged memory is freed up for use. This is referred to as 'paging out' and happens by the following process: 1. Remove the 'oldest' RAM page from the paging cache.
2. Use the MMU to mark the page as inaccessible.
3. Return the RAM page to the free pool.
Possible benefits of the demand paging algorithm of some embodiments of the invention will now be discussed. In general, a purpose of demand paging is to save RAM, but there may also be at least two other potential benefits. These benefits can be dependent on a paging configuration, discussed later.
One possible performance benefit resulting from some embodiments of the invention is due to so-called "lazy loading". In general, the cost of servicing a page fault means that paging has a negative impact on performance. However, in some cases demand paging
(DP) actually improves performance compared with the non-DP composite file system case (Figure 2, layout B), especially when the use-case normally involves loading a large amount of code into RAM (e.g. when booting or starting up large applications). In these cases, the performance overhead of paging can be outweighed by the performance gain of loading less code into RAM. This is sometimes known as 'lazy loading' of code.
Note that when the non-DP case consists of a large core image (i.e. something closer to Figure 2, layout A), most or all of the code involved in a use-case may already be permanently loaded, and so the performance improvement of lazy loading may be reduced. An exception to this is during boot, where the cost of loading the whole core image into RAM contributes to the overall boot time.
A second possible performance improvement lies in improved stability of the device. The stability of a device is often at its weakest in Out Of Memory (OOM) situations. Poorly written code may not cope well with exceptions caused by failed memory allocations. As a minimum, an OOM situation will degrade the user experience.
IfDP is enabled on a device and the same physical RAM is available compared with the non-DP case, the increased RAM saving makes it more difficult for the device to go OOM, avoiding many potential stability issues. Furthermore, the RAM saving achieved by DP is proportional to the amount of code loaded in the non-DP case at a particular time. For instance, the RAM saving when 5 applications are running is greater than the saving immediately after boot. This can make it even harder to induce an OOM situation.
Note that this increased stability may only apply when the entire device is OOM. Individual threads may have OOM problems due to reaching their own heap limits. DP may not help in these cases.
In addition to the above described benefits of demand paging, further performance improvements may be obtained in dependence on the demand paging configuration. In particular, demand paging can introduce three new configurable parameters to the system. These are:
1. The amount of code and data that is marked as unpaged. 2. The minimum size of the paging cache.
3. The ratio of young pages to old pages in the paging cache.
The first two are discussed below. The third should be determined empirically.
With respect to the amount of unpaged files, it is preferred in some embodiments that areas of the OS involved in servicing a paging fault are protected from blocking on the thread that took the paging fault (directly or indirectly). Otherwise, a deadlock situation may occur. This is partly achieved in Symbian OS by ensuring that all kernel-side components are always unpaged.
In addition to kernel-side components, a number of components are explicitly made unpaged in example embodiments of the invention, to meet the functional and performance requirements of a device. The performance overhead of servicing a page fault is unbounded and variable so it may be desirable to protect some critical code paths by making files unpaged. Chains of files and their dependencies may need to be unpaged to achieve this. It may be possible to reduce the set of unpaged components by breaking unnecessary dependencies and separating critical code paths from non-critical ones.
Whilst making a component unpaged is a straightforward performance/RAM trade-off, this can be made configurable, allowing the device manufacturer in embodiments of the invention to make the decision based on their system requirements.
With respect to the paging cache size, as described previously if the system requires more free RAM and the free RAM pool is empty, then pages are removed from the paging cache in order to service the memory allocation. In some embodiments this cannot continue indefinitely or a situation may arise where the same pages are continually paged in and out of the paging cache; this is known as page thrashing. Performance is dramatically reduced in this situation. To avoid catastrophic performance loss due to thrashing, within some embodiments a minimum paging cache size can be defined. If a system memory allocation would cause the paging cache to drop below the minimum size, then the allocation fails.
As paged data is paged in, the paging cache grows but any RAM used by the cache above the minimum size does not contribute to the amount of used RAM reported by the system. Although this RAM is really being used, it will be recycled whenever anything else in the system requires the RAM. So the effective RAM usage of the paging cache is determined by its minimum size.
In theory, it is also possible to limit the maximum paging cache size. However, this may not be useful in production devices because it prevents the paging cache from using all the otherwise unused RAM in the system. This may negatively impact performance for no effective RAM saving.
By setting a minimum paging cache size, thrashing can be prevented in some embodiments of the invention. In this respect, the minimum paging cache size relates to a minimum number of pages which should be in the paging cache at any one moment. In one embodiment the pages in the paging cache are divided between the young list and the old list. This is not essential, however, and in other embodiments the paging cache may not be divided, or may be further sub divded into more than two lists. To help prevent thrashing, it is useful to maintain an overall minimum size of the list, and to make the pages therein accessible without having to be re-loaded into memory.
Overall the main advantage of using DP of the core image is the RAM saving which is obtained. An easy way to visualise the RAM saving achieved by DP is to compare simple configurations. Consider a non-DP ROM consisting of a Core with no ROFS (as in Figure 2, layout A). Compare that with a DP ROM consisting of an XIP ROM paged Core image, again with no ROFS (similar to Figure 2, layout C but without the ROFS). The total ROM contents are the same in both cases. Here the effective RAM saving is depicted by Figure 9.
The effective RAM saving is the size of all paged components minus the minimum size of the paging cache. Note that when a ROFS section is introduced, this calculation is much more complicated because the contents of the ROFS are likely to be different between the non-DP and DP cases.
The RAM saving can be increased by reducing the set of unpaged components and/or reducing the minimum paging cache size (i.e. making the configuration more 'stressed'). Performance can be improved (up to a point) by increasing the set of unpaged components and/or increasing the minimum paging cache size (i.e. making the configuration more 'relaxed'). However, if the configuration is made too relaxed then it is possible to end up with a net RAM increase compared with a non-DP ROM.
Demand paging is therefore able to present significant advantages in terms of RAM savings, and hence providing an attendant reduction in the manufacturing cost of a device. Additionally, as mentioned above, depending on configuration performance improvements can also be obtained.
Thus far we have discussed paging, and particularly demand paging, of the core image. However, in the example embodiment represented by Figure 2D, software components stored in the ROFS, or in the user data area 24 may also be paged. As mentioned previously, this is referred to as code paging, and is described next.
In the embodiments illustrated in Figure 2 a program or other software component that is not stored in XIP ROM exists as a file stored in another file system, such as the ROFS or user data area 24. Before such code can be executed it needs to be copied into a particular RAM location, so that it can be properly accessed by the application processor. This RAM location usually cannot be determined ahead of time, and hence in order for the component to run correctly the executable contents must be modified so that the memory pointers contained within them are correct. In particular, the memory pointers should be modified so that they are referenced with respect to the RAM location into which the program is to be loaded. This modification is called 'relocation' and 'fix-up'. The task of reading, copying and modifying executables is performed by the Loader, and is shown in more detail in the form of an example in Figure 10.
More particularly, at 10.2 it is determined that an executable or other data from non- core storage, such as the ROFS or user data area, is required to be loaded into RAM. Loading into RAM, as mentioned, allows the executable to be accessed by the application processor, so that the executable can be run. In response to this determination, at 10.4 the Loader reads in the data to be loaded, and 10.6 the memory management unit returns the RAM location to the Loader into which the data is to be written.
At this point in the example, therefore, the RAM is available, and the data has been read. However, it is not possible to copy the data directly to the RAM location (in fact range of RAM locations, referenced by the start or end points), as the memory pointers in the data (executable) will be incorrect, in that they will likely reference as default other parts of the RAM than the part of the RAM into which the data has been loaded. Therefore, the memory pointers should be updated to take into account the memory location into which the data is to be written. In some embodiments, this may be as straightforward as adding an offset onto the existing memory pointer values, the offset being determined from the RAM location address into which the data is to be written. Howsoever the new memory pointer values are determined, the memory pointers are updated by the Loader at 10.8 in the example of Figure 10. Thereafter, the updated data (executable) can be written to the determined RAM location, at 10.10. The data (executable) is then available to the application processor.
Code paging as described in the context of these embodiments is therefore a form of demand paging, but of code which is located outside of XIP ROM storage. Paging of such code may be considerably more complex than XIP ROM paging because not only is it necessary to copy the contents into memory on demand, but, as described, the relocation and fix-up modifications should be applied at the same time. Due to this additional overhead of code paging, it may be preferable to use XIP ROM paging where possible. Layout D in Figure 2 shows a NAND flash structure in an embodiment of the invention in which both code paging and XIP ROM paging are used. Only those parts of an executable currently in use are copied into RAM. XIP ROM paging may be used for most data in ROM and code paging may be used for any remaining paged executables in ROFS. Note that in some examples pages in code paged executables do not cross into other executables like in XIP ROM paging. So the last page of an executable may contain less than 4kB of data.
It is also worth noting that many prior art operating systems do not implement code paging as described above. Instead, when they need to page out the contents of executables, they write the modified contents to some storage media (backing store) from where it can be recovered. This may have negative implications for power consumption and 'wearing' of the storage media used for such a backing store, but other advantages may potentially be achieved.
Whilst some of the above described embodiments are discussed in the context of the example of a smartphone, it should be understood that in other embodiments different types of device may be provided, for various different functions. For example, the techniques of the present invention may be used to provide embodiments with different applications, such as for example, as a general purpose computer, or as a portable media player, or other audio visual device, such as a camera. Any device or machine which incorporates a computing device provided with RAM into which data and programs need to be loaded for execution may benefit from the invention and constitute an embodiment thereof. The invention may therefore be applied in many fields, to provide improved devices or machines that require less RAM to operate than had heretofore been the case.
In addition, whilst embodiments have been described in respect of a smartphone running Symbian OS, which makes use of a combined file system, it should be further understood that this is presented for illustration only, and in other embodiments the concepts of the demand paging algorithms and/or code paging techniques described herein may be used in other devices, and in particular devices which do not require a split file system such as the composite file system described. Instead, the demand paging algorithm or code paging techniques herein described may be used in any device in which virtual memory techniques involving paging programs and data into memory for use by a processor may be used. It should also be noted that whilst an embodiment has been described in which both XIP ROM paging (demand paging) and code paging are used together, whilst this is advantageous for the reasons noted in this application, in other embodiments of the invention one or the other may be used separately. In particular, the code paging techniques herein described may be separately used from the XIP ROM paging techniques.
Embodiments of the present invention apply virtual memory techniques to a system provided with storage in which software components may be read into RAM as pages from one storage area without the content of the pages requiring any modification, whereas software components in another storage area may be read into RAM as pages, but require modification to elements thereof before being written into RAM. In particular, software components in the other storage area require memory pointers contained therein to be updated to account for the RAM location to which they are to be written. However, by performing such memory pointer modifications, more data than has heretofore been the case can be paged into RAM, and hence significant RAM hardware savings can be made with embodiments of the invention.
Embodiments of the invention may allow code which is stored in any storage medium on a device to be paged into RAM, and hence the benefits of paging in terms of reduced RAM requirements can be obtained.
In some embodiments modifying the first software component comprises updating memory pointers in at least the required part of the first software component to account for the determined RAM location. Thus, in such embodiments it may not be necessary to know in advance the memory location into which the code is to be loaded. Instead, any RAM location can be used, and the code is dynamically adapted as it is loaded into the RAM location. Such techniques may even allow the code to be split between two or more RAM locations, with the memory pointers of each part being adapted accordingly.
Additionally, in one embodiment the method further comprises storing second software components in a second storage medium, the second software components being divided into memory pages; and when at least part of a second software component is required by the processor, loading the memory page containing the part of the software components presently required into RAM. This can allow data, including executables, which are stored in XIP ROM to be paged into RAM as required, in addition to data stored in the first storage medium. In an example, data in XIP ROM which is to be paged in does not, however, require modifying, and hence can be loaded straight into RAM.
The first storage medium and the second storage medium can constitute different parts of the same storage medium. With such an arrangement, paging can be used with devices which have a composite file system, such as, for example, smartphones of the prior art. The storage medium may be NAND flash memory, which is commonly used for portable computing devices such as a smartphone, MP3 player, or the like, because of its relatively low cost. However, it has a drawback that it is not XIP. Embodiments of the present invention may allow the use of paging from NAND flash into RAM nevertheless, thereby allowing RAM savings to be achieved together with the use of NAND flash memory.
In one embodiment the first component and/or second software component is demand paged into RAM. Thus, there may be no need for scheduling of which memory pages are required to be loaded. Instead, it may be determined that a first or second software component is required by a program thread by the thread attempting the access the page, thereby generating a page fault.
In a particular embodiment a paging cache of pages of the second software components which have been recently loaded is maintained, the paging cache being arranged on a first-in, first-out (FIFO) basis. Maintaining a paging cache as a FIFO can allow a large degree of control to be maintained over the paging process, and can help avoidmemory pages from completely filling up the available RAM. The paging cache may be divided into at least two parts, being a young page part having the pages most recently loaded and an old page part with pages less recently loaded. This feature in combination with the FIFO arrangement provides an effective Least Recently Used (LRU) type paging algorithm, which is relatively straightforward to implement, but which results in less page churn than other known LRU implementations. In the above embodiment the relative sizes of the young page part and the old page part are controlled to maintain substantially a predetermined young/old size ratio. Here, when a new page is loaded into RAM and entered into the young page list, another page previously loaded into RAM may be transferred into the old page part in dependence on the young/old size ratio and the present relative sizes of the young page part and old page part. In particular, the relative sizes of the young page part and the old page part are controlled to maintain the young/old size ratio by transferring pages between the two parts, and deleting pages from the old part.
Additionally in this example embodiment, a page in the old page part is preferably inaccessible to the processor, but when access is required to a page in the old page part the page is transferred into the young page part for access by the processor. This allows pages to be aged out of the paging cache, but if they are required again whilst still in the old page part, they can simply be transferred back into the young page part so as to be accessed by the processor. This presents significantly less overhead than having to load the page again from the storage media.
In this way, the old page part of the cache acts as a sort of buffer to provide extra time for a page to be re-used, before it is completely paged out and made dead. As a consequence, less page churn results.
Preferably, in some embodiments the paging cache is maintained at a minimum size. If the paging cache is too small then a known problem referred to as "thrashing" can occur, where pages are being loaded into and out of RAM very quickly. As each page load incurs a significant overhead, processing performance can be drastically reduced. However, by maintaining the cache at a minimum size, the problem of thrashing can be reduced.
When the paging cache is larger than the minimum size and a memory allocation event occurs and there is no free memory then memory can be allocated from the paging cache, unless such allocation would cause the paging cache to be lower than the minimum size. Such operation can help ensure that the minimum paging cache size is maintained, but does not prevent the paging cache from being larger than the minimum size. In this respect, if there is free RAM at any given time, then the paging cache can be allowed to grow to use as much RAM as it needs, subject to the RAM constraints.
Various modifications, including additions and deletions, will be apparent to the skilled person to provide further embodiments, any and all of which are intended to fall within the appended claims. It will be understood that any combinations of the features and examples of the described embodiments of the invention may be made within the scope of the invention.

Claims

Claims
1. A method comprising: storing first software components in a first storage medium, the first software components being divided into memory pages; and then when at least part of a first software component is required by a processor, performing the following :- a) determining a location in random access memory (RAM) into which at least the required part can be loaded for use by the processor; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
2. A method according to claim 1, wherein the said modifying comprises updating memory pointers in at least the required part of the first software component to account for the determined RAM location.
3. A method according to claims 1 or 2, and further comprising: storing second software components in a second storage medium, the second software components being divided into memory pages; and when at least part of a second software component is required by the processor, loading the memory page containing the part of the software components presently required into RAM.
4. A memory management method according to claim 3, wherein the first storage medium and the second storage medium are different parts of the same storage medium.
5. A method according to claim 4, wherein the storage medium is NAND flash memory.
6. A method according to any of the preceding claims, wherein the first software component and /or the second software component is/are demand paged into RAM.
7. A method according to claim 6, wherein it is determined that a first or second software component is required by a program thread by the thread attempting to access a page of the component, thereby generating a page fault.
8. A method according to any of claims 3 to 7, and further comprising maintaining a paging cache of pages of the second software components which have been recently loaded, the paging cache being arranged on a first-in, first-out (FIFO) basis.
9. A method according to claim 8, wherein the paging cache is divided into at least two parts, being a young page part having the pages most recently loaded and an old page part with pages less recently loaded.
10. A method according to claim 9, wherein the relative sizes of the young page part and the old page part are controlled to maintain substantially a predetermined young/old size ratio, wherein when a new page is loaded into RAM and entered into the young page list, another page previously loaded into RAM may be transferred into the old page part in dependence on the young/old size ratio and the present relative sizes of the young page part and old page part.
11. A method according to claims 9 or 10, wherein a page in the old page part is inaccessible to the processor, and wherein when access is required to a page in the old page part the page is transferred into the young page part for access by the processor.
12. A method according to any of claims 8 to 11, wherein the paging cache is maintained at a minimum size.
13. A method according to claim 12, wherein when the paging cache is larger than the minimum size and a memory allocation event occurs and there is no free memory then memory is allocated from the paging cache, unless such allocation would cause the paging cache to be lower than the minimum size.
14. Apparatus comprising: a processor; a first storage medium storing first software components, the first software components being divided into memory pages; and a loader for loading software components into random access memory (RAM) for use by the processor; wherein when at least part of a first software component is required by the processor, the loader performs the folio wing :- a) determining a location in RAM into which at least the required part can be loaded; b) modifying at least the required part of the first software component in dependence on the determined RAM location; and c) loading at least the modified required part of the first software component into RAM at the determined location.
15. A computer program or suite of computer programs so arranged such that when executed by a computer it/they cause the computer to operate in accordance with the method of any of claims 1 to 13.
16. A computer readable storage medium storing a computer program or at least one of the suite of computer programs according to claim 15.
PCT/FI2009/050463 2008-05-30 2009-06-01 Memory management method and apparatus WO2009144385A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0809922.8 2008-05-30
GB0809922A GB2460462A (en) 2008-05-30 2008-05-30 Method for loading software components into RAM by modifying the software part to be loaded based on the memory location to be used.

Publications (1)

Publication Number Publication Date
WO2009144385A1 true WO2009144385A1 (en) 2009-12-03

Family

ID=39637924

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2009/050463 WO2009144385A1 (en) 2008-05-30 2009-06-01 Memory management method and apparatus

Country Status (2)

Country Link
GB (1) GB2460462A (en)
WO (1) WO2009144385A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5754817A (en) * 1994-09-29 1998-05-19 Intel Corporation Execution in place of a file stored non-contiguously in a non-volatile memory
US20060004984A1 (en) * 2004-06-30 2006-01-05 Morris Tonia G Virtual memory management system
US20070043938A1 (en) * 2003-08-01 2007-02-22 Symbian Software Limited Method of accessing data in a computing device
EP1811384A2 (en) * 2005-12-27 2007-07-25 Samsung Electronics Co., Ltd. Demand paging in an embedded system
EP1909171A1 (en) * 2006-09-29 2008-04-09 Intel Corporation Method and apparatus for run-time in-memory patching of code from a service processor

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58102380A (en) * 1981-12-11 1983-06-17 Hitachi Ltd Virtual storage control system
EP0532643B1 (en) * 1990-06-04 1998-12-23 3Com Corporation Method for optimizing software for any one of a plurality of variant architectures
CA2102883A1 (en) * 1993-02-26 1994-08-27 James W. Arendt System and method for lazy loading of shared libraries

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5754817A (en) * 1994-09-29 1998-05-19 Intel Corporation Execution in place of a file stored non-contiguously in a non-volatile memory
US20070043938A1 (en) * 2003-08-01 2007-02-22 Symbian Software Limited Method of accessing data in a computing device
US20060004984A1 (en) * 2004-06-30 2006-01-05 Morris Tonia G Virtual memory management system
EP1811384A2 (en) * 2005-12-27 2007-07-25 Samsung Electronics Co., Ltd. Demand paging in an embedded system
EP1909171A1 (en) * 2006-09-29 2008-04-09 Intel Corporation Method and apparatus for run-time in-memory patching of code from a service processor

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BRASH D.: "The ARM Architecture Version (ARMv6)", 13 April 2007 (2007-04-13), Retrieved from the Internet <URL:http://web.archive.org/web/20070413003329/http://www.arm.com/pdfs/ARMv6_Architecture.pdf> [retrieved on 20090911] *
HANDLEY D.: "Demand Paging on Symbian OS", I.Q. MAGAZINE ONLINE, vol. 7, no. 2, April 2008 (2008-04-01), pages 71 - 76, Retrieved from the Internet <URL:http://www.iqmagazineonline.com/IQ/IQ23/pdfs/IQ23_pgs71-76.pdf> [retrieved on 20090911] *
HANDLEY, D.: "Demand Paging on Symbian OS", TECHONLINE, TECHNICAL PAPERS [ONLINE], Retrieved from the Internet <URL:http://www.techonline.com/learning/techpaper/208403594> [retrieved on 20090911] *
HANDLEY, D.: "Mobile Handset DesignLine, Technical Papers Archive [online]", Retrieved from the Internet <URL:http://www.mobilehandsetdesignline.com/learning/techpaper/archive/;jsessionid=GOFNCXOUDSHHGQSNDLRSKHSCJUNN2JVN?howManyDisplay=25&sortingSelect=&start-at=26> [retrieved on 20090911] *
SHACKMAN M.: "What's New for Developers in Symbian OS v9.4", 20 October 2007 (2007-10-20), Retrieved from the Internet <URL:http://web.archive.org/web/20071020084032/http://developer.symbian.com/main/downloads/papers/whatsnew9.4/What's_new_for_developers_v9.4.pdf> [retrieved on 20090911] *
TANENBAUM A.S., MODERN OPERATING SYSTEMS (3RD INTERNATIONAL EDITION), PEARSON EDUCATION, INC., 19 February 2008 (2008-02-19) *

Also Published As

Publication number Publication date
GB0809922D0 (en) 2008-07-09
GB2460462A (en) 2009-12-02

Similar Documents

Publication Publication Date Title
US10990540B2 (en) Memory management method and apparatus
KR101110490B1 (en) Information processing device, processor and memory management method
US7962684B2 (en) Overlay management in a flash memory storage device
US8171206B2 (en) Avoidance of self eviction caused by dynamic memory allocation in a flash memory storage device
KR100900439B1 (en) Method and Apparatus for managing out-of-memory in embedded system
KR101651204B1 (en) Apparatus and Method for synchronization of snapshot image
US11360884B2 (en) Reserved memory in memory management system
WO2013101193A1 (en) Method and device for managing hardware errors in a multi-core environment
CN109313604B (en) Computing system, apparatus, and method for dynamic configuration of compressed virtual memory
US9740636B2 (en) Information processing apparatus
CN114546634B (en) Management of synchronous restart of system
US9063868B2 (en) Virtual computer system, area management method, and program
CN111427804A (en) Method for reducing missing page interruption times, storage medium and intelligent terminal
US9037773B2 (en) Methods for processing and addressing data between volatile memory and non-volatile memory in an electronic apparatus
CN113127263B (en) Kernel crash recovery method, device, equipment and storage medium
WO2009144383A1 (en) Memory management method and apparatus
KR100994723B1 (en) selective suspend resume method of reducing initial driving time in system, and computer readable medium thereof
CN112654965A (en) External paging and swapping of dynamic modules
US7577814B1 (en) Firmware memory management
CN109144708B (en) Electronic computing device and method for adjusting trigger mechanism of memory recovery function
US20090031100A1 (en) Memory reallocation in a computing environment
WO2009144385A1 (en) Memory management method and apparatus
WO2009144386A1 (en) Method and apparatus for storing software components in memory
WO2009144384A1 (en) Memory paging control method and apparatus
CN110297674B (en) Information processing method and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09754038

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09754038

Country of ref document: EP

Kind code of ref document: A1