CN112988388A - Memory page management method and computing device - Google Patents

Memory page management method and computing device Download PDF

Info

Publication number
CN112988388A
CN112988388A CN202110298621.2A CN202110298621A CN112988388A CN 112988388 A CN112988388 A CN 112988388A CN 202110298621 A CN202110298621 A CN 202110298621A CN 112988388 A CN112988388 A CN 112988388A
Authority
CN
China
Prior art keywords
memory
page
memory page
target
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110298621.2A
Other languages
Chinese (zh)
Inventor
宋昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110298621.2A priority Critical patent/CN112988388A/en
Publication of CN112988388A publication Critical patent/CN112988388A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0882Page mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements

Abstract

The embodiment of the invention discloses a memory page management method and computing equipment, wherein the method comprises the following steps: determining a memory region to which a target memory page currently belongs when the target memory page in a second memory space meets a first preset condition, wherein the first preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold, or the write frequency of the target memory page reaches a write threshold; if the target memory page belongs to a dirty page, the target memory page is changed from the memory area to which the target memory page currently belongs to and stored in the target memory area, and the target memory area is the memory area with the smallest cache conflict value in the first memory space, so that the heterogeneous memory can be effectively utilized, and meanwhile, the efficient access of the memory page is guaranteed.

Description

Memory page management method and computing device
Technical Field
The present invention relates to the technical field of computing devices, and in particular, to a memory page management method and a computing device.
Background
With the continuous development of computing device technology, the requirements of people for computing device performance are continuously increased. In order to improve the performance of the computing device, one or more memory spaces may be provided in the computing device, each memory space may have at least one memory region, and each memory region may include at least one memory page, where memory data may be managed in units of memory pages.
In a single memory architecture having a single memory space, a memory page coloring technology (page coloring) may be used to manage memory pages, and each memory page is colored according to the number of times that data can be obtained through a Last Level Cache (LLC) in each memory region, that is, the memory page is allocated to a corresponding memory region, so as to reduce access conflicts to the LLC among cores of a processor of a computing device, improve the number of times that data can be obtained through the LLC, and also improve a Cache hit rate.
However, the reduction of Access conflicts between the cores of the processors implemented by using the Memory page coloring technology is directed to a Memory architecture based on a single Memory space, such as a Memory architecture in which only a Dynamic Random Access Memory (DRAM) Memory space shown in fig. 1 exists, or a Memory architecture in which only a non-volatile Memory (NVM) Memory space shown in fig. 1 exists. For a hybrid memory architecture having at least two memory spaces, such as the memory architecture shown in fig. 1 in which both the DRAM and the NVM exist, the above method does not take into account the characteristics of the memory spaces (e.g. the write lifetime limitation of some memory spaces), so that when the memory pages are managed directly by the above method, the storage characteristics of heterogeneous storage media are not utilized, and one of the heterogeneous storage media is damaged in advance.
Disclosure of Invention
The technical problem to be solved by the present application is how to effectively utilize a heterogeneous memory and ensure efficient access to memory pages when a plurality of memory spaces provided by the heterogeneous memory exist.
In a first aspect, the present application provides a memory page management method, where two heterogeneous memories correspondingly provide a first memory space and a second memory space, where the first memory space has at least one memory area, and each memory area in the first memory space includes at least one memory page; the second memory space has at least one memory region, and each memory region in the second memory space comprises at least one memory page; the method comprises the following steps: determining a memory region to which a target memory page currently belongs when the target memory page in the second memory space meets a first preset condition, wherein the first preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold, or a write frequency of the target memory page reaches a write threshold; and if the target memory page belongs to a dirty page, modifying and storing the target memory page from a memory region to which the target memory page currently belongs to a target memory region, wherein the target memory region is a memory region with the smallest cache conflict value in the first memory space.
It can be seen that, by implementing the method provided in the first aspect, the memory region to which the target memory page currently belongs may be adjusted to the memory region with the smallest cache conflict value in the first memory space according to that the target memory page in the second memory space is a dirty page, so as to fully consider the problem that the service life of the memory corresponding to the second memory space is reduced due to the fact that the memory corresponding to the second memory space may be overloaded when the number of writes to the memory page in the second memory space or the cache conflict value is high, and ensure efficient access to the memory page while improving the service life of the memory corresponding to the second memory space.
As an optional implementation manner, before the target memory page is changed from the memory region to which the target memory page currently belongs to and stored in the target memory region, the method further includes: acquiring a first overhead value of the target memory page, wherein the first overhead value is an overhead value of the target memory page caused by cache conflict; acquiring a second overhead value of the target memory page, wherein the second overhead value is an overhead value caused by changing a memory area of the target memory page; determining that the first overhead value is greater than the second overhead value.
It can be seen that, by implementing the above optional implementation manner, the timing for adjusting the memory region can be determined by using the first overhead value and the second overhead value, so as to achieve the maximum balance between the performance overhead generated by adjusting the memory region and the reduction of cache conflict for the last-level cache access after adjusting the memory region.
As an optional implementation, the method further comprises: and if the target memory page does not belong to a dirty page, changing the target memory page from the current memory area to a memory area with the smallest cache conflict value in the second memory space.
As can be seen, by implementing the above optional implementation manner, the target memory page that is not a dirty page may be modified and stored to the memory region with the smallest cache conflict value in the same memory space (second memory space), so that frequent write operations to the target memory page by the second space are avoided, and meanwhile, the loss of the memory in the first memory space due to the increase of the write times is also avoided, and the workload caused by memory page migration across memory spaces is avoided.
As an alternative embodiment, the cache conflict threshold is determined according to the following formula:
Figure BDA0002985231460000021
wherein, ThrmissIndicates the cache conflict threshold, LrA read memory latency representing the second memory space, β represents a rate of decrease in a cache conflict value of the target memory page after the memory region of the target memory page is modified, and T represents a ratio of decrease in a cache conflict value of the target memory pagecostA second overhead value representing the target memory page.
It can be seen that, by implementing the above optional implementation manner, the cache conflict threshold may be determined according to the parameter, and the timing for adjusting the memory region is determined by using the cache conflict threshold, so that the maximum balance between the performance overhead generated by adjusting the memory region and the cache conflict for reducing the last-level cache access after adjusting the memory region is achieved, and the service life of the storage corresponding to the second memory space is prolonged.
As an alternative embodiment, the write threshold is determined according to the following formula:
Figure BDA0002985231460000022
wherein, Thrw-nvmRepresents the write threshold, Lw-nvmRepresents a write memory latency, L, of the second memory spacew-dramThe write memory latency of the first memory space is represented, α represents a value that can improve performance by sacrificing the service life of a part of the memory corresponding to the second memory space, β represents a ratio of a decrease in a cache conflict value of the target memory page after the memory region of the target memory page is changed, and T represents a ratio of a decrease in a cache conflict value of the target memory pagecostA second overhead value representing the target memory page.
As can be seen, by implementing the above optional implementation, the write threshold may be determined according to the parameter, so that the computing device may determine, by using the write threshold, a time for adjusting the memory area of the target memory page, and improve the write lifetime of the second memory space.
As an optional implementation, the method further comprises: receiving a write request; and allocating a first memory page from the memory area with the smallest cache conflict value in the first memory space, where the first memory page is used to store the data to be written specified by the write request.
Therefore, by implementing the optional implementation manner, the memory space suitable for the write request can be selected as the first memory space according to the type of the memory access request as the write request, and the problem that the write life of the memory corresponding to the second memory space is reduced due to the fact that the second memory space is placed in the first memory space is avoided, so that the read-write performance energy consumption of each memory space is balanced, and the performance of the whole computing device is improved.
As an optional implementation, the method further comprises: receiving a read request; and allocating a second memory page from the memory area with the smallest cache conflict value in the second memory space, where the second memory page is used to store the data specified by the read request.
Therefore, by implementing the optional implementation manner, the memory space suitable for the read request can be selected as the second memory space according to the type of the memory access request as the read request, and the read performance of the memory corresponding to the second space is fully utilized, so that the read-write performance energy consumption of the memories corresponding to the memory spaces is balanced, and the performance of the whole computing device is improved.
As an optional implementation manner, when a third memory page in the first memory space meets a second preset condition, determining a memory region to which the third memory page currently belongs, where the second preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold; if the third memory page belongs to a dirty page, the third memory page is changed from the memory area to which the third memory page currently belongs to and stored in the memory area with the smallest cache conflict value in the first memory space; or, if the third memory page does not belong to a dirty page, the third memory page is changed from the memory region to which the third memory page currently belongs to and stored in the memory region with the smallest cache conflict value in the second memory space.
As can be seen, by implementing the above optional implementation, a memory region with the smallest cache conflict value may be preferentially allocated to the third memory page, and the allocated memory space is determined according to the type of the memory page, so that a problem of poor performance of the computing device due to asymmetry of read-write performance/energy consumption of some memory spaces may be avoided, and thus the optimal processing performance is obtained.
In a second aspect, a computing device is provided, which has functionality to implement the computing device behavior in the above first aspect or possible implementations of the first aspect. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The module may be software and/or hardware. Based on the same inventive concept, as the principle and the advantages of the computing device to solve the problem may refer to the possible method embodiments of the first aspect and the advantages brought thereby, the implementation of the computing device may refer to the possible method embodiments of the first aspect and the first aspect, and repeated details are not repeated.
In a third aspect, a computing device is provided, the computing device comprising: a memory for storing one or more programs; for the implementation and the advantageous effects of the computing device to solve the problems, reference may be made to the implementation and the advantageous effects of the possible methods of the first aspect and the first aspect, and repeated details are not described herein.
In a fourth aspect, a storage medium readable by a computing device is provided, where the storage medium stores a program, and the program, when executed by a processor, causes the processor to perform the method of the first aspect and each possible implementation manner of the first aspect, and repeated details are not repeated.
In a fifth aspect, there is provided a computing device program product which, when run on a computing device, causes the computing device to perform the method of the first aspect and embodiments of each possible method of the first aspect described above.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is an overall architecture diagram for memory page management according to an embodiment of the present invention;
FIG. 2 is a diagram of another overall architecture for memory page management according to an embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating a memory page management method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a memory space according to an embodiment of the present invention;
fig. 5 is a schematic flow chart illustrating another memory page management method according to an embodiment of the present invention;
fig. 6 is a schematic flow chart illustrating a further memory page management method according to an embodiment of the present invention;
fig. 7 is a schematic view illustrating a scenario of a memory page management method according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of a computing device according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of another computing device according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described below with reference to the drawings.
Access latency refers to the delay incurred by a processor's access to data stored in memory. In order to reduce the impact of access latency on the processor, a caching mechanism of the last level cache LLC is often utilized in the design of multi-core processors.
Since each core of the processor needs to access its own private cache (e.g. the level L1 cache shown in fig. 2, and may also include the level L2 and L3 caches not shown in fig. 2) first when accessing data, if the private cache does not access the LLC, the LLC may be accessed if the LLC does not access the memory space, and data in the accessed memory space is loaded into the LLC, so that when the processor accesses the data again, the data may be accessed directly through the LLC. Therefore, by using the LLC, the number of times data can be obtained directly through the cache can be increased by reducing the number of accesses to the memory space, i.e., the cache hit rate can be increased.
The memory data can be managed in units of memory pages. In a single memory architecture with a single memory space, a memory page coloring technology can be used to manage memory pages, and each memory page is colored according to the cache hit rate of each memory area, so that access conflicts between cores to the LLC are reduced, and the cache hit rate is improved.
However, taking the single memory architecture as the DRAM-only memory space as an example, the single DRAM-based memory architecture not only causes a large amount of system power consumption due to frequent refresh operations, but also severely limits the possibility of DRAM expansion due to DRAM manufacturing process constraints. Also, taking a single memory architecture as an example of a memory space with only NVM, the memory space of NVM has higher storage density, lower static power consumption and non-volatility compared to the memory space of DRAM. However, because of the inherent characteristics of NVM, it also suffers from high write memory latency, limited write lifetime, and high power consumption for write operations.
Therefore, in order to fully utilize the advantages of each memory space and avoid the disadvantages of each memory space, the hybrid memory architecture is becoming a popular direction for research. Since the page memory coloring technology used in the single memory architecture is to maximize the cache hit rate and ensure efficient access to the memory page, in the hybrid memory architecture, the cache hit rate is increased to ensure efficient access to the memory page, regardless of the storage characteristics of the heterogeneous storage medium, due to the asymmetry of the read/write performance/power consumption and the write lifetime limitation of some of the memory spaces (e.g., the memory space of the NVM and the memory space of the DRAM).
In order to solve the foregoing technical problems, embodiments of the present invention provide a memory page management method and a computing device, which can manage memory pages by fully considering characteristics of a memory space, effectively utilize a heterogeneous memory, and ensure efficient access to the memory pages, thereby obtaining optimal computing device performance. The following are detailed below.
In order to better understand the memory page management method provided in the embodiment of the present invention, the architecture of the embodiment of the present invention is described first.
Fig. 1 is a block diagram of a last-level cache management according to an embodiment of the present invention. It can be seen that the overall architecture shown in fig. 1 includes a processor 101, a last level cache LLC102, a first memory space 103 and a second memory space 104.
It should be noted that the overall architecture may be disposed in any computing device, and the computing device may be a notebook computer, a tablet computer, a desktop computer, a server, a terminal device (e.g., a smart phone, a wearable device, etc.), and the like, which is not limited in this respect.
For better illustration, in fig. 1, the first memory space 103 and the second memory space 104 are storage spaces corresponding to two heterogeneous memories, where the two memories are heterogeneous and refer to that storage media of the two memories are different. For example, the two heterogeneous memories are DRAM and NVM, the first memory space 103 may be a memory space of DRAM, and the second memory space 104 may be a memory space of NVM; it should be understood that, in other embodiments, the Memory providing the first Memory space or the Memory providing the second Memory space may also be a Static Random Access Memory (SRAM), a variable Resistive-Access Memory (ReRAM), a Spin-Transfer Torque magnetic Memory (Spin-Transfer Torque RAM, STT-RAM), a Phase-Change Memory (PCM), a non-volatile Memory (ap page), and the like, which are not limited in this disclosure.
Specifically, the processor 101 may be disposed in the computing device, and may obtain cache conflict values of memory regions in the last level cache LLC102 with respect to respective memory spaces, where each memory space has at least one memory region, and each memory region may include at least one memory page. In other words, the first memory space has at least one memory region, and each memory region in the first memory space includes at least one memory page; the second memory space has at least one memory region, and each memory region in the second memory space includes at least one memory page.
The cache conflict value of the memory area may be determined by the cache conflict values corresponding to the memory pages in the memory area. Optionally, the cache conflict value of the memory region is a sum of cache conflict values of each memory page in the memory region. For example, the memory region a includes three memory pages, which are a memory page 1, a memory page 2 and a memory page 3, where the cache conflict value corresponding to the memory page 1 is 40, the cache conflict value corresponding to the memory page 2 is 50, and the cache conflict value corresponding to the memory page 3 is 60, so that the cache conflict value of the memory region a may be 150 +50+ 60.
Specifically, when the target memory page in the second memory space meets a first preset condition, the memory region to which the target memory page currently belongs is determined, and then, the processor 101 may determine the type of the target memory page again, and if the target memory page belongs to a dirty page, the target memory page may be changed from the memory region to which the target memory page currently belongs and stored in the memory region with the smallest cache conflict value in the first memory space.
Therefore, when the number of times of writing of the cache conflict value of the target memory page is high, the access efficiency of the target memory page is greatly reduced, the target memory page is changed from the current memory area to the target memory area, so that efficient access to the target memory page can be ensured, meanwhile, the time for changing the memory area is grasped by using the first preset condition, the problem of service life reduction caused by possible overload of the memory corresponding to the second memory space is avoided, and the service life of the memory corresponding to the second memory space is prolonged.
For a more detailed description, please refer to fig. 2, which is a block diagram of another overall architecture for memory page management according to an embodiment of the present invention. It can be seen that the overall architecture shown in fig. 2 includes a processor 201, a plurality of L1 caches 202, a last level cache LLC203, a first memory space 204, a second memory space 205, a cache access monitoring module 206, and a memory region management module 207.
The processor 201 is a multi-core processor, and the embodiment of the present invention is described by taking the core 1, the core 2, and the core 3 as an example, but in other embodiments, the number of cores of the processor may be any number, which is not limited in the present invention.
Wherein, the kernel 1 corresponds to the L1 cache private to the kernel 1, the kernel 2 corresponds to the L1 cache private to the kernel 2, and the kernel 3 corresponds to the L1 cache private to the kernel 3. The core 1, the core 2, and the core 3 may share the last level cache LLC 203.
It should be noted that the lower level caches of the L1 cache may further include an L2 cache, an L3 cache, an L4 cache, and the like in sequence, although fig. 2 does not illustrate the L2 cache, the L3 cache, and even the L4 cache, if there are four levels of caches, the L4 cache is the last level cache LLC, and the present invention does not limit the number of cache levels.
The cache access monitoring module 206 may implement any program access behavior record to the last level cache LLC. Specifically, the cache access monitoring module 206 may record cache conflict values of each memory region corresponding to the first memory space 204, cache conflict values of each memory region corresponding to the second memory space 205, cache conflict values of memory pages corresponding to the memory regions of the first memory space 204, cache conflict values of memory pages corresponding to the memory regions of the second memory space 207, write values of memory pages of the second memory space 205, and the like.
The memory area management module 207 may call a page recoloring algorithm to change the memory area to which the memory page currently belongs, and may determine a cache conflict threshold of the memory page and a write threshold of the memory page of the second memory space 207.
In one possible embodiment, the last level cache LLC203 may record cache conflict values for memory pages in respective memory spaces. When the data or instructions accessed by the processor 201 are not present in the last level cache LLC203, the last level cache LLC203 may generate a cache conflict signal and send the cache conflict signal to the cache access monitoring module 206.
In a possible implementation manner, after receiving the cache conflict signal, the cache access monitoring module 206 may add an operation to the cache conflict value of the memory page corresponding to the cache conflict signal, and when the cache conflict value of a certain memory page (assumed here to be the memory page 1 in the second memory space 207) reaches the cache conflict threshold, the cache access monitoring module 206 may generate a cache interrupt signal and send the cache interrupt signal to the processor 201.
It should be noted that, although not shown, in some possible embodiments, the cache access monitoring module 206 may be disposed in the last level cache LLC 203.
In a possible implementation, after receiving the cache interrupt signal, the processor 201 may call the memory region management module 207 to adjust the memory page 1, specifically, to adjust the memory region with the minimum cache conflict value.
In a possible implementation, the processor 201 may select the location of the memory region according to the corresponding page status of the memory page 1, and if the page status is a dirty page (indicating a modified page), the memory page is placed in the memory region of the first memory space 204 with the smallest cache conflict value.
If the page status is not a dirty page, such as a clean page (indicating an unmodified page), the memory page is placed in the memory region of the second memory space 207 with the smallest cache conflict value.
In a possible embodiment, the last level cache LLC203 may also record the number of writes of each memory page in the second memory space 207. When the processor 201 accesses data or instructions, if the cache is full, the corresponding cache block is replaced to provide a cache space for the data or instructions to be accessed. If the replaced cache block is dirty, that is, the data in the cache block space is modified, the data in the cache block space is written into the memory, so as to achieve data consistency, and this process is recorded as a memory write-back operation. When the memory write-back operation occurs once, the last-level cache LLC203 may generate a write-back signal and send the write-back signal to the cache access monitor module 206.
In a possible implementation manner, after receiving the write-back signal, the cache access monitoring module 206 may add an operation to the write-back frequency of the memory page corresponding to the write-back signal, and when the write-back frequency of the memory page (i.e., the target memory page, which may be assumed to be the memory page 2) of a certain second memory space 207 reaches the write threshold, the cache access monitoring module 206 may generate a write-back interrupt signal and send the write-back interrupt signal to the processor 201.
In a possible implementation manner, after receiving the write-back interrupt signal, the processor 201 may invoke the memory region management module 207 to change the memory region to which the memory page 2 currently belongs, specifically, the memory page 2 may be a dirty page, and the processor 201 may adjust the memory region of the memory page 2 to a memory region with a minimum cache conflict value of the first memory space 204.
It should be noted that the write threshold and the cache conflict threshold may be determined by a series of parameters obtained according to the program access behavior of the last-level cache LLC recorded by the cache access monitoring module 206, so as to adaptively adjust the write threshold and the cache conflict threshold according to the parameters, thereby improving the system performance.
The following are examples of methods provided by embodiments of the present invention. It should be noted that, the method embodiment provided in the embodiment of the present invention may be executed by any computing device, for example, a laptop, a tablet, a desktop, a server, a terminal device (e.g., a smart phone, a wearable device, etc.), and specifically, the computing device may be executed by controlling other structural devices through a processor, and the embodiment of the present invention is not limited in this respect.
Referring to fig. 3, fig. 3 is a schematic flow chart illustrating a memory page management method according to an embodiment of the present invention. The method as shown in fig. 3 may include:
301. and when the target memory page in the second memory space meets a first preset condition, determining a memory area to which the target memory page currently belongs.
It should be noted that the execution subject of the method provided by the embodiment of the present invention may be various computing devices, and specifically, the execution subject may be executed by a processor in the computing device in cooperation with other structures (e.g., a memory, etc.).
The heterogeneous two storages correspondingly provide a first memory space and a second memory space, the first memory space is provided with at least one memory area, and each memory area in the first memory space comprises at least one memory page; the second memory space has at least one memory region, and each memory region in the second memory space includes at least one memory page.
It should be noted that, the first memory space is, for example, a memory space of a DRAM, and the second memory space is, for example, a memory space of an NVM, and the embodiment of the invention does not limit this.
In some possible embodiments, the last-level cache may record cache conflict values of memory pages in each memory space (including the second memory space), for example, if an access to a memory page of the second memory space conflicts, the last-level cache may add the cache conflict values of the memory region of the second memory space.
The last-level cache can also record and update the cache conflict value of the memory area of each memory space in real time.
In some possible embodiments, the last-level cache may add a counter to each memory region for recording the cache conflict value of the memory region. For example, the last level cache may generate a cache conflict signal when data or instructions are mapped in a memory region but not successfully accessed in the last level cache, i.e., a cache conflict is generated. If the cache conflict signal matches the match signal in the memory region mapped by the data or instruction and the match is successful, the counter in the mapped memory region may perform a corresponding add operation on the mapped memory region.
It should be further noted that the first memory space and the second memory space may include free memory spaces of corresponding storage media, and page frames (one page frame may correspond to one memory page) with the same memory color may be connected into a linked list.
For example, as shown in fig. 4, the memory space includes a first memory space and a second memory space. The first memory space may include page frames of a plurality of colors 0, page frames of a plurality of colors 1, and page frames of a plurality of colors n (n may be an arbitrary integer). The page frames with the same color can be connected into a linked list to form the memory area of the color. For example, page frames of color 0 may be linked into a linked list to form a memory region of color 0, and page frames of color 1 may be linked into a linked list to form a memory region of color 1. Similarly, the second memory space may also be formed into a memory area in a linked list manner, which is not described herein again.
In some possible implementations, the size of the first memory space and the size of the second memory space may be allocated by a processor of the computing device. The processor may allocate the size according to the number of times of interaction between each memory space and the memory management system. For example, the processor may set the size ratio of the first memory space to the second memory space to 1:3 according to the number of times of interaction between the first memory space and the second memory space with the memory management system. Of course, the above-mentioned modes are only examples and are not exhaustive, and the allocation of the size of the memory space includes, but is not limited to, the above-mentioned alternative modes.
In some possible embodiments, the cache conflict value of the memory page may be recorded by a Translation Lookaside Buffer (TLB). For example, a counter is added to each entry of the TLB to record cache conflict information for each memory page. When the data or instruction accessed by the processor misses in the last-level cache, the last-level cache can generate a cache conflict signal and send the cache conflict signal to the TLB, and the TLB can add one to the corresponding memory page record through the counter. When the cache conflict value of a certain memory page reaches the cache conflict threshold, a cache interrupt signal may be generated and sent to the processor of the computing device.
The first preset condition is that the cache conflict value of the target memory page reaches a cache conflict threshold, or the write frequency of the target memory page reaches a write threshold.
It should be further noted that the target memory page of the second memory space may refer to any memory page of the second memory space whose cache conflict value reaches the cache conflict threshold, or whose write frequency reaches the write threshold.
Optionally, the cache conflict threshold may be determined by the following formula:
Figure BDA0002985231460000081
wherein, ThrmissIndicates the cache conflict threshold, LrA read memory latency representing the second memory space, β represents a rate of decrease in a cache conflict value of the target memory page after the memory region of the target memory page is modified, and T represents a ratio of decrease in a cache conflict value of the target memory pagecostRepresenting the target memory pageA second overhead value, where the second overhead value is an overhead value caused by changing a memory area of the target memory page.
In addition, L is as defined abover、β、TcostThe isoparameters may be parameters resulting from program access behavior.
The second overhead value may specifically represent a performance overhead of the processor that may be generated in advance before the memory page is adjusted in the memory region.
The memory read latency of the second memory space may represent a latency of reading the second memory space.
The ratio of the reduction of the cache conflict value of the target memory page after the memory region of the target memory page is changed may represent a percentage of a pre-estimated reduction of the cache conflict value of the target memory page after the adjustment to the current cache conflict value before the memory region corresponding to the memory page is adjusted.
In a possible implementation manner, after the processor of the computing device receives the cache interrupt signal, the memory region to which the target memory page corresponding to the cache interrupt signal currently belongs may be determined. For example, if the cache interrupt signal corresponds to a target memory page of color 1 in the second memory space, the processor may determine that the memory region corresponding to the cache interrupt signal is a memory region of color 1 in the second memory space. If the cache interrupt signal corresponds to the target memory page of color 2 in the second memory space, the processor may determine that the memory region corresponding to the cache interrupt signal is the target memory region of color 2 in the second memory space.
Alternatively, the write threshold may be determined according to the following formula:
Figure BDA0002985231460000082
wherein, Thrw-nvmRepresents the write threshold, Lw-nvmRepresenting the second memory spaceWrite memory latency of Lw-dramThe write memory latency of the first memory space is represented, α represents a value that can improve performance by sacrificing the service life of a part of the memory corresponding to the second memory space, β represents a ratio of a decrease in a cache conflict value of the target memory page after the memory region of the target memory page is changed, and T represents a ratio of a decrease in a cache conflict value of the target memory pagecostA second overhead value representing the target memory page.
In addition, L is as defined abovew-nvm、β、Lw-dram、TcostThe isoparameters may be parameters resulting from program access behavior.
The memory writing latency of the first memory space may represent a delay generated when data is written into the first memory space. Similarly, the write memory latency of the second memory space may represent a latency incurred in writing data to the second memory space.
In some possible embodiments, before the target memory page is modified from the memory region to which the target memory page currently belongs and stored in the target memory region, the method may further include: acquiring a first overhead value of the target memory page, wherein the first overhead value is an overhead value of the target memory page caused by cache conflict; acquiring a second overhead value of the target memory page, wherein the second overhead value is an overhead value caused by changing a memory area of the target memory page; determining that the first overhead value is greater than the second overhead value.
It should be noted that, the processor may change the memory region to which the target memory page belongs currently by using a memory page recoloring algorithm, however, changing the current memory region of the target memory page may generate a performance overhead, that is, generate a second overhead value, and without changing the current memory region of the target memory page, may generate an overhead, that is, a first overhead value, of the target memory page due to cache conflict. Therefore, when it is determined that the first overhead value is greater than the second overhead value, a balance between changing the current memory region of the target memory page and reducing cache conflicts can be achieved, and the performance of the whole computing device can be improved.
It should be further noted that, when it is determined that the first overhead value is smaller than the second overhead value, it is indicated that changing the memory region to which the target memory page belongs may increase system energy consumption, which may be unfavorable to improving the performance of the entire computing device.
302. And if the target memory page belongs to a dirty page, changing and storing the target memory page from the memory area to which the target memory page currently belongs to the target memory area.
It should be noted that, when determining the memory region to which the target memory page currently belongs, the processor may also determine a page state corresponding to the memory page.
The page status may include a dirty page and not belong to a dirty page. A dirty page may indicate that the target memory page has been modified, and it may be considered that the target memory page is modified again in a later period of time according to the principle of locality of access of the program. If the target memory page is not a dirty page, for example, may be a clean page, which may indicate that the target memory page has not been modified, it may be considered that the target memory page is not modified in a future period of time.
The 202 and 203 steps are parallel steps. If the target memory page is a dirty page, the processor may execute step 302; if the target memory page does not belong to a dirty page, the processor may perform step 303.
The target memory area is a memory area with the smallest cache conflict value in the first memory space.
In some feasible embodiments, when the page status of the target memory page is a dirty page, the manner of placing the target memory page in the first memory space may be: and changing the memory area to which the target memory page currently belongs to the memory area with the smallest cache conflict value in the first memory space by a memory page recoloring algorithm.
For example, as shown in fig. 4, if the region to which the target memory page currently belongs is the memory region of color 1 in the second memory space, and the memory region with the smallest cache conflict value in the first memory space is the memory region of color 2 in the first memory space, the processor may change the target memory page from the memory region of color 1 in the second memory space to the memory region of color 2 in the first memory space.
303. And if the target memory page does not belong to the dirty page, changing and storing the target memory page from the current memory area to the memory area with the smallest cache conflict value in the second memory space.
It should be noted that, when the page state of the target memory page does not belong to a dirty page, for example, when the target memory page is a clean page, the manner of placing the memory page in the second memory space may be: and changing the memory area to which the target memory page currently belongs to the memory area with the smallest cache conflict value in the second memory space by a memory page recoloring algorithm.
In some possible embodiments, the processor may further determine, when a third memory page in the first memory space meets a second preset condition, a memory region to which the third memory page currently belongs, where the second preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold; if the third memory page belongs to a dirty page, the third memory page is changed from the memory area to which the third memory page currently belongs to and stored in the memory area with the smallest cache conflict value in the first memory space; or, if the third memory page does not belong to a dirty page, the third memory page is changed from the memory region to which the third memory page currently belongs to and stored in the memory region with the smallest cache conflict value in the second memory space.
It should be noted that the third memory page may be any memory page in the first memory space whose cache conflict value reaches the cache conflict threshold.
In a specific implementation, when the cache conflict value of the third memory page in the first memory space reaches the cache conflict threshold, the processor may determine to change the specific location of the memory area of the third memory page according to the page state of the third memory page.
If the third memory page is a dirty page, the third memory page may be changed from the memory region to which the third memory page currently belongs and stored in the memory region with the smallest cache conflict value in the first memory space.
For example, if the memory region recorded by the last-level cache and having the smallest cache conflict value in the first memory space is the memory region with color 2, and the memory region corresponding to the third memory page is the memory region with color 1 in the first memory space, the processor may adjust the memory region of the target memory page from the memory region with color 1 in the first memory space to the memory region with color 2 in the first memory space.
If the third memory page does not belong to a dirty page, for example, a clean page, the third memory page may be changed from the memory region to which the third memory page currently belongs to and stored in the memory region with the smallest cache conflict value in the second memory space.
For better illustration, the following describes the modification of the memory area of the memory page by taking table 1 as an example, where table 1 is a schematic diagram of a memory area adjustment provided in the embodiment of the present invention. In table 1, the first memory space may be a memory space of a DRAM, and the second memory space may be a memory space of an NVM, but is not limited thereto in other embodiments.
In table 1, when the current memory page position is the memory space of the DRAM, if the page state of the memory page is a dirty page, the color of the memory region after the memory region of the memory page is adjusted may be the color with the smallest cache conflict value, and the position of the memory page after the adjustment may still be the memory space of the DRAM.
When the current memory page position is the memory space of the DRAM, if the page state of the memory page is a clean page, the color of the memory region after the memory region adjustment of the memory page may be the color with the minimum cache conflict value, and the position of the memory page after the adjustment may be the memory space of the NVM.
When the current memory page position is the memory space of the NVM, if the page status of the memory page is a dirty page, the color of the memory region after the memory region adjustment on the memory page may be the color with the smallest cache conflict value, and the position of the memory page after the adjustment may be the memory space of the DRAM.
When the current memory page position is the memory space of the NVM, if the page status of the memory page is a clean page, the color of the memory region after the memory region of the memory page is adjusted may be the color with the minimum cache conflict value, and the adjusted memory page position may still be the memory space of the NVM.
Figure BDA0002985231460000101
Figure BDA0002985231460000111
Table 1: memory region adjustment schematic diagram
It can be seen that, in the embodiment of the present invention, when the cache conflict value of the target memory page in the second memory space reaches the cache conflict threshold, or the write frequency of the target memory page reaches the write threshold, the processor may adjust the memory region to which the target memory page currently belongs to the memory region having the smallest cache conflict value in the first memory space according to that the target memory page in the second memory space is a dirty page, fully consider the problem that the service life of the memory corresponding to the second memory space is reduced due to the possible overload, and may also improve the service life of the memory corresponding to the second memory space while fully utilizing each memory space.
Please refer to fig. 5, which is a flowchart illustrating another memory page management method according to an embodiment of the present invention. The memory management method shown in fig. 7 may include:
501. a write request is received.
It should be noted that the write request may be a memory access request, and the memory access request may be a request issued when any process or system needs to apply for a memory to load corresponding data.
In particular, the write request may be for requesting that data be written in a memory page. The data may be data that needs to be loaded in the process, and the like.
502. And allocating the first memory page from the memory region with the minimum cache conflict value in the first memory space.
In a specific implementation, after receiving the write request, the processor may select, according to the cache conflict value of the memory region of the first memory space recorded in the last-level cache, a memory region with a smallest cache conflict value from the first memory space, for example, a memory region of color 1 in the first memory space shown in fig. 4.
It can be seen that, in the embodiment of the present invention, the processor first receives the write request, and then allocates the second memory page from the memory area with the smallest cache conflict value in the first memory space, so that the memory space suitable for the write request is selected as the first memory space, and the problem of the decrease in the write life of the memory corresponding to the second memory space caused by the placement of the second memory space is avoided, thereby balancing the read-write performance energy consumption of each memory space, and improving the performance of the entire computing device.
Please refer to fig. 6, which is a flowchart illustrating another memory page management method according to an embodiment of the present invention. The memory management method shown in fig. 6 may include:
601. a read request is received.
It should be noted that the read request may be a memory access request.
In particular, the read request may be for requesting reading of data in a memory page. Where the data may be the data that is first read into memory from an underlying storage medium (e.g., hard disk or magnetic disk).
The memory access request may also include a read request.
602. And allocating a second memory page from the memory area with the minimum cache conflict value in the second memory space.
The second memory page is used for storing the data specified by the read request.
It should be noted that the data may be data read from an underlying storage medium (e.g., a magnetic disk or a hard disk).
For example, the first memory space may be a memory space of a DRAM, and the second memory space may be a memory space of an NVM. If the type of the request is a write request, the memory area with the smallest cache conflict value of the memory space of the DRAM can be allocated because the write lifetime of the memory space of the NVM is limited. If the type of the application request is a read request, a memory region with the smallest cache conflict value of the memory space of the NVM may be allocated in order to achieve the purpose of reasonably allocating the memory space.
In a possible implementation manner, if there is no free area in the memory area with the minimum cache conflict value in the first memory space or the second memory space, the memory space may be applied from the memory management system; adding the applied memory space into a memory area with the minimum cache conflict value of the target memory space; and allocating the memory space of the application from the memory area with the minimum cache conflict value of the target memory space to execute the read request or the write request.
Note that the free area may be an unoccupied area. If no free area exists in the memory area with the minimum cache conflict value in the first memory space or the second memory space, it indicates that the memory area with the minimum cache conflict value is occupied.
It should be further noted that the memory management system may be a memory management module of an operating system such as Linux, windows, and the like, which is not limited in this embodiment of the present invention.
For example, as shown in fig. 7, a schematic view of a memory page management method according to an embodiment of the present invention is provided. In 701 of fig. 7, if a memory access request is received, a memory region with a minimum cache conflict value for the last-level cache record is obtained in 702. If the type of the memory access request is a read request at 703, then the following steps are performed at 704: applying for a memory from a first memory space, and allocating a memory area with the minimum cache conflict value in the first memory space. If the type of the memory access request is a write request in 703, then the steps in 705 are performed: and applying for a memory from a second memory space, and allocating a memory area with the minimum cache conflict value in the second memory space.
In 706, after the processor executes 701 to 705, the processor may determine whether the memory application is successful, and if the memory application is successful, the process is ended. If the application is not successful, the steps in 705 are performed: the memory management system applies for the memory to be placed in the memory space, and the type of the memory application can be judged again so as to determine which memory area with the smallest cache conflict value of the memory space the memory of the application is placed in. Finally, the memory access request may be executed again by allocating the memory space of the application.
In some possible embodiments, the processor may further receive a memory release request, and determine whether a free area exists in a memory area of a memory space corresponding to the memory release request. If so, the free area may be placed, and if not, the free area may be placed in the memory management system.
For example, as shown in fig. 7, in 701, if a memory release request is received, in 708, a memory space corresponding to the memory release request is determined. At 709, the processor may determine whether there is a free region in the corresponding memory space. If there is a free area, then step 710 is performed: and placing the released memory into a free area of the corresponding memory space to execute the memory release request. If there are no free areas, step 711 is performed: and placing the released memory into a memory management system to execute the memory release request.
It can be seen that, in the embodiment of the present invention, the processor first receives the read request, and then allocates the second memory page from the memory area with the smallest cache conflict value in the second memory space, so that the memory space suitable for the read request is selected as the second memory space, and the read performance of the memory corresponding to the second space is fully utilized, thereby balancing the read-write performance energy consumption of the memories corresponding to the respective memory spaces, and improving the performance of the entire computing device.
In order to better implement the above scheme of the embodiment of the present invention, a corresponding apparatus embodiment is described below, and specifically as shown in fig. 8, a schematic structural diagram of a computing device according to the embodiment of the present invention is provided, where two heterogeneous memories correspondingly provide a first memory space and a second memory space, the first memory space has at least one memory area, and each memory area in the first memory space includes at least one memory page; the second memory space has at least one memory region, and each memory region in the second memory space includes at least one memory page, including:
a first determining module 801, configured to determine, when a target memory page in the second memory space meets a first preset condition, a memory region to which the target memory page currently belongs.
The first preset condition is that the cache conflict value of the target memory page reaches a cache conflict threshold, or the number of times of writing the target memory page reaches a write threshold.
A changing module 802, configured to change the target memory page from the memory region to which the target memory page currently belongs to a target memory region and store the target memory page in the target memory region if the target memory page belongs to a dirty page.
The target memory area is a memory area with the smallest cache conflict value in the first memory space.
In one embodiment, the computing device further comprises: an obtaining module 803, configured to obtain a first overhead value of the target memory page, and obtain a second overhead value of the target memory page.
The first overhead value is an overhead value of the target memory page due to cache conflict.
The second overhead value is an overhead value caused by changing a memory area of the target memory page.
A second determining module 804 configured to determine that the first overhead value is greater than the second overhead value.
In an embodiment, the changing module 802 is further configured to, if the target memory page does not belong to a dirty page, change the target memory page from the currently-belonging memory region to a memory region in the second memory space, where the cache conflict value is the smallest.
In one embodiment, the cache conflict threshold is determined according to the following equation:
Figure BDA0002985231460000131
wherein, ThrmissRepresenting the cache conflict threshold, LrA read memory latency representing the second memory space, β represents a ratio of a decrease in a cache conflict value of the target memory page after changing a memory region of the target memory page, and T iscostA second overhead value representing the target memory page.
In one embodiment, the write threshold is determined according to the following equation:
Figure BDA0002985231460000132
wherein, Thrw-nvmRepresents the write threshold, Lw-nvmA write memory latency, L, representing the second memory spacew-dramRepresenting a write memory latency of the first memory space, α representing a value that may improve performance by sacrificing a lifetime of a memory corresponding to a portion of the second memory space, β representing a ratio of a decrease in a cache conflict value of the target memory page after the memory region of the target memory page is modified, and T representing a ratio of a decrease in a cache conflict value of the target memory page after the memory region of the target memory page is modifiedcostA second overhead value representing the target memory page.
In one embodiment, the computing device may further include: a receiving module 806 configured to receive a write request.
An allocating module 807, configured to allocate the first memory page from the memory region with the smallest cache conflict value in the first memory space.
The first memory page is used for storing data to be written specified by the write request.
In one embodiment, the receiving module 806 is further configured to receive a read request.
The allocating module 807 is further configured to allocate a second memory page from the memory area with the smallest cache conflict value in the second memory space.
Wherein the second memory page is used for storing the data specified by the read request.
In one embodiment, the computing device may further include: a third determining module 805, configured to determine, when a third memory page in the first memory space meets a second preset condition, a memory region to which the third memory page currently belongs.
The second preset condition is that the cache conflict value of the target memory page reaches a cache conflict threshold.
The changing module 802 is further configured to, if the third memory page belongs to a dirty page, change and store the third memory page from a memory area to which the third memory page currently belongs to a memory area with a smallest cache conflict value in the first memory space; alternatively, the first and second electrodes may be,
the changing module 802 is further configured to, if the third memory page does not belong to a dirty page, change the third memory page from the memory area to which the third memory page currently belongs to and store the memory area with the smallest cache conflict value in the second memory space.
Fig. 9 is a schematic structural diagram of another computing device according to an embodiment of the present invention. The computing device described in this embodiment includes: memory 120, other input devices 130, display screen 140, input/output subsystem 170, processor 180, and power supply 190. Those skilled in the art will appreciate that the computing device architecture illustrated in FIG. 9 does not constitute a limitation of computing devices, and may include more or fewer components than those illustrated, or may combine certain components, or split certain components, or a different arrangement of components. Those skilled in the art will appreciate that the display screen 140 belongs to a User Interface (UI), and that the computing device may include fewer than or the illustrated User interfaces.
The following describes each constituent element of the terminal in detail with reference to fig. 9:
the memory 120 may be used to store software programs and modules, wherein the memory 120 may have at least two and is a heterogeneous memory, wherein heterogeneous means that the storage media are not the same. The processor 180 executes various functional applications of the computing device and data processing by executing software programs and modules stored in the memory 120. The memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the terminal, and the like.
Other input devices 130 may be used to receive entered numeric or character information and generate key signal inputs relating to user settings and function control. In particular, other input devices 130 may include, but are not limited to, one or more of a physical keyboard, a mouse, a joystick, a light mouse (a light mouse is a touch-sensitive surface that does not display visual output, or is an extension of a touch-sensitive surface formed by a touch screen), and the like. The other input devices 130 are connected to other input device controllers 171 of the input/output subsystem 170 and are in signal communication with the processor 180 under the control of the other input device controllers 171.
The display screen 140 may include a display panel 141, and a touch panel 142. The Display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like. A touch panel 142, also referred to as a touch screen, touch sensitive screen, etc. Alternatively, the touch panel 142 may include two parts, i.e., a touch detection device and a touch controller. The touch controller receives the touch information from the touch detection device, converts the touch information into information that can be processed by the processor, sends the information to the processor 180, and receives and executes a command sent by the processor 180. In addition, the touch panel 142 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, a surface acoustic wave, and the like, and the touch panel 142 may also be implemented by any technology developed in the future. Although the touch panel 142 and the display panel 141 are two separate components to implement the input and output functions of the terminal in fig. 9, in some embodiments, the touch panel 142 and the display panel 141 may be integrated to implement the input and output functions of the terminal.
The external devices used by the input/output subsystem 170 to control input and output may include other devices, an input controller 171, a sensor controller 172, and a display controller 173. Optionally, one or more other input control device controllers 171 receive signals from and/or transmit signals to other input devices 130, and other input devices 130 may include physical buttons (push buttons, rocker buttons, etc.), dials, slide switches, joysticks, click wheels, a light mouse (a light mouse is a touch-sensitive surface that does not display visual output, or is an extension of a touch-sensitive surface formed by a touch screen). It is noted that other input control device controllers 171 may be connected to any one or more of the above-described devices. The display controller 173 in the input/output subsystem 170 receives signals from the display screen 140 and/or sends signals to the display screen 140.
The processor 180 is a control center of the terminal, connects various parts of the entire terminal using various interfaces and lines, performs various functions of the computing device and processes data by operating or executing software programs and/or modules stored in the memory 120 and calling data stored in the memory 120, thereby monitoring the computing device as a whole. Alternatively, processor 180 may include one or more processing units; preferably, the processor 180 may integrate an application processor, which mainly handles operating systems, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 180.
The terminal also includes a power supply 190 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 180 via a power management system to manage charging, discharging, and power consumption via the power management system.
Specifically, the processor 180 may call the program instructions stored in the memory 120 to implement the method according to the embodiment of the present invention.
Specifically, two heterogeneous memories correspondingly provide a first memory space and a second memory space, the first memory space has at least one memory area, and each memory area in the first memory space includes at least one memory page; the second memory space has at least one memory area, each memory area in the second memory space includes at least one memory page, and the processor 180 invokes the program instructions stored in the memory 170 to execute the following steps:
determining a memory region to which a target memory page currently belongs when the target memory page in the second memory space meets a first preset condition, wherein the first preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold, or a write frequency of the target memory page reaches a write threshold;
if the target memory page belongs to a dirty page, the target memory page is changed from a memory area to which the target memory page currently belongs to and stored in a target memory area, and the target memory area is a memory area with the smallest cache conflict value in the first memory space.
The method executed by the processor in the embodiment of the present invention is described from the perspective of the processor, and it is understood that the processor in the embodiment of the present invention needs to cooperate with other hardware structures to execute the method. For example, the interaction between the terminal and other devices or servers, the determination of the memory region to which the target memory page currently belongs, and the change of the memory region to which the target memory page currently belongs may be implemented by the processor 803 controlling the storage program in the storage 807. The embodiments of the present invention are not described or limited in detail for the specific implementation process.
Optionally, the terminal may implement, by the processor 2000 and other devices, corresponding steps executed by a computing device in the memory page management method in the foregoing method embodiment. It should be understood that the embodiments of the present invention are entity device embodiments corresponding to the method embodiments, and the description of the method embodiments is also applicable to the embodiments of the present invention.
In another embodiment of the present invention, there is provided a computer-readable storage medium storing a program which, when executed by a processor, implements: determining a memory region to which a target memory page currently belongs when the target memory page in the second memory space meets a first preset condition, wherein the first preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold, or a write frequency of the target memory page reaches a write threshold; if the target memory page belongs to a dirty page, the target memory page is changed from a memory area to which the target memory page currently belongs to and stored in a target memory area, and the target memory area is a memory area with the smallest cache conflict value in the first memory space.
It should be noted that, for specific processes executed by the processor of the storage medium readable by the computing device, reference may be made to the methods described in the above method embodiments, and details are not described herein again.
There is also provided in yet another embodiment of the present invention a computing device program product containing instructions which, when run on a computing device, cause the computing device to perform the method described in the method embodiment above.
The storage medium readable by the computing device may be an internal storage unit of the terminal according to any of the foregoing embodiments, for example, a hard disk or a memory of the terminal. The storage medium readable by the computing device may also be an external storage device of the computing device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computing device. Further, the computing device readable storage medium may also include both an internal storage unit and an external storage device of the terminal. The computing device readable storage medium is used for storing the program and other programs and data required by the terminal. The computing device readable storage medium may also be used to temporarily store data that has been output or is to be output.
Based on the same inventive concept, the principle of the computing device provided in the embodiment of the present invention for solving the problem is similar to that of the embodiment of the method of the present invention, so the implementation of the computing device may refer to the implementation of the method, and is not described herein again for brevity.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a program, which can be stored in a storage medium readable by a computing device, and when the program is executed, the processes of the embodiments of the methods described above can be included. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims (22)

1. A method for managing memory pages, comprising:
determining a memory region to which a target memory page in a second memory belongs when the target memory page meets a first preset condition, wherein the first preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold, or the number of times of writing the target memory page reaches a write threshold;
and if the target memory page belongs to a dirty page, changing and storing the target memory page from a memory area to which the target memory page currently belongs to a target memory area, wherein the target memory area is a memory area in the first memory.
2. The method of claim 1,
the target memory area is a memory area with a small cache conflict value in the first memory.
3. The method of claim 1,
the first memory and the second memory are different in storage medium.
4. The method according to any of claims 1 to 3, wherein before the target memory page is modified from the memory region to which the target memory page currently belongs to be stored in the target memory region, the method further comprises:
acquiring a first overhead value of the target memory page, where the first overhead value is an overhead value of the target memory page due to cache conflict;
acquiring a second overhead value of the target memory page, where the second overhead value is an overhead value caused by changing a memory area of the target memory page;
determining that the first overhead value is greater than the second overhead value.
5. The method according to any of claims 1-4, characterized in that the method further comprises:
and if the target memory page does not belong to a dirty page, changing and storing the target memory page from the current memory area to a memory area with a small cache conflict value in the second memory space.
6. The method of any of claims 1-5, wherein the cache conflict threshold is determined according to the following equation:
Figure FDA0002985231450000011
wherein, ThrmissRepresenting the cache conflict threshold, LrA read memory latency representing the second memory space, β represents a ratio of a decrease in a cache conflict value of the target memory page after changing a memory region of the target memory page, and T iscostA second overhead value representing the target memory page.
7. The method of any of claims 1-5, wherein the write threshold is determined according to the following equation:
Figure FDA0002985231450000012
wherein, Thrw-nvmRepresents the write threshold, Lw-nvmA write memory latency, L, representing the second memory spacew-dramRepresenting a write memory latency of the first memory space, α representing a value that may improve performance by sacrificing a lifetime of a memory corresponding to a portion of the second memory space, β representing a ratio of a decrease in a cache conflict value of the target memory page after the memory region of the target memory page is modified, and T representing a ratio of a decrease in a cache conflict value of the target memory page after the memory region of the target memory page is modifiedcostA second overhead value representing the target memory page.
8. The method according to any one of claims 1 to 7, further comprising:
receiving a write request;
and allocating a first memory page from a memory area with a small cache conflict value in the first memory space, where the first memory page is used to store the data to be written specified by the write request.
9. The method according to any one of claims 1-8, further comprising:
receiving a read request;
and allocating a second memory page from the memory area with the smallest cache conflict value in the second memory space, where the second memory page is used to store the data specified by the read request.
10. The method according to any one of claims 1-9, further comprising:
determining a memory region to which a third memory page currently belongs when the third memory page in the first memory space meets a second preset condition, wherein the second preset condition is that a cache conflict numerical value of the target memory page reaches a cache conflict threshold value;
if the third memory page belongs to a dirty page, changing and storing the third memory page from a memory region to which the third memory page currently belongs to a memory region with a smallest cache conflict value in the first memory space; alternatively, the first and second electrodes may be,
if the third memory page does not belong to a dirty page, the third memory page is changed from the memory region to which the third memory page currently belongs to and stored in the memory region with the smallest cache conflict value in the second memory space.
11. A computing device, comprising:
a first determining module, configured to determine, when a target memory page in a second memory meets a first preset condition, a memory region to which the target memory page belongs, where the first preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold, or a write frequency of the target memory page reaches a write threshold;
a changing module, configured to change the target memory page from a memory area to which the target memory page currently belongs to a target memory area if the target memory page belongs to a dirty page, where the target memory area is a memory area in the first memory. .
12. The computing device of claim 11, wherein the target memory region is a memory region of the first storage having a small cache conflict value.
13. The computing device of claim 11 or 12, wherein the storage medium of the first memory and the second memory are different.
14. The computing device of any of claims 11-13, wherein the computing device further comprises:
an obtaining module, configured to obtain a first overhead value of the target memory page, where the first overhead value is an overhead value of the target memory page due to cache conflict; acquiring a second overhead value of the target memory page, wherein the second overhead value is an overhead value caused by changing a memory area of the target memory page;
a second determination module to determine that the first overhead value is greater than the second overhead value.
15. The computing device according to any one of claims 11 to 14, wherein the changing module is further configured to, if the target memory page does not belong to a dirty page, change the target memory page from the currently-belonging memory region to a memory region in the second memory space, where the cache conflict value is small.
16. The computing device of any of claims 11-15, wherein the cache conflict threshold is determined according to the following equation:
Figure FDA0002985231450000021
wherein, ThrmissRepresenting the cache conflict threshold, LrA read memory latency representing the second memory space, β represents a ratio of a decrease in a cache conflict value of the target memory page after changing a memory region of the target memory page, and T iscostA second overhead value representing the target memory page.
17. The computing device of any of claims 11-15, wherein the write threshold is determined according to the following equation:
Figure FDA0002985231450000031
wherein, Thrw-nvmRepresents the write threshold, Lw-nvmA write memory latency, L, representing the second memory spacew-dramRepresents a write memory latency of the first memory space, a represents a value that may improve performance at the expense of the lifetime of a portion of memory corresponding to the second memory space,beta represents the proportion of the reduction of the cache conflict value of the target memory page after the memory area of the target memory page is changed, TcostA second overhead value representing the target memory page.
18. The computing device of any of claims 11-17, wherein the computing device further comprises:
a receiving module, configured to receive a write request;
an allocating module, configured to allocate a first memory page from a memory area with a smallest cache conflict value in the first memory space, where the first memory page is used to store the data to be written specified by the write request.
19. The computing device of any of claims 11-17, wherein the computing device further comprises:
a receiving module, configured to receive a read request;
and an allocating module, configured to allocate a second memory page from the memory area with the smallest cache conflict value in the second memory space, where the second memory page is used to store the data specified by the read request.
20. The computing device of any of claims 11-19, wherein the computing device further comprises:
a third determining module, configured to determine, when a third memory page in the first memory space meets a second preset condition, a memory region to which the third memory page currently belongs, where the second preset condition is that a cache conflict value of the target memory page reaches a cache conflict threshold;
the modifying module is further configured to modify and store the third memory page from a memory area to which the third memory page currently belongs to a memory area with a small cache conflict value in the first memory space if the third memory page belongs to a dirty page; alternatively, the first and second electrodes may be,
the changing module is further configured to, if the third memory page does not belong to a dirty page, change and store the third memory page from a memory area to which the third memory page currently belongs to a memory area with a small cache conflict value in the second memory space.
21. A computing device, wherein the computing device comprises:
a memory for storing a program;
a processor for executing a program in the memory to cause the computing device to perform the method of any of claims 1-10.
22. A computing device readable storage medium, characterized in that the computing device readable storage medium stores a program that, when executed by a processor, causes the computing device to perform the method of any of claims 1 to 10.
CN202110298621.2A 2017-08-24 2017-08-24 Memory page management method and computing device Pending CN112988388A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110298621.2A CN112988388A (en) 2017-08-24 2017-08-24 Memory page management method and computing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710738591.6A CN107562645B (en) 2017-08-24 2017-08-24 Memory page management method and computing device
CN202110298621.2A CN112988388A (en) 2017-08-24 2017-08-24 Memory page management method and computing device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201710738591.6A Division CN107562645B (en) 2017-08-24 2017-08-24 Memory page management method and computing device

Publications (1)

Publication Number Publication Date
CN112988388A true CN112988388A (en) 2021-06-18

Family

ID=60976943

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202110298621.2A Pending CN112988388A (en) 2017-08-24 2017-08-24 Memory page management method and computing device
CN202110298594.9A Pending CN112988387A (en) 2017-08-24 2017-08-24 Memory page management method and computing device
CN201710738591.6A Active CN107562645B (en) 2017-08-24 2017-08-24 Memory page management method and computing device

Family Applications After (2)

Application Number Title Priority Date Filing Date
CN202110298594.9A Pending CN112988387A (en) 2017-08-24 2017-08-24 Memory page management method and computing device
CN201710738591.6A Active CN107562645B (en) 2017-08-24 2017-08-24 Memory page management method and computing device

Country Status (1)

Country Link
CN (3) CN112988388A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664419A (en) * 2018-04-03 2018-10-16 郑州云海信息技术有限公司 A kind of method and its device of determining memory big page number
CN110457235B (en) * 2019-08-20 2021-10-08 Oppo广东移动通信有限公司 Memory compression method, device, terminal and storage medium
CN111078410B (en) * 2019-12-11 2022-11-04 Oppo(重庆)智能科技有限公司 Memory allocation method and device, storage medium and electronic equipment
CN113326214B (en) * 2021-06-16 2023-06-16 统信软件技术有限公司 Page cache management method, computing device and readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100577384B1 (en) * 2004-07-28 2006-05-10 삼성전자주식회사 Method for page replacement using information on page
CN102662853A (en) * 2012-03-22 2012-09-12 北京北大众志微系统科技有限责任公司 Memory management method and device capable of realizing memory level parallelism
CN103914403B (en) * 2014-04-28 2016-11-02 中国科学院微电子研究所 A kind of recording method mixing internal storage access situation and system thereof

Also Published As

Publication number Publication date
CN107562645A (en) 2018-01-09
CN112988387A (en) 2021-06-18
CN107562645B (en) 2021-03-23

Similar Documents

Publication Publication Date Title
US11531617B2 (en) Allocating and accessing memory pages with near and far memory blocks from heterogenous memories
US20210374056A1 (en) Systems and methods for scalable and coherent memory devices
EP3132355B1 (en) Fine-grained bandwidth provisioning in a memory controller
KR101999132B1 (en) Method and apparatus for managing memory in virtual machine environment
CN107562645B (en) Memory page management method and computing device
EP3382557B1 (en) Method and apparatus for persistently caching storage data in a page cache
CN105637470B (en) Method and computing device for dirty data management
US20190004841A1 (en) Memory Sharing For Virtual Machines
US10642493B2 (en) Mobile device and data management method of the same
US20140047176A1 (en) Dram energy use optimization using application information
US11836087B2 (en) Per-process re-configurable caches
CN110597742A (en) Improved storage model for computer system with persistent system memory
WO2016058560A1 (en) External acceleration method based on serving end and external buffer system for computing device, and device implementing said method
CN115794682A (en) Cache replacement method and device, electronic equipment and storage medium
CN112654965A (en) External paging and swapping of dynamic modules
KR102457179B1 (en) Cache memory and operation method thereof
EP4022446B1 (en) Memory sharing
US20210200584A1 (en) Multi-processor system, multi-core processing device, and method of operating the same
JP2021536643A (en) Hybrid memory system interface

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination