CN107015628B - Low-overhead DRAM refreshing method and system for approximate application - Google Patents

Low-overhead DRAM refreshing method and system for approximate application Download PDF

Info

Publication number
CN107015628B
CN107015628B CN201710203437.9A CN201710203437A CN107015628B CN 107015628 B CN107015628 B CN 107015628B CN 201710203437 A CN201710203437 A CN 201710203437A CN 107015628 B CN107015628 B CN 107015628B
Authority
CN
China
Prior art keywords
memory
page
mapping
dram
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710203437.9A
Other languages
Chinese (zh)
Other versions
CN107015628A (en
Inventor
王颖
李华伟
刘波
刘国培
刘超伟
孙强
李晓维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201710203437.9A priority Critical patent/CN107015628B/en
Publication of CN107015628A publication Critical patent/CN107015628A/en
Application granted granted Critical
Publication of CN107015628B publication Critical patent/CN107015628B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3215Monitoring of peripheral devices
    • G06F1/3221Monitoring of peripheral devices of disk drive devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3275Power saving in memory, e.g. RAM, cache

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Dram (AREA)

Abstract

The invention provides a low-overhead DRAM refreshing method and a system facing approximate application, which relate to the technical field of memory design, and the method comprises a static matching mapping step, wherein global access information of application is acquired offline, the maximum reuse distance of each memory line in the global access information is analyzed, and the content in each memory line is transferred to the memory line with the storage time longer than the maximum reuse distance; and a dynamic threshold value adjusting step, periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching corresponding memory rows in the DRAM retention time distribution. After the mapping and the migration of the program data stored in the memory are carried out, the error rate of the static matching mapping method is almost zero, the error rate of the dynamic matching mapping method can be controlled within 0.7 percent, and the two methods can save the original refreshing energy consumption by more than 99 percent.

Description

Low-overhead DRAM refreshing method and system for approximate application
Technical Field
The invention relates to the technical field of memory design, in particular to a low-overhead DRAM refreshing method and system for approximate application.
Background
A significant portion of the power consumption of current processor systems is generated by DRAM main memory, and the trend is becoming more and more intense. Recent researches show that in a modern server system, the power consumption generated by a main memory system accounts for up to 30-40%, the power consumption of the main memory can be divided into the power consumption of a memory controller, background power consumption and dynamic power consumption, the generation of the background power consumption is unrelated to access and storage activities, the background power consumption mainly comes from a main memory peripheral circuit, transistor leakage and refresh power consumption, wherein the refresh power consumption is caused by capacitor leakage of DRAM (dynamic random access memory) storage units, and the DRAM controller must compensate capacitor leakage charges through periodic refresh operations so as to ensure the correctness of stored data. The study by Bhati et al shows that more than 20% of main memory power consumption is generated by the refresh operation of DRAM, and therefore, reducing the refresh power consumption of DRAM main memory is very important for system energy efficiency optimization.
Although JEDEC specifies a 64ms refresh interval standard, practical studies have shown that 99% of DRAM cell retention times can reach nearly 10s, as shown in fig. 1, and thus, the refresh mechanism of conventional memory systems has a large design space.
Approximate calculation is increasingly regarded as an idea of reducing energy consumption, and particularly, due to the rise of current mobile and embedded devices, many calculation tasks such as media processing (video, audio, image and the like), recognition, data mining and the like do not require that a calculation result is completely correct and have a certain degree of error tolerance capability, however, the "long tail effect" shown by the traditional memory system, namely 99% of effort is a great waste of energy consumption in eliminating 1% of error rate.
In the DRAM, the read-write operation on the memory unit can replace the refresh operation, if the time interval between two consecutive read-write operations on the same address unit is less than the holding time of the unit, the unit can not perform the refresh operation, thereby saving the refresh power consumption, typical approximate computing application such as multimedia, game, audio and video only has a small part of data related to program control flow playing a critical role on the correctness of program execution, the part of data is called critical data, the holding time fault occurring in the part of data has great influence on the correctness of output, other large data sets are not sensitive to the fault including the holding time fault, the part of data is generally called non-critical data, therefore, in order to reduce the refresh operation of the memory as much as possible, the memory access mode of the non-critical data in the approximate computing can be used for remapping the storage in the DRAM, therefore, by combining the distribution of the retention time deviation in the DRAM, the data refreshing operation can be reduced and the refreshing power consumption can be reduced on the premise of not influencing the application user experience through reasonable data remapping.
The following is prior art, as follows:
smart Refresh: JEDEC sets a standard for DRAM to Refresh every 64ms, which is followed by current memory chips, sets a counter in the DRAM memory controller, and when the counter decrements to zero, indicating that 64ms has elapsed, the memory controller resets the counter and sends a Refresh command, which is sent as a Row Refresh (Row Refresh) command shown in fig. 2.
To ensure a correct refresh operation of data, the DRAM cells are essentially read and written back again to restore the charge level of the storage capacitor, while the memory read and write operations themselves refresh the data of the corresponding memory Row cells, so that a refresh operation after a memory Access operation can be avoided, as shown in fig. 2 Row Access (Row Access), which is the best case for SmartRefresh, the Row receives the memory Access request Ak before the refresh command of each Row Rk is sent, so that all refresh operations after the memory Access can be cancelled, thereby saving power consumption.
The basic idea of Smart Refresh is to set an associated counter with a size of 2 bits or 3 bits for each row in Bank, the value of the counter is stored and updated in the memory controller, the value of the counter is decremented from the maximum value to zero within a Refresh interval, and when a memory row is read and written, the value of the corresponding counter is reset to the maximum value and begins to decrement again, the memory controller will only Refresh the memory row with the counter being zero, if the value of the counter is decremented to zero, this means that the memory row must be refreshed, so the memory access operation postpones the Refresh operation by causing the counter to reset, and preferably, as shown in fig. 2, no Refresh command is generated in the memory.
Fig. 3 discusses the mechanism of Smart Refresh operation, for example, as shown in (a) of fig. 3, a counter with 2-bit size is used, and the Refresh interval is 64ms, assuming that no program initiates a memory access request to the DRAM in the whole process, the memory controller will automatically update the value of the counter, decrement it every 16ms, when the counter is decremented to zero, the corresponding memory row needs to be refreshed, the counter values of all memory rows are zero, the memory controller needs to Refresh all rows, since the Refresh command cannot be executed in parallel, the performance of the memory system will be seriously affected at this time, and (b) of fig. 3, the value of the counter is initialized from 0 to 3, which can avoid the situation that a large number of memory rows need to be refreshed at the same time, however, there are some problems in such a scheme, firstly, randomly initializing the counter means that the memory row of 1/4 needs to be refreshed each time, even when the initialization is started, the counter value of 1/4 is 0, and secondly, during the operation, since the read-write request of the memory will reset the register to the maximum value, it is possible to face the situation shown in the graph (a) in fig. 3.
In order to solve the above problems, Smart Refresh takes the idea of staggering the initial value of the counter and the decrementing operation at the same time, for example, in (b) of fig. 3, the memory rows are divided into N groups (the size of N depends on the size of the memory Refresh queue, N is 4 in the figure, each group includes counters of 16 memory rows), the original scheme updates the counters at 0ms, 16ms, 32ms and 48ms, the current scheme further disperses each previous update, and if the original counter update at 0ms corresponds to:
the counter for the 1 st memory line in all 4 packets is updated at 0ms,
the counter for the 2 nd memory line in all 4 packets is updated at 1ms,
……
the counter for the 16 th memory line in all 4 packets is updated at 15 ms.
The essence of the above scheme is to reduce the time interval granularity of the update counter, so as to disperse the update operation, and at the same time, the number of rows to be refreshed is correspondingly reduced, as shown in fig. 3, N memory rows need to be refreshed every 1ms, and N is set according to the size of the refresh command request queue of the DRAM memory controller, thereby avoiding performance degradation caused by a large number of refresh operations blocking normal requests.
Flikker: the Flikker technology separates non-critical data with tolerance capability to errors in application data and carries out low-frequency refreshing based on tolerance of the application to the errors, so that power consumption is reduced, and the Flikker completes the low-frequency refreshing of the non-critical data through combination of software and hardware.
In terms of hardware, Flikker divides each DRAM Bank into a normal refresh region for ensuring that critical data is correct and a low frequency refresh region for non-critical data for saving power consumption, as shown in fig. 4.
Firstly, a programmer needs to label key data when writing an application; secondly, in the program running process, the system needs to store the key data and the non-key data into a normal refreshing area and a low-frequency refreshing area in the memory respectively; then the operating system configures an autonomous refresh counter and switches the DRAM into an autonomous refresh mode (in the mobile operating system, the operating system gives refresh control to the DRAM when the processor enters a sleep mode for saving power consumption); and finally, the self-refresh controller respectively refreshes different regions of the DRAM at different frequencies according to the configuration parameters of the operating system. The software aspect requires modifications to the application source code, runtime system, and operating system to work in conjunction, as shown in particular in fig. 5.
The dynamic memory DRAM refresh control optimization method for approximate calculation application has few domestic and foreign research results, and the existing research results mainly have the following problems: firstly, how to ensure the correct operation of computer application and the service quality of the computer application on the premise of reducing the refreshing frequency and the refreshing operation; secondly, how to design the refresh frequency of different address blocks of the DRAM of the dynamic memory, and the refresh cost is minimized; thirdly, the refresh times cannot be minimized by combining the memory address allocation through the one-to-one matching of the data address and the memory address space. The three points directly result in three defects of the existing achievement: firstly, the refresh frequency is too high to fully utilize the fault tolerance of the approximate application; secondly, the refresh scheme is unreliable, resulting in application errors or affecting output quality; third, higher area and power consumption overhead.
The following are the technical problems in the prior art, as follows:
SmartRefresh: 1) smart refresh introduces a timer for each eDRAM (embedded DRAM) row, and avoids redundant refreshing of the eDRAM row by recording the last refresh/access time stamp for each eDRAM row, but the method mainly aims at the eDRAM on chip, and the expense of the introduced counter is too large for a large-capacity DRAM memory; 2) SmartRefresh can save refresh power consumption by 52.6% on average, and the refresh power consumption can be saved by more than 99% by the method.
Flikker: flikker needs a programmer to mark non-critical data and combines with a compiler operating system and the like to work cooperatively, so that the technology is too complex and the application difficulty is higher. Flikker reduces the refresh rate to non-key data, sets up ground refresh rate region in DRAM, it is limited to save the refresh consumption, and this technique can save most refresh consumption almost.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a low-overhead DRAM refreshing method and system for approximate application.
The invention provides an approximate application-oriented low-overhead DRAM refreshing method, which comprises the following steps:
a static matching mapping step, namely acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance;
and a dynamic threshold value adjusting step, periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching corresponding memory rows in the DRAM retention time distribution.
The step of static matching mapping comprises
(1) The global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group (P)i,Tij),
Figure BDA0001259235330000051
Wherein P isiRepresenting the address of page i, T, of the memoryijA timestamp representing the jth access to page i, and a set V is obtained according to a set D, each element in the set V being a binary group (P)i,Vij),
Figure BDA0001259235330000052
Wherein P isiThe address, V, of a page i representing the memoryijAnd the difference of the time stamps of j and j-1 accesses to the page i is represented, and the difference is the reuse distance.
(2) For the retention time distribution information of DRAM, represented by a set R, where each element is a doublet (R)k,rtk),
Figure BDA0001259235330000053
RkAddress, rt, representing memory line kkIt represents the hold time for memory line k.
(3) For each P in the set ViFind the corresponding memory line R in the set RkFor a certain page PiHas a set (P)i,Vij),
Figure BDA0001259235330000054
First find page PiThe maximum value of all access reuse distances, denoted as (P)imaxV), then distribute the set (R) in the hold timek,rtk),
Figure BDA0001259235330000055
Finding rt in all memory lines with retention time exceeding maxVkSmallest memory row RkTo page PiAnd RkAnd (4) matching, and repeating the step (3) until all the pages in the set V find the corresponding memory row mapping.
The dynamic threshold adjusting step comprises:
1. predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;
2. according to the threshold value, finding out the memory row which is not distributed and has the minimum holding time from all the memory rows with the holding time larger than the threshold value;
3. and matching the page with the memory line with the minimum holding time.
4. The approximation-application-oriented low-overhead DRAM refresh method of claim 1, further comprising error controlling the dynamic threshold adjustment step.
The error control specifically includes: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period TkThen the error rate of page i is eik. At TkAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be reik=eikE, then obtain the control weight values, as follows
cwik=P*reik+I*(reik+rei(k-1))+D*(reik-rei(k-1))
According to the control weight value cwikDetermines whether page i is migrated.
The invention also provides an approximate application-oriented low-overhead DRAM refreshing system, which comprises the following components:
the static matching mapping module is used for acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance;
and the dynamic threshold adjusting module is used for periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals and matching corresponding memory rows in the DRAM retention time distribution.
The static matching mapping module comprises
(1) The global access information comprises: visit toMemory address, memory type, time stamp, and form a set D, each element in the set D being a binary group (P)i,Tij),
Figure BDA0001259235330000061
Wherein P isiRepresenting the address of page i, T, of the memoryijA timestamp representing the jth access to page i, and a set V is obtained according to a set D, each element in the set V being a binary group (P)i,Vij),
Figure BDA0001259235330000062
Wherein P isiThe address, V, of a page i representing the memoryijAnd the difference of the time stamps of j and j-1 accesses to the page i is represented, and the difference is the reuse distance.
(2) For the retention time distribution information of DRAM, represented by a set R, where each element is a doublet (R)k,rtk),
Figure BDA0001259235330000063
RkAddress, rt, representing memory line kkIt represents the hold time for memory line k.
(3) For each P in the set ViFind the corresponding memory line R in the set RkFor a certain page PiHas a set (P)i,Vij),
Figure BDA0001259235330000064
First find page PiThe maximum value of all access reuse distances, denoted as (P)imaxV), then distribute the set (R) in the hold timek,rtk),
Figure BDA0001259235330000065
Finding rt in all memory lines with retention time exceeding maxVkSmallest memory row RkTo page PiAnd RkAnd (4) matching, and repeating the step (3) until all the pages in the set V find the corresponding memory row mapping.
The dynamic threshold adjustment module comprises:
1. predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;
2. according to the threshold value, finding out the memory row which is not distributed and has the minimum holding time from all the memory rows with the holding time larger than the threshold value;
3. and matching the page with the memory line with the minimum holding time.
Further comprising performing error control on the dynamic threshold adjustment step.
The error control specifically includes: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period TkThen the error rate of page i is eik. At TkAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be reik=eikE, then obtain the control weight values, as follows
cwik=P*reik+I*(reik+rei(k-1))+D*(reik-rei(k-1))
According to the control weight value cwikDetermines whether page i is migrated.
According to the scheme, the invention has the advantages that:
after the mapping and the migration of the program data stored in the memory are carried out, the error rate of the static matching mapping method is almost zero, the error rate of the dynamic matching mapping method can be controlled within 0.7 percent, and the two methods can save the original refreshing energy consumption by more than 99 percent.
Drawings
FIG. 1 is a DRAM retention time profile;
FIG. 2 is a schematic diagram of Smart Refresh;
FIG. 3 is a diagram of a Smart Refresh counter operating mechanism;
FIG. 4 is a Flikker DRAM Bank structure diagram;
FIG. 5 is a Flikker system block diagram;
FIG. 6 is a general schematic diagram of a DRAM refresh control method that takes into account error tolerance characteristics;
FIG. 7 is a general block diagram of a DRAM refresh control method based on static/dynamic matching mapping;
fig. 8 is a block diagram of a PID error controller.
Detailed Description
Based on the above observation, the inventor tries to replace refresh operation with read-write request, proposes a static and dynamic matching mapping method to match the interval (reuse distance) of application access to the memory with the retention time distribution of the DRAM, and finally achieves the purpose of reducing or even eliminating refresh, and at this time, the DRAM becomes a "nonvolatile" device, i.e., NV-DRAM. In recent years, along with the appearance of approximate calculation and big data, the application does not require complete correctness for output quality, and only needs to control errors within a certain range, so that the possibility is provided for the idea of replacing refresh operation by read-write request, and meanwhile, in order to ensure that the error rate of the matching mapping method is within a controllable range, the inventor provides a PID error controller based on an industrial control theory.
For an application running, historical access information can be collected, reuse distance information (time interval of accessing the same memory row twice continuously) of the application is obtained through simple processing, meanwhile, retention time distribution information of different rows of a DRAM can be obtained off line, the reuse distance and the retention time distribution are matched through a static and dynamic matching mapping method, namely, contents stored in the memory row with the access interval of t are migrated to the memory row with the retention time of more than t for storage, because the read-write operation of the DRAM can be considered as the refreshing of a read-write row unit, after the matching, the normal read-write operation replaces the original refreshing mechanism, the correctness of data is ensured while the read-write request is completed, the energy consumption of the original refreshing mechanism is correspondingly disappeared, and the input of the whole system corresponds to the reuse distance and the retention time distribution as shown in FIG. 6, and finally giving a matching mapping result to guide the application to be stored in the memory through a static and dynamic matching mapping method.
Fig. 7 is a structure diagram of a DRAM refresh control structure based on static matching mapping and dynamic matching mapping, where the main components of the DRAM refresh control structure based on static matching mapping are smp (stateful mapping) units, whose functions are to give a mapping scheme for storing data in DRAM during its running process before the program runs through application access information memory trace and DRAM retention time distribution time obtained offline, as shown in (a) of fig. 7, and the DRAM refresh control structure based on dynamic matching mapping in (b) of fig. 7 mainly has three components, which are dtp (dynamic threshold mapping), PID-EC, and migration/modification page table, respectively. The DTP, i.e. the dynamic threshold adjustment method, is used to dynamically adjust the matching mapping result in combination with the access information access trace collected during the application running process and the previous matching mapping effect (i.e. the error rate statistics unit in fig. 7), and whenever the DTP gives the matching mapping result, the Migration unit migrates the corresponding content to the matched row, and further modifies the page table, thereby ensuring that the operating system can find the physical address of the migrated memory through the original virtual page address. The PID-EC (PID Error controller) refers to the PID Error controller, which is widely used in the industry, and aims to find out the page with higher Error rate in the matching and mapping process and then migrate the page with higher Error rate to the normally refreshed area through the page migration unit, thereby reducing the Error rate and ensuring the output quality of the application.
The following are examples of the invention, as follows:
1. DRAM refresh control method based on static matching mapping
In a system such as an embedded system which has no virtual memory and has a fixed corresponding relation between a linear address space and a physical address space, because the running application is relatively fixed and the access mode of the application also has great certainty, the idea of matching and mapping is relatively simple. Firstly, obtaining the global access trace of the application off line, analyzing the maximum reuse distance of each memory row, and matching each memory row to the DRAM row with the retention time larger than the maximum reuse distance.
The static matching Mapping method is divided into two stages of Profiling and Mapping. In the Profiling stage, firstly, a target application needs to be operated off-line, a memory access trace of the target application needs to be collected, reuse distance information is obtained through further processing, and secondly, DRAM retention time distribution needs to be obtained; the Mapping phase matches the reuse distance information of the application to the DRAM retention time distribution.
Profiling early stage program analysis stage
(1) The reuse distance information is applied. Assume that the trace format (i.e. global access information) of each access is: memory access address, memory access type (read/write), timestamp. Then the application access trace collected off-line is set D, and each element in the set is a binary (P) seti,Tij),
Figure BDA0001259235330000091
Wherein P isiAddress, T, of page i representing memory accessijIndicating the timestamp of the j-th access to page i. From the set D, we can compute a set V, where each element in the set V is a doublet (P)i,Vij),
Figure BDA0001259235330000092
Wherein P isiAddress, V, of page i representing memory accessijThe difference between the timestamps representing the j-th and j-1-th accesses to page i, i.e., the reuse distance of the accesses.
(2) The DRAM holds a time distribution. For the retention time distribution information of DRAM, we represent by a set R, where each element is a doublet (R)k,rtk),
Figure BDA0001259235330000093
RkIndicating the address, rt, of a memory line kkIt represents the hold time for memory line k.
Mapping storage object mapping phase
The idea of the static matching mapping method is to provide each page P in the set ViFind the appropriate memory row R in the set RkAnd matching is carried out, and finally, the storage of corresponding data in the running process of the program is guided, so that the aim of 'no refreshment' is fulfilled.
For each page P in the set ViFind the appropriate memory row R in the set RkFor a certain page PiHas a set (P)i,Vij),
Figure BDA0001259235330000094
First, find the maximum value of all access reuse distances for the page, denoted as (P)imaxV), then distribute the set (R) in the hold timek,rtk),
Figure BDA0001259235330000095
Finding rt in all memory lines with retention time exceeding maxVkSmallest memory row RkA 1 is to PiAnd the RkAnd (6) matching. The above method is repeated until all pages in the set V find the appropriate memory row map.
To avoid a certain page PiThe maximum reuse distance maxV is far larger than other reuse distances, which results in poor matching effect, so after finding maxV, the rationality of maxV is firstly judged, and if unreasonable, the maximum reuse distance is discarded and searched again. Although this may result in partial errors, the approximation calculation application has no essential impact on the application output quality since it possesses some error tolerance capability.
2. DRAM refresh control method based on dynamic matching mapping
(1) Dynamic threshold adjustment
In the static matching mapping method, the maximum reuse distance of a page is found out firstly, and then the page is matched with a memory row in a DRAM, wherein the retention time of the memory row is greater than the maximum reuse distance of the page, so that the memory unit can be refreshed before the failure of the retention time occurs in the read-write operation of the page, and the original refreshing operation can be completely eliminated. Accordingly, it becomes a critical issue how to predict the maximum reuse distance in the subsequent mapping cycle in each mapping cycle of the dynamic matching mapping and match the appropriate memory rows in the DRAM retention time distribution, and this predicted maximum reuse distance is called a threshold.
The matching mapping based on dynamic threshold adjustment mainly comprises the following steps:
(the matching mapping process of a single page is used for illustration, and in the actual operation process, the following mapping process needs to be repeated for each visited page traversal)
1. At the end of each mapping cycle, the threshold of the page is predicted. The method specifically comprises the steps of obtaining the maximum reuse distance of the current mapping period of the page, evaluating the threshold error rate (the ratio of the number of matching failures to the total number of access times) predicted by the last mapping period, and analyzing and predicting the threshold of the page by combining the information.
2. And finding out the memory line which is not allocated and has the minimum holding time from all the memory lines with the holding time larger than the threshold value by using the threshold value obtained in the last step.
3. Matching the page with the memory line found above. The specific method comprises the following steps: firstly, data of a physical address corresponding to the virtual page is migrated to the matched physical address, and then a page table is modified to enable an original virtual page address to be mapped to a new physical address so as to ensure normal operation of a program.
Assume the trace format for each access to be: memory access address, memory access type (read/write), timestamp. At intervals of time T weslotCounting the program memory access information in the current mapping period (called mapping period), evaluating the matching effect of mapping, and predicting the reuse distance threshold trend of each page to match. Assume that at time T (T ═ n × T)slotN is 1,2 …), the collected application access trace is a set T, and each element in the set is a binary group (P)i,Tij),
Figure BDA0001259235330000101
Wherein P isiAddress, T, of virtual page i representing memory accessijIndicating the timestamp of the j-th access to page i. Calculating to obtain a set V according to the set T, wherein each element in the set V is a binary group (P)i,Vij),
Figure BDA0001259235330000102
Wherein P isiAddress, V, of page i representing memory accessijThe difference between the timestamps representing the j-th and j-1-th accesses to page i, i.e., the reuse distance of the accesses. For the retention time distribution information of DRAM, we represent by a set R, where each element is a doublet (R)k,rtk),
Figure BDA0001259235330000103
RkIndicating the address, rt, of a memory line kkIt represents the hold time for memory line k.
The idea of the dynamic matching mapping is that the current time period T is used as a basisslotFor each page PiMatching corresponding memory lines R in the set RkAnd in the matching process, predicting according to the change trend of the threshold effect of the current mapping period and the previous mapping period, and finally providing memory mapping guidance information.
The dynamic prediction method adopts a simple hill climbing algorithm to adjust the threshold value. Specifically, the current mapping period T is each timeslotAt the end, we will count the effect of last mapping period threshold lastThresh in the current period, and calculate the error rate (ErrorRate ═ V #(V)ij>lastThresh)/#TijWhere # (V)ij>lastThresh) indicates the number of times the access distance exceeds the threshold, # TijRepresenting the total number of accesses to the page), if the error rate is greater than the error rate previousrrorrate of the last mapping cycle, proving that lastThresh has poor effect, adopting the threshold of the current time period. If the error rate is less than PreviousErrorRate, then a decision needs to be made according to the trend of the threshold value, and the specific flow is shown as Algorithm 1.
Figure BDA0001259235330000111
(2) PID error controller
In actual operation, most errors are concentrated on a certain page. Therefore, the pages with higher error rates are considered to be migrated out, and normal refreshing is carried out to ensure that the data is correct. If the error rates of all pages are directly counted, the page with the highest error rate is migrated at the end of each mapping period, and the error rate is not accurately controlled, so that the error rate may be far less than the given error rate control upper limit, but the migration cost exceeds the power consumption saved by avoiding refreshing due to excessive migration of memory pages, and finally the loss is not paid.
As shown in fig. 8, the structure of PID error controller, in each control cycle of PID, the error rate statistic component will count the error rate of all current pages, and based on the above error rate information and the given error rate controller level, calculate the relative error rate of each page and give it to PID error controller for processing, and the PID error controller decides which pages need to be migrated according to the calculated control amount, and then sends migration command to DRAM.
Error control level E is given first, meaning that the PID is to control the error rate of the system around E. In each control cycle, the error rate of all pages is counted. Assume that the current control period is the kth control period TkThen the error rate of page i is eik. At TkAt the end, the statistical information of all the pages is evaluated in a traversal mode to determine whether to migrate or not. For example, for page i, its relative error rate in the current time slice is first calculated as reik=eikE, then the weight values of the PID controllers are obtained as follows
cwik=P*reik+I*(reik+rei(k-1))+D*(reik-rei(k-1))
According to the control weight value cwikDetermines whether page i is migrated.
The invention also provides an approximate application-oriented low-overhead DRAM refreshing system, which comprises the following components:
the static matching mapping module is used for acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance;
and the dynamic threshold adjusting module is used for periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals and matching corresponding memory rows in the DRAM retention time distribution.
The static matching mapping module comprises
(1) The global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group (P)i,Tij),
Figure BDA0001259235330000121
Wherein P isiRepresenting the address of page i, T, of the memoryijA timestamp representing the jth access to page i, and a set V is obtained according to a set D, each element in the set V being a binary group (P)i,Vij),
Figure BDA0001259235330000122
Wherein P isiThe address, V, of a page i representing the memoryijAnd the difference of the time stamps of j and j-1 accesses to the page i is represented, and the difference is the reuse distance.
(2) For the retention time distribution information of DRAM, represented by a set R, where each element is a doublet (R)k,rtk),
Figure BDA0001259235330000123
RkAddress, rt, representing memory line kkIt represents the hold time for memory line k.
(3) For each P in the set ViFind the corresponding memory line R in the set RkFor a certain page PiHas a set (P)i,Vij),
Figure BDA0001259235330000124
First find page PiThe maximum value of all access reuse distances, denoted as (P)imaxV), then distribute the set (R) in the hold timek,rtk),
Figure BDA0001259235330000125
Finding rt in all memory lines with retention time exceeding maxVkSmallest memory row RkTo page PiAnd RkAnd (4) matching, and repeating the step (3) until all the pages in the set V find the corresponding memory row mapping.
The dynamic threshold adjustment module comprises:
1. predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;
2. according to the threshold value, finding out the memory row which is not distributed and has the minimum holding time from all the memory rows with the holding time larger than the threshold value;
3. and matching the page with the memory line with the minimum holding time.
Further comprising performing error control on the dynamic threshold adjustment step.
The error control specifically includes: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period TkThen the error rate of page i is eik. At TkAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be reik=eikE, then obtain the control weight values, as follows
cwik=P*reik+I*(reik+rei(k-1))+D*(reik-rei(k-1))
Wherein, P and I are proportional parameter and integral parameter in PID control algorithm, the empirical value of the parameter can be set by manually adjusting and searching the value with the lowest error rate through actual condition, K is time parameter, which specifically represents the current control period number.
According to the control weight value cwikDetermines whether page i is migrated.

Claims (8)

1. An approximation application-oriented low-overhead DRAM refresh method, comprising:
a static matching mapping step, namely acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance; the method specifically comprises the following steps: the global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group
Figure FDA0002436913740000011
Wherein P isiRepresenting the address of page i, T, of the memoryijA timestamp representing the jth visit to the page i, and a set V is obtained according to the set D, wherein each element in the set V is a binary group
Figure FDA0002436913740000012
Wherein P isiThe address, V, of a page i representing the memoryijThe difference value of the timestamps of j and j-1 visit to the page i is represented, and the difference value is the reuse distance; for the retention time distribution information of DRAM, represented by the set R, where each element is a doublet
Figure FDA0002436913740000013
RkAddress, rt, representing memory line kkThen the retention time of memory line k is represented; for each P in the set ViFind the corresponding memory line R in the set RkFor a certain page PiHas a collection
Figure FDA0002436913740000014
First find page PiThe maximum value of all access reuse distances, denoted as (P)imaxV), then distribute the collection at hold time
Figure FDA0002436913740000015
Finding rt in all memory lines with retention time exceeding maxVkSmallest memory row RkTo page PiAnd RkMatching is carried out until all pages in the set V find corresponding memory row mapping;
and a dynamic threshold value adjusting step, periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching all the memory rows with the minimum retention time and the unallocated memory rows with the retention time larger than the threshold value in the DRAM retention time distribution.
2. The approximate application-oriented low-overhead DRAM refresh method of claim 1, wherein the dynamic threshold adjustment step comprises:
predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;
and matching the page with the memory line with the minimum holding time.
3. The approximation-application-oriented low-overhead DRAM refresh method of claim 1, further comprising error controlling the dynamic threshold adjustment step.
4. The approximate application-oriented low-overhead DRAM refresh method of claim 3, wherein the error control specifically comprises: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period TkThen the error rate of page i is eik(ii) a At TkAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be reik=eikE, then obtain the control weight values, as follows
cwik=P*reik+I*(reik+rei(k-1))+D*(reik-rei(k-1))
According to the control weight value cwikDetermines whether page i is migrated; p, I, D are the proportional, integral and derivative parameters of the PID control algorithm, respectively.
5. An approximation application-oriented low-overhead DRAM refresh system, comprising:
the static matching mapping module is used for acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance; the method specifically comprises the following steps: the global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group
Figure FDA0002436913740000021
Wherein P isiRepresenting the address of page i, T, of the memoryijA timestamp representing the jth visit to the page i, and a set V is obtained according to the set D, wherein each element in the set V is a binary group
Figure FDA0002436913740000022
Wherein P isiThe address, V, of a page i representing the memoryijThe difference value of the timestamps of j and j-1 visit to the page i is represented, and the difference value is the reuse distance; for the retention time distribution information of DRAM, represented by the set R, where each element is a doublet
Figure FDA0002436913740000023
RkAddress, rt, representing memory line kkThen the retention time of memory line k is represented; for each P in the set ViFind the corresponding memory line R in the set RkFor a certain page PiHas a collection
Figure FDA0002436913740000024
First find page PiThe maximum value of all access reuse distances, denoted as (P)imaxV), then distribute the collection at hold time
Figure FDA0002436913740000025
Finding rt in all memory lines with retention time exceeding maxVkSmallest memory row RkTo page PiAnd RkMatching is carried out until all pages in the set V find corresponding memory row mapping;
and the dynamic threshold adjusting module is used for periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching all the memory rows with the retention time larger than the threshold value, which are not allocated and have the minimum retention time, in the DRAM retention time distribution.
6. The approximation-application-oriented low-overhead DRAM refresh system of claim 5, wherein the dynamic threshold adjustment module comprises:
predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;
and matching the page with the memory line with the minimum holding time.
7. The approximation-application-oriented low-overhead DRAM refresh system of claim 5, further comprising error-controlling the dynamic threshold adjustment step.
8. The approximation-oriented equation of claim 7A low overhead DRAM refresh system for use, wherein said error control comprises: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period TkThen the error rate of page i is eik(ii) a At TkAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be reik=eikE, then obtain the control weight values, as follows
cwik=P*reik+I*(reik+rei(k-1))+D*(reik-rei(k-1))
According to the control weight value cwikDetermines whether page i is migrated; p, I, D are the proportional, integral and derivative parameters of the PID control algorithm, respectively.
CN201710203437.9A 2017-03-30 2017-03-30 Low-overhead DRAM refreshing method and system for approximate application Active CN107015628B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710203437.9A CN107015628B (en) 2017-03-30 2017-03-30 Low-overhead DRAM refreshing method and system for approximate application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710203437.9A CN107015628B (en) 2017-03-30 2017-03-30 Low-overhead DRAM refreshing method and system for approximate application

Publications (2)

Publication Number Publication Date
CN107015628A CN107015628A (en) 2017-08-04
CN107015628B true CN107015628B (en) 2020-08-28

Family

ID=59446454

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710203437.9A Active CN107015628B (en) 2017-03-30 2017-03-30 Low-overhead DRAM refreshing method and system for approximate application

Country Status (1)

Country Link
CN (1) CN107015628B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110058793B (en) 2018-01-19 2020-04-28 华为技术有限公司 Refreshing processing method, device and system and memory controller
KR102583266B1 (en) * 2018-10-24 2023-09-27 삼성전자주식회사 Storage module, operation method of storage module, and operation method of host controlling storage module
CN111124966B (en) * 2019-11-12 2021-08-24 上海移远通信科技有限公司 Method and device for improving stability of module data

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4765222B2 (en) * 2001-08-09 2011-09-07 日本電気株式会社 DRAM device
DE10143766A1 (en) * 2001-09-06 2003-04-03 Infineon Technologies Ag storage system
CN102081964B (en) * 2009-11-30 2014-12-10 国际商业机器公司 Method and system for refreshing dynamic random access memory
CN102368287B (en) * 2011-11-04 2013-08-28 哈尔滨工程大学 Method for processing medical data in remote medical system based on Android platform
CN103019974B (en) * 2012-12-18 2016-08-03 北京华为数字技术有限公司 memory access processing method and controller
CN104143355B (en) * 2013-05-09 2018-01-23 华为技术有限公司 A kind of method and apparatus of refreshed dram
CN103559142B (en) * 2013-11-05 2017-03-08 中国科学院声学研究所 The method for refreshing of dynamic RAM
CN103810126B (en) * 2014-01-27 2017-06-13 上海新储集成电路有限公司 Mixing DRAM memory and the method for reducing power consumption when the DRAM memory refreshes
CN103811047B (en) * 2014-02-17 2017-01-18 上海新储集成电路有限公司 Low-power-consumption refreshing method based on block DRAM (dynamic random access memory)
CN105068940B (en) * 2015-07-28 2018-07-31 北京工业大学 A kind of adaptive page strategy based on Bank divisions determines method
CN105912476A (en) * 2016-04-06 2016-08-31 中国科学院计算技术研究所 On-chip repeated addressing method and device
CN106128499A (en) * 2016-06-28 2016-11-16 田彬 A kind of device refreshed for DRAM or eDRAM and method for refreshing

Also Published As

Publication number Publication date
CN107015628A (en) 2017-08-04

Similar Documents

Publication Publication Date Title
US11636038B2 (en) Method and apparatus for controlling cache line storage in cache memory
US9348527B2 (en) Storing data in persistent hybrid memory
Luo et al. WARM: Improving NAND flash memory lifetime with write-hotness aware retention management
US7707379B2 (en) Dynamic latency map for memory optimization
US7793129B2 (en) Power consumption decrease memory management method
CN107015628B (en) Low-overhead DRAM refreshing method and system for approximate application
CN103562883A (en) Dynamic memory cache size adjustment in a memory device
US20190004723A1 (en) Throttling components of a storage device
US20120311394A1 (en) Memory system having multiple channels and write control method including determination of error correction channel in memory system
US10719247B2 (en) Information processing device, information processing method, estimation device, estimation method, and computer program product
US20190026028A1 (en) Minimizing performance degradation due to refresh operations in memory sub-systems
US11263101B2 (en) Decision model generation for allocating memory control methods
CN112214161B (en) Memory system and method of operating the same
JP6877381B2 (en) Information processing equipment, information processing methods and programs
WO2019009994A1 (en) Selective refresh mechanism for dram
KR20220052353A (en) Garbage collection of memory components with tuned parameters
KR101852974B1 (en) Hybrid memory system and page placement method in mobile device
Bronskill et al. A knowledge-based approach to the detection, tracking and classification of target formations in infrared image sequences
Li et al. Approximate data mapping in refresh-free DRAM for energy-efficient computing in modern mobile systems
US11556253B1 (en) Reducing power consumption by selective memory chip hibernation
CN110362508B (en) Mixed cache data distribution method based on greedy algorithm
Kumar et al. WinDRAM: Weak rows as in‐DRAM cache
US11756597B2 (en) Power-on read demarcation voltage optimization
Pourshirazi Improving Energy Efficiency and Lifetime of Emerging Memory Systems
KR20190100850A (en) Method for predicting energy consumption of heap memory object and memory system implementing the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant