CN107015628B

CN107015628B - Low-overhead DRAM refreshing method and system for approximate application

Info

Publication number: CN107015628B
Application number: CN201710203437.9A
Authority: CN
Inventors: 王颖; 李华伟; 刘波; 刘国培; 刘超伟; 孙强; 李晓维
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2017-03-30
Filing date: 2017-03-30
Publication date: 2020-08-28
Anticipated expiration: 2037-03-30
Also published as: CN107015628A

Abstract

The invention provides a low-overhead DRAM refreshing method and a system facing approximate application, which relate to the technical field of memory design, and the method comprises a static matching mapping step, wherein global access information of application is acquired offline, the maximum reuse distance of each memory line in the global access information is analyzed, and the content in each memory line is transferred to the memory line with the storage time longer than the maximum reuse distance; and a dynamic threshold value adjusting step, periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching corresponding memory rows in the DRAM retention time distribution. After the mapping and the migration of the program data stored in the memory are carried out, the error rate of the static matching mapping method is almost zero, the error rate of the dynamic matching mapping method can be controlled within 0.7 percent, and the two methods can save the original refreshing energy consumption by more than 99 percent.

Description

Low-overhead DRAM refreshing method and system for approximate application

Technical Field

The invention relates to the technical field of memory design, in particular to a low-overhead DRAM refreshing method and system for approximate application.

Background

A significant portion of the power consumption of current processor systems is generated by DRAM main memory, and the trend is becoming more and more intense. Recent researches show that in a modern server system, the power consumption generated by a main memory system accounts for up to 30-40%, the power consumption of the main memory can be divided into the power consumption of a memory controller, background power consumption and dynamic power consumption, the generation of the background power consumption is unrelated to access and storage activities, the background power consumption mainly comes from a main memory peripheral circuit, transistor leakage and refresh power consumption, wherein the refresh power consumption is caused by capacitor leakage of DRAM (dynamic random access memory) storage units, and the DRAM controller must compensate capacitor leakage charges through periodic refresh operations so as to ensure the correctness of stored data. The study by Bhati et al shows that more than 20% of main memory power consumption is generated by the refresh operation of DRAM, and therefore, reducing the refresh power consumption of DRAM main memory is very important for system energy efficiency optimization.

Although JEDEC specifies a 64ms refresh interval standard, practical studies have shown that 99% of DRAM cell retention times can reach nearly 10s, as shown in fig. 1, and thus, the refresh mechanism of conventional memory systems has a large design space.

Approximate calculation is increasingly regarded as an idea of reducing energy consumption, and particularly, due to the rise of current mobile and embedded devices, many calculation tasks such as media processing (video, audio, image and the like), recognition, data mining and the like do not require that a calculation result is completely correct and have a certain degree of error tolerance capability, however, the "long tail effect" shown by the traditional memory system, namely 99% of effort is a great waste of energy consumption in eliminating 1% of error rate.

In the DRAM, the read-write operation on the memory unit can replace the refresh operation, if the time interval between two consecutive read-write operations on the same address unit is less than the holding time of the unit, the unit can not perform the refresh operation, thereby saving the refresh power consumption, typical approximate computing application such as multimedia, game, audio and video only has a small part of data related to program control flow playing a critical role on the correctness of program execution, the part of data is called critical data, the holding time fault occurring in the part of data has great influence on the correctness of output, other large data sets are not sensitive to the fault including the holding time fault, the part of data is generally called non-critical data, therefore, in order to reduce the refresh operation of the memory as much as possible, the memory access mode of the non-critical data in the approximate computing can be used for remapping the storage in the DRAM, therefore, by combining the distribution of the retention time deviation in the DRAM, the data refreshing operation can be reduced and the refreshing power consumption can be reduced on the premise of not influencing the application user experience through reasonable data remapping.

The following is prior art, as follows:

smart Refresh: JEDEC sets a standard for DRAM to Refresh every 64ms, which is followed by current memory chips, sets a counter in the DRAM memory controller, and when the counter decrements to zero, indicating that 64ms has elapsed, the memory controller resets the counter and sends a Refresh command, which is sent as a Row Refresh (Row Refresh) command shown in fig. 2.

To ensure a correct refresh operation of data, the DRAM cells are essentially read and written back again to restore the charge level of the storage capacitor, while the memory read and write operations themselves refresh the data of the corresponding memory Row cells, so that a refresh operation after a memory Access operation can be avoided, as shown in fig. 2 Row Access (Row Access), which is the best case for SmartRefresh, the Row receives the memory Access request Ak before the refresh command of each Row Rk is sent, so that all refresh operations after the memory Access can be cancelled, thereby saving power consumption.

The basic idea of Smart Refresh is to set an associated counter with a size of 2 bits or 3 bits for each row in Bank, the value of the counter is stored and updated in the memory controller, the value of the counter is decremented from the maximum value to zero within a Refresh interval, and when a memory row is read and written, the value of the corresponding counter is reset to the maximum value and begins to decrement again, the memory controller will only Refresh the memory row with the counter being zero, if the value of the counter is decremented to zero, this means that the memory row must be refreshed, so the memory access operation postpones the Refresh operation by causing the counter to reset, and preferably, as shown in fig. 2, no Refresh command is generated in the memory.

Fig. 3 discusses the mechanism of Smart Refresh operation, for example, as shown in (a) of fig. 3, a counter with 2-bit size is used, and the Refresh interval is 64ms, assuming that no program initiates a memory access request to the DRAM in the whole process, the memory controller will automatically update the value of the counter, decrement it every 16ms, when the counter is decremented to zero, the corresponding memory row needs to be refreshed, the counter values of all memory rows are zero, the memory controller needs to Refresh all rows, since the Refresh command cannot be executed in parallel, the performance of the memory system will be seriously affected at this time, and (b) of fig. 3, the value of the counter is initialized from 0 to 3, which can avoid the situation that a large number of memory rows need to be refreshed at the same time, however, there are some problems in such a scheme, firstly, randomly initializing the counter means that the memory row of 1/4 needs to be refreshed each time, even when the initialization is started, the counter value of 1/4 is 0, and secondly, during the operation, since the read-write request of the memory will reset the register to the maximum value, it is possible to face the situation shown in the graph (a) in fig. 3.

In order to solve the above problems, Smart Refresh takes the idea of staggering the initial value of the counter and the decrementing operation at the same time, for example, in (b) of fig. 3, the memory rows are divided into N groups (the size of N depends on the size of the memory Refresh queue, N is 4 in the figure, each group includes counters of 16 memory rows), the original scheme updates the counters at 0ms, 16ms, 32ms and 48ms, the current scheme further disperses each previous update, and if the original counter update at 0ms corresponds to:

the counter for the 1 st memory line in all 4 packets is updated at 0ms,

the counter for the 2 nd memory line in all 4 packets is updated at 1ms,

……

the counter for the 16 th memory line in all 4 packets is updated at 15 ms.

The essence of the above scheme is to reduce the time interval granularity of the update counter, so as to disperse the update operation, and at the same time, the number of rows to be refreshed is correspondingly reduced, as shown in fig. 3, N memory rows need to be refreshed every 1ms, and N is set according to the size of the refresh command request queue of the DRAM memory controller, thereby avoiding performance degradation caused by a large number of refresh operations blocking normal requests.

Flikker: the Flikker technology separates non-critical data with tolerance capability to errors in application data and carries out low-frequency refreshing based on tolerance of the application to the errors, so that power consumption is reduced, and the Flikker completes the low-frequency refreshing of the non-critical data through combination of software and hardware.

In terms of hardware, Flikker divides each DRAM Bank into a normal refresh region for ensuring that critical data is correct and a low frequency refresh region for non-critical data for saving power consumption, as shown in fig. 4.

Firstly, a programmer needs to label key data when writing an application; secondly, in the program running process, the system needs to store the key data and the non-key data into a normal refreshing area and a low-frequency refreshing area in the memory respectively; then the operating system configures an autonomous refresh counter and switches the DRAM into an autonomous refresh mode (in the mobile operating system, the operating system gives refresh control to the DRAM when the processor enters a sleep mode for saving power consumption); and finally, the self-refresh controller respectively refreshes different regions of the DRAM at different frequencies according to the configuration parameters of the operating system. The software aspect requires modifications to the application source code, runtime system, and operating system to work in conjunction, as shown in particular in fig. 5.

The dynamic memory DRAM refresh control optimization method for approximate calculation application has few domestic and foreign research results, and the existing research results mainly have the following problems: firstly, how to ensure the correct operation of computer application and the service quality of the computer application on the premise of reducing the refreshing frequency and the refreshing operation; secondly, how to design the refresh frequency of different address blocks of the DRAM of the dynamic memory, and the refresh cost is minimized; thirdly, the refresh times cannot be minimized by combining the memory address allocation through the one-to-one matching of the data address and the memory address space. The three points directly result in three defects of the existing achievement: firstly, the refresh frequency is too high to fully utilize the fault tolerance of the approximate application; secondly, the refresh scheme is unreliable, resulting in application errors or affecting output quality; third, higher area and power consumption overhead.

The following are the technical problems in the prior art, as follows:

SmartRefresh: 1) smart refresh introduces a timer for each eDRAM (embedded DRAM) row, and avoids redundant refreshing of the eDRAM row by recording the last refresh/access time stamp for each eDRAM row, but the method mainly aims at the eDRAM on chip, and the expense of the introduced counter is too large for a large-capacity DRAM memory; 2) SmartRefresh can save refresh power consumption by 52.6% on average, and the refresh power consumption can be saved by more than 99% by the method.

Flikker: flikker needs a programmer to mark non-critical data and combines with a compiler operating system and the like to work cooperatively, so that the technology is too complex and the application difficulty is higher. Flikker reduces the refresh rate to non-key data, sets up ground refresh rate region in DRAM, it is limited to save the refresh consumption, and this technique can save most refresh consumption almost.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a low-overhead DRAM refreshing method and system for approximate application.

The invention provides an approximate application-oriented low-overhead DRAM refreshing method, which comprises the following steps:

a static matching mapping step, namely acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance;

and a dynamic threshold value adjusting step, periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching corresponding memory rows in the DRAM retention time distribution.

The step of static matching mapping comprises

(1) The global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group (P)_i,T_ij),

Wherein P is_iRepresenting the address of page i, T, of the memory_ijA timestamp representing the jth access to page i, and a set V is obtained according to a set D, each element in the set V being a binary group (P)_i,V_ij),

Wherein P is_iThe address, V, of a page i representing the memory_ijAnd the difference of the time stamps of j and j-1 accesses to the page i is represented, and the difference is the reuse distance.

(2) For the retention time distribution information of DRAM, represented by a set R, where each element is a doublet (R)_k,rt_k),

R_kAddress, rt, representing memory line k_kIt represents the hold time for memory line k.

(3) For each P in the set V_iFind the corresponding memory line R in the set R_kFor a certain page P_iHas a set (P)_i,V_ij),

First find page P_iThe maximum value of all access reuse distances, denoted as (P)_imaxV), then distribute the set (R) in the hold time_k,rt_k),

Finding rt in all memory lines with retention time exceeding maxV_kSmallest memory row R_kTo page P_iAnd R_kAnd (4) matching, and repeating the step (3) until all the pages in the set V find the corresponding memory row mapping.

The dynamic threshold adjusting step comprises:

1. predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;

2. according to the threshold value, finding out the memory row which is not distributed and has the minimum holding time from all the memory rows with the holding time larger than the threshold value;

3. and matching the page with the memory line with the minimum holding time.

4. The approximation-application-oriented low-overhead DRAM refresh method of claim 1, further comprising error controlling the dynamic threshold adjustment step.

The error control specifically includes: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period T_kThen the error rate of page i is e_ik. At T_kAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be re_ik＝e_ikE, then obtain the control weight values, as follows

cw_ik＝P*re_ik+I*(re_ik+re_i(k-1))+D*(re_ik-re_i(k-1))

According to the control weight value cw_ikDetermines whether page i is migrated.

The invention also provides an approximate application-oriented low-overhead DRAM refreshing system, which comprises the following components:

the static matching mapping module is used for acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance;

and the dynamic threshold adjusting module is used for periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals and matching corresponding memory rows in the DRAM retention time distribution.

The static matching mapping module comprises

(1) The global access information comprises: visit toMemory address, memory type, time stamp, and form a set D, each element in the set D being a binary group (P)_i,T_ij),

The dynamic threshold adjustment module comprises:

3. and matching the page with the memory line with the minimum holding time.

Further comprising performing error control on the dynamic threshold adjustment step.

cw_ik＝P*re_ik+I*(re_ik+re_i(k-1))+D*(re_ik-re_i(k-1))

According to the scheme, the invention has the advantages that:

after the mapping and the migration of the program data stored in the memory are carried out, the error rate of the static matching mapping method is almost zero, the error rate of the dynamic matching mapping method can be controlled within 0.7 percent, and the two methods can save the original refreshing energy consumption by more than 99 percent.

Drawings

FIG. 1 is a DRAM retention time profile;

FIG. 2 is a schematic diagram of Smart Refresh;

FIG. 3 is a diagram of a Smart Refresh counter operating mechanism;

FIG. 4 is a Flikker DRAM Bank structure diagram;

FIG. 5 is a Flikker system block diagram;

FIG. 6 is a general schematic diagram of a DRAM refresh control method that takes into account error tolerance characteristics;

FIG. 7 is a general block diagram of a DRAM refresh control method based on static/dynamic matching mapping;

fig. 8 is a block diagram of a PID error controller.

Detailed Description

Based on the above observation, the inventor tries to replace refresh operation with read-write request, proposes a static and dynamic matching mapping method to match the interval (reuse distance) of application access to the memory with the retention time distribution of the DRAM, and finally achieves the purpose of reducing or even eliminating refresh, and at this time, the DRAM becomes a "nonvolatile" device, i.e., NV-DRAM. In recent years, along with the appearance of approximate calculation and big data, the application does not require complete correctness for output quality, and only needs to control errors within a certain range, so that the possibility is provided for the idea of replacing refresh operation by read-write request, and meanwhile, in order to ensure that the error rate of the matching mapping method is within a controllable range, the inventor provides a PID error controller based on an industrial control theory.

For an application running, historical access information can be collected, reuse distance information (time interval of accessing the same memory row twice continuously) of the application is obtained through simple processing, meanwhile, retention time distribution information of different rows of a DRAM can be obtained off line, the reuse distance and the retention time distribution are matched through a static and dynamic matching mapping method, namely, contents stored in the memory row with the access interval of t are migrated to the memory row with the retention time of more than t for storage, because the read-write operation of the DRAM can be considered as the refreshing of a read-write row unit, after the matching, the normal read-write operation replaces the original refreshing mechanism, the correctness of data is ensured while the read-write request is completed, the energy consumption of the original refreshing mechanism is correspondingly disappeared, and the input of the whole system corresponds to the reuse distance and the retention time distribution as shown in FIG. 6, and finally giving a matching mapping result to guide the application to be stored in the memory through a static and dynamic matching mapping method.

Fig. 7 is a structure diagram of a DRAM refresh control structure based on static matching mapping and dynamic matching mapping, where the main components of the DRAM refresh control structure based on static matching mapping are smp (stateful mapping) units, whose functions are to give a mapping scheme for storing data in DRAM during its running process before the program runs through application access information memory trace and DRAM retention time distribution time obtained offline, as shown in (a) of fig. 7, and the DRAM refresh control structure based on dynamic matching mapping in (b) of fig. 7 mainly has three components, which are dtp (dynamic threshold mapping), PID-EC, and migration/modification page table, respectively. The DTP, i.e. the dynamic threshold adjustment method, is used to dynamically adjust the matching mapping result in combination with the access information access trace collected during the application running process and the previous matching mapping effect (i.e. the error rate statistics unit in fig. 7), and whenever the DTP gives the matching mapping result, the Migration unit migrates the corresponding content to the matched row, and further modifies the page table, thereby ensuring that the operating system can find the physical address of the migrated memory through the original virtual page address. The PID-EC (PID Error controller) refers to the PID Error controller, which is widely used in the industry, and aims to find out the page with higher Error rate in the matching and mapping process and then migrate the page with higher Error rate to the normally refreshed area through the page migration unit, thereby reducing the Error rate and ensuring the output quality of the application.

The following are examples of the invention, as follows:

1. DRAM refresh control method based on static matching mapping

In a system such as an embedded system which has no virtual memory and has a fixed corresponding relation between a linear address space and a physical address space, because the running application is relatively fixed and the access mode of the application also has great certainty, the idea of matching and mapping is relatively simple. Firstly, obtaining the global access trace of the application off line, analyzing the maximum reuse distance of each memory row, and matching each memory row to the DRAM row with the retention time larger than the maximum reuse distance.

The static matching Mapping method is divided into two stages of Profiling and Mapping. In the Profiling stage, firstly, a target application needs to be operated off-line, a memory access trace of the target application needs to be collected, reuse distance information is obtained through further processing, and secondly, DRAM retention time distribution needs to be obtained; the Mapping phase matches the reuse distance information of the application to the DRAM retention time distribution.

Profiling early stage program analysis stage

(1) The reuse distance information is applied. Assume that the trace format (i.e. global access information) of each access is: memory access address, memory access type (read/write), timestamp. Then the application access trace collected off-line is set D, and each element in the set is a binary (P) set_i,T_ij),

Wherein P is_iAddress, T, of page i representing memory access_ijIndicating the timestamp of the j-th access to page i. From the set D, we can compute a set V, where each element in the set V is a doublet (P)_i,V_ij),

Wherein P is_iAddress, V, of page i representing memory access_ijThe difference between the timestamps representing the j-th and j-1-th accesses to page i, i.e., the reuse distance of the accesses.

(2) The DRAM holds a time distribution. For the retention time distribution information of DRAM, we represent by a set R, where each element is a doublet (R)_k,rt_k),

R_kIndicating the address, rt, of a memory line k_kIt represents the hold time for memory line k.

Mapping storage object mapping phase

The idea of the static matching mapping method is to provide each page P in the set V_iFind the appropriate memory row R in the set R_kAnd matching is carried out, and finally, the storage of corresponding data in the running process of the program is guided, so that the aim of 'no refreshment' is fulfilled.

For each page P in the set V_iFind the appropriate memory row R in the set R_kFor a certain page P_iHas a set (P)_i,V_ij),

First, find the maximum value of all access reuse distances for the page, denoted as (P)_imaxV), then distribute the set (R) in the hold time_k,rt_k),

Finding rt in all memory lines with retention time exceeding maxV_kSmallest memory row R_kA 1 is to P_iAnd the R_kAnd (6) matching. The above method is repeated until all pages in the set V find the appropriate memory row map.

To avoid a certain page P_iThe maximum reuse distance maxV is far larger than other reuse distances, which results in poor matching effect, so after finding maxV, the rationality of maxV is firstly judged, and if unreasonable, the maximum reuse distance is discarded and searched again. Although this may result in partial errors, the approximation calculation application has no essential impact on the application output quality since it possesses some error tolerance capability.

2. DRAM refresh control method based on dynamic matching mapping

(1) Dynamic threshold adjustment

In the static matching mapping method, the maximum reuse distance of a page is found out firstly, and then the page is matched with a memory row in a DRAM, wherein the retention time of the memory row is greater than the maximum reuse distance of the page, so that the memory unit can be refreshed before the failure of the retention time occurs in the read-write operation of the page, and the original refreshing operation can be completely eliminated. Accordingly, it becomes a critical issue how to predict the maximum reuse distance in the subsequent mapping cycle in each mapping cycle of the dynamic matching mapping and match the appropriate memory rows in the DRAM retention time distribution, and this predicted maximum reuse distance is called a threshold.

The matching mapping based on dynamic threshold adjustment mainly comprises the following steps:

(the matching mapping process of a single page is used for illustration, and in the actual operation process, the following mapping process needs to be repeated for each visited page traversal)

1. At the end of each mapping cycle, the threshold of the page is predicted. The method specifically comprises the steps of obtaining the maximum reuse distance of the current mapping period of the page, evaluating the threshold error rate (the ratio of the number of matching failures to the total number of access times) predicted by the last mapping period, and analyzing and predicting the threshold of the page by combining the information.

2. And finding out the memory line which is not allocated and has the minimum holding time from all the memory lines with the holding time larger than the threshold value by using the threshold value obtained in the last step.

3. Matching the page with the memory line found above. The specific method comprises the following steps: firstly, data of a physical address corresponding to the virtual page is migrated to the matched physical address, and then a page table is modified to enable an original virtual page address to be mapped to a new physical address so as to ensure normal operation of a program.

Assume the trace format for each access to be: memory access address, memory access type (read/write), timestamp. At intervals of time T we_slotCounting the program memory access information in the current mapping period (called mapping period), evaluating the matching effect of mapping, and predicting the reuse distance threshold trend of each page to match. Assume that at time T (T ═ n × T)_slotN is 1,2 …), the collected application access trace is a set T, and each element in the set is a binary group (P)_i,T_ij),

Wherein P is_iAddress, T, of virtual page i representing memory access_ijIndicating the timestamp of the j-th access to page i. Calculating to obtain a set V according to the set T, wherein each element in the set V is a binary group (P)_i,V_ij),

Wherein P is_iAddress, V, of page i representing memory access_ijThe difference between the timestamps representing the j-th and j-1-th accesses to page i, i.e., the reuse distance of the accesses. For the retention time distribution information of DRAM, we represent by a set R, where each element is a doublet (R)_k,rt_k),

The idea of the dynamic matching mapping is that the current time period T is used as a basis_slotFor each page P_iMatching corresponding memory lines R in the set R_kAnd in the matching process, predicting according to the change trend of the threshold effect of the current mapping period and the previous mapping period, and finally providing memory mapping guidance information.

The dynamic prediction method adopts a simple hill climbing algorithm to adjust the threshold value. Specifically, the current mapping period T is each time_slotAt the end, we will count the effect of last mapping period threshold lastThresh in the current period, and calculate the error rate (ErrorRate ═ V #(V)_ij>lastThresh)/#T_ijWhere # (V)_ij>lastThresh) indicates the number of times the access distance exceeds the threshold, # T_ijRepresenting the total number of accesses to the page), if the error rate is greater than the error rate previousrrorrate of the last mapping cycle, proving that lastThresh has poor effect, adopting the threshold of the current time period. If the error rate is less than PreviousErrorRate, then a decision needs to be made according to the trend of the threshold value, and the specific flow is shown as Algorithm 1.

(2) PID error controller

In actual operation, most errors are concentrated on a certain page. Therefore, the pages with higher error rates are considered to be migrated out, and normal refreshing is carried out to ensure that the data is correct. If the error rates of all pages are directly counted, the page with the highest error rate is migrated at the end of each mapping period, and the error rate is not accurately controlled, so that the error rate may be far less than the given error rate control upper limit, but the migration cost exceeds the power consumption saved by avoiding refreshing due to excessive migration of memory pages, and finally the loss is not paid.

As shown in fig. 8, the structure of PID error controller, in each control cycle of PID, the error rate statistic component will count the error rate of all current pages, and based on the above error rate information and the given error rate controller level, calculate the relative error rate of each page and give it to PID error controller for processing, and the PID error controller decides which pages need to be migrated according to the calculated control amount, and then sends migration command to DRAM.

Error control level E is given first, meaning that the PID is to control the error rate of the system around E. In each control cycle, the error rate of all pages is counted. Assume that the current control period is the kth control period T_kThen the error rate of page i is e_ik. At T_kAt the end, the statistical information of all the pages is evaluated in a traversal mode to determine whether to migrate or not. For example, for page i, its relative error rate in the current time slice is first calculated as re_ik＝e_ikE, then the weight values of the PID controllers are obtained as follows

cw_ik＝P*re_ik+I*(re_ik+re_i(k-1))+D*(re_ik-re_i(k-1))

The static matching mapping module comprises

The dynamic threshold adjustment module comprises:

3. and matching the page with the memory line with the minimum holding time.

cw_ik＝P*re_ik+I*(re_ik+re_i(k-1))+D*(re_ik-re_i(k-1))

Wherein, P and I are proportional parameter and integral parameter in PID control algorithm, the empirical value of the parameter can be set by manually adjusting and searching the value with the lowest error rate through actual condition, K is time parameter, which specifically represents the current control period number.

Claims

1. An approximation application-oriented low-overhead DRAM refresh method, comprising:

a static matching mapping step, namely acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance; the method specifically comprises the following steps: the global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group

Wherein P is_iRepresenting the address of page i, T, of the memory_ijA timestamp representing the jth visit to the page i, and a set V is obtained according to the set D, wherein each element in the set V is a binary group

Wherein P is_iThe address, V, of a page i representing the memory_ijThe difference value of the timestamps of j and j-1 visit to the page i is represented, and the difference value is the reuse distance; for the retention time distribution information of DRAM, represented by the set R, where each element is a doublet

R_kAddress, rt, representing memory line k_kThen the retention time of memory line k is represented; for each P in the set V_iFind the corresponding memory line R in the set R_kFor a certain page P_iHas a collection

First find page P_iThe maximum value of all access reuse distances, denoted as (P)_imaxV), then distribute the collection at hold time

Finding rt in all memory lines with retention time exceeding maxV_kSmallest memory row R_kTo page P_iAnd R_kMatching is carried out until all pages in the set V find corresponding memory row mapping;

and a dynamic threshold value adjusting step, periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching all the memory rows with the minimum retention time and the unallocated memory rows with the retention time larger than the threshold value in the DRAM retention time distribution.

2. The approximate application-oriented low-overhead DRAM refresh method of claim 1, wherein the dynamic threshold adjustment step comprises:

predicting a threshold value of the page when each mapping period is finished, wherein the maximum reuse distance of the current mapping period of the page is obtained, and the error rate of the threshold value predicted by the last mapping period is evaluated;

and matching the page with the memory line with the minimum holding time.

3. The approximation-application-oriented low-overhead DRAM refresh method of claim 1, further comprising error controlling the dynamic threshold adjustment step.

4. The approximate application-oriented low-overhead DRAM refresh method of claim 3, wherein the error control specifically comprises: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period T_kThen the error rate of page i is e_ik(ii) a At T_kAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be re_ik＝e_ikE, then obtain the control weight values, as follows

cw_ik＝P*re_ik+I*(re_ik+re_i(k-1))+D*(re_ik-re_i(k-1))

According to the control weight value cw_ikDetermines whether page i is migrated; p, I, D are the proportional, integral and derivative parameters of the PID control algorithm, respectively.

5. An approximation application-oriented low-overhead DRAM refresh system, comprising:

the static matching mapping module is used for acquiring global memory access information of an application in an off-line manner, analyzing the maximum reuse distance of each memory line in the global memory access information, and transferring the content in each memory line to the memory line with the storage time longer than the maximum reuse distance; the method specifically comprises the following steps: the global access information comprises: the access address, the access type and the time stamp form a set D, and each element in the set D is a binary group

and the dynamic threshold adjusting module is used for periodically predicting the maximum reuse distance of each mapping period according to the historical mapping result at intervals, and matching all the memory rows with the retention time larger than the threshold value, which are not allocated and have the minimum retention time, in the DRAM retention time distribution.

6. The approximation-application-oriented low-overhead DRAM refresh system of claim 5, wherein the dynamic threshold adjustment module comprises:

and matching the page with the memory line with the minimum holding time.

7. The approximation-application-oriented low-overhead DRAM refresh system of claim 5, further comprising error-controlling the dynamic threshold adjustment step.

8. The approximation-oriented equation of claim 7A low overhead DRAM refresh system for use, wherein said error control comprises: firstly, setting an error control level E, counting error rates of all pages in each control period, and assuming that the current control period is the kth control period T_kThen the error rate of page i is e_ik(ii) a At T_kAt the end, the statistical information of all pages is evaluated in a traversal mode to determine whether to migrate, wherein for the page i, the relative error rate of the page i in the current time segment is calculated to be re_ik＝e_ikE, then obtain the control weight values, as follows

cw_ik＝P*re_ik+I*(re_ik+re_i(k-1))+D*(re_ik-re_i(k-1))