CN112947931B

CN112947931B - Wear-leveling compiling method for cyclic rotation group based on phase change memory

Info

Publication number: CN112947931B
Application number: CN202110198085.9A
Authority: CN
Inventors: 李清安; 王紫微
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-02-22
Filing date: 2021-02-22
Publication date: 2023-10-03
Anticipated expiration: 2041-02-22
Also published as: CN112947931A

Abstract

The invention discloses a wear-leveling compiling method of a cyclic rotation group based on a phase change memory, which comprises the steps of firstly collecting hot spot variables in a memory global data area; secondly, carrying out inter-process analysis to eliminate certain situations that the transfer group cannot be carried out; then inserting an instruction for generating an array into a precursor basic block of a circulation inlet, replacing reading and writing of a global variable into reading and writing of the generated array in the circulation body, inserting an instruction at an outlet of the circulation, and assigning a value which is finally written into the array to the global variable; and finally, linking the optimized intermediate expression assembly into an executable file. The invention can effectively reduce the write peak value of the global data area and has better performance than the prior method. Meanwhile, the method comprises a hot spot variable detection method, an inter-process analysis method and a method for carrying out escape analysis on global variables, and provides basic algorithm support for a cyclic revolution grouping method.

Description

Wear-leveling compiling method for cyclic rotation group based on phase change memory

Technical Field

The invention belongs to the technical field of computer storage, relates to a wear-leveling compiling method of a cyclic rotation group based on a phase change memory (Phase Change Memory, PCM), and particularly relates to a wear-leveling compiling method of a global data area of the cyclic rotation group based on the phase change memory.

Background

The computer storage system comprises a memory and an external memory. The most widely used main Memory is Dynamic Random-Access Memory (DRAM). The main problem at the present stage of DRAM is the limited integration density, the required hardware space is relatively large, while PCM has an integration density 2-4 times that of DRAM and can be integrated onto smaller chips. Further, DRAM is a semiconductor memory, and for DRAM, charge needs to be periodically charged to refresh the charge. PCM is a non-volatile memory and does not require refresh operations, so that the power consumption is lower.

In order to solve the problems of low integration density, high energy consumption and the like of the DRAM, research on nonvolatile memory (Non-VolatileMemory, NVM) has been advanced, and among them, phase change memory (Phase Change Memory, PCM) is considered as one of the most promising alternatives of the DRAM due to its characteristics of high integration density, low leakage power and the like.

However, PCM has some disadvantages compared to DRAM, as shown in table 1.

TABLE 1

Performance of	DRAM	PCM
			Read latency	20-50ns	50-500ns
Write latency	20-50ns	100ns-10us
			Read energy consumption	0.8J/GB	1J/GB
Write power consumption	1.2J/GB	6J/GB
			Write durability	>10 ¹⁶	10 ⁵ -10 ⁹
Density of	1×	2-4×

Write endurance of PCM of 10 ⁵ -10 ⁹ Second, and DRAM write endurance greater than 10 ¹⁶ And twice. Poor PCM write endurance is a major factor impeding its development, and memory write imbalance can accelerate PCM corruption. To address the write endurance problem of PCM, researchers have proposed various approaches to different memory partitions. In the previous studies, most were directed to a wear-leveling method of a heap or stack area of PCM, and no wear-leveling compilation method was used to study a global data area.

Disclosure of Invention

The invention aims to solve the problems and provides a wear-leveling compiling method for a global data area based on a phase change memory. The method converts the reading and writing of global variables in the source program circulation into the reading and writing of the generated array with the specified size so as to reduce the frequent reading and writing of global hot variables.

The technical scheme adopted by the invention is as follows: a wear-leveling compiling method of a cyclic rotation group based on a phase change memory comprises the following steps:

step 1: collecting write times information of global variables of a source program;

step 2: calculating the size of an array to be generated according to the write times of the global variable in the step 1;

the specific implementation of the step 2 comprises the following sub-steps:

step 2.1: according to the global variable write times information in the step 1, counting the average value W of the global variable write times _avg ；

Step 2.2: statistics of global hotspot variable information, if global variable G _i Write times W _i Average value W of global variable write times generated in step 2.1 _avg Ratio W of (2) _i /W _avg If the threshold value N is greater than or equal to the threshold value N, the global variable is considered to be a global hot spot variable G _hot ；

Step 2.3: calculation of W _i And W is equal to _avg The magnitude of the ratio is the global hot spot variable G _hot Size of the required array _i ；

Step 2.4: identification name of global hotspot variable _i Assigned memory address value _i And the required arraySize _i Writing into a disk file;

step 3: generating an intermediate representation of the source program, namely LLVM intermediate language IR;

step 4: reading global hot spot variable information generated in the step 2 from a disk through LLVM Pass;

step 5: based on the intermediate representation generated in the step 3, performing inter-process analysis to obtain a function set directly and indirectly called by each function contained in the program;

step 6: for each function, collecting information about loops, namely, inlet and outlet basic blocks of the loops, and uniquely determining one loop through the inlet and outlet basic blocks of the loops;

step 7: collecting global hot spot variables read and written in a cycle, and generating a global hot spot variable set read and written in the cycle;

step 8: performing escape analysis on the global hot spot variable set of the cyclic read-write obtained in the step 7;

step 9: performing cyclic revolution grouping operation on the global hot spot variable which does not escape and is obtained in the step 8;

step 10: the converted intermediate representation assemblies are linked into new executable files.

Compared with the prior art, the invention has the following advantages and beneficial effects:

(1) The invention creatively provides a wear-leveling compiling method for a cyclic rotation group provided by a global area.

(2) Compared with the prior heuristic algorithm of SWL, the heuristic algorithm is not necessarily optimal and has poor performance, the SWL is a method based on access information of collected data, and the method is a compiling method based on LLVM IR, so that the method has better performance and can obviously reduce the writing peak value of a global area.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention;

FIG. 2 is a schematic diagram of a different cycle structure according to an embodiment of the present invention.

Detailed Description

In order to facilitate the understanding and practice of the invention, those of ordinary skill in the art will now make further details with reference to the drawings and examples, it being understood that the examples described herein are for the purpose of illustration and explanation only and are not intended to limit the invention thereto.

Referring to fig. 1, the wear-leveling compiling method for a cyclic rotation group based on a phase change memory provided by the invention converts an input intermediate representation through a conversion Pass written by LLVM, and converts reading and writing of a global variable into reading and writing of the array; the method specifically comprises the following steps:

in this embodiment, the specific implementation of step 1 includes the following sub-steps:

step 1.1: acquiring the symbol table information of ELF (Executable and Linking Format) executable files, and further collecting global variable information of the source program, wherein the global variable information comprises a variable allocated memory address, type, size and identifier;

step 1.2: redirecting the global variable information collected in the step 1.1 to a disk file;

step 1.3: and (3) performing instruction instrumentation on the binary file by adopting an Intel pin tool during operation, dynamically instrumentation is performed on a write instruction of a program by inputting the memory address information of the global variable collected in the step (1.1), and if the operation address of the write instruction is the memory address of the global variable collected in the step (1.1), the write frequency of the global variable is increased by 1.

Step 2: calculating the size of an array to be generated according to the information about the writing times of the global variable in the step 1;

in this embodiment, the specific implementation of step 2 includes the following sub-steps:

Step 2.2: statistics of global hotspot variable information, if global variable G _i Write times W _i Average value W of global variable write times generated in step 2.1 _avg Ratio W of (2) _i /W _avg When the value is equal to or greater than the threshold value N (the implementation threshold value n=100), the global variable is regarded as the global hot spot variable G _hot ；

Step 2.4: identification name of global hotspot variable _i Assigned memory address value _i And the size of the required array _i Writing into a disk file;

in this embodiment, LLVM intermediate representation source. Ll is generated for the input source program source. C using the command clip-exit-LLVM-S- < input files >.

in this embodiment, the global hot-spot variable G in step 2 is read from disk by LLVM's conversion Pass _hot Information to detect global hot spot variables, prepare for the set of rotation loops.

in this embodiment, the specific implementation of step 5 includes the following sub-steps:

step 5.1: traversing each function F _a The method comprises the steps of detecting whether a function contains a function call instruction or not; if so, the operand of the call instruction, i.e. the called function F, is obtained _b Obtaining F after traversing _a Direct-call function setAt this time, the one-to-one correspondence between the function and the function set directly called by the function is obtained: f→set _F ；

Step 5.2: for function F of step 5.1, set is traversed _F Each function f in (1) obtaining set _f I.e. the set is updated by the set directly called by function f _F ＝set _F ∪set _f Set of sets _F The obtained set when the size of (1) is unchanged _F All the function sets called directly and indirectly for the function F, i.e. all the function sets reachable by the function F.

in this embodiment, for each function, its basic block is traversed, and all loop entry/exit basic block loops contained by the function are collected _begin /loop _end . Such as the three different loops shown in fig. 2 and their corresponding IR expressions: for loops, while loops, do-while loops, each loop has a corresponding entry basic block and exit basic block, from which a particular loop can be uniquely determined.

in this embodiment, for each of the entry and exit basic blocks of the loop collected in step 6, whether the instructions traversing all basic blocks contained in the loop are for the global hot-spot variable G read in step 4 _hot Reading and writing are carried out, and all read-written global hot spot variables are marked to generate a circulation read-write global hot spot traversal set

in this embodiment, other functions F e set called directly or indirectly for the function F to which the loop entry/exit basic block belongs _F Wherein set is _F All functions reachable for the function F obtained in step 5.2Set of numbers _F The method comprises the steps of carrying out a first treatment on the surface of the If the invocation of function f occurs inside a loop uniquely determined by the loop entry/exit basic block and function f references or modifies the global hot-spot variable of the loop read-write collected in step 7The global hotspot variable G _hot If escape occurs, if the global hot-spot variable is replaced by an array, the value of the array needs to be assigned to the global hot-spot variable before the function F calls the function F, so that the function F uses the correct value of the global hot-spot variable, which can generate an extra write operation in the cyclic body, so that the escaping global hot-spot variable does not perform array conversion, the escaping global hot-spot variable is skipped to directly process the next global hot-spot variable, and otherwise, the next conversion work is performed.

in this embodiment, the specific implementation of step 9 includes the following sub-steps:

step 9.1: basic block loop at loop entry _begin The LLVM IR instruction is inserted at the end of the predecessor basic block, including the generated array aray [ size ] replacing the global variable _i ]The size of the array is the size of the array corresponding to the global hot spot variable generated in the step 2, and the 0 th element aray [0 ] of the array]Initializing to the value of a global hotspot variable;

step 9.2: traversing all basic block instructions contained in the loop, and replacing the basic block instructions with the array generated in the step 9.1 if the operand of the read/write instruction is the global hot spot variable;

step 9.3: basic block loop at the exit of the loop _end And inserting an instruction at the end, and assigning the last written value of the array to the global hot spot variable.

Step 10: linking the converted intermediate representation assembly into a new executable file;

in this embodiment, the converted IR file may be converted into an assembly file by the llc command of the LLVM, and then a new executable file circularly converted into an array may be generated by assembling and linking.

The invention provides a loss balance compiling method of a global data area based on a cyclic rotation group (loop 2 array) of a phase change memory (Phase Change Memory, PCM), firstly, a loop2array algorithm collects hot spot variables of the global area; secondly, the loop2array algorithm performs inter-process analysis to eliminate certain situations that the array cannot be turned; then inserting an instruction for generating an array into a precursor basic block of a loop inlet by a loop2array algorithm, replacing reading and writing of a global variable into reading and writing of the generated array in a loop body, and assigning a value finally written into the array to the global variable by inserting the instruction into a loop outlet; and finally, linking the optimized intermediate expression assembly into an executable file. The invention creatively provides a wear-leveling compiling method for a cyclic rotation group of a global area. The wear-leveling compiling method of the cyclic rotation group based on the phase change memory can effectively reduce the writing peak value of a global data area, the loop2array algorithm does not convert all global variables, only converts global hot variables, and the loop2array does not need to obtain a large amount of memory access information but is realized through the intermediate representation conversion Pass of the LLVM, so that the method has better performance than the existing method. Meanwhile, the algorithm comprises a hot spot variable detection method, an inter-process analysis method and a method for carrying out escape analysis on global variables, and provides basic algorithm support for a cyclic revolution grouping method.

It should be understood that parts of the specification not specifically set forth herein are all prior art.

It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.

Claims

1. A wear-leveling compiling method of a cyclic rotation group based on a phase change memory is characterized by comprising the following steps:

the specific implementation of the step 2 comprises the following sub-steps:

Step 2.2: counting global hot spot variable information; if global variable G _i Write times W _i Average value W of global variable write times generated in step 2.1 _avg Ratio W of (2) _i /W _avg If the threshold value N is greater than or equal to the threshold value N, the global variable is considered to be a global hot spot variable G _hot ；

the specific implementation comprises the following substeps:

step 5.1: traversing each function F _a The method comprises the steps of detecting whether a function contains a function call instruction or not; if so, the operand of the call instruction, i.e. the called function P, is obtained _b Obtaining F after traversing _a Direct-call function setAt this time, the one-to-one correspondence between the function and the function set directly called by the function is obtained: f→set _F ；

Step 5.2: for function F of step 5.1, set is traversed _F Each function f in (1) obtaining set _f I.e. the set is updated by the set directly called by function f _F ＝set _F ∪set _f Set of sets _F The obtained set when the size of (1) is unchanged _F All function sets directly and indirectly called for the function F, namely all function sets reachable by the function F;

wherein, other functions F epsilon set directly or indirectly called for the function F of the basic block of the loop inlet/outlet _F Wherein set is _F Set for all functions reachable by the function F obtained in step 5.2 _F The method comprises the steps of carrying out a first treatment on the surface of the If the invocation of function f occurs inside a loop uniquely determined by the loop entry/exit basic block and function f references or modifies the global hot-spot variable of the loop read-write collected in step 7The global hotspot variable G _hot If the global hot-spot variable is replaced by an array, the value of the array needs to be assigned to the global hot-spot variable before the function F calls the function F, so that the function F uses the correct value of the global hot-spot variable, which generates an extra write operation in the loop body, so that the escaping global hot-spot variable does not perform array conversion and skips the escaping global hot-spot variableThe hot spot variable directly processes the next global hot spot variable, otherwise, the next conversion work is carried out;

2. The wear-leveling compilation method of a cyclic rotation group based on a phase change memory according to claim 1, wherein the specific implementation of step 1 comprises the following sub-steps:

step 1.1: acquiring symbol table information of an ELF executable file, and collecting global variable information of a source program, wherein the global variable information comprises a variable allocated memory address, type, size and identifier;

3. The wear-leveling compilation method of a cyclic rotation group based on a phase change memory according to claim 1, wherein: in step 3, LLVM intermediate representation source.ll is generated for the input source.c. using the command clone-emit-LLVM-S- < input files >.

4. The wear-leveling compilation method of a cyclic rotation group based on a phase change memory according to claim 1, wherein: in step 4, the global hot-spot variable G in step 2 is read from the disk through the LLVM conversion Pass _hot Information to detect global hot spot variables, prepare for the set of rotation loops.

5. The wear-leveling compilation method of a cyclic rotation group based on a phase change memory according to claim 1, wherein: in step 6, for each function, traversing its basic blocks, collecting all loop entry/exit basic block loops contained by the function _begin /loop _end 。

6. The wear-leveling compilation method of a cyclic rotation group based on a phase change memory according to claim 1, wherein: in step 7, for each of the entry and exit basic blocks of the loop collected in step 6, traversing all basic blocks of the loop to determine whether the instructions of the basic blocks are for the global hot-spot variable G read in step 4 _hot Reading and writing are carried out, and all read-written global hot spot variables are marked to generate a circulation read-write global hot spot traversal set

7. The wear-leveling compilation method of a cyclic rotation group based on a phase change memory according to claim 1, wherein the specific implementation of step 9 comprises the following sub-steps:

step 9.1: basic block loop at loop entry _begin The LLVMIR instruction is inserted at the end of the predecessor basic block, including the generated array aray [ size ] replacing the global variable _i ]The size of the array is the size of the array corresponding to the global hot spot variable generated in the step 2, and the 0 th element aray [0 ] of the array]Initializing to the value of a global hotspot variable;

8. The wear-leveling compilation method of the cyclic rotation group based on the phase change memory according to any one of claims 1 to 7, wherein: in step 10, the converted IR file can be converted into an assembly file through the llc command of the LLVM, and then a new executable file circularly converted into an array can be generated through assembly and linking.