CN111435309A

CN111435309A - Register allocation optimization implementation method

Info

Publication number: CN111435309A
Application number: CN201910025726.3A
Authority: CN
Inventors: 代向东; 刘贵山; 常涛; 栗志国; 岑辉林; 董军平
Original assignee: China Standard Software Co Ltd
Current assignee: China Standard Software Co Ltd
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2020-07-21

Abstract

The invention relates to a register allocation optimization implementation method, which comprises the following steps: step S1: coloring the integer register by a Chaitin-Briggs optimistic graph coloring algorithm; step S2: storing the corresponding integer register into a buffer register through a save instruction at a place where the variable needs to be overflowed; step S3: coloring a conflict graph of variables needing overflowing through a Chaitin-Briggs optimistic graph coloring algorithm; step S4: establishing a conflict graph for the overflowing variables, and distributing buffer registers for the overflowing variables according to the priority; step S5: overflow variables not allocated to the buffer register are overflowed into memory. The invention provides an efficient overflow strategy by utilizing the buffer register of the modern processor, so that the limited buffer register achieves an optimal allocation among integer variables needing to be overflowed into the memory, thereby improving the code execution efficiency and reducing the access and storage expenses.

Description

Register allocation optimization implementation method

Technical Field

The invention relates to the technical field of data allocation optimization, in particular to a register allocation optimization implementation method.

Background

A register is a relatively small number of high-speed storage devices in a CPU. The limited size and relatively fast storage speed of registers compared to ordinary memory make registers a critical resource in most machine architectures. Register allocation is therefore a link in the back-end of the compiler that requires efficient configuration by the compiler. Register allocation refers to the process by which a compiler assigns values present in a program to processor-limited physical registers. Register allocation algorithms play an important role in compilation optimization. Modern architecture processors have multiple levels of memory banks and the speed of the banks is inversely related to the capacity. In all the memory banks, the register is the memory bank with the fastest operation speed and the minimum capacity, so that the register resources are reasonably distributed, the pressure of the register can be reduced, the access and storage expenses of key variables are reduced, and the performance of a program is improved.

Register allocation determines which variables (variables, temporary variables, constants) need to be loaded into registers during program execution. For these reasons, register allocation is particularly important in RISC architectures, where most operations, other than data transfers, are performed in registers, whereas in modern CISC architectures register-to-register operations are performed faster than corresponding memory access operations.

The goal of register allocation is to reduce register overflow by a reasonable allocation method. Even with better optimization algorithms, the limited number of registers can lead to the requisite overflow, and the key to optimization is how to minimize the cost of overflow. And establishing overflow cost of each active variable through heuristic information, and when overflow is needed, preferentially overflowing the variables with low cost values while distributing registers to the variables with high cost values as much as possible. This minimizes the overall cost of register allocation.

Register allocation is an ever-present problem in compilers because it is the phase that the compiler must go through before outputting assembly code. The performance and size of the generated code are related to the quality of the register allocation algorithm. In pursuit of extreme performance, many compilers have done many articles on register allocation, and good register allocation can increase program execution speed by over 250% despite introducing very complex algorithms.

For the optimization of the register, the current register allocation algorithm includes a chain-design graph coloring algorithm, a linear scanning algorithm (linear scan), an integer linear programming algorithm (integer linear programming) for the register allocation problem, a priority-based register allocation algorithm proposed by chow et al, a composite register allocation algorithm proposed by nicherson, and the like. Regarding the register allocation algorithm, the GCC currently supports CB (chain-branches graph coloring algorithm) and priority (priority-based register allocation algorithm), and CB is used by default. It is proposed to employ a priority-based shading strategy for the allocation of buffer registers.

After the shading process is completed, some variables that are not allocated to the hardware registers may overflow. However, overflow involves memory accesses, which increases the time for program execution. If the register allocation can be optimized to reduce the amount of overflow, the performance of the program will be improved.

The prior art is considered from an optimization algorithm, and in the architecture of modern cpu, a cache is provided. It is hardware dedicated to binary translation, which operates like memory, with only two instructions, save (save) and restore (restore). Compared with the memory, the biggest difference is that the operation period is faster, and generally only 1 to 2 beats exist, so that the variable overflowing by utilizing the method can bring considerable benefits. However, the limited number of buffer registers limits the use of the buffer registers to memory, and it is particularly important to distribute the overflow variables reasonably among them.

The cache is a cache between the memory and the register, and because the number of the registers in the cpu is relatively small, some variables in the program execution process are temporarily stored in the cache, so that a large number of memory IO operations can be avoided. There will be a great improvement in the performance of the program.

Disclosure of Invention

In order to provide an optimization scheme based on an architecture for register allocation in an industrial control system, the invention provides a register allocation optimization implementation method, which comprises the following steps:

step S1: coloring the integer register by a Chaitin-Briggs optimistic graph coloring algorithm;

step S2: storing the corresponding integer register into a buffer register through a save instruction at a place where the variable needs to be overflowed;

step S3: coloring a conflict graph of variables needing overflowing through a Chaitin-Briggs optimistic graph coloring algorithm;

step S4: establishing a conflict graph for the overflowing variables, and distributing buffer registers for the overflowing variables according to the priority;

step S5: overflow variables not allocated to the buffer register are overflowed into memory.

In step S4, when the overflow variable to which the buffer register is allocated needs to be restored, the corresponding overflow variable is restored from the buffer register by a restore instruction.

In step S4, when the overflowed variable belongs to an integer variable cross-pass call, the buffer register is divided into a caller reserved register and a callee reserved register, wherein the callee reserved register is dedicated to save and restore the overflowed integer variable across the process.

And for all the registers reserved by the callee, when entering a function body, the value of the corresponding register reserved by the callee needs to be saved at the head of the function, and meanwhile, the corresponding value is restored to the corresponding register reserved by the callee after the function call is finished.

In step S4, for all overflow variables to which buffer registers are allocated, a save instruction is added after their fixed values, and a restore instruction is added before their references.

In step S4, for the temporary register in the integer register, an save instruction is added after the fixed value of the temporary register, a restore instruction is added before the temporary register is referred to, and a tmp store instruction is added after the restore instruction and a tmp load instruction is added before the save instruction.

The register allocation optimization implementation method provided by the invention utilizes the buffer register of the modern processor to provide an efficient overflow strategy, so that the limited buffer register achieves an optimal allocation among integer variables needing to be overflowed into the memory, thereby improving the code execution efficiency and reducing the memory access overhead.

Drawings

FIG. 1: the invention provides an implementation flow chart of a register allocation optimization implementation method.

Detailed Description

In order to further understand the technical scheme and the advantages of the present invention, the following detailed description of the technical scheme and the advantages thereof is provided in conjunction with the accompanying drawings.

The invention considers from the register allocation angle in the aspect of machine architecture, designs an efficient overflow strategy, and enables the limited buffer registers to achieve an optimal allocation scheme among integer variables needing to be overflowed into the memory, thereby improving the code execution efficiency. Specifically, the program execution speed is increased by allocating the program variables to the registers as much as possible.

If the machine architecture implements a buffering mechanism, the access overhead can be reduced. By using the buffer register provided in the architecture, the variable is preferentially overflowed to the buffer register when overflowing, and the variable is overflowed to the memory when the buffer register is exhausted. However, the buffer registers are not arbitrarily accessible like other general purpose registers, and access to them is limited to reservation and restore operations, corresponding to save and restore instructions, respectively, in the following format:

SAVE Ra.rq，#b.ib RESTORE#b.ib,Rc.wq

Buffer[#b]<-Rav Rc<-Buffer[#b]

wherein Ra in save instruction format represents the integer register to be saved, and # b.ib represents the corresponding buffer register, namely, the content of the register Ra is saved in the corresponding unit of the CPU internal cache, and the instruction is used for realizing the fast reservation of the integer register; rc in the restore instruction format indicates the integer register to be restored, and # b.ib indicates the corresponding buffer register, that is, the content of the corresponding location in the CPU internal cache is restored to the register Rc.

Fig. 1 is a flowchart illustrating an implementation of a register allocation optimization implementation method provided by the present invention, and as shown in fig. 1, the register allocation optimization implementation method provided by the present invention mainly includes the following steps based on the above-mentioned instructions and functions of the buffer register:

1. when the integer register overflow is processed, the chain-branches optimistic graph coloring algorithm is firstly carried out to start coloring the conflict graph.

2. The integer register is saved to the buffer register by save instruction where overflow is needed.

3. And establishing a conflict graph, calculating cost, and coloring the conflict graph by using a chaitin-briggs optimistic graph coloring algorithm. The buffer register is directed to variables that need to be overflowed during the register allocation process, and therefore its allocation process is similar to the register allocation process. The integer variables that overflow in this process will again "register allocate" except that they are allocated buffer registers this time. Here, the chain-branches algorithm flow may also be performed, where a conflict graph is built for overflowing integer variables, a priority is calculated for each overflowing variable, and then they are sorted according to priority. The high priority variable will get the buffer register first. Nodes in the conflict graph are colored by buffer registers.

4. Where restoration is needed, it is restored from the buffer register using the restore instruction. This eliminates a large number of store and load instructions and thus gains in execution time. In contrast to memory, a buffer register has certain hardware limitations, and the limited number makes it not as free-standing as memory. It is therefore desirable to design an efficient overflow strategy that allows the limited buffer registers to achieve an optimal allocation among integer variables that need to be overflowed into memory, thereby improving the efficiency of code execution.

5. For the case of an integer variable cross-procedure call, the buffer register is divided into two parts, namely a caller reservation part and a callee reservation part, and the buffer register saved by the callee is used for saving and restoring the integer variable of the cross-procedure. When all the buffer registers are used up, the remaining integer variables are directly spilled into memory.

The register allocation optimization implementation method of the invention comprises the following preferred implementation modes:

1. for a variable with an allocated buffer register, a save instruction is added after its constant value and a restore instruction is added before its reference. Equivalent to replacing the store and load instructions with save and restore instructions, respectively, the original active interval can still be divided into many partial sub-active intervals, and the partial sub-active intervals are limited in a single basic block and finally handed to do _ load (reloading or partial register allocation) for processing. Register Allocation if a variable is encountered across processes, this variable is assigned a register reserved by the callee. For all the registers reserved by the callee, every time a function body is entered, the values in the registers are saved at the head of the function, and the corresponding values are restored to the registers after the function call is ended. Therefore, the register reserved by the callee can be used by other variables in the function body, and the value correctness of the register can be ensured. Similarly, if the integer variable to be overflowed is also called across the program, the buffer register allocated to it must also be saved and restored as necessary. The following were used:

for example, the variable a is overflowed in the function func, and a buffer register br is allocated thereto. But in the function call () the variable b is overflowed, also allocating the buffer register br. Thus, when we recover the value of a from br again, it is not the value of a originally stored in br, but the value of b. Therefore, the value of br must be saved and restored at the head and tail of call () respectively, so that the correctness of the program can be guaranteed. The following were used:

2. a restore instruction and a save instruction are added at the head and tail of call () respectively, and the value of br is temporarily saved into a temporary register tmp. However, for tmp, the restore instruction is a fixed value to it, and the save instruction is a reference to it, which is equivalent to generating an active interval spanning the whole function. When register allocation is performed on call (), the active interval corresponding to tmp almost intersects all other active intervals. Since it has only one fixed value and reference, and the fixed value and reference are located at the head and tail of the function, respectively, there is no doubt a lower execution frequency relative to other active intervals, so that the priority value of tmp will be very small. That is, during the priority-based register allocation process, the chance that tmp is allocated to a register is very small, and it is almost certain that tmp will be one of the objects of overflow. If call () is itself a procedure, tmp may be overflowed into a buffer register reserved by some caller; otherwise, if tmp is also the active interval of the cross-process call, since it itself holds the value of the callee-reserved buffer register, it is allocated a callee-reserved buffer register br, and br must also perform the necessary reservation and restore operations. This in turn creates a new temporary register, like tmp, which will eventually result in all the callee reserved buffers being reserved and restored once, which is clearly not practical. Therefore, in this case, we overflow tmp into the memory directly, i.e. add a tmp store instruction after the restore instruction and a tmp load instruction before the save instruction. The following were used:

in do _ load (reload or local register allocation), a buffer register may be used as well. Before overflowing variables into memory, an attempt is made to allocate buffer registers for them.

In summary, in the process of register allocation, the use of the buffer register will greatly improve the result of register allocation, reduce the amount of overflow, and improve the performance of the compiler. Therefore, when compiling the system to be optimized, adding buffer registers in the machine architecture can be considered, so as to optimize the result of register allocation.

The register allocation optimization implementation method provided by the invention utilizes the buffer registers provided in the architecture, the buffer registers are preferentially overflowed to the buffer registers when variables overflow, and the buffer registers are overflowed to the memory when the buffer registers are exhausted. The invention designs an efficient overflow strategy, so that the limited buffer registers achieve an optimal allocation among integer variables needing to be overflowed into the memory, thereby improving the code execution efficiency and reducing the access and storage expenses.

In the present invention, the "RISC" refers to a reduced instruction set Computer, and is collectively called a reducedinstractsionset Computer.

In the present invention, the term "CISC" refers to a complex instruction set Computer, which is called a complete Computer.

Although the present invention has been described with reference to the preferred embodiments, it should be understood that the scope of the present invention is not limited thereto, and those skilled in the art will appreciate that various changes and modifications can be made without departing from the spirit and scope of the present invention.

Claims

1. A register allocation optimization implementation method is characterized by comprising the following steps:

2. The register allocation optimization implementation of claim 1, wherein: in step S4, when the overflow variable to which the buffer register is allocated needs to be restored, the corresponding overflow variable is restored from the buffer register by the restore instruction.

3. The register allocation optimization implementation of claim 1, wherein: in step S4, for the case that the overflowed variable belongs to the span call of the integer variable, the buffer register is divided into a caller-reserved register and a callee-reserved register, wherein the callee-reserved register is dedicated to save and restore the overflowed integer variable of the span procedure.

4. A register allocation optimization implementation as claimed in claim 3, wherein: for all the registers reserved by the callee, every time a function body is entered, the value of the corresponding register reserved by the callee needs to be saved at the head of the function, and meanwhile, the corresponding value is restored to the corresponding register reserved by the callee after the function call is finished.

5. The register allocation optimization implementation of claim 1, wherein: in step S4, for all overflow variables allocated with buffer registers, an save instruction is added after their fixed values, and a restore instruction is added before their references.

6. The register allocation optimization implementation of claim 1, wherein: in step S4, for the temporary register in the shaping register, an save instruction is added after its fixed value and a restore instruction is added before its reference, and a tmp store instruction is added after its restore instruction and a tmp load instruction is added before its save instruction.