CN112925566A - Method and device for establishing virtual register living interval and compiling method and device - Google Patents

Method and device for establishing virtual register living interval and compiling method and device Download PDF

Info

Publication number
CN112925566A
CN112925566A CN201911244068.3A CN201911244068A CN112925566A CN 112925566 A CN112925566 A CN 112925566A CN 201911244068 A CN201911244068 A CN 201911244068A CN 112925566 A CN112925566 A CN 112925566A
Authority
CN
China
Prior art keywords
type
register
overflow
interval
point number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911244068.3A
Other languages
Chinese (zh)
Other versions
CN112925566B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cambricon Technologies Corp Ltd
Original Assignee
Cambricon Technologies Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cambricon Technologies Corp Ltd filed Critical Cambricon Technologies Corp Ltd
Priority to CN201911244068.3A priority Critical patent/CN112925566B/en
Publication of CN112925566A publication Critical patent/CN112925566A/en
Application granted granted Critical
Publication of CN112925566B publication Critical patent/CN112925566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/441Register allocation; Assignment of physical memory space to logical memory space

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The application provides a method and a device for establishing a virtual register living interval, a compiling method and a compiling device and electronic equipment. The electronic device may include: one or more processing units; a storage unit for storing one or more programs; when executed by the one or more processors, cause the one or more processors to implement a method according to an embodiment of the application.

Description

Method and device for establishing virtual register living interval and compiling method and device
Technical Field
The present application relates to the field of computing technologies, and in particular, to a method and an apparatus for establishing a virtual register living interval, a compiling method and an apparatus, an electronic device, and a computer-readable medium.
Background
In a chip architecture, there are typically a variety of registers, such as scalar registers and vector registers. When compiling code for a chip with a certain architecture, one of the tasks to be completed is register allocation. Register allocation is a method of increasing the execution speed of a program by allocating program virtual registers to registers as much as possible. Register allocation is one of the most important issues in compiler optimization, and good register allocation can greatly improve program execution speed (over 250%). Register allocation involves the creation of a live space or conflict graph (an allocator for graph coloring class algorithms).
The above information disclosed in this background section is only for enhancement of understanding of the background of the application and therefore it may contain information that does not constitute prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
The application aims to provide a method and a device for allocating registers, a compiling method and a compiling device and electronic equipment, which can improve the code execution performance.
This user characteristic and advantage of the present application will become apparent from the detailed description below or may be learned in part by practice of the present application.
According to an aspect of the present application, a method for establishing a virtual register lifetime interval is provided, including:
obtaining an intermediate representation of a source file, the intermediate representation comprising a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register;
establishing ascending or descending program point numbers for the instructions in sequence;
representing a virtual register living interval by using the program point number, wherein the virtual register living interval comprises a living interval of the first type of virtual register and a living interval of the second type of virtual register;
when the first-type virtual register overflows, inserting a first-type overflow interval into a corresponding first-type overflow point, and establishing a first-type overflow point number for the first-type overflow interval;
when the second type virtual register overflows, inserting a second type overflow interval into a corresponding second type overflow point, establishing a second type overflow point number for the second type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other.
According to another aspect of the present application, there is provided a method for compiling, including:
obtaining an intermediate representation of a source file, the intermediate representation comprising a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register;
establishing ascending or descending program point numbers for the instructions in sequence;
representing a virtual register living interval by using the program point number, wherein the virtual register living interval comprises a living interval of the first type of virtual register and a living interval of the second type of virtual register;
allocating a first type register for the survival interval of the first type virtual register;
when the first-type virtual register overflows, inserting a first-type overflow interval into a corresponding first-type overflow point to increase a first-type overflow pseudo instruction and a second-type virtual register, and establishing a first-type overflow point number for the first-type overflow interval;
allocating a second type register for the survival interval of the second type virtual register and allocating a second type register for the survival interval of the added second type virtual register;
when the second type virtual register overflows, inserting a second type overflow interval at a corresponding second type overflow point to increase a second type overflow pseudo instruction, establishing a second type overflow point number for the second type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, the first attribute, the second attribute, and the third attribute are incompatible with each other, and
wherein the width of the first type register is greater than the width of the second type register.
According to another aspect of the present application, there is provided an electronic device including: one or more processing units; a storage unit for storing one or more programs; when executed by the one or more processors, cause the one or more processors to implement the aforementioned methods.
According to another aspect of the application, a computer-readable medium is provided, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the aforementioned method.
According to the embodiment of the application, according to the example embodiment, the program point numbers are integrated into the commonly used survival interval analysis to form a unified survival interval representation mode, and the survival interval representation mode is combined with the analysis flow, so that the compiling time can be reduced, and simultaneously, the object code with higher quality can be generated.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
FIG. 1A illustrates an example of a graph coloring algorithm allocation register.
FIG. 1B shows an example of a linear scan algorithm allocation register.
Fig. 2 illustrates a method of establishing a virtual register live interval according to an example embodiment of the present application.
Fig. 3 illustrates a compiling method according to an example embodiment of the present application.
Fig. 4 illustrates a method of allocating registers according to an example embodiment of the present application.
Fig. 5 illustrates a method of allocating registers according to another example embodiment of the present application.
Fig. 6 illustrates an apparatus for establishing a virtual register lifetime interval according to an exemplary embodiment of the present application.
Fig. 7 illustrates a compiling apparatus according to an example embodiment of the present application.
Fig. 8 shows a block diagram of an electronic device according to an example embodiment of the present application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the subject matter of the present application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the application.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. One skilled in the art will appreciate that the embodiments described herein can be combined with other embodiments.
Before describing the embodiments of the present application, some terms referred to in the embodiments of the present application are explained.
Compiling: compilation is the conversion of an Intermediate Representation (IR) into another IR. Compilation is divided into front-end, middle-end and back-end. The backend derived object code is closely related to the target architecture.
Scalar register: registers for performing scalar calculations.
Vector register: registers used for special purposes, which are generally wider than scalar registers, can be viewed as being made up of a fixed number of scalar registers.
Physical register: the registers that are actually present in the target architecture may include scalar registers and vector registers.
Intermediate Representation (IR): compilers are typically organized as a series of processing passes, which must be passed from pass to pass as the compiler continually deduces knowledge about the code being compiled. Thus, for all facts about the program to be deduced, the compiler needs a representation called intermediate representation, abbreviated IR. The IR may include a pseudo-instruction. IR is divided into multiple layers. IR at higher layers (more abstract) may not have the notion of virtual registers, only variables; while the lower level (more specific, closer to the target machine) IR may not have the notion of variables, only the notion of virtual and physical registers.
Virtual register: one number of operands in the source file during compilation. In the process of converting from the higher layer IR to the lower layer IR, the concept of "variable" will be mapped to the concept of "virtual register". Therefore, in the present application, the virtual register and the variable may be mixed, have the same meaning, and those skilled in the art can understand the use thereof according to the context. There may be an unlimited number of "virtual registers," but they are ultimately mapped to a limited number of physical registers on the target machine.
Register allocation: the process of mapping virtual registers to physical registers. In code compilation for a processor of a certain architecture, register allocation is a method for increasing the execution speed of a program by allocating virtual registers to registers (physical registers) as much as possible, i.e., making as many variables as possible reside in the registers. The register allocation process can be understood as establishing a many-to-one mapping relationship between virtual registers and physical registers. Register allocation is one of the most important tasks in compilation, and good register allocation can greatly improve the execution speed of a program.
Overflowing: when the number of physical registers is not enough when the registers are allocated, the value in the current register is placed on the other address space (usually a stack), and is retrieved from the other address space when in use. Generally, the overflow of registers onto the stack is slow and the speed of movement between different registers is faster. The simple overflow (spill) algorithm is such that: for a variable of a survival interval selected to overflow, a store instruction is inserted after each definition of the variable, and a load instruction is inserted before each use of the variable. Obviously, this algorithm would insert a large number of redundant store and load instructions.
Program points are as follows: in the IR representation, the position points on both sides of the instruction are called program points, and can be represented and distinguished by numbers.
Survival interval: assuming that the instructions in IR are numbered in sequence (e.g., the order in which the pseudo instructions appear in the intermediate representation), sometimes also referred to as program point number, the survival interval for variable v is the interval [ i, j ], indicating that: there is no instruction numbered j1(j1> j) so that variable v is active at j1, and there is no instruction numbered i1(i1< i) so that variable v is active at i 1. The technical solution of the present application is not limited to the conventional survival interval, but can also be applied to the concept of the survival interval in the strict or optimized sense, such as the collection of the life span (life range). The lifetime (live range) is the set of all program points when the variable is active in the strict sense.
Analyzing a data flow: data flow analysis is a compile-time technique that collects program semantic information from program code and determines the definition and use (def-use) of variables at compile-time through algebraic means. Survival information for variables, i.e., which variables are "alive" at a given program point, can be obtained using dataflow analysis. The survival information gives a clue to the survival interval so that the survival interval for the variable can be easily established by scanning the program points of the intermediate representation. Through data flow analysis, the behavior of the program during running can be discovered without actually running the program, so that the understanding of the program can be facilitated. Dataflow analysis is used to solve problems with compilation optimization, program verification, debugging, testing, parallelism, vectorization, and chip-line programming environments.
Graph coloring algorithm: the register allocation problem is abstracted to a graph coloring problem, where each node in the graph represents the lifetime of a variable, starting from the first time the variable is defined (assigned) until the last time it is used before being assigned the next time. The edge between two nodes indicates that the two variable survival intervals conflict or interfere with each other because of the overlap. Generally, if two variables live at the same time at some point in the function, they conflict and cannot occupy the same register. To use a graph coloring algorithm, a conflict graph is constructed, wherein nodes in the graph represent survival intervals and edges represent conflict relationships. If no edge is connected between two nodes, the two nodes can be distributed to the same register. For example, in FIG. 1A, FIG. 1A shows that 4 registers (r0-r3) are allocated for 6 variables (a-f) without conflict.
Linear scanning algorithm: the linear scanning algorithm allocates registers based on the survival interval of the variables. The linear scanning algorithm stores all variables according to the starting sequence of the variable survival interval, and stores the variables currently allocated with (occupied) registers according to the ending sequence of the variable survival interval. The number of overlap of variable intervals changes only when the start point and the end point of a certain variable survival interval are met, so that the register is considered to be allocated to a new variable survival interval when the new variable survival interval is scanned. When the end point of one variable survival interval is earlier than the start point of another variable survival interval, the register allocated by the variable can be released and reallocated to another variable. FIG. 1B shows an example of allocating 2 registers for a live interval of 5 variables, where A-E represent variables, the horizontal lines to the right of the variables represent live intervals, and the numbers 1-5 represent the steps of allocating registers for a linear scanning algorithm. As can be seen, in step 1, a register is allocated for the survival interval of the variable a. In step 2, a register is allocated to the survival interval of the variable B, and the registers allocated to the survival interval of the variables a and B. In step 3, variable C overflows because there are no registers available for allocation. In step 4, since the register allocated to a can be released, it is reallocated to the live-span of the variable D, now allocated with two live-spans of the variables D and B. In step 5, since the register allocated to B can be released, it is reallocated to the live-span of the variable E, now allocated to two live-spans of the variables D and E.
The technical solution of the present application is explained in detail below.
For multi-register architectures, such as computing architectures that include both vector and scalar registers, there are two common ways of register allocation, the first being to allocate the various registers individually, and the second being to allocate the various registers in a mixed manner.
The first approach is simple to implement and can be generalized to all architectures using similarities in different register allocation procedures. Taking an architecture comprising scalar registers and vector registers as an example, the main idea of the first approach is to treat all types of registers as scalar registers based on the similarity of different register allocation processes, and to do so only when some special treatment is needed.
The second approach can produce higher performance code and require less compile time in some architectures than the first approach. Still taking as an example an architecture comprising scalar registers and vector registers, this way the vector registers are treated as scalar registers before no overflow occurs, and are distributed within the respective registers in the same way. Upon overflow behavior, the virtual scalar registers are overflowed onto the stack, while the virtual vector registers are overflowed first to the scalar registers. And (3) inserting overflow codes and introducing new virtual scalar registers to be allocated, wherein the whole process forms a loop until no virtual registers needing to be allocated exist in the queue.
With the first approach, relatively low performance code is generated since the characteristics of the architecture are not taken into account. For example, vector registers are generally spilled directly into memory when they are not sufficient, which is much slower than first spilling into scalar registers, and then considering whether spilled into memory. In addition, the inventor also finds that, in the compiling process, when different registers are analyzed separately, multiple data flow analyses and creation of a survival interval or a conflict graph are generally selected, which results in slow compiling time.
The second approach, which mixes the allocation of various registers together, takes into account architectural characteristics, and produces better code than the first approach in the above-mentioned case. However, the inventors have found that in some cases, where resources such as vector registers are not particularly stressed, or at some point in time, this approach still only considers vector register spills to scalar registers, and not noticing that scalar registers spill to vector registers may also improve performance. Of course, the existing algorithm framework cannot support the latter overflow mode either. This is because, mixing the allocation of vector registers and scalar registers, it is possible that one scalar register overflows into one field of a vector register and causes all fields of a later vector register to be further overflowed. Thus, when a scalar register overflows into a vector register, it is not known which vector register the overflow is "safe". In addition, the inventor also finds that, during compilation, the method introduces a new expression to the original intermediate representation in the analysis process, and is a dynamic incremental modification process, which is superior to the complete reconstruction (the first method), but can also significantly increase the compilation time. On the other hand, such incremental modification also results in that fine-tuning optimization of the analysis result cannot be performed, because the intermediate representation is changed in the analysis process, and the final distribution result is already finished after the analysis is finished and cannot be adjusted. In this case, higher requirements are placed on the cost model of the distributed subject algorithm and the selection strategy to obtain higher performance analysis results. However, such requirements tend to further increase algorithm convergence time, increasing compilation time.
Based on the above, the present application provides a method for establishing a virtual register living interval and a compiling method, which can reduce the compiling time of register allocation and improve the performance of compiling generated codes.
The following describes a method for establishing a virtual register lifetime interval and a compiling method according to an embodiment of the present application in detail.
Fig. 2 illustrates a method of establishing a virtual register live interval according to an example embodiment of the present application. The method according to the present embodiment may be used to establish a virtual register live interval when allocating registers of different widths.
Referring to fig. 2, at S201, an intermediate representation of a source file is obtained. The intermediate representation includes a first type of virtual register, a second type of virtual register, and instructions that act on the first type of virtual register and the second type of virtual register. The first type of virtual register is a scalar register, and the second type of virtual register is a vector register.
At S203, increasing or decreasing program point numbers are established for the instructions in order.
According to an example embodiment, a uniform program point number is established for instructions acting on the first type of virtual register and the second type of virtual register, such that a survival interval may be established later using the same presentation logic. According to an exemplary embodiment, for example, the increasing or decreasing program point numbers are established in the order in which the pseudo-instructions appear in the intermediate representation, but the application is not limited thereto.
According to some embodiments, the program dot number may be established as n mPN is the sequence number of the program point, and m and P are predetermined natural numbers greater than 1.
For example, in the following intermediate representation, m takes 2, P takes 2, and the program point number is a multiple of 4(2 to the power of 2).
(28)r1=1 (32)
(20)r2=2 (24)
(12)v2=f(r1,r2) (16)
(4)v1=v2+v1 (8)
r1 and r2 are a second type of virtual register (e.g., virtual scalar registers), v2 is a first type of virtual register (e.g., virtual vector registers), and f is a function, indicating that the left side is generated by two virtual register computations on the right side.
In S204, a virtual register living interval is represented by the program point number, where the virtual register living interval includes a living interval of the first type virtual register and a living interval of the second type virtual register.
According to some embodiments, after obtaining the liveness information of the first and second type of virtual registers using, for example, data flow analysis, the live ranges of the first and second type of virtual registers may be easily represented using program point numbers, for example, by performing one scan of the intermediate representation of the source file. For example, assuming that instructions in IR are numbered in sequence (e.g., the sequence of occurrence of pseudo instructions in IR), i.e., program point number, by traversing the program point number of IR, the survival interval [ i, j ] of variable v can be obtained, i.e.: there is no instruction numbered j1(j1> j) so that variable v is active at j1, and there is no instruction numbered i1(i1< i) so that variable v is active at i 1.
For example, in the previous example, r1 assigned the entry value 1 to r1 at program point number 32, r1 started to live at program point 28 and no longer live at the next program point number 12 of program point number 16, so the survival interval for r1 is [28,16 ]. Similarly, r2 has a survival interval of [20,16], and v2 has a survival interval of [12,8 ].
Thus, when scalar registers need to overflow, for example, because the same representation logic is adopted to establish the survival interval, the vector registers which are safe can be easily selected to overflow based on the consistent survival interval representation, thereby fully utilizing various registers, effectively improving the code performance and improving the program execution speed.
In S205, when the first-type virtual register overflows, a first-type overflow interval is inserted into a corresponding first-type overflow point, and a first-type overflow point number is established for the first-type overflow interval.
According to some embodiments, in the process of allocating the first type register and the second type register, if the first type virtual register overflows, a first type overflow interval is inserted into a corresponding first type overflow point to add a first type overflow dummy instruction and a second type virtual register, and a first type overflow point number is established for the first type overflow interval.
For example, according to some embodiments, the first type overflow point number differs from a preceding adjacent program point number by mKAnd K is a predetermined integer less than P.
For example, in the following intermediate representation, K takes 1 and the first type of overflow point number differs from the preceding adjacent program point number by 21I.e. the first type overflow point number is 21Multiple of (2) but not 22Multiples of (a).
(28)r1=1 (32)
(20)r2=2 (24)
(12)v2=f(r1,r2) (16)
(10)r3,r4=spill(v1) (10)
(4)v1=v2+v1 (8)
r3 and r4 are added second type virtual registers (e.g., virtual scalar registers) and the spill (v1) is the first type overflow dummy instruction.
Thus, according to an example embodiment, when the allocation of multiple registers needs to be analyzed, it is not desirable to regenerate the live range for each register, so a policy similar to incremental modification is employed. Based on the current analysis results, the affected survival interval is partially modified, rather than reestablishing the survival interval. When the interval form of the exponent interval is used, the original survival interval does not need to be changed when the overflow code is inserted in a simulation mode.
For example, according to the hierarchical register allocation algorithm, vector registers are found to be insufficient during allocation in the first layer, and are overflowed to scalar registers (in actual operation, a spill pseudo instruction is not inserted in the current stage, and overflow intervals for r3 and r4 are inserted symbolically, and the overflow intervals enter scalar register allocation in the second layer together with survival intervals of r1 and r 2). Since every two program points differ by 4, it is assumed that the spill instruction is inserted at a program point that is a multiple of 2, like 10, but not a multiple of 4. It can be seen that generating such a survival interval does not change the original survival intervals like r1, r2, since the positions with specific properties have been "reserved".
According to the example embodiment, the live interval of the new virtual scalar register generated by the vector register overflowing to the scalar register does not need to be further calculated, and can be directly obtained through the live cycle of the original vector register (for example, the live interval is equal to the live interval of the original virtual vector register; or the set of the overflow interval of the original virtual vector register and the adjacent program point, and the like, and a person skilled in the art can adjust or optimize according to actual conditions and specific algorithms), so that the mode of the vector register overflowing to the scalar register can adopt the same processing mode as the original scalar register.
In S207, when the second type virtual register overflows, a second type overflow interval is inserted into a corresponding second type overflow point, and a second type overflow point number is established for the second type overflow interval.
According to some embodiments, when the second type virtual register overflows, inserting a second type overflow interval at a corresponding second type overflow point for adding a second type overflow dummy instruction, establishing a second type overflow point number for the second type overflow interval, the second type overflow point number differing from a previous adjacent program point number or a first type overflow point number by mQAnd Q is a predetermined integer less than K.
For example, in the following intermediate representation, Q takes 0, and the second type overflow point number differs from the preceding adjacent first type overflow point number by 20I.e. the second type overflow point number is 20Multiple of (2) but not 21Multiples of (a).
(28)r1=1 (32)
(20)r2=2 (24)
(12)v2=f(r1,r2) (16)
(10)r3,r4=spill(v1) (10)
(9)ram3,ram4=spill(r3,r4) (9)
(4)v1=v2+v1 (8)
r3 and r4 are added second type virtual registers (e.g., virtual scalar registers), and the spill (r3, r4) is a second type overflow dummy instruction.
In the example, when scalar registers are allocated at the second layer according to the hierarchical allocation algorithm of registers, each line (including overflow lines) is regarded as original codes (of course, when the first layer is allocated, spill is not really inserted, and here, all survival intervals corresponding to program points which can be divisionally divided by 2 but cannot be divisionally divided by 4 are regarded as real scalar intervals for register allocation analysis except for analyzed), and when the scalar registers are allocated at the "overflow", no codes are actually inserted, and only overflow intervals of the scalar registers are generated.
According to the above-described embodiment of the present application, the program point number, the first type overflow point number, and the second type overflow point number are associated with the position of the corresponding program point. For example, the program point number, the first type overflow point number, and the second type overflow point number are numbered according to the sequence of the positions where the corresponding program points appear. The program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other. For example, a program point is expressed as a multiple of 4, a vector register overflow point is a multiple of 2 but not a multiple of 4, and a scalar register overflow point is an odd number. In this way, increasing or decreasing program point numbers are established for the instructions in the intermediate representation in sequence, registers of different types (widths) can be compatible, and the whole establishment process of the survival interval of the variable can use the same representation logic.
According to some embodiments of the present application described above, the unified setup of increasing or decreasing program point numbers for instructions in the intermediate representation, for example, the program point is expressed as a multiple of 4, the vector register overflow point is a multiple of 2 but not a multiple of 4, and the scalar register overflow point is an odd number, can be compatible with different kinds of registers, so that the whole setup process of the survival interval of the variables can use the same representation logic.
According to the example embodiment, no storage and loading codes are inserted at the overflow point in the process of establishing the survival interval, so that fine tuning and optimization can be performed after the analysis is completed.
According to some embodiments, in the register allocation process, if the first type virtual register overflows, after a first type overflow interval is inserted at a corresponding first type overflow point for adding the first type overflow dummy instruction and the second type virtual register, the survival interval of the added second type virtual register can be directly obtained by using the survival interval of the original first type virtual register. Thus, no additional analysis is required to give the interval of the scalar where overflow occurs. For example, the survival interval of a new virtual scalar register generated when the vector register overflows to the scalar register does not need to be further calculated, and can be directly obtained through the survival cycle of the original vector register, so that the mode that the vector register overflows to the scalar register can adopt the same processing mode as the original scalar register.
For example, for the added second type virtual registers r3 and r4, the survival interval [12,8] of the original first type virtual register v2 can be directly used to determine the survival interval of r3 and r4 as [12,8], or a more precise interval [10, 8] can be used. Of course, the adjustment or optimization can be performed according to the actual situation and the specific algorithm, and will not be described herein again.
According to an example embodiment, when the second type register allocation is performed, the second type register is also allocated for the survival interval [12,8] of the added second type virtual register.
In an example embodiment, the overflow of a scalar register results in an overflow interval for the scalar register. Then, the survival interval of the allocated vector register can be combined for analysis, and under the condition that the survival interval of the allocated vector register is not conflicted, the idle vector register is utilized for overflowing, so that the performance of the compiled code is improved. If the survival interval unified representation logic compatible with various registers is not adopted, the survival interval needs to be recalculated by performing data flow analysis with high cost again.
Thus, according to an example embodiment, the allocation of vector registers and scalar registers are not mixed, but may use a common set of live-range representation logic. For example, as a distinction, a program point that can be divided by 4 in the analysis is a real program point; the point that is divisible by 2 but not by 4 is the virtual program point of the vector register (the overflow point that results from the overflow of the virtual vector register); odd program points (e.g., as in the previous example, can be 20Is divided but cannot be divided by 21Program point for integer division) is a virtual program point for scalar registers (generated by a virtual scalar register overflow)Raw program point). Thus, by specifying the indicated overflow point, a corresponding overflow interval analysis can be performed, as well as recording the location of the insertion of the overflow code.
According to the example embodiment, the program point numbers are integrated into the commonly used survival interval analysis to form a unified survival interval representation mode, and the survival interval representation mode is combined with the analysis flow, so that the compiling time can be reduced, and simultaneously, the object code with higher quality can be generated.
Fig. 3 illustrates a compiling method according to an example embodiment of the present application. The compiling method can use the method for establishing the virtual register living interval.
As described above, during the compilation process, according to the actual computing architecture, registers of different widths, such as a first type register and a second type register, need to be allocated, and the width of the first type register is greater than that of the second type register. For example, the first type of register may be a 512-bit wide vector register and the second type of register may be a 48-bit wide scalar register.
Referring to FIG. 3, referring to FIG. 2, at S301, an intermediate representation of a source file is obtained. The intermediate representation includes a first type of virtual register, a second type of virtual register, and instructions that act on the first type of virtual register and the second type of virtual register.
At S303, increasing or decreasing program point numbers are established for the instructions in order. Referring to the foregoing description of S203 with reference to fig. 2, a detailed description thereof is omitted here.
In S305, the virtual register lifetime section is represented by the program point number. According to an example embodiment, the virtual register lifetime interval comprises a lifetime interval of the first type of virtual register and a lifetime interval of the second type of virtual register. Referring to the foregoing description of S204 with reference to fig. 2, a detailed description thereof is omitted here.
In S307, a first type register is allocated to the live range of the first type virtual register.
Based on the established lifetime interval, the first type of register may be allocated by a linear scanning algorithm. It will be readily appreciated that, here, the first type virtual registers correspond to the same variable type as the first type registers.
In S309, when the first-type virtual register overflows, a first-type overflow interval is inserted into a corresponding first-type overflow point to add a first-type overflow dummy instruction and a second-type virtual register, and a first-type overflow point number is established for the first-type overflow interval. Referring to the foregoing description of S205 with reference to fig. 2, a detailed description thereof is omitted herein.
In S311, a second type register is allocated for the living interval of the second type virtual register and a second type register is allocated for the living interval of the added second type virtual register.
Similar to the allocation of the first type of registers, the second type of registers may be allocated by, for example, a linear scanning algorithm based on the established live range of the second type of virtual registers. It will be readily appreciated that the second type of virtual register and the second type of register correspond to the same variable type here.
As mentioned above, the survival interval of the added second virtual register generated when the first register overflows to the second register does not need to be further calculated, and can be directly obtained by using the survival interval of the original first type virtual register. Therefore, the scalar section generated by the overflow can be given without extra work such as data flow analysis for many times.
According to some embodiments, when allocating a second type register for a living interval of the second type virtual register, if the second type virtual register overflows, first selecting an available first type register to overflow; if no register of said first type is available, overflow to stack space. Since the live-intervals are established using the same presentation logic, it is easy to base a consistent live-interval presentation when, for example, a scalar register needs to overflow. In addition, the allocation of different registers has a certain sequence, for example, after the vector registers are allocated, the scalar registers are allocated. Therefore, the 'safe' vector register can be selected to overflow, so that various registers can be fully utilized, the code performance is effectively improved, and the program execution speed is improved. In other words, by multiplexing the first type register to avoid overflowing to a memory, such as a stack space, the running efficiency of the compiled code can be improved.
In S313, when the second-type virtual register overflows, a second-type overflow interval is inserted into a corresponding second-type overflow point for adding a second-type overflow dummy instruction, and a second-type overflow point number is established for the second-type overflow interval. Referring to the foregoing description of S207 with reference to fig. 2, a detailed description thereof is omitted herein.
According to the above-described embodiment of the present application, the program point number, the first type overflow point number, and the second type overflow point number are associated with the position of the corresponding program point. For example, the program point number, the first type overflow point number, and the second type overflow point number are numbered according to the sequence of the positions where the corresponding program points appear. The program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other. For example, a program point is expressed as a multiple of 4, a vector register overflow point is a multiple of 2 but not a multiple of 4, and a scalar register overflow point is an odd number. In this way, the increasing or decreasing program point numbers are uniformly established for the instructions in the intermediate representation, registers of different types (widths) can be compatible, and the whole establishment process of the survival interval of the variable can use the same representation logic.
According to the example embodiment, no storage and loading codes are inserted at the overflow point in the process of establishing the survival interval, so that fine tuning and optimization can be performed after the analysis is completed.
According to some embodiments, in the register allocation process, if the first type virtual register overflows, after a first type overflow interval is inserted at a corresponding first type overflow point for adding the first type overflow dummy instruction and the second type virtual register, the survival interval of the added second type virtual register can be directly obtained by using the survival interval of the original first type virtual register. Thus, no additional analysis is required to give the interval of the scalar where overflow occurs.
For example, for the added second type virtual registers r3 and r4, the survival interval [12,8] of the original first type virtual register v2 can be directly utilized to determine the survival interval of r3 and r4 to be [12,8], for example. Of course, the adjustment or optimization can be performed according to the actual situation and the specific algorithm, and will not be described herein again.
According to an example embodiment, when the second type register allocation is performed, the second type register is also allocated for the survival interval [12,8] of the added second type virtual register.
In an example embodiment, the overflow of a scalar register results in an overflow interval for the scalar register. Then, the survival interval of the allocated vector register can be combined for analysis, and under the condition that the survival interval of the allocated vector register is not conflicted, the idle vector register is utilized for overflowing, so that the performance of the compiled code is improved. If the survival interval unified representation logic compatible with various registers is not adopted, the survival interval needs to be recalculated by performing data flow analysis with high cost again.
Thus, according to an example embodiment, the allocation of vector registers and scalar registers are not mixed, but may use a common set of live-range representation logic. For example, as a distinction, a program point that can be divided by 4 in the analysis is a real program point; the point that is divisible by 2 but not by 4 is the virtual program point of the vector register (the overflow point that results from the overflow of the virtual vector register); odd program points (e.g., as in the previous example, can be 20Is divided but cannot be divided by 21Program points for integer divisions) are virtual program points for scalar registers (program points resulting from virtual scalar register overflows). Thus, by specifying the indicated overflow point, a corresponding overflow interval analysis can be performed, as well as recording the location of the insertion of the overflow code.
According to the example embodiment, the program point numbers are integrated into the commonly used survival interval analysis to form a unified survival interval representation mode, and the survival interval representation mode is combined with the analysis flow, so that the compiling time can be reduced, and simultaneously, the object code with higher quality can be generated.
According to some embodiments, the compiling method may further include performing code rewriting based on the allocation result and the optimized code, including rewriting a register and inserting an overflow code, and the like. For example, according to the allocation results of the first type register and the second type register, the corresponding first type virtual register and second type virtual register are rewritten. Additionally, code rewriting may also include inserting respective overflow codes at the first type of overflow point and the second type of overflow point, and so on. The register rewriting and inserting of overflow code should be well known to those skilled in the art and will not be described in detail here.
Fig. 4 illustrates a method of allocating registers according to an example embodiment of the present application. The method according to the present embodiment may be used for allocation of registers of different widths, for example for allocating vector registers and scalar registers. The method according to the present embodiment may be used for register allocation during source program compilation.
Referring to fig. 4, in S401, a method for allocating registers according to an exemplary embodiment of the present application includes allocating a first type of register based on a lifetime of the first type of virtual register.
As previously described, according to an example embodiment, live information for the first type of virtual register, i.e., which variables are "live" at a given program point, is obtained using, for example, dataflow analysis. The survival information gives clues to the survival interval. The live-range of the first type of virtual register can then be easily established by scanning the intermediate representation of the source file once. Based on the established lifetime interval, the first type of register may be allocated by, for example, a linear scanning algorithm. It will be readily appreciated that, here, the first type virtual registers correspond to the same variable type as the first type registers.
Referring to fig. 4, in S403, after the allocation of the first type register is completed, the allocation of the second type register is performed based on the living interval of the first type virtual register and the living interval of the second type virtual register.
As described above, the live interval of the first type virtual register and the live interval of the second type virtual register can be obtained by one scan of the intermediate representation by using the same representation logic, so that the register allocation time can be reduced.
Similar to the allocation of the first type of registers, the second type of registers may be allocated by, for example, a linear scanning algorithm based on the established live range of the second type of virtual registers.
Referring to fig. 4, in S405, it is determined whether an overflow of the second type virtual register occurs. If overflow occurs, go to S407; otherwise, subsequent other processing may be performed, such as performing other register allocation or performing code optimization processing based on the register allocation result.
Referring to fig. 4, in S407, when allocating the second type register, if the second type virtual register overflows, the available first type register is selected to overflow first. Therefore, the first type register is multiplexed to avoid overflowing to a memory, such as a stack space, and the running efficiency of the compiled code can be improved.
According to this embodiment, the allocation of different registers has a certain independence, which allows the same set of core algorithms to be applied to different registers. Meanwhile, the allocation of different registers has a certain sequence, for example, after the vector registers are allocated, the allocation of scalar registers is performed, so that the allocation result of the vector registers can be obtained. When a scalar register needs to overflow, a secure policy is selected, such as selecting one vector register to overflow the scalar register without causing an overflow of the vector register. Thus, the code performance and the program execution speed can be effectively improved due to the full utilization of various registers.
Fig. 5 illustrates a method of allocating registers according to another example embodiment of the present application.
The method shown in fig. 5 is substantially similar to the method shown in fig. 4, except that fig. 5 also shows the case where the first type of register overflows and the case where the second type of register overflows into the stack space. Therefore, a detailed description of the same or similar steps will be omitted herein.
Referring to fig. 5, in S501, a first-type register is allocated based on a lifetime of the first-type virtual register.
At S502A, if the first type register overflows, go to S502B; otherwise, go to S503.
At S502B, overflow to the second type register.
In S503, after the allocation of the first type register is completed, the second type register is allocated based on the living interval of the first type virtual register and the living interval of the second type virtual register.
At S505, it is determined whether the second type virtual register overflows. If overflow occurs, go to S506; otherwise, subsequent other processing may be performed, such as performing other register allocation or performing code optimization processing based on the register allocation result.
At S506, it is determined whether there is a first type register available. If there is a first type register available, go to S507; otherwise go to S509.
At S507, the available first type register is selected for overflow. Therefore, the first type register is multiplexed to avoid overflowing to a memory, such as a stack space, and the running efficiency of the compiled code can be improved.
At S509, when there is no available register of the first type, overflow to the stack space is performed.
According to this embodiment, vector registers may be overflowed first to scalar registers upon overflow behavior, and scalar registers may be overflowed onto the stack without vector registers being available. Therefore, the vector register can overflow to the scalar register in advance rather than directly overflowing to the memory, and the execution efficiency of the code can be improved.
The embodiments of the present application have been described above primarily from a method perspective. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the operations or steps of the various examples described in connection with the embodiments disclosed herein. Skilled artisans may implement the described functionality in varying ways for each particular operation or method, and such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Embodiments of the apparatus of the present application are described below. For details not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Fig. 6 illustrates an apparatus for establishing a virtual register lifetime interval according to an exemplary embodiment of the present application. The device can be used for executing the method for establishing the virtual register survival interval.
As shown in FIG. 6, the register allocation apparatus 600 according to an example embodiment includes an intermediate representation module 601, a program point module 603, a first life span module 604, a first overflow module 605, a second overflow module 607.
Referring to fig. 6 and with reference to the foregoing description, the intermediate representation module 601 is configured to obtain an intermediate representation of a source file, the intermediate representation including a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register.
The program point module 603 is used to establish an increasing or decreasing program point number for the instruction in sequence.
The first lifetime interval module 604 is configured to use the program point number to represent a virtual register lifetime interval, where the virtual register lifetime interval includes a lifetime interval of the first type of virtual register and a lifetime interval of the second type of virtual register.
The first overflow module 605 is configured to insert a first-type overflow interval into a corresponding first-type overflow point when the first-type virtual register overflows, and establish a first-type overflow point number for the first-type overflow interval.
The second overflow module 607 is configured to insert a second type overflow interval into a corresponding second type overflow point when the second type virtual register overflows, and establish a second type overflow point number for the second type overflow interval.
According to an example embodiment, the program point number, the first type overflow point number, the second type overflow point number are associated with a position of the respective program point, the program point number having a first attribute, the first type overflow point number having a second attribute, the second type overflow point number having a third attribute, the first attribute, the second attribute and the third attribute being incompatible with each other.
The register allocation apparatus according to this embodiment can perform similar functions to the method provided above, and other functions can be referred to the above description and will not be described herein again.
Fig. 7 illustrates a compiling apparatus according to an example embodiment of the present application. The apparatus may be adapted to perform the compiling method described above.
As shown in fig. 7, the compiling apparatus 700 according to an example embodiment includes an intermediate representation module 701, a program point module 703, a first life span module 705, a first allocation module 707, a first overflow module 709, a second allocation module 713, and a second overflow module 715.
Referring to fig. 7 and with reference to the previous description, the intermediate representation module 701 is configured to obtain an intermediate representation of a source file, the intermediate representation including a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register.
The program point module 703 is used to establish an increasing or decreasing program point number for the instruction in sequence.
The first life interval module 705 is configured to use the program point number to represent a virtual register life interval, where the virtual register life interval includes a life interval of the first type of virtual register and a life interval of the second type of virtual register.
The first allocating module 707 is configured to allocate a first type register for a lifetime interval of the first type virtual register.
The first overflow module 709 is configured to insert a first-type overflow interval at a corresponding first-type overflow point when the first-type virtual register overflows, so as to add a first-type overflow dummy instruction and a second-type virtual register, and establish a first-type overflow point number for the first-type overflow interval.
The second allocating module 713 is configured to allocate a second type register to the living interval of the second type virtual register and allocate a second type register to the living interval of the added second type virtual register, where a width of the first type register is greater than a width of the second type register.
The second overflow module 715 is configured to insert a second-type overflow interval at a corresponding second-type overflow point when the second-type virtual register overflows, so as to increase a second-type overflow dummy instruction, and establish a second-type overflow point number for the second-type overflow interval.
According to an example embodiment, the program point number, the first type overflow point number, the second type overflow point number are associated with a position of the respective program point, the program point number having a first attribute, the first type overflow point number having a second attribute, the second type overflow point number having a third attribute, the first attribute, the second attribute and the third attribute being incompatible with each other.
The compiling apparatus according to this embodiment may perform functions similar to those of the methods provided above, and other functions may be referred to the foregoing description, and will not be described herein again.
Fig. 8 shows a block diagram of an electronic device according to an example embodiment of the present application.
An electronic device 200 according to this embodiment of the present application is described below with reference to fig. 8. The electronic device 200 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 8, the electronic device 200 is embodied in the form of a general purpose computing device. The components of the electronic device 200 may include, but are not limited to: at least one processing unit 210, at least one memory unit 220, a bus 230 connecting different system components (including the memory unit 220 and the processing unit 210), a display unit 240, and the like.
The storage unit 220 stores program code, which can be executed by the processing unit 210, so that the processing unit 210 executes the methods according to the embodiments of the present application described in the present specification.
The storage unit 220 may include readable media in the form of volatile storage units, such as a random access memory unit (RAM)2201 and/or a cache memory unit 2202, and may further include a read only memory unit (ROM) 2203.
The storage unit 220 may also include a program/utility 2204 having a set (at least one) of program modules 2205, such program modules 2205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 230 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 200 may also communicate with one or more external devices 300 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 200, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 200 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 250. Also, the electronic device 200 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 260. The network adapter 260 may communicate with other modules of the electronic device 200 via the bus 230. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 200, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.
Embodiments of the present application also provide a computer program product, which is operable to cause a computer to perform some or all of the steps as described in the above method embodiments.
It is clear to a person skilled in the art that the solution of the present application can be implemented by means of software and/or hardware. The "unit" and "module" in this specification refer to software and/or hardware that can perform a specific function independently or in cooperation with other components, where the hardware may be, for example, a Field-ProgrammaBLE Gate Array (FPGA), an Integrated Circuit (IC), or the like.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The embodiments of the present application have been described and illustrated in detail above. It should be clearly understood that this application describes how to make and use particular examples, but the application is not limited to any details of these examples. Rather, these principles can be applied to many other embodiments based on the teachings of the present disclosure.
Those skilled in the art will readily appreciate from the description of exemplary embodiments that register allocation methods and compilation methods according to embodiments of the present application may have at least one or more of the following advantages.
According to some embodiments, compilation time may be reduced by reducing unnecessary dataflow analysis and avoiding introducing overly complex dependencies (which may even lead to non-convergence).
According to some embodiments, the performance of compiling generated code may be improved by appropriate mutual overflow between different registers.
According to some embodiments, establishing an increasing or decreasing program point number for instructions in the intermediate representation may be compatible with different kinds of registers, so that the entire establishment process of the survival interval of variables may use the same representation logic.
According to some embodiments, no store and load code is inserted at the overflow point in establishing the live interval, and instead a dummy instruction overflows, so that the fine tuning and optimization can be performed after the analysis is completed.
According to some embodiments, if the first type virtual register overflows, after the overflow first dummy instruction and the second type virtual register are added at the corresponding first type overflow point, the survival interval of the added second type virtual register is directly obtained by using the survival interval of the original first type virtual register, so that the compiling efficiency can be improved.
According to some embodiments, the survival interval of the new virtual scalar register generated by the vector register overflowing to the scalar register does not need to be further calculated, and can be directly obtained through the survival cycle of the original vector register, so that the same processing mode as that of the original scalar register can be adopted for the mode that the vector register overflows to the scalar register. Similar intervals can be generated when the scalar registers overflow, so that the survival interval of the vector registers which are already allocated is combined for analysis, and the vector registers which are idle are used for overflow under the condition that the survival interval of the vector registers which are already allocated does not conflict. If the survival interval unified representation logic compatible with various registers is not adopted, the survival interval needs to be recalculated by performing data flow analysis with high cost again.
According to some embodiments, the allocation of vector registers and scalar registers are not mixed, but may use a common set of live-interval representation logic. In this way, by specifying the indicated overflow point, the location of the insertion of the overflow code can be recorded.
According to the example embodiment, the program point numbers are integrated into the commonly used survival interval analysis to form a unified survival interval representation mode, and the survival interval representation mode is combined with the analysis flow, so that the compiling time can be reduced, and simultaneously, the object code with higher quality can be generated.
According to the embodiment of the application, different registers are allocated with certain independence, so that the same core algorithm can be applied to different registers. Meanwhile, the allocation of different registers has a certain sequence, namely layered allocation, for example, after the vector registers are allocated, scalar registers are allocated, so that the allocation result of the vector registers can be obtained. When a scalar register needs to overflow, a secure policy is selected, such as selecting one vector register to overflow the scalar register without causing an overflow of the vector register. Thus, the code performance and the program execution speed can be effectively improved due to the full utilization of various registers.
The foregoing may be better understood in light of the following clauses:
clause 1, a method of establishing a virtual register live interval, the method comprising:
obtaining an intermediate representation of a source file, the intermediate representation comprising a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register;
establishing ascending or descending program point numbers for the instructions in sequence;
representing a virtual register living interval by using the program point number, wherein the virtual register living interval comprises a living interval of the first type of virtual register and a living interval of the second type of virtual register;
when the first-type virtual register overflows, inserting a first-type overflow interval into a corresponding first-type overflow point, and establishing a first-type overflow point number for the first-type overflow interval;
when the second type virtual register overflows, inserting a second type overflow interval into a corresponding second type overflow point, establishing a second type overflow point number for the second type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other.
Clause 2, the method according to clause 1, wherein the program points are numbered n × mPN is the sequence number of the program point, and m and P are predetermined natural numbers greater than 1.
Clause 3, the method according to clause 2, wherein the first type overflow point number differs from the previous adjacent program point number by mKAnd K is a predetermined integer less than P.
Clause 4, the method according to clause 3, wherein the second type overflow point number differs from the previous adjacent program point number or the first type overflow point number by mQAnd Q is a predetermined integer less than K.
Clause 5, a method for compiling, the method comprising:
obtaining an intermediate representation of a source file, the intermediate representation comprising a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register;
establishing ascending or descending program point numbers for the instructions in sequence;
representing a virtual register living interval by using the program point number, wherein the virtual register living interval comprises a living interval of the first type of virtual register and a living interval of the second type of virtual register;
allocating a first type register for the survival interval of the first type virtual register;
when the first-type virtual register overflows, inserting a first-type overflow interval into a corresponding first-type overflow point to increase a first-type overflow pseudo instruction and a second-type virtual register, and establishing a first-type overflow point number for the first-type overflow interval;
allocating a second type register for the survival interval of the second type virtual register and allocating a second type register for the survival interval of the added second type virtual register;
when the second type virtual register overflows, inserting a second type overflow interval at a corresponding second type overflow point to increase a second type overflow pseudo instruction, establishing a second type overflow point number for the second type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, the first attribute, the second attribute, and the third attribute are incompatible with each other, and
wherein the width of the first type register is greater than the width of the second type register.
Clause 6 and clause 5, the method according to clause 5, wherein the program dot number is n × mP, n is a serial number of the program dot, and m and P are predetermined natural numbers greater than 1.
Clause 7, the method of clause 6, wherein the first-type overflow point number differs from the previous adjacent program point number by mK, K being a predetermined integer less than P.
Clause 8, the method according to clause 7, wherein the second-type overflow point number differs from the previous adjacent program point number or the first-type overflow point number by mQ, Q being a predetermined integer less than K.
Clause 9, the method according to clause 5, wherein when allocating a second type of register for the live space of the second type of virtual register, if the second type of virtual register overflows, the available first type of register is selected to overflow first; if no register of said first type is available, overflow to stack space.
Clause 10, the method of clause 9, further comprising:
performing code optimization based on the allocation results of the first type register and the second type register;
and performing code rewriting based on the distribution result and the optimized code.
Clause 11, the method of clause 10, wherein the performing code rewriting based on the allocation result and the optimized code comprises:
rewriting the corresponding first type virtual register and second type virtual register according to the distribution result of the first type register and the second type register; and
inserting corresponding overflow codes at the first type overflow point and the second type overflow point.
Clause 12, an apparatus for establishing a virtual register live interval, the apparatus comprising:
the intermediate representation module is used for obtaining intermediate representation of a source file, and the intermediate representation comprises a first type virtual register, a second type virtual register and an instruction acting on the first type virtual register and the second type virtual register;
the program point module is used for establishing an increasing or decreasing program point number for the instruction in sequence;
the first survival interval module is used for representing a virtual register survival interval by using the program point number, and the virtual register survival interval comprises a survival interval of the first type of virtual register and a survival interval of the second type of virtual register;
the first overflow module is used for inserting a first type overflow interval into a corresponding first type overflow point when the first type virtual register overflows, and establishing a first type overflow point number for the first type overflow interval;
a second overflow module, configured to insert a second-type overflow interval into a corresponding second-type overflow point when the second-type virtual register overflows, and establish a second-type overflow point number for the second-type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other.
Clause 13, a compiling apparatus characterized in that the compiling apparatus comprises:
the intermediate representation module is used for obtaining intermediate representation of a source file, and the intermediate representation comprises a first type virtual register, a second type virtual register and an instruction acting on the first type virtual register and the second type virtual register;
the program point module is used for establishing an increasing or decreasing program point number for the instruction in sequence;
the first survival interval module is used for representing a virtual register survival interval by using the program point number, and the virtual register survival interval comprises a survival interval of the first type of virtual register and a survival interval of the second type of virtual register;
the first allocation module is used for allocating a first type register for the survival interval of the first type virtual register;
the first overflow module is used for inserting a first type overflow interval into a corresponding first type overflow point to increase a first type overflow pseudo instruction and a second type virtual register when the first type virtual register overflows, and establishing a first type overflow point number for the first type overflow interval;
a second allocating module, configured to allocate a second type register to a living interval of the second type virtual register and allocate a second type register to a living interval of the added second type virtual register, where a width of the first type register is greater than a width of the second type register;
a second overflow module, configured to insert a second-type overflow interval at a corresponding second-type overflow point when the second-type virtual register overflows, so as to add a second-type overflow dummy instruction, and establish a second-type overflow point number for the second-type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other.
Clause 14, an electronic device, characterized in that the electronic device comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of any of clauses 1-11.
Clause 15, a computer-readable medium, characterized in that the computer-readable medium has stored thereon a computer program which, when executed by a processor, implements the method according to any of clauses 1-11.
Exemplary embodiments of the present application are specifically illustrated and described above. It is to be understood that the application is not limited to the details of construction, arrangement, or method of implementation described herein; on the contrary, the intention is to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (13)

1. A method for compiling, the method comprising:
obtaining an intermediate representation of a source file, the intermediate representation comprising a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register;
establishing ascending or descending program point numbers for the instructions in sequence;
representing a virtual register living interval by using the program point number, wherein the virtual register living interval comprises a living interval of the first type of virtual register and a living interval of the second type of virtual register;
allocating a first type register for the survival interval of the first type virtual register;
when the first-type virtual register overflows, inserting a first-type overflow interval into a corresponding first-type overflow point to increase a first-type overflow pseudo instruction and a second-type virtual register, and establishing a first-type overflow point number for the first-type overflow interval;
allocating a second type register for the survival interval of the second type virtual register and allocating a second type register for the survival interval of the added second type virtual register;
when the second type virtual register overflows, inserting a second type overflow interval at a corresponding second type overflow point to increase a second type overflow pseudo instruction, establishing a second type overflow point number for the second type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, the first attribute, the second attribute, and the third attribute are incompatible with each other, and
wherein the width of the first type register is greater than the width of the second type register.
2. The method of claim 1, wherein the program points are numbered n x mPN is the sequence number of the program point, and m and P are predetermined natural numbers greater than 1.
3. The method of claim 2, wherein the first type overflow point number differs from a previous adjacent program point number by mKAnd K is a predetermined integer less than P.
4. The method of claim 3, wherein the second type overflow point number differs from the previous adjacent program point number or the first type overflow point number by mQAnd Q is a predetermined integer less than K.
5. The method according to claim 1, wherein when allocating a second type register for a live range of the second type virtual register, if the second type virtual register overflows, the first type register available is selected to overflow first; if no register of said first type is available, overflow to stack space.
6. The method of claim 5, further comprising:
performing code optimization based on the allocation results of the first type register and the second type register;
and performing code rewriting based on the distribution result and the optimized code.
7. The method of claim 6, wherein performing code rewriting based on the allocation result and the optimized code comprises:
rewriting the corresponding first type virtual register and second type virtual register according to the distribution result of the first type register and the second type register; and
inserting corresponding overflow codes at the first type overflow point and the second type overflow point.
8. A method for establishing a virtual register live interval, the method comprising:
obtaining an intermediate representation of a source file, the intermediate representation comprising a first type of virtual register, a second type of virtual register, and instructions acting on the first type of virtual register and the second type of virtual register;
establishing ascending or descending program point numbers for the instructions in sequence;
representing a virtual register living interval by using the program point number, wherein the virtual register living interval comprises a living interval of the first type of virtual register and a living interval of the second type of virtual register;
when the first-type virtual register overflows, inserting a first-type overflow interval into a corresponding first-type overflow point, and establishing a first-type overflow point number for the first-type overflow interval;
when the second type virtual register overflows, inserting a second type overflow interval into a corresponding second type overflow point, establishing a second type overflow point number for the second type overflow interval,
wherein the program point number, the first type overflow point number, the second type overflow point number are associated with a location of the corresponding program point, the program point number has a first attribute, the first type overflow point number has a second attribute, the second type overflow point number has a third attribute, and the first attribute, the second attribute, and the third attribute are incompatible with each other.
9. The method of claim 8, wherein the program points are numbered n x mPN is the sequence number of the program point, and m and P are predetermined natural numbers greater than 1.
10. The method of claim 9, wherein the first step is performedThe difference between the serial number of one type of overflow point and the serial number of the previous adjacent program point is mKAnd K is a predetermined integer less than P.
11. The method of claim 10 wherein the second type overflow point number differs from a previous adjacent program point number or first type overflow point number by mQAnd Q is a predetermined integer less than K.
12. An electronic device, characterized in that the electronic device comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-11.
13. A computer-readable medium, characterized in that the computer-readable medium has stored thereon a computer program which, when being executed by a processor, carries out the method according to any one of claims 1-11.
CN201911244068.3A 2019-12-06 2019-12-06 Method and device for establishing virtual register survival interval and compiling method and device Active CN112925566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911244068.3A CN112925566B (en) 2019-12-06 2019-12-06 Method and device for establishing virtual register survival interval and compiling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911244068.3A CN112925566B (en) 2019-12-06 2019-12-06 Method and device for establishing virtual register survival interval and compiling method and device

Publications (2)

Publication Number Publication Date
CN112925566A true CN112925566A (en) 2021-06-08
CN112925566B CN112925566B (en) 2024-06-25

Family

ID=76161808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911244068.3A Active CN112925566B (en) 2019-12-06 2019-12-06 Method and device for establishing virtual register survival interval and compiling method and device

Country Status (1)

Country Link
CN (1) CN112925566B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283613A (en) * 2021-07-23 2021-08-20 上海燧原科技有限公司 Deep learning model generation method, optimization method, device, equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07234792A (en) * 1994-02-23 1995-09-05 Fujitsu Ltd Compile processor
JPH08234997A (en) * 1995-02-28 1996-09-13 Nec Corp Register allocation method
JPH09311791A (en) * 1996-05-22 1997-12-02 Hitachi Ltd Variable register rule compiling method
JP2001290653A (en) * 2000-04-06 2001-10-19 Matsushita Electric Ind Co Ltd Resource-assigning device
WO2008061154A2 (en) * 2006-11-14 2008-05-22 Soft Machines, Inc. Apparatus and method for processing instructions in a multi-threaded architecture using context switching
JP2009258796A (en) * 2008-04-11 2009-11-05 Toshiba Corp Program development device and program development method
JP2011181114A (en) * 2011-06-23 2011-09-15 Panasonic Corp Device and method for converting program, and recording medium
CN106201641A (en) * 2015-04-29 2016-12-07 龙芯中科技术有限公司 The memory access co mpiler optimization method and apparatus of function
WO2017048646A1 (en) * 2015-09-19 2017-03-23 Microsoft Technology Licensing, Llc Generation and use of memory access instruction order encodings

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07234792A (en) * 1994-02-23 1995-09-05 Fujitsu Ltd Compile processor
JPH08234997A (en) * 1995-02-28 1996-09-13 Nec Corp Register allocation method
JPH09311791A (en) * 1996-05-22 1997-12-02 Hitachi Ltd Variable register rule compiling method
JP2001290653A (en) * 2000-04-06 2001-10-19 Matsushita Electric Ind Co Ltd Resource-assigning device
WO2008061154A2 (en) * 2006-11-14 2008-05-22 Soft Machines, Inc. Apparatus and method for processing instructions in a multi-threaded architecture using context switching
JP2009258796A (en) * 2008-04-11 2009-11-05 Toshiba Corp Program development device and program development method
JP2011181114A (en) * 2011-06-23 2011-09-15 Panasonic Corp Device and method for converting program, and recording medium
CN106201641A (en) * 2015-04-29 2016-12-07 龙芯中科技术有限公司 The memory access co mpiler optimization method and apparatus of function
WO2017048646A1 (en) * 2015-09-19 2017-03-23 Microsoft Technology Licensing, Llc Generation and use of memory access instruction order encodings

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SIDIROGLOU-DOUSKOS,ETC: "Automatic Discovery and Patching of Buffer and Integer Overflow Errors", 《 CSAIL TECHNICAL REPORTS 》, 26 May 2015 (2015-05-26) *
杨旸: "基于GCC的软件流水技术的研究", 《中国优秀博硕士学位论文全文数据库 (硕士) 信息科技辑》, 15 December 2006 (2006-12-15), pages 138 - 338 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283613A (en) * 2021-07-23 2021-08-20 上海燧原科技有限公司 Deep learning model generation method, optimization method, device, equipment and medium
CN113283613B (en) * 2021-07-23 2021-11-09 上海燧原科技有限公司 Deep learning model generation method, optimization method, device, equipment and medium

Also Published As

Publication number Publication date
CN112925566B (en) 2024-06-25

Similar Documents

Publication Publication Date Title
US6523173B1 (en) Method and apparatus for allocating registers during code compilation using different spill strategies to evaluate spill cost
EP2656208B1 (en) Agile communication operator
Chow et al. The priority-based coloring approach to register allocation
Blelloch et al. Provably good multicore cache performance for divide-and-conquer algorithms
KR101360512B1 (en) Register allocation with simd architecture using write masks
Bauer et al. Singe: Leveraging warp specialization for high performance on gpus
Kolte et al. Load/store range analysis for global register allocation
JP2011527788A5 (en)
US20130139135A1 (en) Optimization method for compiler, optimizer for a compiler and storage medium storing optimizing code
JP2011527788A (en) Efficient parallel computation of dependency problems
Jang et al. A code generation framework for VLIW architectures with partitioned register banks
US8701098B2 (en) Leveraging multicore systems when compiling procedures
Beaty Instruction scheduling using genetic algorithms
Li et al. Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs
CN112925566A (en) Method and device for establishing virtual register living interval and compiling method and device
CN112925567A (en) Method and device for distributing register, compiling method and device and electronic equipment
Hank Machine independent register allocation for the IMPACT-I C compiler
Guo et al. Writing productive stencil codes with overlapped tiling
Lin et al. PALF: compiler supports for irregular register files in clustered VLIW DSP processors
Ottoni et al. Offset assignment using simultaneous variable coalescing
Brisk et al. Optimal polynomial-time interprocedural register allocation for high-level synthesis and asip design
KR100912114B1 (en) A Memory Assignment Method for X-Y Data Transfer
Zhuang et al. A framework for parallelizing load/stores on embedded processors
Hummel et al. Using program visualization for tuning parallel-loop scheduling
Wu et al. Copy propagation optimizations for VLIW DSP processors with distributed register files

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant