CN102193811A - Compiling device for eliminating memory access conflict and realizing method thereof - Google Patents

Compiling device for eliminating memory access conflict and realizing method thereof Download PDF

Info

Publication number
CN102193811A
CN102193811A CN201110073683XA CN201110073683A CN102193811A CN 102193811 A CN102193811 A CN 102193811A CN 201110073683X A CN201110073683X A CN 201110073683XA CN 201110073683 A CN201110073683 A CN 201110073683A CN 102193811 A CN102193811 A CN 102193811A
Authority
CN
China
Prior art keywords
memory
variable
internal memory
unit
allocation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201110073683XA
Other languages
Chinese (zh)
Other versions
CN102193811B (en
Inventor
肖贺
孔吉
刘佩林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Fujitsu Ltd
Original Assignee
Shanghai Jiaotong University
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, Fujitsu Ltd filed Critical Shanghai Jiaotong University
Priority to CN201110073683.XA priority Critical patent/CN102193811B/en
Publication of CN102193811A publication Critical patent/CN102193811A/en
Application granted granted Critical
Publication of CN102193811B publication Critical patent/CN102193811B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a compiling device for eliminating a memory access conflict and a realizing method thereof, belonging to the technical field of computers. The device comprises a front end language analysis unit, a memory variable analysis unit, a memory conflict eliminating unit and a conversion output unit, wherein the front end language analysis unit is connected with the memory variable analysis unit and the memory conflict eliminating unit and is used for transmitting an intermediate language sequence converted by a source program, a memory model and a memory operating information and source file functional dependence tree; the memory variable analysis unit is connected with the memory conflict eliminating unit and is used for transmitting variable memory allocation and memory block operating information; the memory conflict eliminating unit is connected with the conversion output unit and is used for transmitting intermediate language information optimized by memory allocation; and the conversion output unit is used for outputting a finial executable code. Due to the adoption of the method, additional overhead caused by memory conflict is reduced greatly.

Description

Eliminate compilation device and its implementation of internal storage access conflict
Technical field
What the present invention relates to is a kind of device and method of field of computer technology, specifically is a kind of compilation device and its implementation of eliminating the internal storage access conflict.
Background technology
But compiler is the program that higher level lanquage is converted to the machine effective language.Compiler is in transfer process, need distribute the deposit position of the variable in the program at internal memory, with the pipeline stall of avoiding the internal memory instruction to cause because of the internal memory conflict in the process of implementation, the final execution efficient that improves the final machine-executable program that generates.
In existing processor, dsp processor particularly, instruction set can be supported internal memory operation number and various addressing mode usually.Because in data processing and in using, the internal storage access operation is a key link, therefore, how to carry out the variable branch and utilizes streamline to have great important with being equipped with top efficiency.On the other hand, DSP framework for majority, in order to improve memory bandwidth as much as possible, usually support multidata memory architecture (generally being the Double Data memory architecture), i.e. instruction can still can not repeatedly be visited a data internal memory in the same clock period in a plurality of different datarams of same clock period concurrent access.
The method that solves this internal memory conflict has two big classes, one class is the memory location that the programmer manually arranges variable, especially in Embedded Application, the manual allocation variable is a kind of very effective memory allocation method, it makes calling program eliminate the internal memory conflict on the data structure algorithm aspect, but when program structure becomes complexity, artificially distribute the method for variable to be absorbed in the vicious circle of suboptimization through regular meeting; Another kind of is to be optimized by compiler, and this method can be carried out the variable storage allocation of space from overall angle, but present stage compiler for the variable allocation strategy of multidata memory architecture still imperfection.
Through the literature survey of prior art is found, G.Grewal etc. adopted genetic algorithm and have proposed a kind of Memory Allocation strategy at the DSP structure of this class Double Data internal memory of M56K on the IEEE Congress on Evolutionary Computation in 2006.But this allocation strategy is for powerless more than twin-channel internal storage structure, and the source operand situation identical with destination operand occur when internal storage access, and the internal memory conflict will take place inevitably; Simultaneously, this allocation algorithm is that the situation of unitary variant is effective at memory variable only, and when variable is continuous structures such as array, this allocation algorithm will lose efficacy.
Summary of the invention
The present invention is directed to the prior art above shortcomings, a kind of compilation device and its implementation of eliminating the internal storage access conflict is provided, make the overhead that produces because of the internal memory conflict reduce greatly.
The present invention is achieved by the following technical solutions:
The present invention relates to a kind of compilation device of eliminating the internal storage access conflict, comprise: front end language analysis unit, the memory variable analytic unit, unit and conversion output unit are eliminated in the internal memory conflict, wherein: front end language analysis unit is connected with memory variable analytic unit and internal memory conflict elimination unit and transmits the intermediate language sequence of being changed by source program, memory model and internal memory operation information and source file functional dependence tree, the memory variable analytic unit is connected and transmission variables Memory Allocation and memory block operation information with internal memory conflict elimination unit, the intermediate language information that the unit is connected with the conversion output unit and transmits the optimization of process Memory Allocation, the final executable code of conversion output unit output are eliminated in the internal memory conflict.
Described front end language analysis unit comprises: the memory configurations assembly, language analysis assembly and documentation function relationship component, wherein: the memory configurations assembly is connected with the memory variable analytic unit and transmits internal memory operation information and memory model by internal memory operation information table and memory model description list, the language analysis assembly is converted to the intermediate language sequence with source program and passes to the memory variable analytic unit respectively and internal memory conflict elimination unit, and the documentation function relationship component is analyzed the functional dependence relation of each file in the source program and passed to the internal memory conflict with the form that the source file functional dependence is set and eliminates the unit.
Described internal memory operation information table comprises: mathematical operation command function, operand number, operand type, addressing mode, data width and executive overhead;
Described memory model description list comprises total size of internal memory, the number of memory partitioning, the start address and the size of each memory partitioning.
Described memory variable analytic unit reads and the analytic record interlude comprises all memory variables and operation thereof, this memory variable analytic unit comprises: memory variable divider and variable analysis assembly, wherein: the memory variable information in the intermediate language sequence that the analysis of memory variable divider is generated by front end language analysis unit also adopts dynamic assignment or static allocation passes to the variable analysis assembly with the form of variable Memory Allocation table, and the memory variable analytic unit is analyzed the internal memory operation information in each program fundamental block by variable Memory Allocation table and recorded in the memory block operation table.
Described dynamic assignment is meant: the memory variable divider by unique determine and record in the variable Memory Allocation table allocation identification with the title of track record memory variable and life cycle information;
Described static allocation is meant: the memory variable divider identifies this variable and is the overall situation and records in the variable Memory Allocation table life cycle;
Described variable Memory Allocation table comprises: variable sign, variable name, variable size, variable life cycle, variable is cut apart and Memory Allocation, and wherein: the variable sign is the unique identification of the different variablees of difference; The size information of variable size record variable; The life cycle of variable record variable life cycle, the distribution that comprises variable constantly and the release of variable constantly, the starting stage for global variable, distributing constantly is that program begins, discharging constantly is that program stops; Fresh information after the variable cutting recording variable process variable cutting transformation module comprises and cuts apart quantity, the variable rename that each is cut apart, and the Memory Allocation of respectively cutting apart;
The relevant information of internal memory operation in each list item sign fundamental block in the described memory block operation table; Relevant information comprises: the number of source internal memory operation number, the size of each source internal memory operation number, types of variables and operand update mode, the number of target internal memory operation number, the size of each target internal memory operation number, type and operand update mode, the type of internal memory operation.
Described internal memory conflict is eliminated the unit and comprised: variable is cut apart assembly, global memory's allocation component and addressing mode selection assembly, wherein: variable is cut apart assembly and is received the intermediate language sequence of being transmitted by the front end linguistic unit, by analyzing the internal memory conflict that exists in the interlude, the variable that meets parted pattern is cut apart and the rename operation, and more new variables Memory Allocation table and memory block operation table information also export global memory's allocation component and addressing mode respectively to and select assembly simultaneously; Global memory's allocation component read after the renewal the Memory Allocation table and according to the size of wherein memory variable, adopt life cycle optimized Algorithm that all memory variables are distributed and export addressing mode to and select assembly; Addressing mode is selected the information of assembly according to the memory block operation table, is that unit is the variable distribution addressing mode of array form with the fundamental block.After the conversion by above-mentioned three assemblies, interlude and variable chained file that unit output is optimized through Memory Allocation eliminated in the internal memory conflict to interlude successively.
Described variable chained file comprises: the sign of each memory partitioning, the start address of memory partitioning, be assigned to the variable information of this internal memory; In order to reduce the fragment of internal memory, the variable that is assigned to each memory partitioning distributes successively according to variable size order from big to small; Variable information comprises variable name, the alignment pattern of variable, the size of variable.
Described conversion output unit is input as variable chained list and interlude, by changing the executable program code of final output device.
The present invention relates to the implementation method of said apparatus, may further comprise the steps:
The first step: the user compiles source program, needs is carried out the variable structure of Memory Allocation scheduling and distributes by the memory variable divider, make compiler can explicit identification all need variablees of Memory Allocation scheduling;
Second step: the user use of the present invention a kind of succinctly, easy row and internal memory operation and memory model describing method are described the model of internal memory operation, addressing mode and the internal memory of processor support intuitively; The user only need insert normalized form according to standard just can finish description to internal memory operation and memory model;
The 3rd step: the user is extracted the descriptor of internal memory operation and memory model, and be organized into respectively and have good interface, help the internal memory operation message structure that compiler resolves and the memory model of particular memory simultaneously;
The 4th step: the user imports the source program of needs compiling and the file dependence of source program; Source file functional dependence tree of file dependence structure according to user's compiling;
The 5th step: source program converts the general a kind of interlude of compiler to by the compiling of front end linguistic unit, in this step, can implement multiple compile optimization strategy, comprise scheduling, to eliminate the conflict of the factor generation that exists between the internal memory operation according to being correlated with to internal memory operation;
The 6th step: will leave in the variable Memory Allocation table by the memory variable that the memory variable divider distributes;
The 7th step: the internal memory operation message structure that generates in conjunction with the 3rd step, and the variable Memory Allocation table that generates of the 6th step, analyze the operation behavior of corresponding memory variable in the interlude, and make up the memory block operation table with the form of fundamental block;
The 8th step: the action type of each memory variable in the internal memory operation table that generates according to the 7th step, cut apart module by variable and detect each memory block operation list item and whether mate parted pattern, carry out for the memory variable that meets model that variable is cut apart and rename;
The 9th step: set according to the source file functional dependence that the 4th step generated, find fundamental block and the variable of cutting apart rename, and find all references of in all the follow-up fundamental blocks of this fundamental block in the dependence this being cut apart the rename variable, they are also upgraded successively;
The tenth step: repeated for the 8th step and the 9th step, cut apart and rename up to finishing all variablees that satisfy model, the while is new variables Memory Allocation table more;
The 11 step: carry out the memory variable global optimization according to variable Memory Allocation table, memory block operation table by the global optimization module and distribute; A kind of method of global optimization is to adopt genetic algorithm to calculate the optimization solution of Memory Allocation; According to optimum solution that algorithm found more new variables Memory Allocation table and memory block operation table;
The 12 step: according to the update mode of the source and destination operand of each memory block operation in the information of memory block operation table, in conjunction with the addressing mode of describing in the internal memory operation structural information in the 3rd step, select suitable addressing mode by the addressing mode allocation component;
The 13 step: generate the variable chained file according to final variable Memory Allocation table;
The 14 step: interlude is generated the final machine-executable program that generates according to the variable chained file.
Compared with prior art, the present invention has following useful effect:
The invention provides instruction description of general structured internal memory and internal storage structure model collocation method, only need just can finish memory variable allocation optimized based on the user memory model according to the configuration information of user's input.
1) with respect to the Memory Allocation strategy that only was confined to the Double Data memory model in the past, allocation algorithm provided by the present invention is not subjected to this condition restriction, can support the multidata memory model, therefore can be optimized distribution at user-defined memory model.
2) compilation device that proposes of the present invention not only is confined to the source operand conflict analysis of prior art to internal memory operation, and can be simultaneously between the source operand to internal memory operation, carry out conflict analysis so more perfect function between the source and destination operand.
Description of drawings
Fig. 1 is a system schematic of the present invention.
Fig. 2 is the inventive method process flow diagram.
Fig. 3 is the structural representation that is used to describe the normalized form of internal memory operation.
Fig. 4 is the structural representation that is used to describe the normalized form of memory model.
Fig. 5 is a variable Memory Allocation hoist pennants.
Fig. 6 is a memory block operation table synoptic diagram.
Fig. 7 is a variable chained file synoptic diagram.
Fig. 8 be the 3rd class parted pattern typical case cut apart synoptic diagram.
Fig. 9 is the synoptic diagram of cutting apart of the 3rd class parted pattern general type.
Figure 10 be the 4th class parted pattern typical case cut apart synoptic diagram.
Figure 11 is the internal memory global optimization process flow diagram that adopts genetic algorithm.
Figure 12 is that the embodiment Memory Allocation is separated the reparation synoptic diagram.
Embodiment
Below embodiments of the invention are elaborated, present embodiment is being to implement under the prerequisite with the technical solution of the present invention, provided detailed embodiment and concrete operating process, but protection scope of the present invention is not limited to following embodiment.
As shown in Figure 1, present embodiment comprises: front end language analysis unit, the memory variable analytic unit, unit and conversion output unit are eliminated in the internal memory conflict, wherein: front end language analysis unit is connected with memory variable analytic unit and internal memory conflict elimination unit and transmits the intermediate language sequence of being changed by source program, memory model and internal memory operation information and source file functional dependence tree, the memory variable analytic unit is connected and transmission variables Memory Allocation and memory block operation information with internal memory conflict elimination unit, the intermediate language information that the unit is connected with the conversion output unit and transmits the optimization of process Memory Allocation, the final executable code of conversion output unit output are eliminated in the internal memory conflict.
Described front end language analysis unit comprises: the memory configurations assembly, language analysis assembly and documentation function relationship component, wherein: the memory configurations assembly is connected with the memory variable analytic unit and transmits internal memory operation information and memory model by internal memory operation information table and memory model description list, the language analysis assembly is converted to the intermediate language sequence with source program and passes to the memory variable analytic unit respectively and internal memory conflict elimination unit, and the documentation function relationship component is analyzed the functional dependence relation of each file in the source program and passed to the internal memory conflict with the form that the source file functional dependence is set and eliminates the unit.
Described internal memory operation information table comprises: mathematical operation command function, operand number, operand type, addressing mode, data width and executive overhead.
Described memory model description list comprises total size of internal memory, the number of memory partitioning, the start address and the size of each memory partitioning.
Described memory variable analytic unit reads and the analytic record interlude comprises all memory variables and operation thereof, this memory variable analytic unit comprises: memory variable divider and variable analysis assembly, wherein: the memory variable information in the intermediate language sequence that the analysis of memory variable divider is generated by front end language analysis unit also adopts dynamic assignment or static allocation passes to the variable analysis assembly with the form of variable Memory Allocation table, and the memory variable analytic unit is analyzed the internal memory operation information in each program fundamental block by variable Memory Allocation table and recorded in the memory block operation table.
Described dynamic assignment is meant: the memory variable divider by unique determine and record in the variable Memory Allocation table allocation identification with the title of track record memory variable and life cycle information.
Described static allocation is meant: the memory variable divider identifies this variable and is the overall situation and records in the variable Memory Allocation table life cycle.
Described variable Memory Allocation table comprises: variable sign, variable name, variable size, variable life cycle, variable is cut apart and Memory Allocation, and wherein: the variable sign is the unique identification of the different variablees of difference; The size information of variable size record variable; The life cycle of variable record variable life cycle, the distribution that comprises variable constantly and the release of variable constantly, the starting stage for global variable, distributing constantly is that program begins, discharging constantly is that program stops; Fresh information after the variable cutting recording variable process variable cutting transformation module comprises and cuts apart quantity, the variable rename that each is cut apart, and the Memory Allocation of respectively cutting apart.
The relevant information of internal memory operation in each list item sign fundamental block in the described memory block operation table; Relevant information comprises: the number of source internal memory operation number, the size of each source internal memory operation number, types of variables and operand update mode, the number of target internal memory operation number, the size of each target internal memory operation number, type and operand update mode, the type of internal memory operation.
Described internal memory conflict is eliminated the unit and comprised: variable is cut apart assembly, global memory's allocation component and addressing mode selection assembly, wherein: variable is cut apart assembly and is received the intermediate language sequence of being transmitted by the front end linguistic unit, by analyzing the internal memory conflict that exists in the interlude, the variable that meets parted pattern is cut apart and the rename operation, and more new variables Memory Allocation table and memory block operation table information also export global memory's allocation component and addressing mode respectively to and select assembly simultaneously; Global memory's allocation component read after the renewal the Memory Allocation table and according to the size of wherein memory variable, adopt life cycle optimized Algorithm that all memory variables are distributed and export addressing mode to and select assembly; Addressing mode is selected the information of assembly according to the memory block operation table, is that unit is the variable distribution addressing mode of array form with the fundamental block.After the conversion by above-mentioned three assemblies, interlude and variable chained file that unit output is optimized through Memory Allocation eliminated in the internal memory conflict to interlude successively.
Whether described variable is cut apart assembly and is at first searched interlude and exist the common memory that exists because of source operand identical with target operand (perhaps overlapping) to distribute insurmountable internal memory conflict operation, analyze these internal memory operations, and these variablees are carried out specific cutting apart and rename according to parted pattern; Cut apart finish after, set more new variables according to the source file functional dependence, simultaneously updating memory block operations table and variable Memory Allocation table.
Described parted pattern comprises: first kind of model source operand is the internal memory operation number more than 2, the internal memory operation model of cutting apart the type is at first introduced an interim intermediate variable, former internal memory operation can split into the internal memory operation of two 2 operands, the destination operand of first operation is interim intermediate variable, and a source operand of second operation is temporary variable for this reason; Two source operands of second kind of model are same variablees, and internal memory operation is addition, subtraction or division, the internal memory operation model of cutting apart the type be about to shifting function replace addition behaviour, with 0 replace subtraction, when memory value be not 0 usefulness, 1 replacement result of division; Two source operands of the third model are the different elements of an array, the internal memory operation model of cutting apart the type is promptly cut apart array location with the unit of being spaced apart of these two elements, and array that will be separately cuts apart sub-piece and is stitched together, and finally is divided into two sub-pieces of array; Or a source operand of the 4th kind of model internal memory operation is identical with destination operand to be the different elements of an array, and the internal memory operation model of cutting apart the type is about to destination operand and carries out rename.
Described global memory allocation component reads the information of variable Memory Allocation table, distributes cost to calculate the pairing variable allocation strategy of value that variable distributes the cost minimum by global optimization approach according to variable, generates the variable chained file according to allocation strategy.
Described variable distributes cost to be meant: by the execution cost three parts decision of the frequent degree of the size of variable, internal memory operation and internal memory operation, and variable distributes three parts of cost and this to be directly proportional.
Described variable chained file comprises: the sign of each memory partitioning, the start address of memory partitioning, be assigned to the variable information of this internal memory; In order to reduce the fragment of internal memory, the variable that is assigned to each memory partitioning distributes successively according to variable size order from big to small; Variable information comprises variable name, the alignment pattern of variable, the size of variable.
Described addressing mode is selected assembly, and at first the update mode of source operand and destination operand in each memory block in the mark memory block operation table is described according to the internal memory instruction and the addressing mode of user's input, distributes to the suitable addressing mode of each memory block.
Described addressing mode support is normal from increasing and decreasing addressing mode, symmetrical addressing mode, bit reversal addressing mode and cyclic addressing pattern.
Described conversion output unit is input as variable chained list and interlude, by changing the executable program code of final output device.
As shown in Figure 2, above-mentioned compilation device is specifically realized compiling in the following manner:
The first step: the user compiles source program, needs is carried out the variable structure of Memory Allocation scheduling and distributes by the memory variable divider, make compiler can explicit identification all need variablees of Memory Allocation scheduling;
Second step: the user uses present embodiment a kind of succinctly, easy row and internal memory operation and memory model describing method are described the model of internal memory operation, addressing mode and the internal memory of processor support intuitively; The user only need insert normalized form according to standard just can finish description to internal memory operation and memory model;
The 3rd step: the user is extracted the descriptor of internal memory operation and memory model, and be organized into respectively and have good interface, help the internal memory operation message structure that compiler resolves and the memory model of particular memory simultaneously;
The 4th step: the user imports the source program of needs compiling and the file dependence of source program; Source file functional dependence tree of file dependence structure according to user's compiling;
The 5th step: source program converts the general a kind of interlude of compiler to by the compiling of front end linguistic unit, in this step, can implement multiple compile optimization strategy, comprise scheduling, to eliminate the conflict of the factor generation that exists between the internal memory operation according to being correlated with to internal memory operation;
The 6th step: will leave in the variable Memory Allocation table by the memory variable that the memory variable divider distributes;
The 7th step: the internal memory operation message structure that generates in conjunction with the 3rd step, and the variable Memory Allocation table that generates of the 6th step, analyze the operation behavior of corresponding memory variable in the interlude, and make up the memory block operation table with the form of fundamental block;
The 8th step: the action type of each memory variable in the internal memory operation table that generates according to the 7th step, cut apart module by variable and detect each memory block operation list item and whether mate parted pattern, carry out for the memory variable that meets model that variable is cut apart and rename;
The 9th step: set according to the source file functional dependence that the 4th step generated, find fundamental block and the variable of cutting apart rename, and find all references of in all the follow-up fundamental blocks of this fundamental block in the dependence this being cut apart the rename variable, they are also upgraded successively;
The tenth step: repeated for the 8th step and the 9th step, cut apart and rename up to finishing all variablees that satisfy model, the while is new variables Memory Allocation table more;
The 11 step: carry out the memory variable global optimization according to variable Memory Allocation table, memory block operation table by the global optimization module and distribute; A kind of method of global optimization is to adopt genetic algorithm to calculate the optimization solution of Memory Allocation; According to optimum solution that algorithm found more new variables Memory Allocation table and memory block operation table;
The 12 step: according to the update mode of the source and destination operand of each memory block operation in the information of memory block operation table, in conjunction with the addressing mode of describing in the internal memory operation structural information in the 3rd step, select suitable addressing mode by the addressing mode allocation component;
The 13 step: generate the variable chained file according to final variable Memory Allocation table;
The 14 step: interlude is generated the final machine-executable program that generates according to the variable chained file.
Be illustrated in figure 1 as the method flow diagram of present embodiment, that is: the user describes step, and the user carries out the description of internal memory operation and memory model by internal memory operation and the memory model description list that present embodiment provides; Front end language conversion step converts source program to unified compiler intermediate language; The memory variable analytical procedure is analyzed middle program, writes down the information of all memory variables, and the operation information of each memory block; Internal memory conflict removal process, the internal memory operation of search in the interlude carries out variable respectively according to parted pattern and cuts apart with rename, global memory's allocation optimized and generate the link information of each memory variable and analyze in the memory block operation update mode of source operand and target operand and select suitable addressing mode; Conversion output step converts interlude to machine executable code according to link information.
As Fig. 3~shown in Figure 14, in this example, the user adopts the mode of filling a vacancy and filling up a form to the description of internal memory operation and memory model configuration, wherein the synoptic diagram of internal memory operation description list and memory model description list respectively as shown in Figure 3 and Figure 4, in this example the user to dispose be described below shown in:
User's processor is supported 32 multiplying order, and this instruction has 3 operands, and all is the internal memory operation number, supports common from adding and subtracting addressing.The internal memory operation description list is as follows:
Figure BDA0000052119300000081
The memory model structure of handling is, the total big or small 64kB of datarams, and start address is 0x2000, has 4 memory partitionings, each subregion 16k.The memory model description list is as follows:
Figure BDA0000052119300000091
A cutting transformation that gordian technique is a variable of present embodiment.Variable is cut apart to change and is carried out variable by parted pattern and cut apart rename; These models are respectively, source operand is the parted pattern of the internal memory operation number more than 2, two source operands are parted patterns of same variable, two source operands are parted patterns of the different elements of an array, or a source operand of internal memory operation identical with destination operand be the parted pattern of the different elements of an array.For first kind and second kind of parted pattern, the equivalence that only relates to the operation of this memory block is replaced, and does not have the rename of variable.Below, inquire into the third and the 4th kind of variable parted pattern.
[two source operands are parted patterns of the different elements of an array]
A typical example of this model is a C code array manipulation as shown in Figure 8.In this section code, A[i], F[i] all be the integer element of 32 bits, wherein array A[i] and value by array F[i] and F[i-1] multiplying each other obtains.Describe according to user-defined internal memory operation, can know that this operation will be converted into the multiplying order of one 32 bit, so this operation will record the memory block operation table, array A and array B also will be recorded in the variable Memory Allocation table simultaneously.But, can cause the internal memory conflict because two source operands of this multiplying order are positioned at the continuous position of internal memory.According to originally cutting apart model, two operands be spaced apart 1, therefore can split into two by F, as shown in Figure 9, wherein, the element of odd index is formed one, and another piece of the composition of even index is in time this deposit data of two of Memory Allocation generation that do not conflict when different positions can guarantee to carry out multiply operation.
General situation, as shown in figure 10, the address of two source operands is spaced apart k, and be that size is cut apart array F this moment with k, can also be implemented in the visit moment two source operands and can not clash.
[or a source operand is identical with destination operand to be the parted pattern of the different elements of an array]
An example that meets this model as shown in figure 11, the source operand and the destination operand of internal memory multiplication are all F[i], when the performance period of multiplication is 3, read F[i+3] and write F[i] can produce the internal memory conflict.According to parted pattern, the F[i of target operand] understand by RNTO F_1[i], in storage allocation, F leaves different positions in F_1 and can guarantee the internal memory conflict can not take place in the moment of carrying out multiply operation, cuts apart synoptic diagram as shown in figure 12.
When a source operand of internal memory multiplication is the different element of same array with destination operand, for example source operand is F[i], destination operand is F[i+2], similar to the analysis that the analysis of this model is identical with the source and destination operand, by destination operand is called by name, can eliminate access conflict.
Behind the variable cutting transformation, the situation that the source and destination operand produces conflict can be resolved.File dependence according to user's input generates source file functional dependence tree below, as shown in Figure 7.According to source file functional dependence tree, compilation device will upgrade all subsequent node to being cut apart quoting of variable, simultaneously updating memory allocation table and memory block operation table.
Global memory's allocation component can adopt specific optimized Algorithm.A kind of algorithm of global optimization is a genetic algorithm, as shown in figure 13.Generate a set of dispense at first, at random and separate set.Calculate the cost that each distribution is separated according to the execution cost of internal memory operation, the number of times of internal memory operation and the size of variable.For separating of preventing to obtain is wrong, a for example a certain memory partitioning overflows, and need separate the distribution that generates and repair.The strategy of repairing has a lot, a kind of mode as shown in figure 14, can the variable taking-up in the memory partitioning that this method will be overflowed be tested this variable and be placed on other subregions and not cause conflict, and then redistribute this variable.After Global Algorithm was finished and obtained optimum distribution and separate, more new variables Memory Allocation table and memory block operation table generated the variable chained file according to variable Memory Allocation table simultaneously; A kind of organizational form of variable chained file as shown in Figure 7.
After the global assignment module, the Memory Allocation position of all variablees all is determined, in the memory block operation table, can analyze the update mode of source operand and destination operand, in conjunction with the addressing mode of operand in the internal memory operation description list of user's input, can the suitable addressing mode of storage allocation variable.
At last, the output converting unit is converted into final machine-executable program according to the variable chained file with interlude.
Compared with prior art, present embodiment has following useful effect:
Present embodiment provides the general instruction description of structuring internal memory and internal storage structure model collocation method, only needs just can finish memory variable allocation optimized based on the user memory model according to the configuration information of User input.
1) with respect to the Memory Allocation Strategy that only was confined to the Double Data memory model in the past, the allocation algorithm that present embodiment provides is not subjected to this condition restriction, can support majority according to memory model, therefore can be optimized distribution for user-defined memory model.
2) compilation device of present embodiment proposition not only is confined to prior art to the source operand conflict analysis of internal memory operation, and can be simultaneously between the source operand to internal memory operation, carry out conflict analysis so more perfect function between the source and destination operand.

Claims (6)

1. eliminate the compilation device that internal storage access conflicts for one kind, it is characterized in that, comprise: front end language analysis unit, the memory variable analytic unit, unit and conversion output unit are eliminated in the internal memory conflict, wherein: front end language analysis unit is connected with memory variable analytic unit and internal memory conflict elimination unit and transmits the intermediate language sequence of being changed by source program, memory model and internal memory operation information and source file functional dependence tree, the memory variable analytic unit is connected and transmission variables Memory Allocation and memory block operation information with internal memory conflict elimination unit, the intermediate language information that the unit is connected with the conversion output unit and transmits the optimization of process Memory Allocation, the final executable code of conversion output unit output are eliminated in the internal memory conflict.
2. the compilation device of elimination internal storage access conflict according to claim 1, it is characterized in that, described front end language analysis unit comprises: the memory configurations assembly, language analysis assembly and documentation function relationship component, wherein: the memory configurations assembly is connected with the memory variable analytic unit and transmits internal memory operation information and memory model by internal memory operation information table and memory model description list, the language analysis assembly is converted to the intermediate language sequence with source program and passes to the memory variable analytic unit respectively and internal memory conflict elimination unit, and the documentation function relationship component is analyzed the functional dependence relation of each file in the source program and passed to the internal memory conflict with the form that the source file functional dependence is set and eliminates the unit; Described internal memory operation information table comprises: mathematical operation command function, operand number, operand type, addressing mode, data width and executive overhead; Described memory model description list comprises total size of internal memory, the number of memory partitioning, the start address and the size of each memory partitioning.
3. the compilation device of elimination internal storage access conflict according to claim 1, it is characterized in that, described memory variable analytic unit reads and the analytic record interlude comprises all memory variables and operation thereof, this memory variable analytic unit comprises: memory variable divider and variable analysis assembly, wherein: the memory variable information in the intermediate language sequence that the analysis of memory variable divider is generated by front end language analysis unit also adopts dynamic assignment or static allocation passes to the variable analysis assembly with the form of variable Memory Allocation table, the memory variable analytic unit is analyzed the internal memory operation information in each program fundamental block by variable Memory Allocation table and recorded in the memory block operation table, wherein: dynamic assignment is meant: the memory variable divider by unique determine and record in the variable Memory Allocation table allocation identification with the title of track record memory variable and life cycle information; Static allocation is meant: the memory variable divider identifies this variable and is the overall situation and records in the variable Memory Allocation table life cycle; Described variable Memory Allocation table comprises: variable sign, variable name, variable size, variable life cycle, variable is cut apart and Memory Allocation, and wherein: the variable sign is the unique identification of the different variablees of difference; The size information of variable size record variable; The life cycle of variable record variable life cycle, the distribution that comprises variable constantly and the release of variable constantly, the starting stage for global variable, distributing constantly is that program begins, discharging constantly is that program stops; Fresh information after the variable cutting recording variable process variable cutting transformation module comprises and cuts apart quantity, the variable rename that each is cut apart, and the Memory Allocation of respectively cutting apart; The relevant information of internal memory operation in each list item sign fundamental block in the described memory block operation table; Relevant information comprises: the number of source internal memory operation number, the size of each source internal memory operation number, types of variables and operand update mode, the number of target internal memory operation number, the size of each target internal memory operation number, type and operand update mode, the type of internal memory operation.
4. the compilation device of elimination internal storage access conflict according to claim 1, it is characterized in that, described internal memory conflict is eliminated the unit and comprised: variable is cut apart assembly, global memory's allocation component and addressing mode are selected assembly, wherein: variable is cut apart assembly and is received the intermediate language sequence of being transmitted by the front end linguistic unit, by analyzing the internal memory conflict that exists in the interlude, the variable that meets parted pattern is cut apart and the rename operation, and more new variables Memory Allocation table and memory block operation table information also export global memory's allocation component and addressing mode respectively to and select assembly simultaneously; Global memory's allocation component read after the renewal the Memory Allocation table and according to the size of wherein memory variable, adopt life cycle optimized Algorithm that all memory variables are distributed and export addressing mode to and select assembly; Addressing mode is selected the information of assembly according to the memory block operation table, with the fundamental block is that unit is the variable distribution addressing mode of array form, interlude is successively after the conversion by above-mentioned three assemblies, interlude and variable chained file that unit output is optimized through Memory Allocation eliminated in the internal memory conflict, and this variable chained file comprises: the sign of each memory partitioning, the start address of memory partitioning, be assigned to the variable information of this internal memory; In order to reduce the fragment of internal memory, the variable that is assigned to each memory partitioning distributes successively according to variable size order from big to small; Variable information comprises variable name, the alignment pattern of variable, the size of variable.
5. the compilation device of elimination internal storage access conflict according to claim 1 is characterized in that described conversion output unit is input as variable chained list and interlude, by changing the executable program code of final output device.
6. the implementation method according to the described device of above-mentioned arbitrary claim is characterized in that, may further comprise the steps:
The first step: the user compiles source program, needs is carried out the variable structure of Memory Allocation scheduling and distributes by the memory variable divider, make compiler can explicit identification all need variablees of Memory Allocation scheduling;
Second step: the user use of the present invention a kind of succinctly, easy row and internal memory operation and memory model describing method are described the model of internal memory operation, addressing mode and the internal memory of processor support intuitively; The user only need insert normalized form according to standard just can finish description to internal memory operation and memory model;
The 3rd step: the user is extracted the descriptor of internal memory operation and memory model, and be organized into respectively and have good interface, help the internal memory operation message structure that compiler resolves and the memory model of particular memory simultaneously;
The 4th step: the user imports the source program of needs compiling and the file dependence of source program; Source file functional dependence tree of file dependence structure according to user's compiling;
The 5th step: source program converts the general a kind of interlude of compiler to by the compiling of front end linguistic unit, in this step, can implement multiple compile optimization strategy, comprise scheduling, to eliminate the conflict of the factor generation that exists between the internal memory operation according to being correlated with to internal memory operation;
The 6th step: will leave in the variable Memory Allocation table by the memory variable that the memory variable divider distributes;
The 7th step: the internal memory operation message structure that generates in conjunction with the 3rd step, and the variable Memory Allocation table that generates of the 6th step, analyze the operation behavior of corresponding memory variable in the interlude, and make up the memory block operation table with the form of fundamental block;
The 8th step: the action type of each memory variable in the internal memory operation table that generates according to the 7th step, cut apart module by variable and detect each memory block operation list item and whether mate parted pattern, carry out for the memory variable that meets model that variable is cut apart and rename;
The 9th step: set according to the source file functional dependence that the 4th step generated, find fundamental block and the variable of cutting apart rename, and find all references of in all the follow-up fundamental blocks of this fundamental block in the dependence this being cut apart the rename variable, they are also upgraded successively;
The tenth step: repeated for the 8th step and the 9th step, cut apart and rename up to finishing all variablees that satisfy model, the while is new variables Memory Allocation table more;
The 11 step: carry out the memory variable global optimization according to variable Memory Allocation table, memory block operation table by the global optimization module and distribute; A kind of method of global optimization is to adopt genetic algorithm to calculate the optimization solution of Memory Allocation; According to optimum solution that algorithm found more new variables Memory Allocation table and memory block operation table;
The 12 step: according to the update mode of the source and destination operand of each memory block operation in the information of memory block operation table, in conjunction with the addressing mode of describing in the internal memory operation structural information in the 3rd step, select suitable addressing mode by the addressing mode allocation component;
The 13 step: generate the variable chained file according to final variable Memory Allocation table;
The 14 step: interlude is generated the final machine-executable program that generates according to the variable chained file.
CN201110073683.XA 2010-12-08 2011-03-25 Compiling device for eliminating memory access conflict and realizing method thereof Expired - Fee Related CN102193811B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110073683.XA CN102193811B (en) 2010-12-08 2011-03-25 Compiling device for eliminating memory access conflict and realizing method thereof

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201010577313.5 2010-12-08
CN2010105773135A CN102043659A (en) 2010-12-08 2010-12-08 Compiling device for eliminating memory access conflict and implementation method thereof
CN201110073683.XA CN102193811B (en) 2010-12-08 2011-03-25 Compiling device for eliminating memory access conflict and realizing method thereof

Publications (2)

Publication Number Publication Date
CN102193811A true CN102193811A (en) 2011-09-21
CN102193811B CN102193811B (en) 2014-04-16

Family

ID=43909817

Family Applications (2)

Application Number Title Priority Date Filing Date
CN2010105773135A Withdrawn CN102043659A (en) 2010-12-08 2010-12-08 Compiling device for eliminating memory access conflict and implementation method thereof
CN201110073683.XA Expired - Fee Related CN102193811B (en) 2010-12-08 2011-03-25 Compiling device for eliminating memory access conflict and realizing method thereof

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN2010105773135A Withdrawn CN102043659A (en) 2010-12-08 2010-12-08 Compiling device for eliminating memory access conflict and implementation method thereof

Country Status (1)

Country Link
CN (2) CN102043659A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506594A (en) * 2016-09-30 2017-03-15 科大讯飞股份有限公司 Parallel computing resource allocation method and device

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2015121749A (en) * 2012-11-07 2016-12-27 Конинклейке Филипс Н.В. COMPILATOR GENERATING NON-OPERATING CODE
CN103235745B (en) * 2013-03-27 2016-08-10 华为技术有限公司 A kind of address conflict detecting method and device
CN103955406A (en) * 2014-04-14 2014-07-30 浙江大学 Super block-based based speculation parallelization method
CN108874532B (en) * 2017-06-01 2020-11-06 北京旷视科技有限公司 Memory allocation method and device
CN108197027B (en) * 2017-12-29 2021-07-16 广州景派科技有限公司 Software performance optimization method, storable medium, computer program
CN110895512A (en) * 2018-09-13 2020-03-20 普天信息技术有限公司 Memory management method
CN110471862A (en) * 2019-07-16 2019-11-19 杭州电子科技大学 A kind of the address of variable distribution system and method for programmable controller
CN112631610B (en) * 2020-11-30 2022-04-26 上海交通大学 Method for eliminating memory access conflict for data reuse of coarse-grained reconfigurable structure
CN115826946B (en) * 2023-02-17 2023-05-12 苏州浪潮智能科技有限公司 Program exception vector space optimization system, method, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1740989A (en) * 2004-08-23 2006-03-01 中兴通讯股份有限公司 Universal inserting system virtual page attribute management method
CN1823322A (en) * 2003-07-15 2006-08-23 可递有限公司 Shared code caching method and apparatus for program code conversion
US20070169059A1 (en) * 2005-12-13 2007-07-19 Poseidon Design Systems Inc. Compiler method for extracting and accelerator template program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1823322A (en) * 2003-07-15 2006-08-23 可递有限公司 Shared code caching method and apparatus for program code conversion
CN1740989A (en) * 2004-08-23 2006-03-01 中兴通讯股份有限公司 Universal inserting system virtual page attribute management method
US20070169059A1 (en) * 2005-12-13 2007-07-19 Poseidon Design Systems Inc. Compiler method for extracting and accelerator template program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506594A (en) * 2016-09-30 2017-03-15 科大讯飞股份有限公司 Parallel computing resource allocation method and device

Also Published As

Publication number Publication date
CN102043659A (en) 2011-05-04
CN102193811B (en) 2014-04-16

Similar Documents

Publication Publication Date Title
CN102193811B (en) Compiling device for eliminating memory access conflict and realizing method thereof
CN100465895C (en) Compiler, compilation method, and compilation program
Wang et al. Performance prediction for apache spark platform
Simitsis et al. Optimizing analytic data flows for multiple execution engines
US8930919B2 (en) Modernization of legacy software systems based on modeled dependencies
US20160098662A1 (en) Apparatus and Method for Scheduling Distributed Workflow Tasks
US9823911B2 (en) Method and apparatus for compiling code based on a dependency tree
CN102750150B (en) Method for automatically generating dense matrix multiplication assembly code based on x86 architecture
CN105786864A (en) Offline analysis method for massive data
Ogilvie et al. Fast automatic heuristic construction using active learning
CN106126601A (en) A kind of social security distributed preprocess method of big data and system
WO2012099625A1 (en) Architecture optimizer
CN109408493A (en) A kind of moving method and system of data source
CN103645930B (en) Assembly level is across the construction method of file Scheduling Framework
CN107851003A (en) For improving the field specialization system and method for program feature
CN102289362A (en) Segmented symbolic execution device and working method thereof
CN115756451A (en) Method, device, equipment and storage medium for reusing multi-project code file
CN105404611B (en) A kind of automatic selecting method of more computing engines based on matrix model
CN104750533A (en) C program compiling method and C program compiler
CN117009038B (en) Graph computing platform based on cloud native technology
Popov et al. Piecewise holistic autotuning of compiler and runtime parameters
Chiba et al. Towards selecting best combination of sql-on-hadoop systems and jvms
US11262989B2 (en) Automatic generation of efficient vector code with low overhead in a time-efficient manner independent of vector width
Xiong et al. ShenZhen transportation system (SZTS): a novel big data benchmark suite
Shi et al. Performance models of data parallel DAG workflows for large scale data analytics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140416

Termination date: 20170325

CF01 Termination of patent right due to non-payment of annual fee