CN104572234A - Method for generating source codes used for parallel computing architecture and source-to-source compiler - Google Patents

Method for generating source codes used for parallel computing architecture and source-to-source compiler Download PDF

Info

Publication number
CN104572234A
CN104572234A CN201410841579.4A CN201410841579A CN104572234A CN 104572234 A CN104572234 A CN 104572234A CN 201410841579 A CN201410841579 A CN 201410841579A CN 104572234 A CN104572234 A CN 104572234A
Authority
CN
China
Prior art keywords
described
ast
source
source code
parallel computation
Prior art date
Application number
CN201410841579.4A
Other languages
Chinese (zh)
Inventor
崔世强
叶寒栋
胡焰林
Original Assignee
杭州华为数字技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州华为数字技术有限公司 filed Critical 杭州华为数字技术有限公司
Priority to CN201410841579.4A priority Critical patent/CN104572234A/en
Publication of CN104572234A publication Critical patent/CN104572234A/en

Links

Abstract

An embodiment of the invention provides a method for generating source codes used for a parallel computing architecture and a source-to-source compiler. By the method, efficiency and transportability can be improved during generating of the source codes used for the parallel computing architecture. The method includes receiving a first source code which is a source code of a vector array input by a user; analyzing the first source code to acquire a first abstract syntax tree AST which is an AST irrelevant to the parallel computing architecture; acquiring information of the parallel computing architecture, and converting the first AST into a second AST according to the information of the parallel computing architecture, wherein the second AST is an AST related to the parallel computing architecture; converting the second AST into a second source code of the parallel computing architecture, wherein the second source code is a source code that the compile of the parallel computing architecture can compile. The method and the source-to-source compiler are suitable for the field of computers.

Description

Generate method and the source-to-source compiler of the source code being used for parallel computation framework

Technical field

The present invention relates to computer realm, particularly relate to the method and source-to-source compiler that generate for the source code of parallel computation framework.

Background technology

(Single Instruction Multiple Data is called for short: single instruction SIMD) representing parallel work-flow in multiple data item single instruction multiple data.Such as " add (add) " corresponding SIMD instruction and can represent parallel addition to 8 16 bit values.

At present, computing machine has realized the multiple parallelization strategies supporting SIMD from hardware.In order to fully use these performance characteristics, software view can, according to the computing architecture characteristic of program, adopt different parallelization strategies to carry out the parallelization conversion of source-to-source to the calculating of different qualities.Technology that this generally involves " source code is to the compiler of source code ".

In prior art, there is the scheme that " source code is to the compiler of the source code " technology of utilization generates the parallel simd code of any parallel computation framework.As shown in Figure 1, the program predefines annotation standard, and it is independent of parallel computation framework, and according to this predefined annotation standard, user can specify the expectation environment generating source code.Specific works principle is: robotization parallel codes maker (Automatic parallel code generator, APCG) the Parallel application source code 110 of source-to-source compiler 120 receiving belt APCG annotation, and according to the annotation in the Parallel application source code 110 of predefined annotation standard assessment band APCG annotation, generate the machine Parallel application source code for annotating described parallel computation framework, as X86 Parallel application source code 130, Cell power supply processing unit (Power Processing Unit, be called for short: PPU) Parallel application source code 132, Cell assists processing unit (Synergistic Processing Unit, be called for short: SPU) Parallel application source code 134.After the machine Parallel application source code 130,132,134 is received by X86 compiler 140, Cell PPU compiler 142, Cell SPU compiler 144 respectively, generate each object code being used for each parallel computation framework, be respectively X86 Parallel application object code 150, Cell PPU Parallel application object code 152, Cell SPU Parallel application object code 154.

Although the program is with the Parallel application source code of annotation to achieve high-rise software abstract by the band APCG of higher level of abstraction to a certain extent, and then achieve the cross-platform of source-to-source compiler, but because the Parallel application source code 110 of band APCG annotation is mechanically corresponding one by one with the machine source code, programmer needs the algorithm being familiar with the machine source code to be rewritten into the Parallel application source code 110 of corresponding band APCG annotation, therefore this converting system efficiency is lower, portable poor, and then can not effectively use above-mentioned performance characteristic.

Summary of the invention

Embodiments of the invention provide the method and source-to-source compiler that generate for the source code of parallel computation framework, the poor efficiency existed during at least to solve in prior art and to utilize " source code is to the compiler of source code " technology to generate and be used for the source code of parallel computation framework, portable poor problem, efficiency when can improve the source code generated for parallel computation framework and portability.

For achieving the above object, embodiments of the invention adopt following technical scheme:

First aspect, provide a kind of generation for the method for the source code of parallel computation framework, described method comprises:

Receive the first source code, described first source code is the source code of the vector matrix that user inputs;

Resolve described first source code, obtain the first abstract syntax tree AST, wherein, a described AST is the AST irrelevant with described parallel computation framework;

Obtain the information of described parallel computation framework, and according to the information of described parallel computation framework, a described AST is converted into the 2nd AST, and wherein, described 2nd AST is the AST relevant to described parallel computation framework;

Described 2nd AST is converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.

In the first possible implementation of first aspect, in conjunction with first aspect, described first source code of described parsing, obtains the first abstract syntax tree AST, and wherein, a described AST is the AST irrelevant with described parallel computation framework, comprising:

Resolve described first source code, obtain the AST of vector matrix operation;

According to preset rules, the AST that described vector matrix operates is changed into the AST of element operation;

The described information according to described parallel computation framework, is converted into the 2nd AST by a described AST, comprises:

According to the information of described parallel computation framework, the AST of described element operation is converted into described 2nd AST.

In the implementation that first aspect the second is possible, in conjunction with the first possible implementation of first aspect, described according to preset rules, by the AST that the AST that described vector matrix operates changes into element operation, comprising:

The function body that the operation of described vector matrix is corresponding is searched in the function library of the element operation preset;

Replace the AST of described vector matrix operation with the AST of the function body of described vector matrix operation correspondence, obtain the AST of described element operation.

In the third possible implementation of first aspect, in conjunction with the first possible implementation of first aspect, described according to preset rules, by the AST that the AST that described vector matrix operates changes into element operation, comprising:

The operational character operated according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of the element operation corresponding to AST of described vector matrix operation;

The AST of the function of the element operation that the AST operated by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.

In first aspect the 4th kind of possible implementation, in conjunction with the first possible implementation of first aspect to the third possible implementation of first aspect, described according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, the described information according to described parallel computation framework, before the AST of described element operation being converted into described 2nd AST, also comprise:

The AST of described element operation is optimized, obtains the AST of the described element operation after optimizing;

The described information according to described parallel computation framework, is converted into the 2nd AST by the AST of described element operation, comprises:

According to the information of described parallel computation framework, the AST of the described element operation after described optimization is converted into described 2nd AST.

Second aspect, provides a kind of source-to-source compiler, and described source-to-source compiler comprises: receiving element, resolution unit, acquiring unit, the first conversion unit, the second conversion unit;

Described receiving element, for receiving the first source code, described first source code is the source code of the vector matrix that user inputs;

Described resolution unit, for resolving described first source code, obtains the first abstract syntax tree AST, and wherein, a described AST is the AST irrelevant with described parallel computation framework;

Described acquiring unit, for obtaining the information of described parallel computation framework;

Described first conversion unit, for the information according to described parallel computation framework, is converted into the 2nd AST by a described AST, and wherein, described 2nd AST is the AST relevant to described parallel computation framework;

Described second conversion unit, for described 2nd AST being converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.

In the first possible implementation of second aspect, in conjunction with second aspect, described resolution unit comprises: parsing module, conversion module;

Described parsing module, for resolving described first source code, obtains the AST of vector matrix operation;

Described conversion module, for according to preset rules, changes into the AST of element operation by the AST that described vector matrix operates;

Described first conversion unit specifically for:

According to the information of described parallel computation framework, the AST of described element operation is converted into described 2nd AST.

In the implementation that second aspect the second is possible, in conjunction with the first possible implementation of second aspect, described conversion module specifically for:

The function body that the operation of described vector matrix is corresponding is searched in the function library of the element operation preset;

Replace the AST of described vector matrix operation with the AST of the function body of described vector matrix operation correspondence, obtain the AST of described element operation.

In the third possible implementation of second aspect, in conjunction with the first possible implementation of second aspect, described conversion module specifically for:

The operational character operated according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of the element operation corresponding to AST of described vector matrix operation;

The AST of the function of the element operation that the AST operated by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.

In second aspect the 4th kind of possible implementation, in conjunction with the first possible implementation of second aspect to the third possible implementation of first aspect, described source-to-source compiler also comprises: optimize unit;

Described optimization unit, for at described conversion module according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, described first conversion unit is according to the information of described parallel computation framework, before the AST of described element operation being converted into the 2nd AST, the AST of described element operation is optimized, obtains the AST of the described element operation after optimizing;

Described first conversion unit specifically for:

According to the information of described parallel computation framework, the AST of the described element operation after described optimization is converted into described 2nd AST.

In prior art, the Parallel application source code that source-to-source compiler receiving belt APCG annotates.But the Parallel application source code of band APCG annotation is mechanically corresponding one by one with the machine source code, programmer needs the algorithm being familiar with the machine source code to be rewritten into the Parallel application source code of corresponding band APCG annotation, and therefore efficiency is lower, portable poor.In the embodiment of the present invention, source-to-source compiler resolves this source code after receiving the source code of the vector matrix of user's input, obtains the AST irrelevant with parallel computation framework; And then, according to the mark of the described parallel computation framework of user's input, obtain the information of described parallel computation framework; According to the information of described parallel computation framework, be converted into the AST relevant with described parallel computation framework by described to the AST that parallel computation framework has nothing to do; Finally AST relevant for described and described parallel computation framework is converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.That is, the source code that the compiler that the source code of vector matrix that user inputs can be converted into parallel computation framework by source-to-source compiler can compile.On the one hand, compared to prior art, the source code due to vector matrix is a kind of source code of simple, higher level of abstraction, and do not need programmer to be familiar with the algorithm of the machine source code, only need programmer to have certain matrix operation basis to complete, therefore efficiency is higher; On the other hand, because the source code of vector matrix and parallel computation framework have nothing to do, programmer is not needed to be rewritten into the Parallel application source code of corresponding band APCG annotation according to the algorithm of the machine source code, namely, the embodiment of the present invention can be converted into the source code relevant with parallel computation framework by the source code that parallel computation framework has nothing to do, therefore better portable.

Accompanying drawing explanation

Fig. 1 is the configuration diagram of source-to-source translation in prior art;

The method flow schematic diagram one of the source code of the generation parallel computation framework that Fig. 2 provides for the embodiment of the present invention;

The configuration diagram of a kind of source-to-source translation that Fig. 3 provides for the embodiment of the present invention;

The method flow schematic diagram two of the source code of the generation parallel computation framework that Fig. 4 provides for the embodiment of the present invention;

The method flow schematic diagram three of the source code of the generation parallel computation framework that Fig. 5 provides for the embodiment of the present invention;

The method flow schematic diagram four of the source code of the generation parallel computation framework that Fig. 6 provides for the embodiment of the present invention;

The method flow schematic diagram five of the source code of the generation parallel computation framework that Fig. 7 provides for the embodiment of the present invention;

The source-to-source compiler that Fig. 8 provides for the embodiment of the present invention resolves the first source code, obtains the structural representation of the AST of element operation;

The source-to-source compiler structural representation one that Fig. 9 provides for the embodiment of the present invention;

The source-to-source compiler structural representation two that Figure 10 provides for the embodiment of the present invention;

The source-to-source compiler structural representation three that Figure 11 provides for the embodiment of the present invention;

The source-to-source compiler structural representation three that Figure 12 provides for the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

For the ease of the technical scheme of the clear description embodiment of the present invention, in an embodiment of the present invention, have employed the printed words such as " first ", " second " to distinguish the substantially identical identical entry of function and efficacy or similar item, it will be appreciated by those skilled in the art that the printed words such as " first ", " second " and right quantity and execution order limit.

Embodiment one,

The embodiment of the present invention provides a kind of method generating the source code of parallel computation framework, is particularly applicable in the source-to-source compiler of multi-purpose computer, as shown in Figure 2, comprises:

S201, source-to-source compiler receive the first source code, and described first source code is the source code of the vector matrix that user inputs.

Concrete, the source code of vector matrix is a kind of matrix language of expansion, can be specifically the higher level lanquages such as C/Java.Matrix language and the parallel computation framework of this expansion have nothing to do, and have stronger readability, ease for use, and code is succinct, clear, and closer to algorithm literal sense, size of code reduces greatly.

S202, source-to-source compiler resolve described first source code, and (Abstract Syntax Tree, is called for short: AST), wherein, and a described AST is the AST irrelevant with parallel computation framework to obtain the first abstract syntax tree.

Concrete, this resolving can comprise: to the expansion support of the first source code, and as matrix-type, broad sense data type, new operation, subtree, template etc., the embodiment of the present invention does not do concrete restriction to this.

S203, source-to-source compiler obtain the information of described parallel computation framework.

Concrete, source-to-source compiler can according to the mark of the described parallel computation framework of user's input, obtain the information of described parallel computation framework, the information of this parallel computation framework can be the information such as type and operation that parallel computation framework is supported, the embodiment of the present invention does not do concrete restriction to this.

A described AST, according to the information of described parallel computation framework, is converted into the 2nd AST by S204, source-to-source compiler, and wherein, described 2nd AST is the AST relevant to described parallel computation framework.

Described 2nd AST is converted into the second source code of described parallel computation framework by S205, source-to-source compiler, and described second source code is the source code that the compiler of described parallel computation framework can compile.

Concrete, the second source code is the source code that the compiler of parallel computation framework can compile, and is configurable on the executable code generating this parallel computation framework in the compiler of corresponding parallel computation framework after compiling, as shown in Figure 3.This parallel computation framework can be advanced Reduced Instruction Set Computer (Reduced Instruction Set Computer, be called for short: RISC) machine (Advanced RISCMachine, be called for short: ARM) NOEN framework, such as ARM v7 NOEN framework or ARMv8 NOEN framework etc.Wherein, ARM NEON is a kind of ARM architecture processor expansion structure, can complete the individual instructions performing parallel work-flow in multiple data item, and then can significantly improve the execution speed of program.

Certainly, parallel computation framework in the embodiment of the present invention can also be other, such as x86 framework, microprocessor (Microprocessor without interlockedpiped stages without inner interlocked pipelining-stage, be called for short: MIPS) framework, the embodiment of the present invention does not do concrete restriction to this.

Further, as shown in Figure 4, in the embodiment of the present invention, source-to-source compiler resolves described first source code, obtains an AST, and wherein, a described AST is the AST (step S202) irrelevant with parallel computation framework, specifically can comprise:

S202a, source-to-source compiler resolve described first source code, obtain the AST of vector matrix operation.

The AST that described vector matrix operates, according to preset rules, is changed into the AST of element operation by S202b, source-to-source compiler.

A described AST, according to the information of described parallel computation framework, is converted into the 2nd AST by described source-to-source compiler, and wherein, described 2nd AST is the AST (step S204) relevant to described parallel computation framework, specifically comprises:

The AST of described element operation, according to the information of described parallel computation framework, is converted into described 2nd AST by S204a, source-to-source compiler, and wherein, described 2nd AST is the AST relevant to described parallel computation framework.

That is, in the embodiment of the present invention, source-to-source compiler, at parsing first source code, when obtaining an AST, first obtains the AST of the vector matrix operation of higher level of abstraction; Secondly, according to certain preset rules, the AST that the vector matrix of higher level of abstraction operates is converted into the AST of the abstract element operation of low layer, the embodiment of the present invention does not do concrete restriction to this.

In a kind of possible implementation, as shown in Figure 5, the AST that described vector matrix operates, according to preset rules, is changed into the AST (step S202b) of element operation, comprises by source-to-source compiler:

S202b1, source-to-source compiler search function body corresponding to described vector matrix operation in the function library of the element operation preset.

S202b2, the source-to-source compiler AST of the function body of described vector matrix operation correspondence replaces the AST of described vector matrix operation, obtains the AST of described element operation.

That is, having prestored Matrix (matrix) Lib storehouse in source-to-source compiler, is the senior vector matrix operation realized with simple such as C language function in this Matrix Lib storehouse.Source-to-source compiler can utilize Matrix Lib storehouse the AST that described vector matrix operates to be changed into the AST of element operation.

In another kind of possible implementation, as shown in Figure 6, the AST that described vector matrix operates, according to preset rules, is changed into the AST (step S202b) of element operation, comprises by source-to-source compiler:

The operational character that S202b3, source-to-source compiler operate according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of element operation corresponding to the AST of described vector matrix operation.

The AST of the function of the element operation that the AST that S202b4, source-to-source compiler operate by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.

That is, source-to-source compiler adopts Matrix Lower mode the AST that described vector matrix operates to be changed into the AST of element operation.

It should be noted that, Lower, in compiling field, is often referred to and a kind of senior representation is converted into another kind of rudimentary representation, and namely the process being such as converted to assembly language from the higher level lanquage that such as C is such can be described as lower process.Wherein, senior representation is abstract, is more close to the understanding of people; Rudimentary representation is concrete, is more close to the execution of machine.

Further, as shown in Figure 7, in the embodiment of the present invention, at source-to-source compiler according to preset rules, after the AST that described vector matrix operates being changed into the AST (step S202b) of element operation, source-to-source compiler, according to the information of described parallel computation framework, before the AST of described element operation being converted into the 2nd AST (step S204a), can also comprise:

The AST of described element operation is optimized by S206, source-to-source compiler, obtains the AST of the described element operation after optimizing.

The AST of described element operation, according to the information of described parallel computation framework, is converted into described 2nd AST (step S204a), specifically can comprises by source-to-source compiler:

The AST of the described element operation after described optimization, according to the information of described parallel computation framework, is converted into described 2nd AST by S204a1, source-to-source compiler.

That is, in the embodiment of the present invention, at source-to-source compiler according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, AST optimization can also be carried out.Program execution efficiency within a processor can be improved like this.Wherein, a lot, the AST of the element operation of the embodiment of the present invention optimizes usually relevant with the parallel work-flow of matrix, and common are loop fusion, register lifting, peephole optimization etc., the embodiment of the present invention does not do concrete restriction to this for the method for Compiler Optimization and theory.

It should be noted that, in the embodiment of the present invention, source-to-source compiler can also be optimized the 2nd AST, can promote program execution efficiency within a processor further like this.Wherein, this is optimized for the optimization relevant to parallel computation framework, specifically can with reference to the scheme of prior art, and the embodiment of the present invention does not repeat them here.

It should be noted that, in the embodiment of the present invention, source-to-source compiler can also be optimized the AST of vector matrix operation, can promote program execution efficiency within a processor further like this.Wherein, this is optimized for the optimization of matrix level, and common are the information such as the dimension according with relation and matrix according to matrix operation and select optimal algorithm, the embodiment of the present invention does not repeat them here.

Below for the method for the above-mentioned generation provided for the source code of parallel computation framework, suppose a matrix multiplication scene of ARM-v8 NEON, its source code from vector matrix (the first source code) is as follows to the source-to-source translation process of the source code (the second source code) generating ARM-v8 NEON parallel computation framework:

The first, source-to-source compiler receives the first source code, shown in this first source code is specific as follows:

The second, source-to-source compiler resolves the first source code, obtains the AST of vector matrix operation, and utilizes Matrix Lib storehouse the AST that vector matrix operates to be changed into the AST of element operation.

Concrete, Fig. 8 provides source-to-source compiler and resolves the first source code, obtains the structural representation of the AST of element operation, comprising:

First, resolve the first source code, obtain the AST80 of vector matrix operation;

Secondly, in the function library 81 of the element operation preset, search function body corresponding to vector matrix operation;

Finally, with the AST80 of the AST82 substituting vector matrix manipulation of function body, obtain the AST83 of element operation.

Wherein, in this example, shown in the AST class C of element operation is expressed as follows:

3rd, source-to-source compiler obtains the information of parallel computation framework.

In this example, the information of the parallel computation framework that source-to-source compiler obtains is specially the information of ARM-v8NEON framework.

4th, the AST of element operation, according to the information of parallel computation framework, is converted into the 2nd AST by source-to-source compiler.

In this example, the AST of element operation is converted into the AST relevant to ARM-v8 NEON framework by source-to-source compiler.

5th, the 2nd AST is converted into the second source code of parallel computation framework by source-to-source compiler.

Concrete, source-to-source compiler according to the type of ARM-v8 NEON framework and can operate ARM NEON Parallel application source code corresponding to generation the 2nd AST, and this second source code is as follows:

So far, under the matrix multiplication scene of ARM-v8 NEON, its source code from vector matrix (the first source code) terminates to the source-to-source translation process of the source code (the second source code) generating ARM-v8 NEON parallel computation framework.

In prior art, the Parallel application source code that source-to-source compiler receiving belt APCG annotates.But the Parallel application source code of band APCG annotation is mechanically corresponding one by one with the machine source code, programmer needs the algorithm being familiar with the machine source code to be rewritten into the Parallel application source code of corresponding band APCG annotation, and therefore efficiency is lower, portable poor.In the embodiment of the present invention, source-to-source compiler resolves this source code after receiving the source code of the vector matrix of user's input, obtains the AST irrelevant with parallel computation framework; And then, according to the mark of the described parallel computation framework of user's input, obtain the information of described parallel computation framework; According to the information of described parallel computation framework, be converted into the AST relevant with described parallel computation framework by described to the AST that parallel computation framework has nothing to do; Finally AST relevant for described and described parallel computation framework is converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.That is, the source code that the compiler that the source code of vector matrix that user inputs can be converted into parallel computation framework by source-to-source compiler can compile.On the one hand, compared to prior art, the source code due to vector matrix is a kind of source code of simple, higher level of abstraction, and do not need programmer to be familiar with the algorithm of the machine source code, only need programmer to have certain matrix operation basis to complete, therefore efficiency is higher; On the other hand, because the source code of vector matrix and parallel computation framework have nothing to do, programmer is not needed to be rewritten into the Parallel application source code of corresponding band APCG annotation according to the algorithm of the machine source code, namely, the embodiment of the present invention can be converted into the source code relevant with parallel computation framework by the source code that parallel computation framework has nothing to do, therefore better portable.

Embodiment two,

The embodiment of the present invention provides a kind of source-to-source compiler 90, and specifically as shown in Figure 9, described source-to-source compiler 90 comprises: receiving element 901, resolution unit 902, acquiring unit 903, first conversion unit 904, second conversion unit 905.

Described receiving element 901, for receiving the first source code, described first source code is the source code of the vector matrix that user inputs.

Described resolution unit 902, for resolving described first source code, obtains the first abstract syntax tree AST, and wherein, a described AST is the AST irrelevant with described parallel computation framework.

Described acquiring unit 903, for obtaining the information of described parallel computation framework.

Described first conversion unit 904, for the information according to described parallel computation framework, is converted into the 2nd AST by a described AST, and wherein, described 2nd AST is the AST relevant to described parallel computation framework.

Described second conversion unit 905, for described 2nd AST being converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.

Further, as shown in Figure 10, described resolution unit 902 comprises: parsing module 9021, conversion module 9022.

Described parsing module 9021, for resolving described first source code, obtains the AST of vector matrix operation.

Described conversion module 9022, for according to preset rules, changes into the AST of element operation by the AST that described vector matrix operates.

Described first conversion unit 904 specifically for:

According to the information of described parallel computation framework, the AST of described element operation is converted into the 2nd AST.

In a kind of possible implementation, described conversion module 9022 specifically for:

The function body that the operation of described vector matrix is corresponding is searched in the function library of the element operation preset;

Replace the AST of described vector matrix operation with the AST of the function body of described vector matrix operation correspondence, obtain the AST of described element operation.

In another kind of possible implementation, described conversion module 9022 specifically for:

The operational character operated according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of the element operation corresponding to AST of described vector matrix operation;

The AST of the function of the element operation that the AST operated by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.

Further, as shown in figure 11, described source-to-source compiler 90 also comprises: optimize unit 906.

Described optimization unit 906, for at described conversion module 9022 according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, described first conversion unit 904 is according to the information of described parallel computation framework, before the AST of described element operation being converted into the 2nd AST, the AST of described element operation is optimized, obtains the AST of the described element operation after optimizing.

Described first conversion unit 904 specifically for:

According to the information of described parallel computation framework, the AST of the described element operation after described optimization is converted into described 2nd AST.

Concrete, the method that the source-to-source compiler 90 provided by the embodiment of the present invention is generated for the source code of parallel computation framework can the description of reference example one, and the embodiment of the present invention does not repeat them here.

In prior art, the Parallel application source code that source-to-source compiler receiving belt APCG annotates.But the Parallel application source code of band APCG annotation is mechanically corresponding one by one with the machine source code, programmer needs the algorithm being familiar with the machine source code to be rewritten into the Parallel application source code of corresponding band APCG annotation, and therefore efficiency is lower, portable poor.In the embodiment of the present invention, source-to-source compiler resolves this source code after receiving the source code of the vector matrix of user's input, obtains the AST irrelevant with parallel computation framework; And then, obtain the information of described parallel computation framework, and according to the information of described parallel computation framework, be converted into the AST relevant with described parallel computation framework by described to the AST that parallel computation framework has nothing to do; Finally AST relevant for described and described parallel computation framework is converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.That is, the source code that the compiler that the source code of vector matrix that user inputs can be converted into parallel computation framework by source-to-source compiler can compile.On the one hand, compared to prior art, the source code due to vector matrix is a kind of source code of simple, higher level of abstraction, and do not need programmer to be familiar with the algorithm of the machine source code, only need programmer to have certain matrix operation basis to complete, therefore efficiency is higher; On the other hand, because the source code of vector matrix and parallel computation framework have nothing to do, programmer is not needed to be rewritten into the Parallel application source code of corresponding band APCG annotation according to the algorithm of the machine source code, namely, the source-to-source compiler that the embodiment of the present invention provides can be converted into the source code relevant with parallel computation framework by the source code that parallel computation framework has nothing to do, therefore better portable.

Embodiment three,

The embodiment of the present invention provides a kind of source-to-source compiler 120, and specifically as shown in figure 12, described source-to-source compiler 120 comprises: input interface 1201, processor 1202.

Described input interface 1201, for receiving the first source code, described first source code is the source code of the vector matrix that user inputs.

Described processor 1202, for resolving described first source code, obtains the first abstract syntax tree AST, and wherein, a described AST is the AST irrelevant with described parallel computation framework;

Described processor 1202, also for obtaining the information of described parallel computation framework, and according to the information of described parallel computation framework, a described AST is converted into the 2nd AST, and wherein, described 2nd AST is the AST relevant to described parallel computation framework.

Described processor 1202, also for described 2nd AST being converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.

Further, described processor 1202 specifically for:

Resolve described first source code, obtain the AST of vector matrix operation;

According to preset rules, the AST that described vector matrix operates is changed into the AST of element operation.

Described processor 1202 also specifically for:

According to the information of described parallel computation framework, the AST of described element operation is converted into described 2nd AST.

In a kind of possible implementation, described processor 1202 further specifically for:

The function body that the operation of described vector matrix is corresponding is searched in the function library of the element operation preset;

Replace the AST of described vector matrix operation with the AST of the function body of described vector matrix operation correspondence, obtain the AST of described element operation.

In another kind of possible implementation, described processor 1202 further specifically for:

The operational character operated according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of the element operation corresponding to AST of described vector matrix operation;

The AST of the function of the element operation that the AST operated by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.

Further, described processor 1202 also for:

Described according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, the described information according to described parallel computation framework, before the AST of described element operation being converted into the 2nd AST, the AST of described element operation is optimized, obtains the AST of the described element operation after optimizing.

Described processor 1202 also specifically for:

According to the information of described parallel computation framework, the AST of the described element operation after described optimization is converted into described 2nd AST.

Concrete, the method that the source-to-source compiler 120 provided by the embodiment of the present invention is generated for the source code of parallel computation framework can the description of reference example one, and the embodiment of the present invention does not repeat them here.

In prior art, the Parallel application source code that source-to-source compiler receiving belt APCG annotates.But the Parallel application source code of band APCG annotation is mechanically corresponding one by one with the machine source code, programmer needs the algorithm being familiar with the machine source code to be rewritten into the Parallel application source code of corresponding band APCG annotation, and therefore efficiency is lower, portable poor.In the embodiment of the present invention, source-to-source compiler resolves this source code after receiving the source code of the vector matrix of user's input, obtains the AST irrelevant with parallel computation framework; And then, obtain the information of described parallel computation framework, and according to the information of described parallel computation framework, be converted into the AST relevant with described parallel computation framework by described to the AST that parallel computation framework has nothing to do; Finally AST relevant for described and described parallel computation framework is converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.That is, the source code that the compiler that the source code of vector matrix that user inputs can be converted into parallel computation framework by source-to-source compiler can compile.On the one hand, compared to prior art, the source code due to vector matrix is a kind of source code of simple, higher level of abstraction, and do not need programmer to be familiar with the algorithm of the machine source code, only need programmer to have certain matrix operation basis to complete, therefore efficiency is higher; On the other hand, because the source code of vector matrix and parallel computation framework have nothing to do, programmer is not needed to be rewritten into the Parallel application source code of corresponding band APCG annotation according to the algorithm of the machine source code, namely, the source-to-source compiler that the embodiment of the present invention provides can be converted into the source code relevant with parallel computation framework by the source code that parallel computation framework has nothing to do, therefore better portable.

Those skilled in the art can be well understood to, and for convenience and simplicity of description, the specific works process of the system of foregoing description, device and unit, with reference to the corresponding process in preceding method embodiment, can not repeat them here.

In several embodiments that the application provides, should be understood that disclosed system, apparatus and method can realize by another way.Such as, device embodiment described above is only schematic, such as, the division of described unit, can be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical, machinery or other form.

The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.

If described function using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part of the part that technical scheme of the present invention contributes to prior art in essence in other words or this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform all or part of step of method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, portable hard drive, ROM (read-only memory) (Read-OnlyMemory, ROM), random access memory (Random Access Memory, RAM), magnetic disc or CD etc. various can be program code stored medium.

The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (10)

1. generate a method for the source code being used for parallel computation framework, it is characterized in that, described method comprises:
Receive the first source code, described first source code is the source code of the vector matrix that user inputs;
Resolve described first source code, obtain the first abstract syntax tree AST, wherein, a described AST is the AST irrelevant with described parallel computation framework;
Obtain the information of described parallel computation framework, and according to the information of described parallel computation framework, a described AST is converted into the 2nd AST, and wherein, described 2nd AST is the AST relevant to described parallel computation framework;
Described 2nd AST is converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.
2. method according to claim 1, is characterized in that, described first source code of described parsing, obtains the first abstract syntax tree AST, and wherein, a described AST is the AST irrelevant with described parallel computation framework, comprising:
Resolve described first source code, obtain the AST of vector matrix operation;
According to preset rules, the AST that described vector matrix operates is changed into the AST of element operation;
The described information according to described parallel computation framework, is converted into the 2nd AST by a described AST, comprises:
According to the information of described parallel computation framework, the AST of described element operation is converted into described 2nd AST.
3. method according to claim 2, is characterized in that, described according to preset rules, by the AST that the AST that described vector matrix operates changes into element operation, comprising:
The function body that the operation of described vector matrix is corresponding is searched in the function library of the element operation preset;
Replace the AST of described vector matrix operation with the AST of the function body of described vector matrix operation correspondence, obtain the AST of described element operation.
4. method according to claim 2, is characterized in that, described according to preset rules, by the AST that the AST that described vector matrix operates changes into element operation, comprising:
The operational character operated according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of the element operation corresponding to AST of described vector matrix operation;
The AST of the function of the element operation that the AST operated by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.
5. the method according to any one of claim 2-4, it is characterized in that, described according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, the described information according to described parallel computation framework, before the AST of described element operation being converted into described 2nd AST, also comprise:
The AST of described element operation is optimized, obtains the AST of the described element operation after optimizing;
The described information according to described parallel computation framework, is converted into described 2nd AST by the AST of described element operation, comprises:
According to the information of described parallel computation framework, the AST of the described element operation after described optimization is converted into described 2nd AST.
6. a source-to-source compiler, is characterized in that, described source-to-source compiler comprises: receiving element, resolution unit, acquiring unit, the first conversion unit, the second conversion unit;
Described receiving element, for receiving the first source code, described first source code is the source code of the vector matrix that user inputs;
Described resolution unit, for resolving described first source code, obtains the first abstract syntax tree AST, and wherein, a described AST is the AST irrelevant with described parallel computation framework;
Described acquiring unit, for obtaining the information of described parallel computation framework;
Described first conversion unit, for the information according to described parallel computation framework, is converted into the 2nd AST by a described AST, and wherein, described 2nd AST is the AST relevant to described parallel computation framework;
Described second conversion unit, for described 2nd AST being converted into the second source code of described parallel computation framework, described second source code is the source code that the compiler of described parallel computation framework can compile.
7. source-to-source compiler according to claim 6, is characterized in that, described resolution unit comprises: parsing module, conversion module;
Described parsing module, for resolving described first source code, obtains the AST of vector matrix operation;
Described conversion module, for according to preset rules, changes into the AST of element operation by the AST that described vector matrix operates;
Described first conversion unit specifically for:
According to the information of described parallel computation framework, the AST of described element operation is converted into described 2nd AST.
8. source-to-source compiler according to claim 7, is characterized in that, described conversion module specifically for:
The function body that the operation of described vector matrix is corresponding is searched in the function library of the element operation preset;
Replace the AST of described vector matrix operation with the AST of the function body of described vector matrix operation correspondence, obtain the AST of described element operation.
9. source-to-source compiler according to claim 7, is characterized in that, described conversion module specifically for:
The operational character operated according to described vector matrix and operand, decompose the AST that described vector matrix operates, and obtains the function of the element operation corresponding to AST of described vector matrix operation;
The AST of the function of the element operation that the AST operated by described vector matrix is corresponding replaces the AST of described vector matrix operation, obtains the AST of described element operation.
10. the source-to-source compiler according to any one of claim 7-9, is characterized in that, described source-to-source compiler also comprises: optimize unit;
Described optimization unit, for at described conversion module according to preset rules, after the AST that described vector matrix operates being changed into the AST of element operation, described first conversion unit is according to the information of described parallel computation framework, before the AST of described element operation being converted into described 2nd AST, the AST of described element operation is optimized, obtains the AST of the described element operation after optimizing;
Described first conversion unit specifically for:
According to the information of described parallel computation framework, the AST of the described element operation after described optimization is converted into described 2nd AST.
CN201410841579.4A 2014-12-29 2014-12-29 Method for generating source codes used for parallel computing architecture and source-to-source compiler CN104572234A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410841579.4A CN104572234A (en) 2014-12-29 2014-12-29 Method for generating source codes used for parallel computing architecture and source-to-source compiler

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410841579.4A CN104572234A (en) 2014-12-29 2014-12-29 Method for generating source codes used for parallel computing architecture and source-to-source compiler

Publications (1)

Publication Number Publication Date
CN104572234A true CN104572234A (en) 2015-04-29

Family

ID=53088391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410841579.4A CN104572234A (en) 2014-12-29 2014-12-29 Method for generating source codes used for parallel computing architecture and source-to-source compiler

Country Status (1)

Country Link
CN (1) CN104572234A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017035748A1 (en) * 2015-08-31 2017-03-09 华为技术有限公司 Code compiling method and code complier
WO2017107154A1 (en) * 2015-12-24 2017-06-29 华为技术有限公司 Method of converting source code to another source code for matrix operation and source-to-source compiler

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000038087A1 (en) * 1998-12-22 2000-06-29 Celoxica Limited Hardware/software codesign system
CN1672132A (en) * 2002-07-25 2005-09-21 皇家飞利浦电子股份有限公司 Source-to-source partitioning compilation
CN102750150A (en) * 2012-06-14 2012-10-24 中国科学院软件研究所 Method for automatically generating dense matrix multiplication assembly code based on x86 architecture
CN103631632A (en) * 2013-11-29 2014-03-12 华为技术有限公司 Transplantation method and source to source compiler
CN104182267A (en) * 2013-05-21 2014-12-03 中兴通讯股份有限公司 Compiling method, interpreting method, interpreting device and user equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000038087A1 (en) * 1998-12-22 2000-06-29 Celoxica Limited Hardware/software codesign system
CN1672132A (en) * 2002-07-25 2005-09-21 皇家飞利浦电子股份有限公司 Source-to-source partitioning compilation
CN102750150A (en) * 2012-06-14 2012-10-24 中国科学院软件研究所 Method for automatically generating dense matrix multiplication assembly code based on x86 architecture
CN104182267A (en) * 2013-05-21 2014-12-03 中兴通讯股份有限公司 Compiling method, interpreting method, interpreting device and user equipment
CN103631632A (en) * 2013-11-29 2014-03-12 华为技术有限公司 Transplantation method and source to source compiler

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017035748A1 (en) * 2015-08-31 2017-03-09 华为技术有限公司 Code compiling method and code complier
CN107851002A (en) * 2015-08-31 2018-03-27 华为技术有限公司 A kind of code compiling method and code encoder
WO2017107154A1 (en) * 2015-12-24 2017-06-29 华为技术有限公司 Method of converting source code to another source code for matrix operation and source-to-source compiler

Similar Documents

Publication Publication Date Title
Eddelbuettel et al. RcppArmadillo: Accelerating R with high-performance C++ linear algebra
Ansel et al. PetaBricks: a language and compiler for algorithmic choice
US7174540B2 (en) Component dependency matrices
JP5325925B2 (en) Optimization of N-base type arithmetic expressions
JP5848778B2 (en) Use of dedicated elements to implement FSM
Bezanson et al. Julia: A fresh approach to numerical computing
Lubin et al. Computing in operations research using Julia
CN103547998B (en) For compiling the method and apparatus of regular expression
US20040230958A1 (en) Compiler and software product for compiling intermediate language bytecodes into Java bytecodes
Anderson et al. Communication-avoiding QR decomposition for GPUs
Kong et al. When polyhedral transformations meet SIMD code generation
US7975257B2 (en) Iterative static and dynamic software analysis
JP5602597B2 (en) Method, computer program, and system for memory optimization of virtual machine code by segmenting foreign information
TW201246071A (en) Unrolling quantifications to control in-degree and/or out degree of automation
US20030145312A1 (en) Source code transformation
Li et al. SMAT: an input adaptive auto-tuner for sparse matrix-vector multiplication
EP2827244A1 (en) Extension mechanism for scripting language compiler
JP6122493B2 (en) Adaptively portable library
Clark et al. Scalable subgraph mapping for acyclic computation accelerators
Luporini et al. Cross-loop optimization of arithmetic intensity for finite element local assembly
US20100293534A1 (en) Use of vectorization instruction sets
JP2006243839A (en) Instruction generation device and instruction generation method
CN103631632A (en) Transplantation method and source to source compiler
US8935683B2 (en) Inline function linking
Kaliszyk et al. Holstep: A machine learning dataset for higher-order logic theorem proving

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20150429

RJ01 Rejection of invention patent application after publication