CN103119561A - Systems and methods for compiler-based vectorization of non-leaf code - Google Patents

Systems and methods for compiler-based vectorization of non-leaf code Download PDF

Info

Publication number
CN103119561A
CN103119561A CN2011800455834A CN201180045583A CN103119561A CN 103119561 A CN103119561 A CN 103119561A CN 2011800455834 A CN2011800455834 A CN 2011800455834A CN 201180045583 A CN201180045583 A CN 201180045583A CN 103119561 A CN103119561 A CN 103119561A
Authority
CN
China
Prior art keywords
function
dependence
vector
compiler
called
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011800455834A
Other languages
Chinese (zh)
Other versions
CN103119561B (en
Inventor
J·E·戈尼诺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Apple Computer Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/888,658 external-priority patent/US8949808B2/en
Priority claimed from US12/888,644 external-priority patent/US8621448B2/en
Application filed by Apple Computer Inc filed Critical Apple Computer Inc
Publication of CN103119561A publication Critical patent/CN103119561A/en
Application granted granted Critical
Publication of CN103119561B publication Critical patent/CN103119561B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/456Parallelism detection

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

Systems and methods for the vectorization of software applications are described. In some embodiments, source code dependencies can be expressed in ways that can extend a compiler's ability to vectorize otherwise scalar functions. For example, when compiling a called function, a compiler may identify dependencies of the called function on variables other than parameters passed to the called function. The compiler may record these dependencies, e.g., in a dependency file. Later, when compiling a calling function that calls the called function, the same (or another) compiler may reference the previously-identified dependencies and use them to determine whether and how to vectorize the calling function. In particular, these techniques may facilitate the vectorization of non-leaf loops.; Because non-leaf loops are relatively common, the techniques described herein can increase the amount of vectorization that can be applied to many applications.

Description

For the system and method that non-leaf code is carried out based on the vector quantization of compiler
Technical field
The disclosure relates to computer system, and more particularly, relates to the system and method be used to the general vector quantization that enables software application.
Background technology
Exemplary software exploitation example is known.The computer programmer adopts high-level programming language (for example, Basic, C++ etc.) to write source code.A bit locate at certain, the programmer uses compiler that source code transformation is become object code.After being transformed into executable code (for example, after link or other compilation time or runtime processing), then, the gained object code can be carried out by computing machine or calculation element.
Nowadays computing machine has a plurality of processing units, and can carry out concurrently instruction.Utilize the advantage of this framework, modern compiler can be attempted " parallelization " or " vector quantization " specific software function, thereby replaces making single processing unit sequentially once carry out an instruction, and a plurality of processing units can be carried out a plurality of instructions simultaneously.
During process of compilation, compiler analysis software function is to determine whether to exist any obstacle for vector quantization.A this obstacle is for example to have True Data dependence (dependency).This quotes in present instruction and occurs when carrying out in the data that front instruction is obtained.In this case, rear instruction only can be carried out after early instruction, and two instructions can not executed in parallel thus.Another potential obstacle is that existence function calls.For example, if the function of compiling calls external function, compiler can not this call function of vector quantization.
Summary of the invention
The disclosure provides and has been used for the system and method that enable pass is used with vector software.For this reason, system and method disclosed herein provide to the expansion compiler ability with the dependence of vector quantization function and/or the expression of interface.
In a unrestricted embodiment, compiler can be in its compile duration checks storer and/or a function (" function is called ") data dependence relation, and express those dependences in dependence database (for instance, as the dependence file).In case compiling, the function that is called just can be such as becoming built-in function etc.At time point after a while, can create another function (" call function "), so that it calls for the function that is called.During the compiling call function, compiler can be accessed and the dependence file of functional dependence connection that is called, and can identify its dependence.Based on the dependence of the function that is called, compiler can carry out the judgement of relevant whether vector quantization call function.
In addition or alternatively, compiler can be judged the only part of vector quantization call function.May mode compare with other, the function of the higher number percent of compiler vector quantization can be provided by the observability of using the dependence file to provide.
For example, realize that the dependence file allows vector quantization to comprise the function of non-leaf circulation (non-leaf loop) (that is circulation of, the sightless external function of source code being called).Because the most software function comprises one or more non-leaf circulations now, so these system and methods can increase the amount of the vector quantization that can be applied to any application.
In another unrestricted embodiment, compiler can be described according to single source code the scalar sum vector form of generating function.The scalar form of function can be used as the scalar interface by the initial appointment of source code.Simultaneously, the vector form of function can be realized for this function, vector interface that be used for receiving vector parameters and generate the vector rreturn value.
For example, vector interface can expose in the dependence file that function is associated.The existence of this alternative vector interface for example allows compiler to carry out phasor function in the vector quantization circulation to call, and calls but not carry out a plurality of serialization scalar functions in the vector quantization circulation.
The vector quantization that function is also permitted in the various combinations of technology disclosed herein does not comprise circulation, and it is opposite with generally acknowledged knowledge, and many advantages still are provided.Say especially, these technology can increase the amount of the total vector quantization in software application.
Description of drawings
Fig. 1 be illustration according to some embodiment, can operate to realize the block diagram be used to the computer system of the technology of the general vector quantization that enables to realize software application.
Fig. 2 is illustration according to some embodiment, can generates the block diagram of the compiler of executable code when carrying out by computer system.
Fig. 3 shows illustration according to some embodiment, process flow diagram that express the method for the dependence in the dependence database.
Fig. 4 shows illustration according to process flow diagram some embodiment, the method vector quantization function.
Fig. 5 shows illustration according to some embodiment, the process flow diagram total function vectorization method.
Fig. 6 shows illustration according to some embodiment, process flow diagram that utilize the method for institute's vector quantization function.
Although be subject to various modifications and alternate forms, the specific embodiment of discussing in this manual illustrates by example in the accompanying drawings, and will be described in detail at this.Yet, be understood that, accompanying drawing and detailed description are not intended to the disclosure is constrained to particular forms disclosed, but in contrast, the present invention will cover all modifications example, equivalent and the alternative case that falls in the spirit and scope of the present disclosure that limit as appended claims.
Embodiment
Introduce
At first following instructions discusses exemplary computer system or device.This instructions has also been described the illustration compiler, and it can be configured to carry out and/or generate the executable code for computer system.Then, this instructions has proposed to be used for enabling to realize several technology of non-leaf circulation and total function vector quantization.
Exemplary computer system
Fig. 1 described according to some embodiment, can operate to realize the exemplary computer system be used to the technology of the general vector quantization that enables to realize software application.In this unrestricted example, computer system 100 comprises the one or more processor 110a-110n that are coupled to storer 120 via I/O interface 130.Computer system 100 also comprises network interface 140 and the memory interface 150 that is coupled to I/O interface 130.Memory interface 150 is connected to I/O interface 130 with external memory 155.And network interface 140 can be connected to system 100 the network (not shown) or be connected to another computer system (not shown).
In certain embodiments, computer system 100 can be the single-processor system that only comprises a processor 110a.In other embodiments, computer system 100 can comprise two or more processors 110a-110n.Processor 110a-110n can comprise any processor that can carry out instruction.For example, processor 110a-110n can be the general or flush bonding processor of realizing any suitable instructions collection framework (ISA), for instance, and as x86, PowerPC TM, SPARC TM, or MIPS TMI SAs.In one embodiment, processor 110a-110n can be included in the various features of the Macroscalar processor of describing in U.S. Patent No. 7617496 and U.S. Patent No. 7395419.
System storage 120 can be configured to store can be by the instruction and data of processor 110a-110n access.For example, system storage 120 can be the memory technology of static random-access memory (SRAM), synchronous dynamic ram (SDRAM), non-volatile/sonos type memory or any other any suitable type.The part of the hope function that realization is described in detail below or the programmed instruction of application and/or data can be illustrated and be stored in system storage 120.In addition or alternatively, the part of those programmed instruction and/or data can be stored in cache memory in memory storage 155, one or more processor 110a-110n or can arrive at from network via network interface 140.
I/O interface 130 can operate with the data communication between in management processor 110a-110n, system storage 120 and this system or any device that be engaged to it (comprising network interface 140, memory interface 150 or other peripheral interface).For example, I/O interface 130 can convert data or the control signal from an assembly to be suitable for for another assembly form.In certain embodiments, I/O interface 130 can comprise the support to the device that engages by all kinds peripheral bus, for instance, and as periphery component interconnection (PCI) bus or USB (universal serial bus) (USB).And in certain embodiments, some or all functions of I/O interface 130 can be incorporated processor 110a-110n into.
Network interface 140 is configured to allow data exchange between computer system 100 and other device (for instance, as other computer system) that is engaged to network.For example, network interface 140 can be supported via wired or wireless conventional data network, telecommunication/telephone network, such as the communication of storage area network of Fibre Channel SAN etc. etc.
Memory interface 150 is configured to allow computer system 100 to be connected with memory device interface such as memory storage 155.Memory interface 150 can be supported following standard storage interface: as the advanced techniques additional number according to packet interface (ATAPI) standard one or more appropriate version of (it can also be called as integrated drive electronics (IDE)), small computer system interface (SCSI) standard, IEEE 1394 " Firewire " standard, USB standard or another standard or the proprietary interface of be suitable for interconnecting high-capacity storage and computer system 100.For example, memory storage 155 can comprise and can fix or removable magnetic, optics or solid state medium.Memory storage 155 can also be corresponding to hard disk drive or drive array, CD or DVD driver or based on the device of nonvolatile memory (for example, flash memory).
System storage 120 and memory storage 155 expressions are configured to the computer-accessible of stored program instruction and data or the exemplary embodiments of computer-readable recording medium.In other embodiments, programmed instruction and/or data can receive on dissimilar computer accessible, send or store.In general, computer accessible or storage medium can comprise mass storage media or the storage medium of any type, as magnetic medium or optical medium.Computer accessible or storage medium can also comprise any volatibility or non-volatile media, as RAM(for example, SDRAM, DDR SDRAM, RDRAM, SRAM etc.), ROM etc., no matter as system storage 120 or the another type storer is included in computer system 100.Can be by transmission medium or such as the signal transmission of electric signal, electromagnetic signal or digital signal via the programmed instruction of computer accessible storage and data, it can be via carrying such as the communication media of network and/or Radio Link, as realizing via network interface 140.
Typically say, computer system 100 can be taked the form of desk-top or laptop computer.Yet, as will readily appreciate that according to the disclosure, computer system 100 can be can executive software any appropriate device.For example, computer system 100 can be flat computer, phone etc.
Exemplary compiler
Generally speaking, compiler can be corresponding to source code translation or the software application that is transformed into object code (for example being configured to, one or more modules in computer executable instructions), this source code can adopt the high-level programming language such as C, C++ or any other suitable programmed language to represent.The language of expressing source code can be called as language source code or referred to as source language.Typically say, object code can adopt the form of the instruction and data that is suitable for the processing of confession target computing architecture to represent, although in certain embodiments, can ((for example carry out additional treatments for the object code that generates, link), so that object code is transformed into machine executable code.In various embodiments, this additional treatments can should be used for carrying out by compiler or by separating.
Object code (for example can adopt machine-readable form, binary mode), adopt and to need additional treatments with people's readable form of generating machine readable code (for example, assembly language), perhaps adopt the combination of people's readable form and machine-readable form to represent.Target architecture for object code can be identical with the ISA that realizes by processor 110a-11n, on it, compiler is configured to carry out.Yet in some cases, compiler can be configured to generate the object code of comparing different ISA for the ISA of compiler execution (" cross-compiler ") on it.
Fig. 2 has described according to some embodiment, exemplary compiler can generate executable code when carrying out by computer system 100 or another suitable computer system.Compiler 200 comprises front end 220 and rear end 230, and it can comprise again optimizer 240 and code generator 250.As implied above, front end 220 reception sources codes 210, and rear end 230 generates object code, and for instance, as scalar object code 260, vector quantization object code 270 or its combination.Compiler 200 can also generate with object code 260 and/or 270 in one or more dependence databases 280 that are associated.
Although source code 210 adopts high-level programming language to write usually, source code 210 can be alternatively corresponding to the machine level language such as assembly language.For example, compiler 200 can be configured to use its optimization technique, with the language codes of compilation except the code that the employing high-level programming language is write.And compiler 200 can comprise many different instances of front end 220, and it all is configured to process the source code 210 that adopts different corresponding language to write, and generates the similar intermediate representation of processing for rear end 230.In such an embodiment, compiler 200 can be filled the post of multilingual compiler effectively.
In one embodiment, front end 220 can be configured to carry out the rough handling of source code 210, determining the source on vocabulary and/or whether correct on grammer, and carries out and is suitable for preparing any conversion that source code 210 is further processed for rear end 230.For example, front end 220 can be configured to process any compiler instruction that is present in source code 210, is included in process of compilation and other parts are not included in conditional compilation instruction wherein as some part that can cause source code 210.Front end 220 can also differently be configured to (for example convert source code 210 to mark (tokens), according to blank and/or other separator by the source language definition), determine whether source code 210 comprises any character or the mark of forbidding for source language, and whether definite gained mark stream obeys the syntax rule by the box-like expression of source language definition.Under different situations, front end 220 can be configured to carry out the various combination of these processing activities, can omit above-mentioned specific action, perhaps can comprise different actions, according to the realization of front end 220 and the target source language of front end 220, for example, if source language is not provided for defining the grammer of compiler instruction, front end 220 can omit and comprise that scanning is used for the processing action of the source code 210 of compiler instruction.
If front end 220 meets with mistake during processing source code 210, it can be processed and reporting errors (for example, by error message being write to journal file or display) by abort.Otherwise, when fully analyzing the syntax and semantics content of source code 210, front end 220 can be to the back-end 230 intermediate representations that source code 210 is provided.Generally speaking, this intermediate representation can comprise the structure that represents source code 210 and one or more data structures of semantic content, as syntax tree, chart, symbol table or other suitable data structure.This intermediate representation can be configured to preserve the information of the syntactical and semantical feature of identifying source code 210, and can comprise by dissecting and analyze the additional annotations information that source code 210 generates.For example, this intermediate representation can comprise the control flow chart of the control relation between the different masses of identifying clearly source code 210 or section.This control flow information can be adopted by rear end 230, for example to determine how to rearrange the funtion part of (for example, by optimizer 240) source code 210, to improve performance in the essential execution sequence relation in preserving source code 210.
Rear end 230 can be configured to intermediate representation is transformed into one or more in scalar code 260, vector quantization code 270 or both combinations usually.Specifically, in the embodiment shown, optimizer 240 can be configured to attempt the conversion intermediate representation, to improve gained scalar code 260 or vector quantization code 270 in a certain respect.For example, optimizer 240 can be configured to analyze intermediate representation with recognition memory or data dependence relation.In certain embodiments, optimizer 240 can be configured to carry out the code optimization of various other types, as, vector quantization, loop optimization are (for example, circulation fusion, loop unrolling etc.), optimizing data stream (for example, common subexpression elimination, constant are folding etc.), or any other suitable optimisation technique.Optimizer 240 can also be configured to generate dependence database 280.In greater detail following, dependence database 280 can be expressed the indication to the data dependence relation in storer and/or source code 210.In addition or alternatively, together with the vector quantization of source code 210, dependence database 280 can expose the vector interface that is associated with vector quantization object code 270.
Code generator 250 can be configured to process the intermediate representation as by optimizer 206 conversion, in order to generate the combination of scalar code 260, vector quantization code 270 or two category codes.For example, code generator 250 can be configured to generate the vector quantization machine instruction by the ISA definition of target architecture, so that by the processor of realize target framework (for example, one of processor 110a-110n, perhaps different processor) carry out the instruction that generates and can realize behaviour by source code 210 appointments.In one embodiment, code generator 250 can also be configured to generate the instruction corresponding with not being operation intrinsic in source code 210, but it can be added by optimizer 240 during optimization process.
In other embodiments, compiler 200 can be partitioned into from shown in those assemblies compare more, still less or different assemblies.For example, compiler 200 can comprise the linker (not shown), and it is configured to take one or more file destinations or storehouse as input, and makes up their single to generate (usually executable) files.Alternatively, this linker can be the entity that separates with compiler 200.As mentioned above, any in the assembly of compiler 200, and performed those method or any in technology of describing below with reference to Fig. 3-6 that comprise thus, can be partly or integrally be implemented as the software code that is stored in suitable computer-accessible storage medium.
Source code 210 for example can represent software function or algorithm.Gained object code 260 and/or 270 can be for example can be by the storehouse of other function call or external function.Below, the exemplary technology that (and specifically, at its vector quantization operating period) adopted by compiler 200 to during operation more discusses in detail.
The vector quantization of non-leaf circulation
Many modern computers have the ability of carrying out a certain type parallel processing of amount of calculation by carrying out concomitantly two or more different operatings.For example, superscalar processor can allow computing machine to attempt once carrying out a plurality of independent instruction.Another technology that is commonly referred to as " vector calculating " (it can be regarded as the particular case of parallel computation) allows computing machine to attempt carrying out the single instruction that once a plurality of data item is operated.The various examples that vector calculates can be in this single instruction, find in available a plurality of data (SIMD) instruction set in various processors now, for example, comprises for PowerPC TMThe AltiVec of the IBM of processor TMAnd SPE TMInstruction set extension, and the MMX of Intel TMAnd SSE TMThe modification of instruction set extension.This SIMD instruction is can be by the example of vector quantization compiler as the vector instruction of target, although the vector instruction exclusive disjunction of other type (comprise variable-length vector calculus, prediction (predicated) vector calculus, to vector scalar/vector calculus of the combination operation of (immediates) immediately) also may and it is contemplated that.
Generally speaking, source code transformation is become the processing of vector quantization object code can be called as " vector quantization ".When utilizing compiler to carry out (for example, opposite with manual vector quantization source code), vector quantization can be called " compiler automatic vectorization ".A particular type of automatic vectorization is the circulation automatic vectorization.The circulation automatic vectorizationization can convert the process circulation of iteration on a plurality of data item to can be in separation processing unit (for example, the processor 110a-110n of the computer system 100 in Fig. 1, perhaps the separation function unit in processor) in process concomitantly the code of a plurality of data item.For example, for two digital array A[] and B[] added together, the process circulation can be passed through these array iteration, during each iteration with the addition of a pair of array constituent element.When this circulation time of compiling, the vector quantization compiler can utilize target processor to realize processing concomitantly that quantity is fixed or the fact of the vector calculus of variable vector constituent element.For example, compiler can the addition of automatic vectorization array circulation so that in each iteration, addition array A[concomitantly] and B[] a plurality of constituent elements, be reduced to and complete the required number of iterations of this addition.Exemplary program spends the execution time of its significant quantity in this circulation.Like this, the automatic vectorization circulation can produce improvement in performance and not need the programmer to interfere.
In certain embodiments, the compiler automatic vectorization is subject to the leaf circulation, that is, and and the circulation of other function not being called.The vector quantization of non-leaf circulation (namely, those that other function is called) because of the spinoff of External Function Call usually unclear (opaque) usually very difficult, especially when their source code is not useable for interprocedural analysis, for instance, as utilize the situation in storehouse.For illustrative purposes, consider following circulation:
Figure BDA00002949554100091
Be that vector quantization should circulation, compiler 200 can determine function f oo () whether with array A[] mutual (for example, read or write).Here, have three kinds of possibilities: (1) function f oo () and A[] not mutual; (2) function f oo () and A[] not mutual; Perhaps (3) function f oo () may with A[] mutual (and for example, according to compilation time or working time condition, foo () can with A[] mutual or can be not mutual).Wherein function f oo () may to A[] mutual situation exist with function f oo () wherein in fact really to A[] the similar problem of mutual situation.In the situation that foo () and A[] not mutual, but the vector quantization code so is equivalent to top circulation:
Figure BDA00002949554100101
This example shows, and in the processing of the non-leaf circulation of vector quantization, compiler 200 is benefited from and known whether storer and/or this storer that function is accessed are read and/or writen.Because main circulation comprises the function call within them usually, so the vector quantization of the circulation of non-leaf and the function that called by them is optimized for the vector quantization of high level.For enabling to realize that other vector quantization of this level, the various embodiment of technology described here and system have increased across the storehouse and the dependence of module and the compilation time observability of potential dependence that have before compiled.For example, this information can be available when the compiling call function, and is irrelevant with the time that initially compiles this storehouse or module (or position).Therefore, some technology described here is set up exemplary compiler foundation structure, to create this observability and exploration by the type of its vector quantization that enables to realize.
The dependence database
When the code of external function is called in compiling, can it is desirable for the interface (for example, the quantity of the parameter that external function is taked and/or type, and/or the quantity of its result of returning and/or type) of determining external function.For example, this interface message can be useful when whether definite invoke code has correctly realized this external function.The outside can be called function and usually can be exposed their interface definition in header files.Yet this header files can not expose the details of variable of the part of the interface that is used for call function that is not external function, but it still can affect code vector.For example, in illustrative circulation, the vector quantization of for circulation can depend on function f oo () and array A[in the above] how mutual.Yet, because foo () does not take A[] and as a parameter, so the header files corresponding with foo () may be not enough to pointer to this dependence of compiler 200.
At this dependence database that can also be called as " permanent dependence database ", the outside dependence that can call function can be described in the storehouse.That is, the dependence database can expose separately from not necessarily significantly the be called various dependences of function of the interface of the function that is called to call function.This database can be accessed when the function in storehouse is called in compiling.Generally speaking, the dependence database can for good and all be stored the indication to the dependence that can call code, so that as seen dependence is called across compiler.For example, in certain embodiments, the dependence database may be implemented as the dependence file (similar with header files) that comprises the readable and/or machine readable content of the people who indicates various dependences.In other embodiments, the dependence database can utilize other technology to realize, as utilizing relational database, semi-structured data (for example, utilizing extend markup language (XML) format), perhaps any other the suitable technology based on table.Be simplified illustration, the embodiment that adopts the dependence file is quoted in following discussion.Yet, it should be noted that, this is only a unrestricted example of dependence database.
In one embodiment, compiler 200 is comprising corresponding header files (for example, automatic access dependence file (if the words that exist) stdlib.h) time.This mechanism can allow the vector quantization compiler, for instance, does not need the Macroscalar compiler of revising as being used for the existing code of compiling in the advantage with dependence of knowing external libraries.Then, compiler 200 can automatically generate the dependence file when the compiling storehouse.
The information in the dependence file of being included in can form uses compiler interface (ACI), and it provides compiler 200 can be used for understanding the information of the constraint condition of function.Specifically, the dependence file can be expressed the relevant common not information of the variable in the scope of call function.The variable of for example, expressing in the dependence file can comprise the data item (that is, this variable may can't help to be called the DLL (dynamic link library) of function be defined as the parameter of this function that is called) of the parameter of the function that is not to be called.By using the dependence file, call function for example can become knows that the function that is called reads or write function static state or file static variable.The dependence file can also allow compiler 200 to distinguish sharing same title but have between the variable of different range.
As a unrestricted example, when compiling during the stdlib of storehouse, compiler is generating object file stdlib.o only usually.Utilize technology described here, compiler 200 can also for example generate dependence file stdlib.d in compilation time.The storer dependence of the public functional dependence connection that defines in dependence file stdlib.d exposure and stdlib.h.Comprise the dependence file stdlib.d that can trigger the association in compiler 200 search correspondence positions from other program of the stdlib.h of its source code.This dependence file can distribute and install together with stdlib.h and stdlib.o.In one implementation, the Existence dependency relationship file will not mean, and there is no the additional information in available relevant storehouse, and it may be the default conditions for conventional libraries, and can not cause any compile error.
The dependence database can enable pass be crossed the data dependence relation feature that exposes the built-in function (or any function in program) of previous compiling by the mode that is found in compiler 200 when the code of compiling Using Call Library Function and is come the non-leaf circulation of vector quantization.This information can be caught in the situation that the source code that does not need to represent for this storehouse is available.
In certain embodiments, dependency information can generate by the compilation time in storehouse.For example, for each function of compiling, compiler 200 can note for the access type that is delivered to the function static variable, file static variable, global variable and/or the pointer that are compiled in function.Then, which symbol compiler 200 can record is read or writes, and adopts and can and export this information with the form of dependence file by the compilation time access of other code of reference library.
As another unrestricted example, if function f oo () defines in file f oo.c, and its interface defines in header files foo.h, in the compilation time of foo.c, the storer dependence feature of function f oo () can be stored in dependence file f oo.hd and (it should be noted that, can adopt any suitable naming convention for the dependence file).Use the call function of function f oo () can comprise header files foo.h, but can access file foo.c.When quoting foo.h during the compiling call function, whether compiler 200 can automatically be searched for dependence file f oo.hd, exist to check it.Because the existence of dependence file f oo.hd is optional, so lack this file can hint the function that defines in file f oo.h dependence unknown characteristic, advise that thus compiler 200 should carry out unfavorable supposition when the vector quantization call function.Yet if the dependence file exists, compiler 200 can use the dependency information in this document, with during the vector quantization call function, utilizes the dependence feature that is included in wherein to carry out more accurate and supposition initiatively.
With reference to Fig. 3, according to some embodiment, expression is expressed the process flow diagram of the method for the dependence in the dependence file and described.In frame 300, compiler 200 receives the function that will compile.For example, compiler 200 can be when process being used for the source code of compiling (during comprising the storehouse of this function in compiling) receiver function.In frame 310, compiler 200 is analyzed this function and is identified the interior expressed dependence of this function.This expressed dependence can be for example storer or be not the data dependence relation that the data item of parameter of function is associated that is called.More generally, it is only to read this specific data item that function can be indicated this function to the expressed dependence of specific data item, only writes this specific data item, has still not only read but also write this specific data item.In various embodiments, analytic function can comprise such as the action of carrying out vocabulary, grammer and/or the semantic analysis of function.Analysis can also comprise that the generation indication is compiled the operation of code and/or parsing tree in a certain respect, symbol table, the intermediate code of data referencing represents and/or any other suitable data structure or expression.
In frame 320, compiler 200 will be stored in the indication of expressed dependence in the dependence database that joins with this functional dependence.For example, during analyzing this function, compiler 200 can identify used by this function, need not to be local or special-purpose, and the variable that can be read or write by the code of this function outside thus for this function.This variable can be the example of the expressed dependence that may identify of compiler 200, and compiler 200 can be stored in the indication to these variablees in the dependence database and (it should be noted that, in certain embodiments, compiler 200 can also to identify and indicate for this function be local or special-purpose dependence).In various embodiments, can comprise the information of the dependence that identification is expressed to the indication of expressed dependence, the title that relies on as variable.This indication can also comprise the information of the dependence that characterization is expressed, read or write the information of variable as relevant function, and/or the information of the data type of related variable or scope (for example, this variable be whether the overall situation, special-purpose, static etc.).As easily understanding according to the disclosure, the dependence file can create or upgrade by any suitable format (for instance, as extend markup language (XML) etc.).And in certain embodiments, dependence can replace sure mode to indicate by means of negation, perhaps indicates by means of negation except sure mode.For example, the dependence file can be indicated clearly, except or replace non-existent those the expressed dependences of indication, given variable does not depend on external code.
For example, the example below considering, in the situation that will compile func1.c:
Figure BDA00002949554100141
In this case, func1.c calls the external function foo1.c shown in following:
Figure BDA00002949554100142
Only for exemplary purpose, reproduce the source code for the function f oo1.c that is called.Should be understood that as long as there is the dependence database (in this example, the dependence file) that is used for foo1.c, its source code does not just need available during compiling call function func1.c.In this example, the expressed dependency information (the moment when compiling file foo1.c generates for it) that is stored in dependence file f oo1.hd can be expressed the fact that function static variable " e " is read and writes.Like this, the below shows a unrestricted example of corresponding dependence file:
In the compilation time of file f unc1.c, comprise header files foo1.h and can cause dependence file f oo1.hd to be compiled device 200 reading.This information is notified the expressed dependence of the function f oo1 () that is called to compiler: namely, this static variable " e " is read and writes.This also allows compiler 200 to detect, even they use in call function func1 (), global variable " A " and " F " the function f oo1 () that also is not called quotes.This circulation of knowing in permission compiler 200 vector quantization function f unc1 () is because it can determine that parallel work-flow can not cause maloperation.In this case, the circulation in func1 () will be called foo1 () once for each constituent element in processed vector.
If function f oo1 () is written into the overall situation " A ", the not circulation in vector quantization func1 () of compiler 200, perhaps it can use this information with the part of this function of vector quantization only.In this case, compiler for example can serialization for function f oo1 () call and memory reference to " A ", allow simultaneously the remainder of this circulation to carry out by parallel mode.
With reference to Fig. 4, described the process flow diagram of an embodiment of the method for expression vector quantization function.In frame 400, compiler 200 identifying call functions.In a unrestricted embodiment, call function can comprise non-leaf circulation, and in this case, call function can comprise calling for outside or the function that is called.With reference to the code example that just now provided, compiler 200 can be processed the func1.c source code, and identification func1 () function is as the call function that comprises the non-leaf for circulation of calling foo1 () function.
In frame 410, compiler 200 can attempt accessing the dependence database that joins with the functional dependence that is called.In some cases, dependence database (for example, the dependence file) for example can via the compiler instruction in command line parameter, embedding source code or via another appropriate technology, be indicated to compiler 200 clearly.In other cases, compiler 200 can according to naming convention, be attempted the title from other inferred from input data dependence file.For example, if header files is included in source code, compiler 200 can be searched for the dependence file of deriving from the title of header files.In certain embodiments, compiler 200 can be searched for the dependence file based on the title of the function that is called.
If the dependence database exists, it can indicate the expressed dependence in the function that is called.This expressed dependence can be for example and not be storer or the data dependence relation that the data item of parameter of function is associated that be called, as mentioned above.In some cases, compiler 200 can check that many different naming conventions are to determine whether the Existence dependency relationship file.
In frame 420, then, compiler 200 determines at least in part based on expressed dependence (perhaps lacking dependence) whether call function is mutual with the function that is called.For example, when the dependence file that access function foo1 () is associated, compiler 200 can determine that foo1 () depends on variable " e " but not variable " A " or " F ".Thus, " e', compiler 200 can determine that call function func1 () is not mutual with the function f oo1 () that is called about variable at least.
In frame 430, according to call function whether with mutual the determining of the function that is called, compiler 200 can determine whether at least a portion of vector quantization call function.For example, based on above-mentioned expressed dependency information, compiler 200 can by generating the vector code that a plurality of data item (for example, the array constituent element) and/or a plurality of loop iteration are operated concomitantly, be attempted vector quantization call function func1 ().
In various embodiments, the dependence database can be expressed various types of information, and this information can be useful to compiler 200 when determining whether the vector quantization function.example comprises follows the tracks of reading and writing following item: data object, pointer, directed data object, known offset in directed object, enter the unknown skew (it can consist of quoting for whole object effectively) of directed object, variable offset (directed object and data object in object, it can utilize the variable of discussing to enable to realize dependence analysis working time), and enter in the object of the unknown skew in high-level objects more known offset (for example, in the known offset of quoting unknown number but other skew when still keeping unreferenced).
Known offset information can be so that compiler 200 can vector quantization, do not check instruction and do not generate additional dependence, variable offset information can be used to generate in working time the dependence inspection instruction of situational variables dependence simultaneously, it can allow to realize the vector concurrency that increases to keep simultaneously program correctness.
As above illustrate, the dependence database can be expressed the information of the relevant function that is called, and this information is useful on compiler 200 when the vector quantization call function.In this, the dependence database can be stored such information: such as memory access type, addressing mode and/or additional qualifier.
In certain embodiments, the storer by a function access falls into two types usually: read and write.Thus, as shown in the example that provides in the above, the dependence database can be stored the indication that data item is read or writes clearly.
Addressing mode has been described the memory access in the call function of watching as call function.Some embodiment can define three kinds of addressing modes: constant, variable and the unknown, although alternative embodiment also may and it is contemplated that.Each in these three addressing modes can be respectively according to addressing whether can set up in compilation time by compiler, according to the call function of working time or determine according to the function that is called of working time.In addition, some embodiment can define two quadrature qualifiers for addressing mode: public and special-purpose.These specify associated variable whether to be found in external module.
According to some embodiment, the constant addressing has been described can be in the addressing of compilation time from the module-external decomposition.This comprises for naming variable, the name name structural component in structure or the array index that can decompose in compilation time.For example, g(naming variable), the name structural component in str.g(name structure), h[5] (according to the array of constant index) and str[5] .h(is according to the name structural component in the name array of structures of constant index) example of expression constant addressing.These examples can represent static variable or global variable.((for example, distribute when entering module and deallocate when module withdraws from) that autostore is normally interim, and thus usually in module-external as seen).Below the example illustration for the dependence of the function that uses the constant addressing:
Figure BDA00002949554100181
In certain embodiments, to have described be not addressing constant nor that revised by the function that is called in the variable addressing.Therefore, it can be assessed by call function in working time.Example comprises wherein observing the directed object of addressing and quoting of array by call function.Function below considering:
Figure BDA00002949554100182
This function is with the following dependence of output needle to the dependence file, and the statement function writes A[g] and read A[x] (being all the variable addressing array):
Figure BDA00002949554100183
In this example, if each call function iteration of function assignA () only is called once, dependence inspection (it can also be called as accident (hazard) inspection) may be unnecessary.The function assignA () that is called can determine g and x whether overlapping, therefore and for example utilize the Macroscalar technology to distinguish vector.
Consider that each iteration of outer loop wherein calls the situation of twice of assignA ():
Figure BDA00002949554100191
Although accident may be present between g1 and x, perhaps between g2 and y, it is relevant that these dependence functions single calls.In this particular instance, call circulation and can only check potential accident between g1 and y and g2 and x, it can be identified according to the information in the dependence file.
In certain embodiments, unknown addressing is similar to variable addressing as above, but usually is applied to the situation that wherein addressing working time can not be assessed by call function.This for example can occur in the function that wherein is called and be used in the information of the dependence file situation by the value of the mode modified address variable that can not see call function.
Whether additional qualifier " public " and " special use " can export a symbol to allow checking variable by call function by the designated links device.For example, for the A[to the last example that provides above next] quote designated " special use " because A[] be declared as not the file static variable to the function output of calling assignA ().In this example, compiler 200 can be determined how addressing A[of assignA () function according to dependency information], in fact read A[but can not generate] the code of value.
The total function vector quantization
As above describe in detail, can adopt the compiler automatic vectorization, with by can be transparent in programmer or other user's mode, generate the vector quantization code according to non-vector quantization source code.This compiler automatic vectorizationization can be so that source code can be in the situation that seldom or there is no the programmer to interfere to utilize an advantage of the improvement in performance that is provided by the vector computing hardware.
Yet, if non-leaf function (that is, calling the function of other function) vector quantization effectively can be desirable to provide to call function expose vector interface but not the form of the function that is called of the scalar interface that may represent in original source code.
And application developer may wish that with the target that is applied as for multiple computing platform, not all these computing platforms can provide the vector resource.For example, the mobile edition of processor family may omit vector calculus with reduction chip size and power consumption, and the desk type plate of same processor family may be developed to compare with power consumption and more pays attention to processing power.In this case, in order to carry out on mobile processor, application may need only to utilize scalar function to compile, and this application may be used scalar function or phasor function when carrying out on desktop processor.Yet, about above-mentioned automatic vectorization, can wish to allow this to be applied on vector platform and non-vector platform and effectively carry out, reduction simultaneously or elimination programmer interfere.
Accordingly, when vector quantization one function, according to the compiler of embodiment more described here, the scalar sum vector version of generating function can be described according to single source code.This function can be for example built-in function, although more generally, it can be corresponding to any process or method called.In certain embodiments, the scalar version of function can use as the scalar interface by the initial appointment of source code.Simultaneously, the vector version of this function can be realized the vector interface for this function, accepts vector parameters and/or generates the vector rreturn value.Scalar sum vector version by generating this function both, compiler can be so that code can be that available resources are special in compilation time or working time more neatly.And by the be called vector quantization version of function and the gained vector interface is exposed to call function of generation, the vector quantization that compiler can the assisted call function is propagated thus and is used for from the leaf function chance of vector quantization hierarchically upwards.
Vector interface for example can with the dependence database (as the dependence file) of this functional dependence connection in expression.For example, the function shell below considering, wherein, the interior details of this function omits:
Figure BDA00002949554100201
Scalar interface for this function can represent that (for example, in the dependence file) is:
int?foo(int?A)
This expression reflected, according to this version, foo () takes scalar parameter and returns to scalar result.
Same Function by vector quantization for example once can become during executable operations on a plurality of data item:
Figure BDA00002949554100211
Like this, the vector interface for this function can be expressed (for example, in the dependence file) and is:
Vector?foo(Vector?A)
Be different from existing expression, this version of this expression indication foo () is taked vector parameters and is returned to vector result.
With reference to Fig. 5, described the process flow diagram of an embodiment of expression total function vectorization method.In frame 500, compiler 200 receives the function that will compile.In frame 510, compiler 200 can compile the scalar form of this function.In frame 520, compiler 200 can compile the vector form of this function.And in frame 530, compiler 200 can be expressed the vector interface that is associated with the vector version of this function in the dependence database.
The existence of this alternative vector interface allows compiler 200 to carry out phasor function in the vector quantization circulation to call, and calls but not carry out a plurality of serialization scalar functions in the vector quantization circulation.For example, the following circulation in the call function of considering to call for external function foo ():
Figure BDA00002949554100221
If foo () only has the scalar interface, the chance that should circulate for vector quantization may be limited, for example, and the assignment of restriction vector quantization.Yet, exist the vector version of foo () can increase chance for the circulation vector quantization.For example, the vector quantization version of above-mentioned circulation may utilize vector parameters call foo () and may receive vector result, enables to realize how concurrent execution and the interior serialization of reduction circulation.And, being different from previous method, the vector quantization of this techniques permit function does not comprise circulation.This can increase the amount of the total vector quantization in application.
Circulation in two versions of function can be by vector quantization.Generally speaking, " level " vector quantization can relate to the vector quantization of such type, that is, wherein, the iteration of circulation is mapped to the corresponding constituent element of vector." vertically " vector quantization can relate to the vector quantization of such type, namely, wherein, the iterative nature of circulation can keep (namely, opposite with the vector constituent element that is mapped to as in horizontal vector), but wherein scalar variable is replaced with vector variable, makes each iteration operate concomitantly on than the more data of scalar version of code.
Circulation in the scalar version of function can utilize Macroscalar technical merit ground vector quantization, and the circulation in the vector version of this function can be flatly or vector quantization vertically.This can increase the chance that is used for vector quantization in application.Except the performance and efficient benefit of vector quantization function call, this technology can also increase by one use in the quantity of the circulation of vector quantization vertically, reduce thus the expense that causes at vector quantization circulation time flatly.
With reference to Fig. 6, described the process flow diagram of an embodiment that expression utilizes the method for vector quantization function.In frame 600, the call function that compiler 200 identifications are called for the function that is called.For example, call function can comprise the circulation of calling for the function in the precompile storehouse.In frame 610, the dependence database of compiler 200 access and the functional dependence connection that is called.Whether in frame 620, compiler 200 checks the dependence databases, available with the vector modification of the function of determining to be called.In one implementation, if this vector version can be used, in frame 630, compiler 200 compiling call functions are with the be called vector modification of function of utilization.If this vector version is unavailable, compiler 200 compiles call functions, to utilize scalar version (for example, calling iteratively the scalar version of this function).
For example, again consider following circulation:
Figure BDA00002949554100231
Whether when this circulation time of vector quantization, compiler can check the dependence database that is associated with foo (), exist to determine the vector interface that is associated with foo ().If the vector interface of foo () does not exist, only this circulation of vector quantization partly of compiler 200 for example, is undertaken by this assignment of vector quantization when staying function call by the scalar form.
On the other hand, if foo () has the vector quantization interface of expressing in its dependence database, in some instances, compiler 200 can be somebody's turn to do circulation (for example, by replacing or otherwise this assignment and function call being transformed into vector calculus) by whole vector quantization.
Check the dependence database of foo () when compiler, when determining whether to exist the vector quantization interface be used to the function that is called, compiler can be in addition or is alternatively checked and the function that is called (can be expressed as be associated with foo () same (or another) dependence database) any storer dependence of being associated.
In some implementations, the addressing of follow needle to every one dimension of an array independently is to minimize uncertainty.This design can be applied to all aggregated data types usually, as structure and array.Following example in more detail illustration compiler (for example, as compiler 200) how can use the dependence database information to enable vector quantization, and how can adopt the vector version of function to replace the scalar version (to it should be noted that in possible, in other embodiments, the dependence database can be used and with determine whether to exist the phasor function interface irrelevant, vice versa).
Figure BDA00002949554100241
In this example, function bar () will export indication, and it writes p.ptr[] and the dependence that reads from p.b and j (for example, the dependence file that is generated by compiler 200 when compiling function bar (), as mentioned above):
Figure BDA00002949554100251
It should be noted that, under this particular case, can identification parameter be cited as " public " or " privately owned ".And, can state the function that reads from p or j, because can use himself parameter by assumed function at least in this example.The type definition of myStruct can be comprised at the dependence database, it being exposed to the function that calls foo (), but the definition of myStruct can be comprised to be exposed to by header files.
At compile duration, compiler 200 can compile function bar () and do not need vector quantization its because there is not circulation by its vector quantization.Do like this, it can generate the scalar version of the bar () with following interface:
int?bar(myStruct*p,int?j)
In this example, bar () can take for the single-instance of the pointer of a structure and single integers as parameter, and returns to single integers as a result of.Thus, the bar () of this version is being scalar aspect its input and output.
Yet compiler 200 can also utilize the following interface that can also export in the dependence database to compile phasor function:
Vector?bar(Vector?p,Vector?j,Vector?pred)
In this example, which vector constituent element predictive vector pred specifies to process by this function.For example, suppose that vector comprises the constituent element that limits quantity, predictive vector can comprise the vector with identical restriction number of bits, and each bit is corresponding to corresponding constituent element.Each bit can as determine boolean (Boolean) prediction whether its corresponding vector constituent element should be processed (for example, if the prediction bit is " 1 ", "Yes", and if it is " 0 ", "No", vice versa).Prediction allows call function to carry out the conditioning function call, and if it does not stop noting the afterbody of this circulation on the vector length border.It should be noted that, other embodiment can adopt dissimilar prediction form, as non-boolean's prediction.
And in this example, vector p is the vector for the pointer of structure, although in this example, they all point to same instance.Vector j is the vector of simple integer.Compiler can state to infer the type information according to scalar function.
A kind of possibility vector modification of function bar () is calculated the p.b+j for each constituent element of input vector, and these results is written in the appropriate array index of p.ptr.Also based on the vector that relatively returns results of p.b and j.In this concrete example, compiler is this function of vector quantization vertically.That is, because bar () does not comprise circulation, thus there is not the loop iteration that will be transformed into the vector constituent element, as the situation in horizontal vector.By contrast, the bar () of vector quantization version can be on the different constituent elements of vector input concurrent operations.
During compiling foo (), compiler 200 can read the dependency information information of relevant function bar (), it may needn't be arranged in same source file, even and call function is passed to structure g with a pointer, the function bar () that also determines to be called is not for the dependence of g.a.Because it has this information, so the flatly circulation in vector quantization function f oo () of compiler 200.And compiler 200 can carry out single function call to the vector modification of bar () for each handled vector, but not calls the scalar modification in each iteration of this circulation.At last, compiler 200 can create the vector modification of the foo () with vector interface.Under this particular case, because not analyzing the gamut of x for dependence, so vertical vectorization may not be used.Horizontal vector that can application cycle, and it is comprised in another circulation, this another circulate in iteration on the vector constituent element of the vector modification that is passed to function f oo ().
Under these supposition, function f oo () may export following dependence:
(the unknown addressing of@symbolic representation).Because function bar () output dependence " writep.ptr[p.b+j] ", so compiler 200 can posit structure member ptr[] be written as the function of x.Thus, the index that compiler 200 can write to the called side report of foo () is unknown, because it can not be determined by the called side of foo ().
The additional technology that realizes
This part has described additional unrestricted compiler technologies, and it can be used to realize non-leaf and total function vector quantization.Below description based on the Macroscalar compiler technologies, but those of ordinary skills will recognize in view of the disclosure, can use other compiler technologies.
Previous example illustration addressing can comprise mathematic(al) representation.This normally really as long as this expression formula does not relate to function call, and only comprise for visible of call function.This can comprise indirect addressing, as the tracing table that uses when entering the index of other array in calculating.
Indirect addressing is a kind of like this situation, wherein configures compiler and linker and can help the more circulations of vector quantization to be output as public static array.Consider following example:
Whether the dependence that generates for foo () can be configured to export publicly static symbol and difference according to compiler and linker.In example below, the special-purpose static variable of the first dependence document presentation, and the second public static variable of dependence document presentation:
Figure BDA00002949554100291
It should be noted that, type declarations A is essential in the dependence file when it is openly exported.When static variable is special-purpose, to B[] addressing be unknown because it can not be determined from this function is outside.Because unexpected inspection is impossible, so can not carry out the vector quantization of the circulation in bar ().Yet, when these tool configuration are become openly to export static variable, compiler can launch read A[x] the instruction of content, and check B[A[x]] and B[x] between accident, enable thus to realize the vector quantization that circulates.
Naturally, when static variable is disclosed output and externally addressing, for the chance rising of Name Conflict.Reform for helping to avoid this conflict, static variable can utilize their declared functions and file to carry out name.
Some accidents relate to the storage operation of conditioning ground generation, perhaps relate to calculating and different addressing based on conditioning.Call the vector quantization of the circulation of the function that relates to the conditioning dependence for support, how can provide mechanism to express this condition affects dependence.
For example, consider following code:
if(A[x]<c)d=B[x];
This code can be expressed as in the dependence database:
Conditioning is expressed in the time of can also being present in calculated address.For example, consider following code:
Figure BDA00002949554100302
This code can be expressed as in the dependence file:
Figure BDA00002949554100303
Alternatively, the above's conditioning is after a while expressed and can be expressed as:
read?public?B[A[x]<c?x:x+c];
In some cases, the unknown can be slipped into the dependence expression formula.In this case, an exemplary example can be:
A[x]<c?read?public?B[x]:read?public?B[@];
If condition is true, this expression formula can be notified to compiler the specific dependency relation of relevant B, and if this condition is false, notify the unknown dependence of relevant B.
The unknown of slipping into the conditioning expression can cause putting up a good show and not only be very but also be false non-conditioning dependence as this condition.For example:
A[x]<B[@]?read?public?f:read?public?g;
Can be expressed as:
read?public?f;
read?public?g;
And:
read?public?A[x>@?x:x+y];
Can be expressed as:
read?public?A[x];
read?public?A[x+y];
Because call function can not be assessed unknown condition usually, so they can guard supposition, that is, access enters A[] two kinds may index.
In some implementations, can also express the circulation dependence in the dependence database.For example, consider following function:
if(A[x]>b)b=A[x]
In one implementation, this function can be expressed as:
read?public?A[x];
read?public?b;
A[x]>b?write?public?b;
Wherein, with pointer or quote and be passed to a function (also being called as " transmitting by quoting "), for this function, can revise its call parameters.These are for example different from the modification to the parameter according to value transmitted, because can affect the operation of call function by the modification of the parameter of quoting to transmit.Modification by the parameter of quoting to transmit can come record by the static same way as with overall modification of storing of record.The modification of the parameter of according to value transmitting can be treated to the modification of local autostore.In some cases, because they are invisible for call function, so can not record them.
In some implementations, the function that satisfies one group of standard software is therein inferred the needs vector quantization is called in the situation of circulation and is called speculatively.Therefore, infer that secure indicator can express in the dependence file, and the indication that can call safely by the predictive mode as corresponding code.In a unrestricted example, can by predictive the phasor function that calls can fall into one of two types: type-A and type-B.The Type-A function can be the phasor function with normal vector interface described here.For example, if the standard of Type-A function below satisfying, they can not have harmful side effect ground by predictive call.At first, this function is not accessed other storer except the automatic non-array memory in this locality.The second, this function is never called any other function that is not the type-A function.The example of type-A function may be priori or other iterative convergence algorithm.
Except any rreturn value by the source code appointment, the type-B function can return to the processed predictive vector of those constituent elements of indication.In one embodiment, call the standard of type-B function can be as follows with being used for predictive.At first, use the Fisrt fault reading command from any reading of non-local memory storage or local array memory storage.The second, this function does not write non-local memory storage or static local storage.The 3rd, this function is never called neither type-A or any function of type-B function.
From a recursive call type-A function can with call non-supposition functional similarity.Typically say, predictive calling when calling the type-A function on the part of circulation and do not needing nonspecific action.Yet calling the type-B function may need this to call circular test to return to vector, in order to determine to have processed which constituent element, and responsively call the behavior of circulation.
Can be chosen to make all called side of type-B phasor function to regulate their behaviors such as the compiler of compiler 200, adapting to the quantity of the constituent element of in fact having processed, and whether use in calling circulation no matter software is inferred.Alternatively, compiler 200 can be for two phasor functions of each type-B function creation; A predictive with a non-predictive.Usually can be designated as for the standard of type-B circulation and guarantee those qualified circulations seldom and less, and can ignore for the code size impact of the method thus.
Type-A and type-B phasor function can be identified by their statement in the dependence database, and be as follows.In one implementation, lack indicator hint function may be not by predictive call.
Figure BDA00002949554100331
Aliasing may be the problem for the vector quantization compiler sometimes.Although the Macroscalar framework is devoted to solve this problem by aliasing analysis working time, there is the expense for the method.Expense in the Macroscalar program contributes to the serial assembly in Amdahl ' s law, and it can limit the benefit of wider vector.And, follow the aliasing of outside or static variable can affect behavior across function call.Therefore, in one implementation, carry out the analysis of compilation time aliasing, and the aliasing designator is exported in the dependence file.
For example, a kind of method can be that the aliasing event is separated into two kinds, for instance, and as inbound and departures aliasing.From the viewpoint of the function that is called, inbound aliasing can relate to the address that enters a function, as that read from external variable or address by taking external variable by this function calculation transmit as parameter those.Simultaneously, the departures aliasing can relate to the pointer that function is emitted.These can be rreturn value (that is, this function write the value of external variable or non-quoted pointer).
And, can follow the tracks of at least two class aliasings." copy aliasing (Copies aliasing) " can indicate this pointer can be another pointer copy and can this pointer of aliasing can aliasing anything." pointer aliasing " can indicate a pointer may affect another variable.Aliasing information in the dependence file is the sure expression that possible have aliasing.For example, in compiler information and can not conclude simply whether two pointers quote same storer the time, do not need to use it for want of.
Statement can be similar to statement for the aliasing of rreturn value for the aliasing of variable.For example, consider following function:
Figure BDA00002949554100341
In one implementation, this function can be expressed following dependence:
For clarity sake, aforementionedly distinguish between pointer and copy, although it can make up this two designs by the alternative grammer.About other dependency information, the aliasing information exchange is often upwards propagated by calling functional-link.
The value of returning by a function also can cause aliasing, for example, and by rreturn value itself, perhaps by because revising the information of returning by the variable of quoting transmission.These can also be followed the tracks of in the dependence file.For example, consider following function:
Figure BDA00002949554100351
In one implementation, this function can be exported following dependence:
Figure BDA00002949554100352
Dependence statement can be to calling the copy that pointer that the circulation notice returned by foo () may be the pointer that is passed.This allows to call circulation and takes measures, guaranteeing the proper operation of this circulation, and no matter the aliasing that occurs.And, this know can also make compiler can be when facing the code of non-ANSI-C compatibility lever ANSI aliasing rule better.
As another consideration, the projection of pointer can affect address computation.For example, consider following function:
Figure BDA00002949554100353
In one implementation, this function can be exported following dependence:
Do not know that in compilation time whether the fact of support vector interface can not be by vector quantization with calling what function or the function that is called because of it via calling of function pointer is common.Call the function of other function via pointer and can not export dependency information, it can be probabilistic reflection of the dependence of relevant directed function.This can cause compiler this function to be considered as having the scalar function of unknown dependence.
In one implementation, versioned scheme permission dependence is put at any time and is utilized best practices to express.For example, embodiment can permit and the dependence file back compatible that is generated by older compiler, and another embodiment can permit making older compiler also can read the two-way compatibility of the file that is generated by newer compiler.In the situation that back compatible is unique demand, be used to not notify given file readable and should be left in the basket to older compiler for the version indicator of dependence file.
Two-way compatibility can followingly realize.For example suppose that compiler version 1 do not support the calculating in array index, and compiler version 2 is supported.A write to B[x+y], can be expressed as by version 1 compiler:
Figure BDA00002949554100361
On the other hand, the version 2 compiler can utilize version 2 grammer output Same Function in addition:
Figure BDA00002949554100362
Utilize the method, not only the version 2 compiler can read version 1 file, and it can also allow the version 2 statement to ignore version 1 statement.Version 1 compiler will be known any statement of ignoring greater than version 1, give to it dependency information as much as possible that it can be understood.This is a kind of significant capability along with the maturation of compiler technologies.
Generally speaking, if exploitation side is required to change to enable vector quantization for software, vector quantization can occur in relatively less code.For being devoted to address this is that, technology described here provides and has been used for carrying out the ability of scale vector, and does not need the side of exploitation to revise their source code.
Although above-described embodiment quite at length is described, in case know this instructions fully, it will be appreciated by one of skill in the art that many modified examples and modification.Below claims be intended to be interpreted into and contain all this modified example and modifications.

Claims (20)

1. method comprises:
Carry out following steps by one or more computing machines:
The identifying call function, this call function comprises calling for the function that is called;
The permanent dependence database of access and the functional dependence connection that is called, wherein, this permanent dependence database is indicated the dependence that is expressed of the function that is called, wherein, this dependence indication that is expressed function that is called is only reading out data item, data writing item or not only read but also the data writing item only; And
At least in part based on the dependence that is expressed generate call function whether with mutual the determining of the function that is called.
2. method according to claim 1, wherein, occur in for calling of the function that is called in the circulation of call function.
3. method according to claim 1, wherein, described execution also comprises:
From determine the to be called existence of vector version of function of permanent dependence database; And
In call function, will be transformed into calling for the vector version of the function that is called for calling of the scalar version of the function that is called.
4. method according to claim 1, wherein, described execution also comprises: based on by the one or more at least a portion that determine whether the vector quantization call function in lising under the indication of permanent dependence database: variable is read or is write by call function, variable is addressing mode public or privately owned and that be associated with variable for call function.
5. method according to claim 1, wherein, described execution also comprises:
Compile the source code corresponding with a function;
At compile duration, identify this function to the dependence that is expressed of data item, wherein, it is only reading out data item, data writing item or not only read but also the data writing item only that this dependence that is expressed is indicated this function; And
To the indication of the dependence that is expressed be stored in permanent dependence database.
6. method according to claim 5, wherein, storage comprises the indication of the dependence that is expressed: except the title of storage of variables, also will be stored in the one or more indication in the following in permanent dependence database: variable is addressing mode public or privately owned and that be associated with variable for this function.
7. method according to claim 5, wherein, described execution also comprises: generate the vector version that this function has vector interface, and will be stored in the indication of vector interface in permanent dependence database.
8. method according to claim 5, wherein, described execution also comprises: create permanent dependence database when the compiling of function.
9. method according to claim 5, wherein, storage indication comprises: express one or more in the following: the aliasing designator that the addressing mode that is associated with data item in this function, the public or privately owned qualifier that is associated with data item in this function, the supposition secure indicator that joins with this functional dependence and the data item in function are associated.
10. method according to claim 5, wherein, storage indication comprises: express one or more in the following: the indication that this function is read or writes to the known offset in directed object, the indication that this function is read or writes to the variable offset in an object or the indication that this function is read or writes to the unknown skew in an object.
11. according to claim 1 or method claimed in claim 5, wherein, data item is not to be delivered to the parameter of this function via the DLL (dynamic link library) of function.
12. method according to claim 1, wherein, described execution also comprises: at least in part based on the described code of determining to come in the vector quantization call function.
13. method according to claim 12, wherein, the code in described vector quantization call function also comprises: at least in part based on the described circulation of determining to come in the vector quantization call function.
14. method according to claim 12, wherein, the code in described vector quantization call function also comprises: described calling is modified as the vector version of quoting the function that is called.
15. method according to claim 1, wherein, described operation also comprises:
According to call function whether with mutual the determining of the function that is called, determine whether at least in part at least a portion of vector quantization call function based on the dependence that is expressed; And
In response at least a portion of determining the vector quantization call function, generation makes vector calculus for the vector code of the concurrent execution of a plurality of data item of quoting in call function when being performed.
16. method according to claim 15, wherein, the function that is called comprises the function in the precompile code, and wherein, at least a portion of vector quantization call function is determined in described definite operation, even the source code of precompile code is unavailable.
17. method according to claim 1, wherein, this call function comprises non-leaf circulation, and this non-leaf circulation comprises calling for the function that is called.
18. method according to claim 17, wherein, described execution also comprises:
The first of the non-leaf circulation of vector quantization; And
The second portion of the non-leaf circulation of serialization.
19. the computer-readable recording medium of the instruction that wherein has program stored therein, described programmed instruction in response to the execution of computer system, make this computer system carry out the operation that realizes the described method of any one according to claim 1-18.
20. a system comprises:
One or more storeies, it stores instruction during operation; With
One or more processors, it from described one or more memory search instructions, and carries out this instruction so that this system carries out the operation that realizes the described method of any one according to claim 1-18 during operation.
CN201180045583.4A 2010-09-23 2011-09-07 For carrying out the system and method for the vector quantization based on compiler to n omicronn-leaf code Expired - Fee Related CN103119561B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US12/888,658 US8949808B2 (en) 2010-09-23 2010-09-23 Systems and methods for compiler-based full-function vectorization
US12/888,644 2010-09-23
US12/888,658 2010-09-23
US12/888,644 US8621448B2 (en) 2010-09-23 2010-09-23 Systems and methods for compiler-based vectorization of non-leaf code
PCT/US2011/050713 WO2012039937A2 (en) 2010-09-23 2011-09-07 Systems and methods for compiler-based vectorization of non-leaf code

Publications (2)

Publication Number Publication Date
CN103119561A true CN103119561A (en) 2013-05-22
CN103119561B CN103119561B (en) 2016-03-09

Family

ID=44937720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201180045583.4A Expired - Fee Related CN103119561B (en) 2010-09-23 2011-09-07 For carrying out the system and method for the vector quantization based on compiler to n omicronn-leaf code

Country Status (9)

Country Link
KR (1) KR101573586B1 (en)
CN (1) CN103119561B (en)
AU (1) AU2011305837B2 (en)
BR (1) BR112013008640A2 (en)
DE (1) DE112011103190T5 (en)
GB (1) GB2484000A (en)
MX (1) MX2013003339A (en)
TW (1) TWI446267B (en)
WO (1) WO2012039937A2 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106371838A (en) * 2016-08-31 2017-02-01 福建联迪商用设备有限公司 Method and system for maintaining software package dependence relationship
CN107710150A (en) * 2015-06-15 2018-02-16 高通股份有限公司 Object code is produced from the intermediate code comprising level sub-routine information
CN108733432A (en) * 2017-04-14 2018-11-02 阿里巴巴集团控股有限公司 The implementation method of private method, call method and its device under programmed environment
CN109240666A (en) * 2018-06-22 2019-01-18 北京大学 Function call code generating method and system based on call stack and independent path
CN110998516A (en) * 2017-05-22 2020-04-10 起元技术有限责任公司 Automated dependency analyzer for heterogeneous programmed data processing systems
CN111566616A (en) * 2017-11-03 2020-08-21 相干逻辑公司 Programming flow of multiprocessor system
CN112214221A (en) * 2020-10-10 2021-01-12 上海上讯信息技术股份有限公司 Method and equipment for constructing Linux system
CN112799671A (en) * 2019-11-13 2021-05-14 北京配天技术有限公司 Function analysis method and computer equipment thereof
CN113342319A (en) * 2021-05-24 2021-09-03 重庆长安汽车股份有限公司 Method and system for automatically generating software code for CAN fault diagnosis
CN113536316A (en) * 2021-06-17 2021-10-22 深圳开源互联网安全技术有限公司 Detection method and device for component dependence information
CN113742646A (en) * 2020-05-28 2021-12-03 红帽公司 Compiling a single language compound function into a single entity
CN116700729A (en) * 2023-04-27 2023-09-05 珠海市芯动力科技有限公司 Code compiling method and related device
CN116700731A (en) * 2023-05-12 2023-09-05 珠海市芯动力科技有限公司 Code compiling method and related device
CN116700731B (en) * 2023-05-12 2024-09-24 珠海市芯动力科技有限公司 Code compiling method and related device

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9298456B2 (en) 2012-08-21 2016-03-29 Apple Inc. Mechanism for performing speculative predicated instructions
US9348589B2 (en) 2013-03-19 2016-05-24 Apple Inc. Enhanced predicate registers having predicates corresponding to element widths
US9817663B2 (en) 2013-03-19 2017-11-14 Apple Inc. Enhanced Macroscalar predicate operations
US11809871B2 (en) 2018-09-17 2023-11-07 Raytheon Company Dynamic fragmented address space layout randomization

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0785506A1 (en) * 1996-01-17 1997-07-23 Nec Corporation Optimizing compiler using interprocedural dataflow analysis
US20090138862A1 (en) * 2007-11-22 2009-05-28 Kabushiki Kaisha Toshiba Program parallelization supporting apparatus and program parallelization supporting method
CN101477472A (en) * 2009-01-08 2009-07-08 上海交通大学 Multi-core multi-threading construction method for hot path in dynamic binary translator
US20090307656A1 (en) * 2008-06-06 2009-12-10 International Business Machines Corporation Optimized Scalar Promotion with Load and Splat SIMD Instructions

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002073333A (en) * 2000-08-25 2002-03-12 Hitachi Ltd Method for analyzing and displaying data dependence in calling procedure
JP2004246776A (en) * 2003-02-17 2004-09-02 Ricoh Co Ltd Automatic variable sharing compiler, automatic variable sharing linker and program development support system
US7395419B1 (en) 2004-04-23 2008-07-01 Apple Inc. Macroscalar processor architecture
US7617496B2 (en) 2004-04-23 2009-11-10 Apple Inc. Macroscalar processor architecture
US7506331B2 (en) 2004-08-30 2009-03-17 International Business Machines Corporation Method and apparatus for determining the profitability of expanding unpipelined instructions
US8418161B2 (en) * 2008-11-24 2013-04-09 International Business Machines Corporation System and method for loading a called class file table with data indicating a highest version of a class file

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0785506A1 (en) * 1996-01-17 1997-07-23 Nec Corporation Optimizing compiler using interprocedural dataflow analysis
US20090138862A1 (en) * 2007-11-22 2009-05-28 Kabushiki Kaisha Toshiba Program parallelization supporting apparatus and program parallelization supporting method
US20090307656A1 (en) * 2008-06-06 2009-12-10 International Business Machines Corporation Optimized Scalar Promotion with Load and Splat SIMD Instructions
CN101477472A (en) * 2009-01-08 2009-07-08 上海交通大学 Multi-core multi-threading construction method for hot path in dynamic binary translator

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107710150A (en) * 2015-06-15 2018-02-16 高通股份有限公司 Object code is produced from the intermediate code comprising level sub-routine information
CN106371838B (en) * 2016-08-31 2019-10-18 福建联迪商用设备有限公司 A kind of method and system for safeguarding software package dependency relationship
CN106371838A (en) * 2016-08-31 2017-02-01 福建联迪商用设备有限公司 Method and system for maintaining software package dependence relationship
CN108733432B (en) * 2017-04-14 2021-12-21 创新先进技术有限公司 Method for realizing private method in programming environment, calling method and device thereof
CN108733432A (en) * 2017-04-14 2018-11-02 阿里巴巴集团控股有限公司 The implementation method of private method, call method and its device under programmed environment
CN110998516A (en) * 2017-05-22 2020-04-10 起元技术有限责任公司 Automated dependency analyzer for heterogeneous programmed data processing systems
CN110998516B (en) * 2017-05-22 2024-04-16 起元技术有限责任公司 Automated dependency analyzer for heterogeneous programming data processing systems
CN111566616A (en) * 2017-11-03 2020-08-21 相干逻辑公司 Programming flow of multiprocessor system
CN111566616B (en) * 2017-11-03 2023-12-12 相干逻辑公司 Programming flow for multiprocessor systems
CN109240666A (en) * 2018-06-22 2019-01-18 北京大学 Function call code generating method and system based on call stack and independent path
CN112799671A (en) * 2019-11-13 2021-05-14 北京配天技术有限公司 Function analysis method and computer equipment thereof
CN113742646A (en) * 2020-05-28 2021-12-03 红帽公司 Compiling a single language compound function into a single entity
CN112214221A (en) * 2020-10-10 2021-01-12 上海上讯信息技术股份有限公司 Method and equipment for constructing Linux system
CN112214221B (en) * 2020-10-10 2023-04-28 上海上讯信息技术股份有限公司 Method and equipment for constructing Linux system
CN113342319A (en) * 2021-05-24 2021-09-03 重庆长安汽车股份有限公司 Method and system for automatically generating software code for CAN fault diagnosis
CN113342319B (en) * 2021-05-24 2024-03-22 重庆长安汽车股份有限公司 Method and system for automatically generating software code for CAN fault diagnosis
CN113536316A (en) * 2021-06-17 2021-10-22 深圳开源互联网安全技术有限公司 Detection method and device for component dependence information
CN113536316B (en) * 2021-06-17 2023-08-11 深圳开源互联网安全技术有限公司 Method and device for detecting component dependency information
CN116700729A (en) * 2023-04-27 2023-09-05 珠海市芯动力科技有限公司 Code compiling method and related device
CN116700731A (en) * 2023-05-12 2023-09-05 珠海市芯动力科技有限公司 Code compiling method and related device
CN116700731B (en) * 2023-05-12 2024-09-24 珠海市芯动力科技有限公司 Code compiling method and related device

Also Published As

Publication number Publication date
AU2011305837A1 (en) 2013-03-28
WO2012039937A2 (en) 2012-03-29
AU2011305837B2 (en) 2015-05-14
TW201224933A (en) 2012-06-16
TWI446267B (en) 2014-07-21
KR101573586B1 (en) 2015-12-01
DE112011103190T5 (en) 2013-06-27
KR20130096738A (en) 2013-08-30
GB201116429D0 (en) 2011-11-02
MX2013003339A (en) 2013-06-24
CN103119561B (en) 2016-03-09
WO2012039937A3 (en) 2012-09-20
BR112013008640A2 (en) 2016-06-21
GB2484000A (en) 2012-03-28

Similar Documents

Publication Publication Date Title
CN103119561B (en) For carrying out the system and method for the vector quantization based on compiler to n omicronn-leaf code
US8621448B2 (en) Systems and methods for compiler-based vectorization of non-leaf code
US8949808B2 (en) Systems and methods for compiler-based full-function vectorization
US9529574B2 (en) Auto multi-threading in macroscalar compilers
Grosser et al. Polyhedral AST generation is more than scanning polyhedra
Verdoolaege et al. Polyhedral extraction tool
JP5893038B2 (en) Compile-time boundary checking for user-defined types
Pop et al. GRAPHITE: Polyhedral analyses and optimizations for GCC
CN101169718A (en) System and method for instantiating abstract class
US20120117552A1 (en) Speculative compilation to generate advice messages
US9430203B2 (en) Information processing apparatus and compilation method
CN107003885B (en) Techniques for low-level combinable high-performance computing libraries
US10705814B2 (en) Systems and/or methods for generating reassemblable disassemblies of binaries using declarative logic
Grosse-Kunstleve et al. Automatic Fortran to C++ conversion with FABLE
Sinz et al. LLBMC: A Bounded Model Checker for LLVM’s Intermediate Representation: (Competition Contribution)
Bang et al. Smt-based translation validation for machine learning compiler
Haslbeck et al. For a few dollars more: Verified fine-grained algorithm analysis down to llvm
Schommer et al. Embedded program annotations for WCET analysis
Göhringer et al. An interactive tool based on polly for detection and parallelization of loops
Jang et al. Automatic code overlay generation and partially redundant code fetch elimination
Arabnejad et al. An OpenMP based parallelization compiler for C applications
Larsen et al. Compiler driven code comments and refactoring
Norouzi Enhancing the Speed and Automation of Assisted Parallelization
Jammer Characterization and translation of OpenMP use cases to MPI using LLVM
Zhou Guided automatic binary parallelisation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160309

Termination date: 20200907