WO2023280078A1 - 程序编译方法和装置 - Google Patents

程序编译方法和装置 Download PDF

Info

Publication number
WO2023280078A1
WO2023280078A1 PCT/CN2022/103415 CN2022103415W WO2023280078A1 WO 2023280078 A1 WO2023280078 A1 WO 2023280078A1 CN 2022103415 W CN2022103415 W CN 2022103415W WO 2023280078 A1 WO2023280078 A1 WO 2023280078A1
Authority
WO
WIPO (PCT)
Prior art keywords
loop
statement
variable
program
loop statement
Prior art date
Application number
PCT/CN2022/103415
Other languages
English (en)
French (fr)
Inventor
刘志康
吴凌飞
陆敬磊
徐子明
程琛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22836829.6A priority Critical patent/EP4332752A1/en
Publication of WO2023280078A1 publication Critical patent/WO2023280078A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4434Reducing the memory space required by the program code
    • G06F8/4435Detection or removal of dead or redundant code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4441Reducing the execution time required by the program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/451Code distribution
    • G06F8/452Loops

Definitions

  • the present application relates to the field of information technology, and in particular to a program compiling method and device.
  • Multi-dimensional data is often processed on AI chips, and the processed multi-dimensional data is represented in the program through multiple layers of nested loops. During processing, multidimensional data is mapped into an instruction, which is called tensorize.
  • the instruction mapping label vmuls is originally placed under the H_o axis, and all program codes under the instruction mapping label will generate an instruction that includes a definite execution process, but due to the if condition ( That is, the program execution is uncertain) limitation, in the process of multi-dimensional data mapping, the instruction mapping label needs to be moved down to the if conditional branch, so that the number of generated instructions is far greater than the theoretical 9.
  • the multi-layer nested loops in the multi-dimensional data program are fully expanded (expanded in units of 1) to eliminate branch jumps, thereby improving the efficiency of instruction issuance.
  • this full expansion method will greatly increase the code length of the program compilation result, thereby increasing the mapping time of the program compilation result.
  • the embodiment of the present application provides a program compiling method and device, which can reduce the code length in the compiling result of the program and obtain a lower mapping time of the compiling result of the program while improving the instruction emission efficiency to obtain higher running performance of the compiling result of the program .
  • the present application provides a method for compiling a program, the method comprising: obtaining a first program; wherein, the first program includes a multi-layer loop statement, and the loop condition of each loop statement in the multi-layer loop statement Including a variable and the value range of the variable, the loop body of the multi-layer loop statement includes at least one conditional statement; the multi-layer loop statement includes a first loop statement; wherein, the first loop statement is the A layer of loop statements in the multi-layer loop statement, the variable included in the loop condition of the first loop statement is the first variable, and the first variable is included in the first condition statement in the at least one condition statement One of the variables; process the value range of the first variable in the first loop statement to obtain at least one loop statement corresponding to the first loop statement; wherein, the at least one loop statement includes the first loop statement Two loop statements, the first loop statement and the second loop statement both include the first variable, the value interval of the first variable in the second loop statement is the first interval, and the first interval is A subset
  • the above-mentioned first program including multi-layer loop statements may be used to represent multi-dimensional data, and the compilation result of the first program may correspond to a program representation after multi-dimensional data is segmented.
  • the compilation result of the above-mentioned first program is a high-level language (for example, C, C++ or python, etc.) program representation, and the compilation result of the first program can be converted into hardware (for example, CPU or GPU) ) on executable instructions.
  • a high-level language for example, C, C++ or python, etc.
  • the above-mentioned processing of the value range of the first variable in the first loop statement is to expand the value range of the first variable.
  • the above-mentioned first program also includes other loop statements that need to be processed, you can refer to the expansion mode of the value interval of the first variable in the first loop statement, and sequentially expand the value intervals of the variables contained in the loop condition in other loop statements, After the last loop statement that needs to be processed is expanded, the compilation result of the first program is obtained, that is, the compilation result of the first program is related to the at least one loop statement.
  • the loop statements to be processed may be sequentially expanded in order from the outermost loop statement to the innermost loop statement or from the innermost loop statement to the outermost loop statement.
  • the outermost loop statement is the uppermost loop statement of the first program
  • the innermost loop statement is the lowermost loop statement of the first program.
  • the first conditional statement can always be established, and the value interval of the first variable in the second loop statement is the first interval, so it can be eliminated
  • the first conditional statement in the second loop statement can further reduce the number of instructions obtained by mapping in the mapping process of the compilation result (multidimensional data after segmentation) of the first program, improve the instruction emission efficiency in the mapping process, and obtain Higher runtime performance of first program compilation results.
  • the value interval of the first variable is fully expanded (that is, the value interval of the first variable is expanded in units of 1), to eliminate the process of the first conditional statement; in this application, Since the first conditional statement is always established when the first variable takes the value in the first interval, the value interval of the first variable in the first loop statement can be expanded in the form of an interval, thereby effectively reducing the number of values obtained after expansion.
  • the code length of the second loop statement is to reduce the code length of the compiled result of the first program, thereby reducing the mapping duration of the subsequent compiled result of the first program.
  • the first program includes an instruction mapping tag; when the estimated mapping time of the compilation result of the first program is greater than a preset time, the first variable is the instruction mapping
  • the loop condition of the loop statement below the label and the at least one conditional statement jointly contain one of the variables; the estimated mapping time is required for the process of mapping the compilation result of the first program into an executable instruction on the hardware time; when the estimated mapping time of the compilation result of the first program is less than or equal to the preset time, the first variable is the loop condition and the loop condition of each loop statement in the multi-layer loop statement At least one of the conditional statements collectively contains one of the variables.
  • the estimated mapping time of the compilation result of the first program is based on the number of variables contained in the at least one conditional statement, the number of layers of the multi-layer loop statement, the At least one conditional statement is determined by one or more of the fuzzy interval of the variable or the number of the conditional statement; wherein, when the value of the variable is a value in the fuzzy interval of the variable, the variable is included Whether the conditional statement is true or not is also related to other factors.
  • the above-mentioned other factors refer to other variables in the conditional statement, that is, whether the conditional statement containing the variable is true or not is determined by the variable and other variables in the conditional statement containing the variable.
  • the code length in the first program compilation result can be calculated by the number of variables in at least one conditional statement, the number of layers of multi-layer loop statements, the fuzzy interval of variables and the number of conditional statements in at least one conditional statement Therefore, based on the above parameters, a more accurate mapping duration can be calculated, so as to determine the number of variables that need to be processed (that is, the corresponding layer of loop statements where the variables are located) based on the mapping duration, and then take into account the instruction of the first program compilation result Issue efficiency and mapping duration, even if the first program compilation result has a relatively short mapping duration while having high instruction issuance efficiency.
  • the at least one loop statement corresponding to the first loop statement further includes a third loop statement; wherein, the value of the first variable in the third loop statement The interval is the second interval, and when the value of the first variable is a value in the second interval, whether the first conditional statement is true or not is also related to other factors.
  • the third loop statement is obtained simultaneously with the second loop statement after expanding the first loop statement.
  • the first interval and the second interval are subsets of the value interval of the first variable in the first loop statement.
  • the second interval is the fuzzy interval of the first variable.
  • the first conditional statement in the second loop statement is always established, whether the first conditional statement in the third loop statement is established is jointly determined by the first variable and other variables in the first conditional statement, so the first The loop statement is expanded into a second loop statement and a third loop statement for distinction, thereby eliminating the first conditional statement in the second loop statement.
  • the third loop statement can be expanded subsequently to eliminate the first conditional statement.
  • the first conditional statement further includes a second variable
  • the method further includes: when the estimated mapping time of the compilation result of the first program is greater than the preset time , and the second variable is one of the variables contained in the loop condition of the loop statement under the instruction mapping label and the at least one conditional statement; or, when the estimated compilation result of the first program is When the mapping time is less than or equal to the preset time, and the second variable is one of the variables contained in the loop condition of each loop statement in the multi-layer loop statement and the at least one conditional statement, the The second interval of the first variable in the third loop statement is processed to obtain one or more fourth loop statements; wherein, the value interval of the first variable in each of the fourth loop statements is the third loop A subset of the range of values for the first variable in the statement.
  • the above-mentioned processing of the second interval of the first variable in the third loop statement specifically includes: expanding the second interval by 1-bit unit.
  • the second interval corresponding to the value of the first variable can be expanded first, And correspondingly determine the value of the second variable in each fourth loop statement, so that the first conditional statement in each fourth loop statement is always established, to eliminate the first conditional statement in each fourth loop statement, Furthermore, the instructions generated during the compilation result mapping are reduced, and the instruction emission efficiency is improved, so as to obtain high-performance program compilation results.
  • the first conditional statement further includes a second variable
  • the value interval of the second variable in the third loop statement is a third interval
  • the method further includes: when When the estimated mapping time of the compilation result of the first program is greater than the preset time, and the second variable is a variable included in the loop condition of the loop statement above the instruction mapping label, the The third interval in the third loop statement is processed to obtain one or more fifth loop statements; wherein, the value range of the second variable in each fifth loop statement is a subset of the third interval .
  • processing of the third interval in the third loop statement specifically includes: expanding the third interval in 1-bit units.
  • the value interval corresponding to the second variable in the third loop statement (the first three intervals) to expand, and simultaneously determine the value of the first variable in each fifth loop statement, so that the first conditional statement in each fifth loop statement is always established, to eliminate the fifth loop statement in each fifth loop statement.
  • the first conditional statement further reduces the instructions generated during the subsequent compilation result mapping and improves the instruction emission efficiency.
  • the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement are constants, and the variables contained in the at least one conditional statement are all of the multi-level loop statement Variables included in the loop condition of the .
  • the value intervals of the variables included in the loop condition of the multi-layer loop statement are all constants, and the variables contained in each condition statement are the variables contained in the loop condition of the multi-layer loop statement, so based on the first The endpoints of the first interval and the second interval obtained by a conditional statement and the value interval of the first variable are constants, and then the second interval in the third loop statement can be continued to be expanded to eliminate every fourth loop
  • the first conditional statement in the statement is used to reduce instructions generated during subsequent compilation result mapping and improve instruction emission efficiency.
  • the present application provides a method for compiling a program, the method comprising: obtaining a second program; wherein, the second program includes multiple layers of loop statements, and the loop conditions of each layer of loop statements include variables and Value range, the loop body of the multi-layer loop statement includes at least one conditional statement; the multi-layer loop statement includes a fifth loop statement; wherein, the fifth loop statement is one of the multi-layer loop statements Layer loop statement, the variable included in the loop condition of the fifth loop statement is a third variable, and the third variable is one of the variables included in the second condition statement in the at least one condition statement; The first value interval of the third variable in the fifth loop statement is updated to obtain the sixth loop statement; wherein, the updated first value interval makes the second conditional statement always true; based on the The sixth loop statement compiles the second program to obtain a compilation result of the second program; the compilation result of the second program is related to the sixth loop statement.
  • the above-mentioned second program containing multi-layer loop statements can be used to represent multi-dimensional data, and the compilation result of the second program can correspond to the program representation after multi-dimensional data is segmented.
  • the compilation result of the above-mentioned second program is a high-level language (for example, C, C++ or python, etc.) program representation, and the compilation result of the second program can be converted into hardware (for example, CPU or GPU) ) on executable instructions.
  • high-level language for example, C, C++ or python, etc.
  • the above-mentioned processing of the value range of the third variable in the fifth loop statement is to expand the value range of the third variable.
  • the above-mentioned second program also includes other loop statements that need to be processed, you can refer to the update method of the value interval of the third variable in the fifth loop statement, and update the value intervals of the variables contained in the loop conditions in other loop statements in turn.
  • the compilation result of the second program is obtained, that is, the compilation result of the second program is related to the sixth loop statement.
  • the loop statements to be processed may be sequentially expanded in order from the outermost loop statement to the innermost loop statement or from the innermost loop statement to the outermost loop statement.
  • the outermost loop statement is the uppermost loop statement of the first program
  • the innermost loop statement is the lowermost loop statement of the first program.
  • the embodiment of the present application may adopt any order to update the value intervals of variables in the loop conditions of all loop statements that need to be processed.
  • the second conditional statement when the third variable takes the value in the updated first value interval, the second conditional statement is always established, and at this time, the second conditional statement in the fifth loop statement can be eliminated, and then the subsequent During the mapping process of the second program compilation result (segmented multi-dimensional data), the number of instructions obtained through mapping is reduced, and the instruction emission efficiency during the mapping process is improved.
  • the cycle condition of the third variable is fully developed (that is, the value range of the first variable is expanded with 1 as a unit) to eliminate the process of the conditional statement; in the embodiment of the present application, the second The value intervals of the three variables are updated as a whole so that the second conditional statement is always established, which can effectively reduce the code length in the sixth loop statement obtained after the update, and reduce the code length in the program compilation result obtained based on the sixth loop statement.
  • the length of the code reduces the mapping time of subsequent program compilation results and improves the running performance of program compilation results.
  • the second program includes an instruction mapping label; the third variable is one of the loop condition of the loop statement under the instruction mapping label and the variable contained in the at least one conditional statement.
  • the loop condition of the loop statement under the instruction mapping label and at least one conditional statement jointly include the value range of the variable, rather than the loop condition of the multi-layer loop statement and at least one conditional statement jointly include the variable Updating the value range of , can reduce the number of loop statements to be processed, thereby reducing the code length in the compiled result, and obtaining a lower mapping time of the compiled result of the second program.
  • the updating the first value interval of the third variable in the fifth loop statement includes: obtaining the second value range of the third variable based on the second conditional statement a value interval; using the intersection of the first value interval and the second value interval to update the first value interval.
  • a second value interval that makes the inequality always true can be obtained, so after updating the first value interval by using the intersection of the first value interval and the second value interval, we get The updated first value range can make the second conditional statement in the fifth loop statement always hold true.
  • the second conditional statement can be eliminated without fully developing the fifth loop statement, which can effectively reduce the obtained after update.
  • the code length in the sixth loop statement can improve the running performance of the program compilation result, and can also reduce the number of instructions mapped to the second program compilation result, thereby improving the instruction emission efficiency.
  • At least one of the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement is a non-constant and/or each condition statement in the at least one condition statement is also A fourth variable is included, wherein the fourth variable is a variable not included in the loop condition of each layer of loop statement.
  • each conditional statement contains at most one variable that needs to update the value range.
  • each condition statement also includes a fourth variable
  • the fourth variable is not Variables included in the loop condition of each layer of loop statement, so at least one of the two endpoints of the first value interval and the two endpoints of the second value interval is a non-constant; since each condition statement contains only one Variables that need to be processed, so the upper and lower bounds of the first value range and the second value range can be directly compared to obtain the updated first value range without affecting other variables that need to be processed. Since the fifth variable takes the value in the first value interval after the update, the second conditional statement is always established.
  • the conditional statement can be eliminated and the first value interval does not need to be expanded, so not only The number of instructions obtained by mapping can be reduced, and the code length in the second program compilation result obtained subsequently can be reduced, thereby reducing the mapping time of the subsequent second program compilation result.
  • the present application provides a program compiling device, which includes: an acquisition unit configured to acquire a first program; wherein, the first program includes multi-layer loop statements, and each of the multi-layer loop statements The loop condition of the layer loop statement includes a variable and the value range of the variable, and the loop body of the multi-layer loop statement includes at least one conditional statement; the multi-layer loop statement includes a first loop statement; wherein, the second loop statement A loop statement is a layer of loop statements in the multi-layer loop statement, the variable included in the loop condition of the first loop statement is a first variable, and the first variable is a variable in the at least one conditional statement one of the variables included in the first conditional statement; a processing unit, configured to process the value range of the first variable in the first loop statement to obtain at least one loop statement corresponding to the first loop statement; Wherein, the at least one loop statement includes a second loop statement, both the first loop statement and the second loop statement include the first variable, and the value range of the first variable in the second loop
  • the first program includes an instruction mapping tag; when the estimated mapping time of the compilation result of the first program is greater than a preset time, the first variable is the instruction mapping
  • the loop condition of the loop statement below the label and the at least one conditional statement jointly contain one of the variables; the estimated mapping time is required for the process of mapping the compilation result of the first program into an executable instruction on the hardware Time; when the estimated mapping time of the compilation result of the first program is less than or equal to the preset time, the first variable is the loop condition and the loop condition of each loop statement in the multi-layer loop statement
  • the at least one conditional statement collectively includes one of the variables.
  • the estimated mapping time of the compilation result of the first program is based on the number of variables contained in the at least one conditional statement, the number of layers of the multi-layer loop statement, the At least one conditional statement is determined by one or more of the fuzzy interval of the variable or the number of the conditional statement; wherein, when the value of the variable is a value in the fuzzy interval of the variable, the variable is included Whether the conditional statement is true or not is also related to other factors.
  • the at least one loop statement corresponding to the first loop statement further includes a third loop statement; wherein, the value of the first variable in the third loop statement The interval is the second interval, and when the value of the first variable is a value in the second interval, whether the first conditional statement is true or not is also related to other factors.
  • the first conditional statement further includes a second variable
  • the processing unit is further configured to: when the estimated mapping time of the compilation result of the first program is greater than the estimated Set time, and the second variable is one of the variables contained in the loop condition of the loop statement under the instruction mapping label and the at least one conditional statement; or, when the estimated first program is compiled
  • the mapping time of the result is less than or equal to the preset time
  • the second variable is one of the variables contained in the loop condition of each loop statement in the multi-layer loop statement and the at least one condition statement
  • the The second interval of the first variable in the third loop statement is processed to obtain one or more fourth loop statements; wherein, the value range of the first variable in each of the fourth loop statements is the value interval of the first variable A subset of the second interval.
  • the first conditional statement further includes a second variable, and the value interval of the second variable in the third loop statement is a third interval;
  • the processing unit is further configured to : when the estimated mapping time of the compilation result of the first program is greater than the preset time, and the second variable is a variable included in the loop condition of the loop statement above the instruction mapping label, Processing the third interval in the third loop statement to obtain one or more fifth loop statements; wherein, the value range of the second variable in each fifth loop statement is the value interval of the third interval Subset.
  • the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement are constants, and the variables contained in the at least one conditional statement are all of the multi-level loop statement Variables included in the loop condition of the .
  • the present application provides a program compiling device, the device comprising: an acquisition unit, configured to acquire a second program; wherein, the second program includes multiple layers of loop statements, and the loop condition of each layer of loop statements includes variable and the value range of the variable, the loop body of the multi-layer loop statement includes at least one conditional statement; the multi-layer loop statement includes a fifth loop statement; wherein, the fifth loop statement is the multi-layer loop statement A layer of loop statements in the layer loop statement, the variable included in the loop condition of the fifth loop statement is a third variable, and the third variable is the variable contained in the second condition statement in the at least one condition statement One of them; an updating unit, which updates the first value interval of the third variable in the fifth loop statement to obtain the sixth loop statement; wherein, the updated first value interval makes the first value interval
  • the second conditional statement is always established; the compilation unit compiles the second program based on the sixth loop statement to obtain the compilation result of the second program; the compilation result of the second program and the sixth loop sentence related.
  • the second program includes an instruction mapping label; the third variable is one of the loop condition of the loop statement under the instruction mapping label and the variable contained in the at least one conditional statement.
  • the updating unit is specifically configured to: obtain The second value interval of the third variable; updating the first value interval by using the intersection of the first value interval and the second value interval.
  • At least one of the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement is a non-constant and/or each condition statement in the at least one condition statement is also A fourth variable is included, wherein the fourth variable is a variable not included in the loop condition of each layer of loop statement.
  • the present application provides a chip system, the chip system includes at least one processor, a memory, and an interface circuit, the memory, the interface circuit, and the at least one processor are interconnected by wires, and the at least Instructions are stored in a memory; when the instructions are executed by the processor, the method described in any one of the first aspect and/or the second aspect is implemented.
  • the present application provides a compiling device, the compiling device comprising the system-on-a-chip as described in the fifth aspect, and a discrete device coupled to the system-on-a-chip.
  • the present application provides a computer-readable storage medium, where the computer-readable medium stores program code for device execution, and the program code includes a program code for executing any of the above-mentioned first aspect and/or second aspect. one of the methods described.
  • the present application provides a computer program product, characterized in that the computer program product includes program instructions, and when the program instructions are run on a computer, any one of the first aspect and/or the second aspect The method described in this item is implemented.
  • FIG. 1 is a schematic representation of a multidimensional data program in an embodiment of the present application
  • FIG. 2 is a schematic diagram of a multidimensional data segmentation process in an embodiment of the present application
  • FIG. 3 is a schematic diagram of a system architecture in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an application scenario in an embodiment of the present application.
  • Fig. 5 is a flow chart of a program compiling method in the embodiment of the present application.
  • Fig. 6 is a flow chart of another program compiling method in the embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a compiling device in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a hardware structure of a compiling device in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a hardware structure of a compiling device in an embodiment of the present application.
  • the technical solutions of the embodiments of the present application can be applied to various computer systems, for example, personal computers (personal computers, PCs), computer cluster systems, large computer systems, or various supercomputers (supercomputers), etc., and this application does not make any limited.
  • this application can also be applied to various compilers, for example, GNU Compiler Collection (GNU Compiler Collection, GCC) and underlying virtual machine (Low Level Virtual Machine, LLVM), etc., and this embodiment of the application is not limited thereto.
  • Tensorize A process in which a program representing multidimensional data is mapped into an instruction.
  • Tail block When multi-dimensional data is segmented, a certain dimension cannot be divided into divisions, and the remaining part generated at this time is called a tail block.
  • the multidimensional data may be a matrix.
  • Compilation The process of converting one source code into another source code. Compilation in this application specifically refers to the process of processing a source program representing multidimensional data to obtain a program representing multidimensional data after segmentation.
  • the instruction mapping label is used to indicate that the code under the label will be mapped to an instruction, and the instruction will be processed later to be an executable instruction on the hardware (for example, CPU, etc.).
  • the code under the instruction mapping label can only be a definite program execution process, that is, it cannot contain branch jumps (for example, conditional statements).
  • FIG. 1 is a schematic representation of a multi-dimensional data program in an embodiment of the present application.
  • the multidimensional data may be a matrix.
  • the 9-line program code represents a multi-dimensional data
  • the program contains five layers of nested for loop statements, and each layer of loop statements corresponds to a dimension of the multi-dimensional data.
  • Lines 1, 2, and lines 4-6 are the loop conditions of the five-level loop statement
  • line 7 is the condition statement in the loop body. From top to bottom, for the first layer (that is, the outermost layer) loop statement, the loop condition is the first line of code, the loop body is the second to ninth line of code, and the seventh line of code is the loop body conditional statement.
  • the loop condition is the code on the second line
  • the loop body is the code on the third line to the ninth line.
  • the third line of code is an instruction mapping label in the multidimensional data, which indicates that the code below will be mapped to an instruction in the subsequent multidimensional data mapping process.
  • the program contains a conditional statement (the 7th line of code), so in the subsequent multidimensional data mapping process, the instruction mapping label will be moved down to the bottom of the conditional statement, so that the instruction mapping
  • the program execution process contained under the label is a deterministic process, resulting in a large number of instructions mapped, far greater than 9 in rule segmentation, which seriously affects the instruction emission efficiency in SIMD.
  • FIG. 2 is a schematic diagram of a multi-dimensional data segmentation process in the embodiment of the present application.
  • the multidimensional data segmentation process corresponds to the compilation process of the first program or the second program in the embodiment of the present application.
  • the program shown in Figure 1 characterizes the multidimensional data in Figure 2.
  • the multidimensional data in Figure 2 also includes five dimensions: C1, W, C0, H (including H_o and H_i, represented by H in Figure 2).
  • the H direction data is divided into three pieces: the first two pieces of data have the same size in the H direction, and the third piece of data has the same size in the H direction. It is not equal to the size of the first two pieces of data in the H direction being the same.
  • the multi-dimensional data shown in Figure 2 is cut into 9 pieces.
  • the multi-dimensional data we expect the multi-dimensional data to be mapped into 9 instructions, and use 9 instructions to complete the operation of the entire multi-dimensional data.
  • the final mapped The number of instructions is much greater than 9, which seriously affects the efficiency of instruction launch.
  • FIG. 3 is a schematic diagram of a system architecture 300 in an embodiment of the present application.
  • the compiling device 320 compiles the source program 350 to obtain a program compiling result 301 .
  • the source program 350 is the program (the first program or the second program) in the embodiment of the present application, which may be a program for performing multidimensional data calculations in different scenarios, such as image processing, speech recognition, scientific computing, etc. or physical modeling etc.
  • the compiling device 320 may be any device including a compiler (such as a compiler such as GCC or LLVM).
  • the program compilation result 301 compiled by the compiling device 320 can be applied to different systems or devices, such as the execution device 310 shown in FIG. Augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR), vehicle terminal, etc., can also be a server or cloud, etc.
  • Augmented reality augmented reality, AR
  • virtual reality virtual reality, VR
  • vehicle terminal etc.
  • an execution device 310 is configured with an input/output (input/output, I/O) interface 312 for data interaction with external devices.
  • I/O input/output
  • the executing device 310 may receive the data input from the database 330 or the client device 340, and use the computing module 311 to execute the relevant computing process in the program compilation result 301 to obtain corresponding processing results.
  • the I/O interface 312 returns the processing results (for example, image processing results or speech recognition results, etc.) to the client device 340 for providing to the user.
  • processing results for example, image processing results or speech recognition results, etc.
  • the compiling device 320 can compile source programs for different goals or different tasks to obtain the corresponding program compiling result 301, and then use the execution device 310 to execute the relevant calculation process in the program compiling result 301 to obtain the target Different goals or different tasks require processing results.
  • the user can manually specify the input data, and the manual specification can be operated through the interface provided by the I/O interface 312 .
  • the client device 340 can automatically send the input data to the I/O interface 312 . If the client device 340 is required to automatically send the input data to obtain the user's authorization, the user can set the corresponding authority in the client device 340 .
  • the user can view the results output by the execution device 310 on the client device 340 , and the specific presentation form may be specific ways such as display, sound, and action.
  • FIG. 3 is only a schematic diagram of a system architecture provided by an embodiment of the present invention, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the compilation The device 320 is an external device relative to the execution device 310 , and in other cases, the compilation device 320 may also be placed in the execution device 310 .
  • the execution device 310 is an external device with respect to the client device 340, and in other cases, the execution device 310 and the client device 340 may be the same device.
  • FIG. 4 is a schematic diagram of an application scenario in an embodiment of the present application.
  • the compiling methods 500 and 600 in the embodiments of the present application can be applied to scenarios requiring multi-dimensional data operations in fields such as artificial intelligence (such as image processing or speech recognition), scientific computing, and physical modeling.
  • FIG. 4 describes the application process of compiling methods 500 and 600 in the embodiment of the present application by taking image processing scenarios based on deep learning in the field of artificial intelligence (eg, image recognition, object detection, and image segmentation, etc.) as an example.
  • the image data to be processed 410 is obtained, and the image data to be processed 410 is defined in a high-level language (for example, programming languages such as C, C++, or Python) to obtain the multidimensional data program representation 420 corresponding to the image data to be processed 410 (that is, in FIG. 3 source program).
  • a high-level language for example, programming languages such as C, C++, or Python
  • Executing the compiling method 500 and/or 600 in the embodiment of the present application by the compiling device 320 compiles the multidimensional data program representation 420 (that is, the first program or the second program in the following embodiments), and obtains the program compiling result 440 (that is, Program compilation result 301 in Fig. 3). Perform operations such as optimization and instruction mapping on the program compilation result 440 to obtain a hardware machine language 450 executable on the hardware (eg, CPU or GPU, etc.).
  • Execution device 310 is used to run hardware machine language 450 to perform corresponding multi-dimensional data operations to obtain image processing result 430 (for example, in image recognition, the processing result is the category label of the image, and in target detection, the processing result is the target recognized from the image. , the processing result in image segmentation is the image segmentation result).
  • image processing result 430 for example, in image recognition, the processing result is the category label of the image, and in target detection, the processing result is the target recognized from the image.
  • the processing result in image segmentation is the image segmentation result).
  • the compiling device 320 is an external device relative to the executing device 310 , and in other cases, the compiling device 320 may also be placed in the executing device 310 .
  • FIG. 5 is a flowchart of a method for compiling a program in an embodiment of the present application.
  • the method 500 includes steps S510, S520, S530 and S540.
  • Method 500 is applied to static scenes.
  • Step S510 Acquire the first program; wherein, the first program includes multi-layer loop statements, and the loop conditions of each layer of loop statements in the multi-layer loop statements include variables and value ranges of the variables, and the multi-layer loop statements
  • the loop body of the loop statement includes at least one conditional statement.
  • Step S520 The multi-layer loop statement includes a first loop statement; wherein, the first loop statement is a layer of loop statement in the multi-layer loop statement, and the variable included in the loop condition of the first loop statement is a first variable, and the first variable is one of the variables included in the first conditional statement in the at least one conditional statement.
  • Step S530 Process the value interval of the first variable in the first loop statement to obtain at least one loop statement corresponding to the first loop statement; wherein, the at least one loop statement includes a second loop statement, the first loop statement and the second loop statement both include the first variable, the value interval of the first variable in the second loop statement is the first interval, and the first interval is the A subset of the value range of the first variable in the first loop statement, the first range makes the first conditional statement always true.
  • Step S540 Compile the first program based on the at least one loop statement to obtain a compilation result of the first program, and the compilation result of the first program is related to the at least one loop statement.
  • the above-mentioned first program including multi-layer loop statements may be used to represent multi-dimensional data, and the compilation result of the first program may correspond to a program representation after multi-dimensional data is segmented.
  • the above-mentioned first range is obtained based on the value range of the first variable in the first conditional statement and the first loop statement.
  • the compilation result of the above-mentioned first program is a high-level language (for example, C, C++ or python, etc.) program representation, and the compilation result of the first program can be converted into hardware (for example, CPU or GPU) ) on executable instructions.
  • a high-level language for example, C, C++ or python, etc.
  • the above-mentioned processing of the value range of the first variable in the first loop statement is to expand the value range of the first variable.
  • the above-mentioned first program also includes other loop statements that need to be processed, you can refer to the expansion mode of the value interval of the first variable in the first loop statement, and sequentially expand the value intervals of the variables contained in the loop condition in other loop statements, After the last loop statement that needs to be processed is expanded, the compilation result of the first program is obtained, that is, the compilation result of the first program is related to the at least one loop statement.
  • the above-mentioned loop statement may be a for loop statement.
  • the program for representing multidimensional data includes five layers of nested for loop statements, and each layer of loop statements includes a loop Conditions, the loop conditions of the five-level loop statement are lines 1-2 and lines 4-6 in Figure 1, respectively.
  • the loop condition of each layer of loop statement includes a variable and the value range of the variable.
  • the variable in the control part of the first layer of loop statement is C1
  • its value range is (0,3), which can represent C1
  • the value of is 0, 1 and 2, a total of 3 values.
  • the range of the loop statement on the top layer is: line 1-line 9; the range of the loop statement on the bottom layer (the innermost layer) is: line 9 Line 6 - Line 9 of code.
  • the above-mentioned first loop statement may be a layer of loop statement in the multi-layer loop statement.
  • the first program includes an instruction mapping tag; when the estimated mapping time of the compilation result of the first program is greater than a preset time, the first variable is the instruction mapping
  • the loop condition of the loop statement below the label and the at least one conditional statement jointly contain one of the variables; the estimated mapping time is required for the process of mapping the compilation result of the first program into an executable instruction on the hardware Time; when the estimated mapping time of the compilation result of the first program is less than or equal to the preset time, the first variable is the loop condition and the loop condition of each loop statement in the multi-layer loop statement
  • the at least one conditional statement collectively includes one of the variables.
  • each first program representing multidimensional data includes an instruction mapping label.
  • the estimated mapping time of the compilation result of the first program it is determined to expand the multi-layer loop statement in a full segmentation mode or a partial segmentation mode, that is, it is determined to perform processing under different segmentation modes (value range expansion )Variables.
  • the variables that need to expand the value range are the variables contained in the loop condition of the loop statement under the instruction mapping label and at least one conditional statement.
  • the variables whose value range is expanded are H_o and H_i.
  • the variable that needs to be expanded in the value range is the loop condition of each loop statement in the multi-layer loop statement and at least one conditional statement together contain variables.
  • the variable whose value range is expanded is H_i.
  • the first variable in the above-mentioned first loop statement is a variable that needs to be expanded into a value range
  • the above-mentioned preset time may be a time set according to a specific scenario, which is not limited in this application.
  • the estimated mapping time of the compilation result of the first program is based on the number of variables contained in the at least one conditional statement, the number of layers of the multi-layer loop statement, the at least A conditional statement is determined by one or more of the fuzzy interval of the variable or the quantity of the conditional statement; wherein, when the value of the variable is a value in the fuzzy interval of the variable, the variable's Whether the conditional statement is true or not is also related to other factors.
  • the estimated mapping time of the compilation result of the first program is directly proportional to the code length in the compilation result.
  • the above-mentioned other factors refer to other variables in the conditional statement, that is, whether the conditional statement containing the variable is true or not is determined by the variable and other variables in the conditional statement containing the variable.
  • the quantity of variables contained in at least one conditional statement in the above-mentioned multi-layer loop statement, the number of layers of the multi-layer loop statement, the fuzzy interval of the variable contained in at least one conditional statement in the multi-layer loop statement or the multi-layer loop statement can be used.
  • the number of conditional statements is used to calculate the code length in the compilation result (that is, the program after expanding the first program representing multidimensional data), and then calculate the estimated mapping time of the compilation result of the first program based on the code length in the compilation result.
  • the specific calculation method is not limited in this application.
  • the code length in the first program compilation result can be calculated by the number of variables in at least one conditional statement, the number of layers of multi-layer loop statements, the fuzzy interval of variables and the number of conditional statements in at least one conditional statement Therefore, based on the above parameters, a more accurate mapping duration can be calculated, so as to determine the number of variables that need to be processed (that is, the corresponding layer of loop statements where the variables are located) based on the mapping duration, and then take into account the instruction of the first program compilation result Issue efficiency and mapping duration, even if the first program compilation result has a relatively short mapping duration while having high instruction issuance efficiency.
  • At least one loop statement corresponding to the first loop statement further includes a third loop statement; wherein, the value range of the first variable in the third loop statement is the second interval, and when the value of the first variable is a value in the second interval, whether the first conditional statement is true or not is also related to other factors.
  • the above-mentioned processing of the value interval of the first variable in the first loop statement includes: decomposing the value interval of the first variable included in the loop condition in the first loop statement, and the decomposed interval includes the first interval , the second interval and the fourth interval.
  • the first conditional statement when the value of the first variable is a value in the first interval, the first conditional statement is always true; when the value of the first variable is a value in the second interval, the first conditional statement is true or not, that is At this time, whether the first conditional statement is established or not is determined by the second interval of the first variable and other parts in the first conditional equation.
  • the second interval can also be called the fuzzy interval of the first variable; when the value of the first variable is When the value in the fourth interval is set, the first conditional statement is always not established, so the fourth interval can be eliminated directly. Then, the value interval of the first variable in the first loop statement is expanded according to the first interval, the second interval and the fourth interval, to obtain the second loop statement and the third loop statement.
  • the value range of the first variable in the second loop statement is the first interval, meanwhile, because the first conditional statement is always established in the second loop statement, the first conditional statement may not be included in the second loop statement;
  • the third The value interval of the first variable in the loop statement is the second interval, because whether the first conditional statement in the third loop statement is established is determined by the second interval of the first variable and other parts in the first conditional equation, so the third The loop statement includes a first conditional statement.
  • the above-mentioned process of processing the value range of the first variable in the first loop statement is the process of expanding the value range of the first variable in the first loop statement, that is, the first loop statement The process of unfolding.
  • the third loop statement is obtained simultaneously with the second loop statement after expanding the first loop statement.
  • the first interval and the second interval are subsets of the value interval of the first variable in the first loop statement.
  • the second interval is the fuzzy interval of the first variable.
  • the first conditional statement in the second loop statement is always established, whether the first conditional statement in the third loop statement is established is jointly determined by the first variable and other variables in the first conditional statement, so the first The loop statement is expanded into a second loop statement and a third loop statement for distinction, thereby eliminating the first conditional statement in the second loop statement.
  • the third loop statement can be expanded subsequently to eliminate the first conditional statement.
  • the first conditional statement further includes a second variable
  • the method further includes: when the estimated mapping time of the compilation result of the first program is greater than the preset time , and the second variable is one of the variables contained in the loop condition of the loop statement under the instruction mapping label and the at least one conditional statement; or, when the estimated compilation result of the first program is When the mapping time is less than or equal to the preset time, and the second variable is one of the variables contained in the loop condition of each loop statement in the multi-layer loop statement and the at least one conditional statement, the The second interval of the first variable in the third loop statement is processed to obtain one or more fourth loop statements; wherein, the value range of the first variable in each of the fourth loop statements is the second interval subset of .
  • the first conditional statement further includes a second variable
  • the value interval of the second variable in the third loop statement is a third interval
  • the method further includes: when When the estimated mapping time of the compilation result of the first program is greater than the preset time, and the second variable is a variable included in the loop condition of the loop statement above the instruction mapping label, the The third interval in the third loop statement is processed to obtain one or more fifth loop statements; wherein, the value range of the second variable in each fifth loop statement is a subset of the third interval .
  • the third loop statement can also be expanded, and the expansion of the third loop statement includes two situations:
  • the first conditional statement also includes a second variable that needs to be expanded into a value range.
  • the value range of the first variable in the third loop statement is the second interval (fuzzy interval), at this time, the second interval can be decomposed ( Specifically, the decomposition is performed in units of 1).
  • the third loop statement is further expanded based on the decomposed value of the first variable, so that when the value range of the second variable is subsequently expanded, the first conditional statement can be eliminated based on the determined value of the first variable.
  • each fourth loop statement after expanding the third loop statement, the obtained value of the first variable in each fourth loop statement is a fixed value, and each fourth loop statement still includes the first conditional statement.
  • the value interval of the first variable in each fourth loop statement is a subset of the second interval.
  • the first conditional statement may also include multiple variables that need to be expanded into value ranges.
  • multiple variables that need to be expanded into value ranges can be taken as a whole, and the third loop statement can be expanded with reference to the situation when the second variable is included in the first conditional statement, and will not be described here.
  • the value range of the second variable in the third loop statement can be decomposed in 1-bit units.
  • the value corresponding to the first variable is determined, so as to eliminate the first conditional statement.
  • the third loop statement is expanded to obtain one or more fifth loop statements, so that the first conditional statement in each fifth loop statement Constantly holds to eliminate the first conditional statement in every fifth loop statement.
  • the value interval of the first variable in each fifth loop statement is a subset of the second interval.
  • the first conditional statement may contain multiple variables, and none of the multiple variables needs to be expanded into a value range.
  • the plurality of variables can be taken as a whole (that is, the above-mentioned second variable), and the third loop statement can be expanded by referring to the expansion process in the above-mentioned case (2), which will not be repeated here.
  • the second interval corresponding to the value of the first variable can be expanded first, And correspondingly determine the value of the second variable in each fourth loop statement, so that the first conditional statement in each fourth loop statement is always established, to eliminate the first conditional statement in each fourth loop statement, Furthermore, the instructions generated during the compilation result mapping are reduced, and the instruction emission efficiency is improved, so as to obtain high-performance program compilation results.
  • the value interval (the third interval) corresponding to the second variable in the third loop statement can be expand, and determine the value of the first variable in each fifth loop statement at the same time, so that the first conditional statement in each fifth loop statement is always established, to eliminate the first conditional statement in each fifth loop statement , thereby reducing the instructions generated during the subsequent compilation result mapping, and improving the instruction emission efficiency.
  • the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement are constants, and the variables contained in the at least one conditional statement are all of the multi-level loop statement Variables included in the loop condition of the .
  • the method 500 can be applied to static scenarios: the value interval of the variable contained in the loop condition in each layer of loop statement is a constant interval, and the variables contained in at least one of the above conditional statements are all variables contained in the loop condition of the multi-level loop statement .
  • the first interval, the second interval, the fourth interval and the third interval determined above are all constant intervals (the values of the two endpoints of the interval are determined integers), and then the first loop statement can be Expanding and expanding the third loop statement to eliminate the first conditional statement, thereby reducing the number of instructions mapped to the compiled result.
  • the first loop statement is only a loop statement that needs to be expanded at any level of the multi-level loop statements, that is, the above-mentioned method 500 describes the processing of the loop statement that needs to be expanded at any level.
  • the expansion method can refer to the expansion process of the first loop statement, which will not be repeated here.
  • the loop statements to be processed may be sequentially expanded in order from the outermost loop statement to the innermost loop statement or from the innermost loop statement to the outermost loop statement.
  • the outermost loop statement is the uppermost loop statement of the first program
  • the innermost loop statement is the lowermost loop statement of the first program.
  • the first conditional statement can always be established, and the value interval of the first variable in the second loop statement is The first interval, thus the first conditional statement in the second loop statement can be eliminated, and then the number of instructions obtained by mapping can be reduced in the mapping process of the compilation result (multidimensional data after segmentation) of the first program, and the mapping process can be improved.
  • the instruction emission efficiency in the method can obtain higher running performance of the first program compilation result.
  • the value interval of the first variable is fully expanded (that is, the value interval of the first variable is expanded in units of 1), to eliminate the process of the first conditional statement; in this application, Since the first conditional statement is always established when the first variable takes the value in the first interval, the value interval of the first variable in the first loop statement can be expanded in the form of an interval, thereby effectively reducing the number of values obtained after expansion.
  • the code length of the second loop statement is to reduce the code length of the compiled result of the first program, thereby reducing the mapping duration of the subsequent compiled result of the first program.
  • Program 1 there are two layers of nested for loop statements.
  • the loop condition in the first layer of loop statements (codes from line 1 to line 6) is the code on line 1, and the body of the loop is code from line 2 to line 6;
  • the variable contained in the loop condition is A, and the value range of A is [0,8], which takes nine values from 0-8; the conditional statement contained in the loop body is A+B ⁇ 10.
  • the loop condition is the code on line 3 and the loop body is code from line 3 to line 6;
  • the variable contained in the loop condition is B, and the value range of B is [0,5], take six values from 0-5.
  • program 1 is a program in a static scenario.
  • Table 1 shows the process of compiling program 1 in the way of all divisions.
  • the variables that need to expand the value range are the loop conditions and the multi-layer loop statements of each loop statement in the multi-layer loop statement.
  • the variables commonly contained in all conditional statements in namely variable A and variable B.
  • variable A is the first variable
  • variable B is the second variable
  • conditional statement A+B ⁇ 10 is the first conditional statement
  • first loop statement that is, the first layer of loop statement: calculate the first interval (constant true interval), the second interval (fuzzy interval) and the fourth interval (constant false interval) of the first variable A, respectively [ 0,4], [5,9] and [10,+ ⁇ ]. Regardless of the fourth interval, the first loop statement is expanded according to the first interval and the second interval to obtain the second loop statement and the third loop statement.
  • the code from line 1 to line 4 is the second loop statement
  • the code from line 5 to line 9 is the third loop statement. Since the value interval (constant truth interval) of the first variable A in the second loop statement makes the first conditional statement A+B ⁇ 10 always true, the first conditional statement can be eliminated in the second loop statement.
  • the value interval of the first variable A in the third loop statement is the second interval (fuzzy interval). At this time, whether the first conditional statement is set up is jointly determined by the value of the first variable A and the value of the second variable B. Thus the first conditional statement cannot be eliminated.
  • the first interval [0,4] is characterized by (A,0,5) in the first line of code
  • the second interval [5,9] is represented by (A,0,4) in the fifth line of code characterization.
  • the second variable B contained in the first conditional statement is also a variable that needs to expand the value range, in order to facilitate subsequent expansion of the value range of the second variable B, it is also necessary to divide the second range [5,9] by 1 Break down into units. Then based on the fixed value of the first variable A after decomposition, the third loop statement is further expanded to obtain four fourth loop statements.
  • the four fourth loop statements from top to bottom are: codes from line 6 to line 11, code from line 12 to line 17, code from line 18 to line 23 , Lines 24 to 29 of the code.
  • the values of the first variable in the four fourth loop statements from top to bottom are 5, 6, 7 and 8 respectively.
  • the second column in Table 1 is the second loop statement and four fourth loop statements obtained by expanding the first loop statement.
  • the loop statement containing the second variable B includes: a second loop statement and four fourth loop statements.
  • the second loop statement since the first conditional statement in the second loop statement has been eliminated, there is no need to segment the second variable B in the second loop statement, that is, it is not necessary to expand the second loop statement.
  • the first interval (constant truth interval) of the second variable B can be calculated
  • the fourth interval (constant false interval) the first interval corresponding to the second variable B in the four fourth loop statements from top to bottom is respectively: [0,4], [0,3], [0,2] and [0,1]. Then, based on the first interval corresponding to the second variable B, each fourth loop statement is expanded to obtain the program shown in the third column in Table 1, that is, the compilation result of program one.
  • Table 2 shows the process of compiling program 1 by partial segmentation.
  • the variables that need to expand the value range are the loop conditions of the loop statement under the instruction mapping label and all the conditions in the multi-layer loop statement.
  • variable B the first variable is a variable that requires value range expansion, that is, variable B
  • variable A the second variable is a variable that does not require value range expansion, that is, variable A.
  • First expand the first loop statement (that is, the second layer loop statement in the multi-layer loop statement): since the second variable A is located in the outer layer of the first variable B, the value interval of the second variable A is decomposed first.
  • the discrimination results of the first conditional statement are: Constant Established, established or not established, and always not established. Therefore, the first interval (constant true interval), the second interval (fuzzy interval) and the fourth interval (constant false interval) of the first variable B are all [0,5]. Regardless of the fourth interval, according to the combination of the value interval of the first variable B and the value interval of the second variable A, the first loop statement is expanded to obtain the second loop statement and the third loop statement.
  • the codes from the 2nd to the 6th lines are the second loop statement
  • the codes from the 7th to the 12th line are the third loop statement. Since the combination of the values of the first variable B and the second variable A in the second loop statement makes the first conditional statement A+B ⁇ 10 always true, the first conditional statement can be eliminated in the second loop statement.
  • the third loop statement whether the first conditional statement is true or not is determined by the values of the first variable B and the second variable A, so the first conditional statement cannot be eliminated.
  • the first interval [0,5] is represented by (B,0,6) in the 4th line of code
  • the second interval [0,5] is represented by (B,0,6) in the 9th line of code characterization.
  • the value interval of the second variable A (ie, the third interval) in the third loop statement can be decomposed in units of 1. Then further expand the third loop statement based on the fixed value of the second variable A after decomposition to obtain four fifth loop statements.
  • the four fifth loop statements from top to bottom are: codes from line 7 to line 11, code from line 12 to line 16, code from line 17 to line 21, line 22 lines to 26 lines of code.
  • the values of the second variable A in the four fifth loop statements from top to bottom are 5, 6, 7 and 8 respectively, and the value ranges of the corresponding first variable B are: [0,4], [0, 3], [0,2], and [0,1].
  • the first conditional statement in each fifth loop statement after expansion has been eliminated.
  • the second column in Table 2 is the second loop statement obtained after expanding the first loop statement and the four fifth loop statements, that is, the compilation result of program one.
  • the first column is the program representation of multi-dimensional data
  • the second column is the program compilation result obtained by adopting all segmentation methods; among them, the third row, the seventh row, the eleventh row and the second row in the second column
  • the 15 lines of code respectively refer to the codes from the 3rd to the 13th line in the first column, so the program compilation result obtained by adopting all segmentation methods is about 4 times longer than the original program code.
  • the third column is the program compilation result obtained by partial segmentation, that is, the segmentation variable outer2, and the branch condition is eliminated in the layer.
  • the program compilation result is about 1-2 times larger than the original program code length.
  • Table 4 describes the comparison results of the number of instructions obtained from the compilation result mapping under different compilation methods.
  • the first column in Table 4 is the program representation of multidimensional data. When the data represented by the program is processed each time, since the instruction mapping label needs to be moved down to the bottom of the if conditional statement, 4608 instructions will be mapped, that is, the actual instruction Run 4608 times.
  • the second column is the program compilation result obtained by adopting all segmentation methods, and the compilation result is mapped to 9 instructions; specifically, when the first-level loop statement and the second loop statement are combined, 6 instructions are mapped, and the first-level loop statement When combined with the third loop statement, the mapping results in 3 instructions.
  • the partial segmentation method is an incomplete segmentation method, and some if conditional statements still exist in the compilation result, which affects the instruction emission efficiency to a certain extent. Therefore, in the case that the mapping time is not affected, you can use all segmentation methods to obtain higher-performance instructions.
  • FIG. 6 is a flow chart of another program compiling method in the embodiment of the present application.
  • the method 600 includes steps S610, S620, S630 and S640.
  • Method 600 applies to dynamic scenes.
  • Step S610 Obtain a second program; wherein, the second program includes multi-layer loop statements, the loop conditions of each layer of loop statements include variables and the value intervals of the variables, and the loop body of the multi-layer loop statements includes At least one conditional statement.
  • Step S620 The multi-layer loop statement includes a fifth loop statement; wherein, the fifth loop statement is a layer of loop statement in the multi-layer loop statement, and the variable included in the loop condition of the fifth loop statement is a third variable, and the third variable is one of the variables included in the second conditional statement in the at least one conditional statement.
  • Step S630 Update the first value range of the third variable in the fifth loop statement to obtain the sixth loop statement; wherein, the updated first value range makes the second conditional statement constant established.
  • Step S640 Compile the second program based on the sixth loop statement to obtain a compilation result of the second program; the compilation result of the second program is related to the sixth loop statement.
  • the above-mentioned second program containing multi-layer loop statements can be used to represent multi-dimensional data, and the compilation result of the second program can correspond to the program representation after multi-dimensional data is segmented.
  • the compilation result of the above-mentioned second program is a high-level language (for example, C, C++ or python, etc.) program representation, and the compilation result of the second program can be converted into hardware (for example, CPU or GPU) ) on executable instructions.
  • high-level language for example, C, C++ or python, etc.
  • the above-mentioned processing of the value range of the third variable in the fifth loop statement is to expand the value range of the third variable.
  • the above-mentioned second program also includes other loop statements that need to be processed, you can refer to the update method of the value interval of the third variable in the fifth loop statement, and update the value intervals of the variables contained in the loop conditions in other loop statements in turn.
  • the compilation result of the second program is obtained, that is, the compilation result of the second program is related to the sixth loop statement.
  • the loop statements to be processed may be sequentially expanded in order from the outermost loop statement to the innermost loop statement or from the innermost loop statement to the outermost loop statement.
  • the outermost loop statement is the uppermost loop statement of the first program
  • the innermost loop statement is the lowermost loop statement of the first program.
  • the program representing multidimensional data includes two layers of nested for loop statements, and each layer of loop statements A loop condition is included, and the loop conditions of the two-level loop statement are the codes in the first row and the third row in the first column of Table 5 respectively.
  • the loop condition of each layer of loop statement includes a variable and the value range of the variable.
  • the variable in the loop condition of the first layer of loop statement is A
  • its value range is [0,8].
  • the expression in the line of code is (A,0,9), which can indicate that the value of A is 0-9, a total of 9 values.
  • the loop statement may be a for loop statement.
  • the range of the loop statement on the top layer is: line 1-line 6 code; the range of the loop statement on the bottom layer (innermost layer) For: line 3 - line 6 code.
  • the first loop statement may be a layer of loop statements in multi-layer loop statements.
  • the above-mentioned second program includes an instruction mapping label; the third variable is one of the loop condition of the loop statement under the instruction mapping label and the variable contained in the at least one conditional statement.
  • the second program representing each multidimensional data includes an instruction mapping label.
  • variable that needs to update the value interval is the variable contained in the loop condition of the loop statement under the instruction mapping label and the condition statement in the multi-layer loop statement.
  • the third variable is a variable that needs to be valued. Variables for interval updates. For example, in the first column of Table 5, the variable whose value range needs to be updated is variable B.
  • the loop condition of the loop statement under the instruction mapping label and at least one conditional statement jointly include the value range of the variable, rather than the loop condition of the multi-layer loop statement and at least one conditional statement jointly include the variable Updating the value range of , can reduce the number of loop statements to be processed, thereby reducing the code length in the compiled result, and obtaining a lower mapping time of the compiled result of the second program.
  • the updating the first value interval of the third variable in the fifth loop statement includes: obtaining the second value range of the third variable based on the second conditional statement a value interval; using the intersection of the first value interval and the second value interval to update the first value interval.
  • a shift operation is performed on the second conditional statement to obtain the second value interval of the third variable, and then the intersection of the second value interval and the first value interval is calculated. Since in a dynamic scenario, at least one of the two endpoints of the second value interval and the two endpoints of the first value interval is a non-constant, it is necessary to compare the two endpoints of the second value interval with the first value.
  • the values of the two endpoints of the interval specifically: the left and right endpoints of the second value interval are the first endpoint and the second endpoint respectively, and the left and right endpoints of the first value interval are respectively the third endpoint and the fourth endpoint.
  • the endpoints on the left and right of the updated first value interval are the fifth endpoint and the sixth endpoint respectively. Then the intersection of the first value interval and the second value interval can be calculated according to the method in formula (1).
  • the fifth endpoint max(first endpoint, third endpoint)
  • the sixth endpoint min(second endpoint, fourth endpoint)(1)
  • the formula (1) indicates that the maximum value among the first endpoint and the third endpoint is taken as the fifth endpoint, and the minimum value among the second endpoint and the fourth endpoint is taken as the sixth endpoint.
  • the formula (1) can be expanded, and the obtained sixth loop statement contains four conditional branches:
  • the updated first value range is [the first end point, the third end point].
  • the updated first value range is [the second end point, the third end point].
  • the updated first value range is [second endpoint, fourth endpoint].
  • a second value interval that makes the inequality always true can be obtained, so after updating the first value interval by using the intersection of the first value interval and the second value interval, we get The updated first value range can make the second conditional statement in the fifth loop statement always hold true.
  • the second conditional statement can be eliminated without fully developing the fifth loop statement, which can effectively reduce the obtained after update.
  • the code length in the sixth loop statement can improve the running performance of the program compilation result, and can also reduce the number of instructions mapped to the second program compilation result, thereby improving the instruction emission efficiency.
  • At least one of the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement is a non-constant and/or each condition statement in the at least one condition statement is also A fourth variable is included, wherein the fourth variable is a variable not included in the loop condition of each layer of loop statement.
  • each conditional statement contains at most one variable that needs to update the value range.
  • the dynamic scenario applied by method 600 is specifically: at least one of the two endpoints of the value interval of the variable included in the loop condition is a non-constant and/or each conditional statement also includes a fourth variable, and each conditional statement is at most Contains a variable that needs to update the range of values.
  • At least one of the two endpoints of the second value interval and the two endpoints of the first value interval is a non-constant, so the above formula (1) and the updated first value can be used interval to eliminate the second conditional statement; at the same time, since the two endpoints of the second value interval and the two endpoints of the first value interval do not contain variables that need to update the value interval, the formula (1) can be updated to the top of the instruction map label. Further, the conditional statement can be used to expand the formula (1) to update the fifth loop statement to obtain the sixth loop statement, and since the formula (1) is above the instruction mapping label, after the formula (1) is expanded, It does not increase the number of instructions generated by the program compilation result map.
  • the fifth loop statement is just a loop statement that any layer of the multi-layer loop statement needs to update the value range in the loop condition, that is, the above method 600 describes that any layer needs to update the value range in the loop condition
  • the processing of the updated loop statement can refer to the processing process of the fifth loop statement, which is not mentioned here Let me repeat. After processing all loop statements in the second program that need to update the range of values in the loop condition, the compilation result of the second program is obtained, so the compilation result of the second program is related to the sixth loop statement.
  • the embodiments of the present application may update the value intervals of variables in the loop conditions of all loop statements that need to be processed in any order.
  • the second conditional statement when the third variable takes the value in the updated first value range, the second conditional statement is always established, and the fifth loop statement in the fifth loop statement can be eliminated at this time.
  • the second conditional statement can further reduce the number of instructions obtained by mapping in the subsequent mapping process of the second program compilation result (segmented multidimensional data), and improve the instruction emission efficiency in the mapping process.
  • the cycle condition of the third variable is fully developed (that is, the value range of the first variable is expanded with 1 as a unit) to eliminate the process of the conditional statement; in the embodiment of the present application, the second The value intervals of the three variables are updated as a whole so that the second conditional statement is always established, which can effectively reduce the code length in the sixth loop statement obtained after the update, and reduce the code length in the program compilation result obtained based on the sixth loop statement.
  • the length of the code reduces the mapping time of subsequent program compilation results and improves the running performance of program compilation results.
  • the first column of Table 5 is the program representation of multidimensional data.
  • the program includes two layers of nested for loop statements and the second conditional statement if(A+B ⁇ n), and the second line of code is the instruction mapping label.
  • the second conditional statement contains a fourth variable n.
  • the variable that needs to update the value range is the loop condition of the loop statement under the instruction mapping label and the variable contained in all conditional statements in the multi-layer loop statement, namely variable B.
  • the third to sixth lines of code in the program are the fifth loop statement in the method 600
  • the variable B is the third variable in the method 600
  • the first value interval and the second value interval of the third variable B are [0,5] and [- ⁇ ,n-A] respectively.
  • the intersection is calculated based on the first value interval and the second value interval. Since 0 is greater than - ⁇ , it is only necessary to calculate the value of the sixth endpoint B_ext of the updated first value interval at this time.
  • the instruction mapping label is located in the third line of code.
  • the number of instructions mapped is 4608, that is, the instruction is actually executed 4608 times.
  • the number of instructions mapped by the compilation result is 9.
  • the combination of the first-level loop statement in the second column of Table 6 and the conditional branch under the third line of code will map 6 instructions, and the first One layer of loop statements and the conditional branch mapping under the 7th line of code get 3 instructions, a total of 9 instructions.
  • the processing of multi-dimensional data can be completed after 9 executions.
  • FIG. 7 is a schematic structural diagram of a compiling device 700 provided in an embodiment of the present application.
  • Apparatus 700 includes:
  • the obtaining unit 701 is configured to obtain a first program; wherein, the first program includes multi-layer loop statements, and the loop conditions of each layer of loop statements in the multi-layer loop statements include variables and value intervals of the variables, so
  • the loop body of the multi-layer loop statement includes at least one conditional statement; the multi-layer loop statement includes a first loop statement; wherein, the first loop statement is a layer of loop statement in the multi-layer loop statement, so
  • the variable included in the loop condition of the first loop statement is a first variable, and the first variable is one of the variables included in the first condition statement in the at least one condition statement.
  • a processing unit 702 configured to process the value range of the first variable in the first loop statement to obtain at least one loop statement corresponding to the first loop statement; wherein, the at least one loop statement includes The second loop statement, both the first loop statement and the second loop statement include the first variable, the value interval of the first variable in the second loop statement is the first interval, and the first interval is a subset of the value range of the first variable in the first loop statement, and the first range makes the first conditional statement always true.
  • the compiling unit 703 is configured to compile the first program based on the at least one loop statement to obtain a compilation result of the first program, and the compilation result of the first program is related to the at least one loop statement.
  • the first program includes an instruction mapping tag; when the estimated mapping time of the compilation result of the first program is greater than a preset time, the first variable is the instruction mapping
  • the loop condition of the loop statement below the label and the at least one conditional statement jointly contain one of the variables; the estimated mapping time is required for the process of mapping the compilation result of the first program into an executable instruction on the hardware Time; when the estimated mapping time of the compilation result of the first program is less than or equal to the preset time, the first variable is the loop condition and the loop condition of each loop statement in the multi-layer loop statement
  • the at least one conditional statement collectively includes one of the variables.
  • the estimated mapping time of the compilation result of the first program is based on the number of variables contained in the at least one conditional statement, the number of layers of the multi-layer loop statement, the At least one conditional statement is determined by one or more of the fuzzy interval of the variable or the number of the conditional statement; wherein, when the value of the variable is a value in the fuzzy interval of the variable, the variable is included Whether the conditional statement is true or not is also related to other factors.
  • the at least one loop statement corresponding to the first loop statement further includes a third loop statement; wherein, the value of the first variable in the third loop statement The interval is the second interval, and when the value of the first variable is a value in the second interval, whether the first conditional statement is true or not is also related to other factors.
  • the first conditional statement further includes a second variable
  • the processing unit is further configured to: when the estimated mapping time of the compilation result of the first program is greater than the estimated Set time, and the second variable is one of the variables contained in the loop condition of the loop statement under the instruction mapping label and the at least one conditional statement; or, when the estimated first program is compiled
  • the mapping time of the result is less than or equal to the preset time
  • the second variable is one of the variables contained in the loop condition of each loop statement in the multi-layer loop statement and the at least one condition statement
  • the The second interval of the first variable in the third loop statement is processed to obtain one or more fourth loop statements; wherein, the value range of the first variable in each of the fourth loop statements is the value interval of the first variable A subset of the second interval.
  • the first conditional statement further includes a second variable, and the value interval of the second variable in the third loop statement is a third interval;
  • the processing unit is further configured to : when the estimated mapping time of the compilation result of the first program is greater than the preset time, and the second variable is a variable included in the loop condition of the loop statement above the instruction mapping label, Processing the third interval in the third loop statement to obtain one or more fifth loop statements; wherein, the value range of the second variable in each fifth loop statement is the value interval of the third interval Subset.
  • the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement are constants, and the variables contained in the at least one conditional statement are all of the multi-level loop statement Variables included in the loop condition of the .
  • FIG. 8 is a schematic structural diagram of a compiling device 800 provided in an embodiment of the present application.
  • Apparatus 800 includes:
  • An acquisition unit 801 configured to acquire a second program; wherein, the second program includes a multi-layer loop statement, the loop condition of each loop statement includes a variable and the value range of the variable, and the loop of the multi-layer loop statement At least one conditional statement is included in the body; the multi-layer loop statement includes a fifth loop statement; wherein, the fifth loop statement is a layer of loop statement in the multi-layer loop statement, and the cycle of the fifth loop statement
  • the variable included in the condition is a third variable, and the third variable is one of the variables included in the second conditional statement in the at least one conditional statement.
  • the updating unit 802 is configured to update the first value range of the third variable in the fifth loop statement to obtain a sixth loop statement; wherein, the updated first value range makes the second conditional statement Heng established.
  • the compiling unit 803 is configured to compile the second program based on the sixth loop statement to obtain a compilation result of the second program; the compilation result of the second program is related to the sixth loop statement.
  • the second program includes an instruction mapping label; the third variable is one of the loop condition of the loop statement under the instruction mapping label and the variable contained in the at least one conditional statement.
  • the updating unit is specifically configured to: obtain The second value interval of the third variable; updating the first value interval by using the intersection of the first value interval and the second value interval.
  • At least one of the two endpoints of the value interval of the variable included in the loop condition of each layer of loop statement is a non-constant and/or each condition statement in the at least one condition statement is also A fourth variable is included, wherein the fourth variable is a variable not included in the loop condition of each layer of loop statement.
  • the devices 700 and 800 are embodied in the form of functional units.
  • the term "unit” here may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor for executing one or more software or firmware programs (such as a shared processor, a dedicated processor, or a group processor, etc.) and memory, incorporated logic, and/or other suitable components to support the described functionality.
  • ASIC application specific integrated circuit
  • the devices 700 and 800 can be used to execute the various processes and/or steps corresponding to the compilers of the above-mentioned method embodiments 500 and/or 600. To avoid repetition, here No longer.
  • FIG. 9 is a schematic diagram of a hardware structure of a compiling device provided in an embodiment of the present application.
  • the device may include: a memory 901 , one or more (only one is shown in the figure) processors 902 , a communication interface 903 and a bus 904 .
  • the memory 901 , the processor 902 , and the communication interface 903 are connected to each other through a bus 904 .
  • the memory 901 is used to store instructions, and the processor 902 is used to call the instructions stored in the memory 901; the instructions may be the programs in the foregoing application embodiments 500 and/or 600.
  • the processor 902 is specifically configured to obtain the program in the embodiment 500 and/or 600, so as to execute the corresponding compiling method in the embodiment 500 and/or 600.
  • the compiling device of the embodiment of the present application can use the compiling method in the embodiment of the present application to irregularly segment multi-dimensional data in the compiling stage, so as to eliminate the conditional inequality in the program, so that the obtained program compiling result has a relatively strong mapping process.
  • the code length in the program compilation result is reduced, and a lower program compilation result mapping time is obtained.
  • apparatus 900 may specifically be a computer, and it may be used to execute various steps and/or processes corresponding to the compiler in the foregoing method embodiment 500 and embodiment 600.
  • the memory 901 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM).
  • the memory 901 may store programs, and when the programs stored in the memory 901 are executed by the processor 902, the processor 902 and the communication interface 903 are used to execute each step of the compiling method of the embodiment of the present application.
  • the processor 902 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the compiling device of the embodiment of the present application, or to execute the compiling method of the method embodiment of the present application.
  • the processor 902 may also be an integrated circuit chip, which has a signal processing capability. During implementation, each step of the compiling method of the present application may be completed by instructions in the form of software in the processor 902 .
  • the above-mentioned processor 902 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application-specific integrated circuit
  • FPGA ready-made programmable gate array
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory 901, and the processor 902 reads the information in the memory 901, and combines its hardware to complete the functions required by the units included in the compiling device of the embodiment of the present application, or execute the compiling method of the method embodiment of the present application.
  • the communication interface 903 implements communication between the apparatus 900 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver.
  • a transceiver device such as but not limited to a transceiver.
  • the program can be acquired through the communication interface 903 .
  • the bus 904 may include pathways for transferring information between various components of the device 900 (eg, memory 901 , processor 902 , communication interface 903 ).
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

本申请公开了一种程序编译方法包括:获取第一程序;其中,第一程序包括多层循环语句,多层循环语句中每层循环语句的循环条件包括变量和变量的取值区间,多层循环语句的循环体中包括至少一个条件语句;对多层循环语句包括的第一循环语句中第一变量的取值区间进行处理,得到第二循环语句,第二循环语句中第一变量的取值区间为第一区间,第一区间为第一循环语句中第一变量的取值区间的子集,第一区间使得至少一个条件语句中的第一条件语句恒成立;基于至少一个循环语句对第一程序进行编译,以得到第一程序的编译结果,采用本申请实施例,可以在提升指令发射效率获取更高程序编译结果运行性能的同时,降低程序的编译结果中的代码长度。

Description

程序编译方法和装置
本申请要求于2021年7月09日提交中国专利局、申请号为2021107819019、申请名称为“程序编译方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及信息技术领域,尤其涉及一种程序编译方法和装置。
背景技术
随着人工智能技术的高速发展,传统的图形处理器(Graphics Processing Unit,GPU)和中央处理器(Central Processing Unit,CPU)已经不能满足日益增长的性能需求。因而业界的研究人员开始聚焦于人工智能(Artificial Intelligence,AI)芯片的研究以满足AI领域的性能需求。在AI系统栈上,由于硬件限制,多维数据的分块计算是必经之路,分块策略直接导致AI芯片的性能问题。AI芯片上往往要处理多维数据,而处理的多维数据在程序中是通过多层嵌套循环来表示的。在处理过程中,多维数据会被映射成一条指令,该映射过程称为张量化tensorize。
但在实际的场景中,由于多层循环嵌套中间夹杂着分支跳转指令,阻碍了tensorize过程,因而映射成指令的数目会受到一定程度限制。具体地,如图1所示,在多维数据的程序表示中,指令映射标签vmuls原始打在H_o轴下面,指令映射标签下所有程序代码会生成一条包含确定执行过程的指令,但由于if条件(即程序执行不确定)限制,在多维数据映射过程中,指令映射标签需要下移到if条件分支下,从而使生成的指令数量远远大于理论上的9条。
因此,当多维数据的切分不规则时,在多维数据映射过程中,尾块数据无法进行有效的tensorize,严重影响单指令多数据(Single Instruction,Multiple Data,SIMD)这一高效的指令发射模式的执行。
现有技术在编译过程中,将表征多维数据程序中的多层嵌套循环进行全展开(以1为单位进行展开),以消除分支跳转,从而提升指令发射效率。但是,此种全展开方式会大幅增加程序编译结果的代码长度,从而增加程序编译结果的映射时长。
发明内容
本申请实施例提供了一种程序编译方法和装置,可以在提升指令发射效率获取更高程序编译结果运行性能的同时,降低程序的编译结果中的代码长度,得到较低的程序编译结果映射时长。
第一方面,本申请提供了一种程序编译方法,该方法包括:获取第一程序;其中,所述第一程序包括多层循环语句,所述多层循环语句中每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第一循环语句;其中,所述第一循环语句为所述多层循环语句中的一层循环语句,所述第一循环语句的循环条件中包括的变量为第一变量,且所述第一变量为所述至少一个条件语句中的第一条件语句所包含变量中的一个;对所述第一循环语句中第一变量的取值区间进行处理,以得到与所述第一循环语句对应的至少一个循环语句;其中,所述至少一个循环语句中包括第二循环语句,所述第一循环语句和所述第二循环语句都包括所述第一变量,所述第二循环语句中第一变量的取值区间为第一区间,所述第一区间为所述第一循环语句中第一变 量的取值区间的子集,所述第一区间使得所述第一条件语句恒成立;基于所述至少一个循环语句对所述第一程序进行编译,以得到所述第一程序的编译结果,所述第一程序的编译结果与所述至少一个循环语句有关。
其中,上述包含多层循环语句的第一程序可以用于表征多维数据,第一程序的编译结果可以对应为多维数据切分后的程序表示。
其中,上述第一程序的编译结果为一种高级语言(例如,C、C++或python等)程序表示,第一程序的编译结果后续经过指令映射等处理后才能转化为硬件(例如,CPU或GPU)上的可执行指令。
应当理解,上述对第一循环语句中第一变量的取值区间进行处理,也即是对第一变量的取值区间进行展开。当上述第一程序还包括其它需要进行处理的循环语句时,可以参照第一循环语句中第一变量取值区间的展开方式,依次将其它循环语句中循环条件包含变量的取值区间进行展开,在将最后一个需要进行处理的循环语句进行展开后,即得到上述第一程序编译结果,即上述第一程序的编译结果与上述至少一个循环语句有关。
此外,本申请实施例可以采用从最外层循环语句到最内层循环语句或从最内层循环语句到最外层循环语句的顺序依次展开需要进行处理的循环语句。其中,所述最外层循环语句为第一程序最上方的一层循环语句,最内层循环语句为第一程序最下方的一层循环语句。
从技术效果上看,由于第一变量取第一区间中的数值时,可以使第一条件语句恒成立,且第一变量在第二循环语句中的取值区间为第一区间,因而可以消除第二循环语句中的第一条件语句,进而可以在第一程序的编译结果(切分后的多维数据)的映射过程中,减少映射得到的指令数量,提高映射过程中的指令发射效率,获取更高的第一程序编译结果的运行性能。同时,相对于现有技术中对第一变量的取值区间进行全开展(即将第一变量的取值区间以1为单位进行展开),来消除第一条件语句的过程;在本申请中,由于第一变量取第一区间中的数值时,第一条件语句恒成立,因而可以将第一循环语句中第一变量的取值区间以区间形式进行展开,从而可以有效降低展开后得到的第二循环语句的代码长度,即降低第一程序编译结果的代码长度,进而降低后续第一程序编译结果的映射时长。
在一种可行的实施方式中,所述第一程序包括指令映射标签;当预估的所述第一程序的编译结果的映射时间大于预设时间时,所述第一变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;所述预估映射时间为将所述第一程序的编译结果映射为在硬件上可执行指令的过程所需要的时间;当预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间时,所述第一变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
从技术效果上看,当预估的第一程序的编译结果的映射时间大于预设时间时,为控制编译结果的映射时长,此时只对指令映射标签下方循环语句的循环条件和至少一个条件语句共同包含的变量的取值区间依次进行处理,从而减少处理的循环语句的数量,以降低编译结果中的代码长度,进而降低程序编译结果的映射时长;同时,当预估的第一程序的编译结果的映射时间小于或等于预设时间时,可以对多层循环语句中每层循环语句的循环条件和至少一个条件语句共同包含的所有变量的取值区间都依次进行处理,在此种情况下,可以最大程度消除第一程序中的条件语句,从而减少第一程序编译结果映射得到的指令数量,即得到高性能指令。
在一种可行的实施方式中,所述预估的所述第一程序的编译结果的映射时间是基于所述至少一个条件语句包含变量的数量、所述多层循环语句的层数、所述至少一个条件语句包含变量的模糊区间或所述条件语句的数量中的一个或多个决定的;其中,当所述变量的取值为所述变量的模糊区间中的数值时,包含所述变量的条件语句是否成立还与其它因素有关。
其中,上述其它因素指条件语句中的其它变量,即包含变量的条件语句是否成立由该变量和包含该变量的条件语句中的其它变量共同决定。
从技术效果上看,由于第一程序编译结果中的代码长度可以由至少一个条件语句中变量的数量、多层循环语句的层数、变量的模糊区间和至少一个条件语句中条件语句的数量计算得到,因而基于上述参数可以计算得到较为准确的映射时长,从而基于该映射时长确定需要进行处理的变量(即对应确定变量位于的一层循环语句)的数量,进而兼顾第一程序编译结果的指令发射效率和映射时长,即使第一程序编译结果在具有较高的指令发射效率的同时具有较短的映射时长。
在一种可行的实施方式中,所述与所述第一循环语句对应的至少一个循环语句中还包括第三循环语句;其中,所述第一变量在所述第三循环语句中的取值区间为第二区间,且当所述第一变量的取值为所述第二区间中的数值时,所述第一条件语句是否成立还与其它因素有关。
其中,其它因素指第一条件语句中的其它变量,即包含第一变量的第一条件语句是否成立由第一变量和第一条件语句中的其它变量共同决定。
其中,第三循环语句是在对第一循环语句进行展开后,与第二循环语句同时得到的。第一区间和第二区间为第一循环语句中第一变量的取值区间的子集。第二区间为第一变量的模糊区间。
从技术效果上看,由于第二循环语句中第一条件语句恒成立,第三循环语句中第一条件语句是否成立由第一变量和第一条件语句中的其它变量共同决定,因而将第一循环语句展开为第二循环语句和第三循环语句,以示区别,从而消除第二循环语句中的第一条件语句。同时,当第一条件语句中还包含其它变量时,后续可以对第三循环语句继续进行展开,以消除第一条件语句。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述方法还包括:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;或者,当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间,且所述第二变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个时,对所述第三循环语句中第一变量的第二区间进行处理,得到一个或多个第四循环语句;其中,每个所述第四循环语句中的第一变量的取值区间为所述第三循环语句中第一变量的取值区间的子集。
其中,上述对第三循环语句中第一变量的第二区间进行处理具体为:将第二区间以1位单位进行展开。
从技术效果上看,由于第三循环语句中第一条件语句是否成立由第一变量和第二变量的取值共同决定,因而,可以先将第一变量取值对应的第二区间进行展开,并对应地确定每个第四循环语句中第二变量的取值,以使得每个第四循环语句中的第一条件语句恒成立,以消 除每个第四循环语句中的第一条件语句,进而减少编译结果映射时生成的指令,提高指令发射效率,以获得高性能的程序编译结果。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述第二变量在所述第三循环语句中的取值区间为第三区间;所述方法还包括:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签上方的循环语句的循环条件中包含的变量时,对所述第三循环语句中的第三区间进行处理,得到一个或多个第五循环语句;其中,每个所述第五循环语句中第二变量的取值区间为所述第三区间的子集。
其中,上述对第三循环语句中的第三区间进行处理具体为:将第三区间以1位单位进行展开。
从技术效果上看,由于第三循环语句中第一条件语句是否成立由第一变量和第二变量的取值共同决定,因而可以将第三循环语句中第二变量对应的取值区间(第三区间)进行展开,并同时确定每个第五循环语句中第一变量的取值,以使得每个第五循环语句中的第一条件语句恒成立,以消除每个第五循环语句中的第一条件语句,进而减少后续编译结果映射时生成的指令,提高指令发射效率。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点为常数,且所述至少一个条件语句包含的变量都为所述多层循环语句的循环条件中包含的变量。
从技术效果上看,由于多层循环语句的循环条件包括的变量的取值区间都为常数,且每个条件语句包含的变量都是多层循环语句的循环条件中包含的变量,因而基于第一条件语句和第一变量的取值区间得到的第一区间和第二区间的端点数值都为常数,进而可以继续将第三循环语句中的第二区间进行展开,以消除每个第四循环语句中的第一条件语句,以减少后续编译结果映射时生成的指令,提高指令发射效率。
第二方面,本申请提供了一种程序编译方法,该方法包括:获取第二程序;其中,所述第二程序包括多层循环语句,每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第五循环语句;其中,所述第五循环语句为所述多层循环语句中的一层循环语句,所述第五循环语句的循环条件中包括的变量为第三变量,且所述第三变量为所述至少一个条件语句中的第二条件语句所包含变量中的一个;对所述第五循环语句中第三变量的第一取值区间进行更新,以得到第六循环语句;其中,更新后的所述第一取值区间使所述第二条件语句恒成立;基于所述第六循环语句对所述第二程序进行编译,以得到所述第二程序的编译结果;所述第二程序的编译结果与所述第六循环语句有关。
其中,上述包含多层循环语句的第二程序可以用于表征多维数据,第二程序的编译结果可以对应为多维数据切分后的程序表示。
其中,上述第二程序的编译结果为一种高级语言(例如,C、C++或python等)程序表示,第二程序的编译结果后续经过指令映射等处理后才能转化为硬件(例如,CPU或GPU)上的可执行指令。
应当理解,上述对第五循环语句中第三变量的取值区间进行处理,也即是对第三变量的取值区间进行展开。当上述第二程序还包括其它需要进行处理的循环语句时,可以参照第五 循环语句中第三变量取值区间的更新方式,依次将其它循环语句中循环条件包含变量的取值区间进行更新,在将最后一个需要进行处理的循环语句的循环条件包含的取值区间进行更新后,即得到上述第二程序编译结果,即上述第二程序的编译结果与上述第六循环语句有关。
此外,本申请实施例可以采用从最外层循环语句到最内层循环语句或从最内层循环语句到最外层循环语句的顺序依次展开需要进行处理的循环语句。其中,所述最外层循环语句为第一程序最上方的一层循环语句,最内层循环语句为第一程序最下方的一层循环语句。
此外,本申请实施例可以采用任意顺序更新所有需要进行处理的循环语句的循环条件中变量的取值区间。
从技术效果上看,由于第三变量取更新后的第一取值区间中的数值时,第二条件语句恒成立,此时可以消除第五循环语句中的第二条件语句,进而可以在后续第二程序编译结果(切分后的多维数据)的映射过程中,减少映射得到的指令数量,提高映射过程中的指令发射效率。同时,相对于现有技术中对第三变量的循环条件进行全开展(即将第一变量的取值区间以1为单位进行展开)来消除条件语句的过程;在本申请实施例中,将第三变量的取值区间作为整体进行更新,以使得第二条件语句恒成立,可以有效降低更新后得到的第六循环语句中的代码长度,以及降低基于第六循环语句得到的程序编译结果中的代码长度,进而降低后续程序编译结果的映射时长,提升程序编译结果的运行性能。
在一种可行的实施方式中,所述第二程序包括指令映射标签;所述第三变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
从技术效果上看,只对指令映射标签下方循环语句的循环条件和至少一个条件语句共同包含变量的取值区间进行更新,而不是对多层循环语句的循环条件和至少一个条件语句共同包含变量的取值区间进行更新,可以减少进行处理的循环语句的数量,进而降低编译结果中的代码长度,获得较低的第二程序编译结果的映射时长。
在一种可行的实施方式中,所述对所述第五循环语句中第三变量的第一取值区间进行更新,包括:基于所述第二条件语句得到所述第三变量的第二取值区间;利用所述第一取值区间和所述第二取值区间的交集更新所述第一取值区间。
从技术效果上看,基于第二条件语句可以得到一个使得该不等式恒成立的第二取值区间,因而利用第一取值区间和第二取值区间的交集更新第一取值区间后,得到的更新后的第一取值区间可以使第五循环语句中的第二条件语句恒成立,此时无需对第五循环语句进行全开展就可以消除第二条件语句,既可以有效降低更新后得到的第六循环语句中的代码长度,提升程序编译结果的运行性能,同时还可以减少第二程序编译结果映射得到的指令数量,提升指令发射效率。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或所述至少一个条件语句中每个条件语句还包括第四变量,其中,所述第四变量为未被包含于所述每层循环语句的循环条件中的变量。
其中,每个条件语句至多包含一个需要进行取值区间更新的变量。
从技术效果上看,由于每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或每个条件语句还包括第四变量,该第四变量为未被包含于每层循环语句的循环条件中的变量,因而第一取值区间的两个端点和第二取值区间的两个端点中至少有一 个为非常数;由于每个条件语句只包含一个需要进行处理的变量,因而可以直接比较第一取值区间和第二取值区间的上下界,来得到更新后的第一取值区间,并且不会影响其它需要进行处理的变量。由于第五变量取更新后第一取值区间中的数值时,第二条件语句恒成立,因而本申请在编译过程中,可以在消除条件语句同时无需对第一取值区间进行展开,因而不仅可以减少映射得到的指令数量,还能降低后续得到的第二程序编译结果中的代码长度,进而降低后续第二程序编译结果的映射时长。
第三方面,本申请提供了一种程序编译装置,所述装置包括:获取单元,用于获取第一程序;其中,所述第一程序包括多层循环语句,所述多层循环语句中每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第一循环语句;其中,所述第一循环语句为所述多层循环语句中的一层循环语句,所述第一循环语句的循环条件中包括的变量为第一变量,且所述第一变量为所述至少一个条件语句中的第一条件语句所包含变量中的一个;处理单元,用于对所述第一循环语句中第一变量的取值区间进行处理,以得到与所述第一循环语句对应的至少一个循环语句;其中,所述至少一个循环语句中包括第二循环语句,所述第一循环语句和所述第二循环语句都包括所述第一变量,所述第二循环语句中第一变量的取值区间为第一区间,所述第一区间为所述第一循环语句中第一变量的取值区间的子集,所述第一区间使得所述第一条件语句恒成立;编译单元,基于所述至少一个循环语句对所述第一程序进行编译,以得到所述第一程序的编译结果,所述第一程序的编译结果与所述至少一个循环语句有关。
在一种可行的实施方式中,所述第一程序包括指令映射标签;当预估的所述第一程序的编译结果的映射时间大于预设时间时,所述第一变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;所述预估映射时间为将所述第一程序的编译结果映射为在硬件上可执行指令的过程所需要的时间;当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间时,所述第一变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
在一种可行的实施方式中,所述预估的所述第一程序的编译结果的映射时间是基于所述至少一个条件语句包含变量的数量、所述多层循环语句的层数、所述至少一个条件语句包含变量的模糊区间或所述条件语句的数量中的一个或多个决定的;其中,当所述变量的取值为所述变量的模糊区间中的数值时,包含所述变量的条件语句是否成立还与其它因素有关。
在一种可行的实施方式中,所述与所述第一循环语句对应的至少一个循环语句中还包括第三循环语句;其中,所述第一变量在所述第三循环语句中的取值区间为第二区间,且当所述第一变量的取值为所述第二区间中的数值时,所述第一条件语句是否成立还与其它因素有关。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述处理单元还用于:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;或者,当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间,且所述第二变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个时,对所述第三循环语句中第一变量的第二区间进行处理,得到一个或多个第四循环语句;其中,每个所述第四循环语句中的第一变量的取值区间为所述第二区间的子集。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述第二变量在所述第 三循环语句中的取值区间为第三区间;所述处理单元还用于:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签上方的循环语句的循环条件中包含的变量时,对所述第三循环语句中的第三区间进行处理,得到一个或多个第五循环语句;其中,每个所述第五循环语句中第二变量的取值区间为所述第三区间的子集。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点为常数,且所述至少一个条件语句包含的变量都为所述多层循环语句的循环条件中包含的变量。
第四方面,本申请提供了一种程序编译装置,所述装置包括:获取单元,用于获取第二程序;其中,所述第二程序包括多层循环语句,每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第五循环语句;其中,所述第五循环语句为所述多层循环语句中的一层循环语句,所述第五循环语句的循环条件中包括的变量为第三变量,且所述第三变量为所述至少一个条件语句中的第二条件语句所包含变量中的一个;更新单元,对所述第五循环语句中第三变量的第一取值区间进行更新,以得到第六循环语句;其中,更新后的所述第一取值区间使所述第二条件语句恒成立;编译单元,基于所述第六循环语句对所述第二程序进行编译,以得到所述第二程序的编译结果;所述第二程序的编译结果与所述第六循环语句有关。
在一种可行的实施方式中,所述第二程序包括指令映射标签;所述第三变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
在一种可行的实施方式中,在所述对所述第五循环语句中第三变量的第一取值区间进行更新的方面,所述更新单元具体用于:基于所述第二条件语句得到所述第三变量的第二取值区间;利用所述第一取值区间和所述第二取值区间的交集更新所述第一取值区间。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或所述至少一个条件语句中每个条件语句还包括第四变量,其中,所述第四变量为未被包含于所述每层循环语句的循环条件中的变量。
第五方面,本申请提供了一种芯片系统,所述芯片系统包括至少一个处理器、存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,上述第一方面和/第二方面中任意一项所述的方法得以实现。
第六方面,本申请提供了一种编译装置,所述编译装置包括如第五方面所述芯片系统,以及耦合至所述芯片系统的分立器件。
第七方面,本申请提供了一种计算机可读存储介质,所述计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行如上述第一方面和/第二方面中任一项所述的方法。
第八方面,本申请提供了一种计算机程序产品,其特征在于,所述计算机程序产品包括程序指令,当所述程序指令在计算机上运行时,上述第一方面和/第二方面中任意一项所述的方法得以实现。
附图说明
以下对本申请实施例用到的附图进行介绍。
图1是本申请实施例中一种多维数据的程序表示示意图;
图2是本申请实施例中一种多维数据的切分过程示意图;
图3是本申请实施例中一种系统架构的示意图;
图4是本申请实施例中一种应用场景示意图;
图5是本申请实施例中一种程序编译方法的流程图;
图6是本申请实施例中另一种程序编译方法的流程图;
图7是本申请实施例中一种编译装置结构示意图;
图8是本申请实施例中一种编译装置的硬件结构示意图;
图9为本申请实施例中一种编译装置的硬件结构示意图。
具体实施方式
下面结合本申请实施例中的附图对本申请实施例进行描述。
本申请实施例的技术方案可以应用于各种计算机系统,例如,个人计算机(personal computer,PC)、计算机集群系统、大型计算机系统或各种超级计算机(supercomputer)等等,本申请对此并不作限定。此外,本申请还可以应用于各种编译器,例如,GNU编译器套件(GNU Compiler Collection,GCC)和底层虚拟机(Low Level Virtual Machine,LLVM)等,本申请实施例对此也不作限定。
首先对本申请实施例中的相关术语进行解释:
(1)张量化Tensorize:表征多维数据的程序被映射成一条指令的过程。
(2)尾块:多维数据在切分时,在某一个维度不能整除切分,此时所产生的剩余部分称为尾块。其中,多维数据可以是矩阵。
(3)编译:将一种源代码转换成另一种源代码的过程。本申请中的编译具体指对表征多维数据的源程序进行处理,得到表征切分后多维数据的程序的过程。
(4)指令映射标签:一个标签,可以是//pragma_emit_insn="vmuls";其中,pragma_emit_insn为指令映射标签的关键字,即指示该标签为指令映射标签,vmuls表示类型,此处为向量的乘。指令映射标签用于指示该标签下方的代码会映射为一条指令,该指令后续经过处理,即为硬件(例如,CPU等)上可执行的指令。同时,在映射过程中,指令映射标签下方的代码只能是确定的程序执行过程,即不能包含分支跳转(例如,条件语句)。
请参见图1,图1为本申请实施例中一种多维数据的程序表示示意图。其中,多维数据可以是矩阵。如图1所示,该9行程序代码表征一个多维数据,程序中包含五层嵌套的for循环语句,每层循环语句对应多维数据的一个维度。第1行、第2行,以及第4行-第6行代码分别为五层循环语句的循环条件,第7行代码为循环体中的条件语句。从上至下来看,对于第一层(即最外层)循环语句而言,其循环条件为第1行代码,循环体为第2行到第9行代码,第7行代码为循环体中的条件语句。对于第二层循环语句而言,其循环条件为第2行代码,循环体为第3行到第9行代码。第3行代码为多维数据中的指令映射标签,该标签指示下方的代码在后续多维数据的映射过程会映射为一条指令。但是,在多维数据不规则切分过程中,该程序中包含条件语句(第7行代码),因而在后续多维数据映射过程中,指令映射标签会下移到条件语句的下方,以使指令映射标签下方包含的程序执行过程为确定的过程,造成映射得到大量的指令,远远大于规则切分时的9条,严重影响SIMD时的指令发射效率。
请参见图2,图2本申请实施例中一种多维数据的切分过程示意图。该多维数据的切分 过程对应本申请实施例中第一程序或第二程序的编译过程。图1所示的程序表征图2中的多维数据。图2中多维数据也包含五个维度:C1、W、C0、H(包含H_o和H_i,图2以H表示)。
在实际应用中,由于硬件内存的限制,在切分大的数据时,往往存在切块不均匀的情况。例如,在将图2所示多维数据的H方向进行切分时,H方向数据被切分为三块:前两块数据在H方向上的尺寸相同,第三块数据在H方向上的尺寸不等于前两块数据在H方向上的尺寸相同。
图2所示的多维数据被切成9块,理论上,我们期望该多维数据映射成9条指令,利用9条指令完成整个多维数据的操作,但是由于存在不规则切分,最后映射得到的指令远大于9条,严重影响指令发射效率。
而现有技术中,对多维数据的进行不规则切分时,将表征多维数据程序中的多层嵌套循环进行全展开,以消除条件语句,但此种方式会极大增加程序编译结果中代码长度,增加编译结果的映射时长。
下面将介绍本申请实施例的系统架构和应用场景。
请参见图3,图3是本申请实施例中一种系统架构300的示意图。
如图3所示,编译设备320对源程序350进行编译,得到程序编译结果301。其中,源程序350即为本申请实施例中的程序(第一程序或第二程序),其可以是在不同场景下进行多维数据运算的程序,该不同场景包括图像处理、语音识别、科学计算或物理建模等。编译设备320可以是任意包含编译器(如GCC或LLVM等编译器)的设备。
下文将基于图5和图6所示实施例详细地描述编译设备320对源程序350的编译过程。
利用编译设备320编译得到的程序编译结果301可以应用于不同的系统或设备中,如应用于图3所示的执行设备310,执行设备310可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端等。在图3中,执行设备310配置有输入/输出(input/output,I/O)接口312,用于与外部设备进行数据交互。
执行设备310可以接收数据库330中或客户设备340输入的数据,利用计算模块311执行程序编译结果301中的相关计算过程,以得到对应的处理结果。
最后,I/O接口312将处理结果(例如,图像处理结果或者语音识别结果等)返回给客户设备340,以提供给用户。
值得说明的是,编译设备320可以针对实现不同的目标或不同的任务的源程序进行编译,得到相应的程序编译结果301,然后利用执行设备310执行程序编译结果301中的相关计算过程,得到针对不同的目标或不同的任务所需的处理结果。
在图3中所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口312提供的界面进行操作。另一种情况下,客户设备340可以自动地向I/O接口312发送输入数据,如果要求客户设备340自动发送输入数据需要获得用户的授权,则用户可以在客户设备340中设置相应权限。用户可以在客户设备340查看执行设备310输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。
值得注意的是,图3仅是本发明实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图3中,编译设备320相对执行设备310是外部设备,在其它情况下,也可以将编译设备320置于执行设备310中。执行设备 310相对于客户设备340是外部设备,在其它情况下,执行设备310和客户设备340可以是同一设备。
请参见图4,图4是本申请实施例中一种应用场景示意图。应当理解,本申请实施例中的编译方法500和600可应用于包括人工智能(如图像处理或语音识别等)、科学计算、物理建模等领域中需要进行多维数据运算的场景。图4以人工智能领域中基于深度学习的图像处理场景(例如,图像识别、目标检测和图像分割等)为例,描述本申请实施例中编译方法500和600的应用过程。
首先,获取待处理图像数据410,对处理图像数据410进行高级语言(例如,C、C++或Python等编程语言)定义,得到待处理图像数据410对应的多维数据程序表征420(即图3中的源程序)。
通过编译设备320执行本申请实施例中的编译方法500和/或600对多维数据程序表征420(即后文实施例中的第一程序或第二程序)进行编译,得到程序编译结果440(即图3中的程序编译结果301)。对程序编译结果440进行优化和指令映射等操作,得到硬件(例如,CPU或GPU等)上可执行的硬件机器语言450。
利用执行设备310运行硬件机器语言450以进行相应的多维数据运算,得到图像处理结果430(例如,在图像识别中处理结果为图像的类别标签,目标检测中处理结果为从图像中识别出的目标,在图像分割中处理结果为图像的分割结果)。
应当理解,在图4中,编译设备320相对执行设备310是外部设备,在其它情况下,也可以将编译设备320置于执行设备310中。
请参见图5,图5是本申请实施例中一种程序编译方法的流程图。该方法500包括步骤S510、S520、S530和S540。方法500应用于静态场景。
步骤S510:获取第一程序;其中,所述第一程序包括多层循环语句,所述多层循环语句中每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句。
步骤S520:所述多层循环语句包括第一循环语句;其中,所述第一循环语句为所述多层循环语句中的一层循环语句,所述第一循环语句的循环条件中包括的变量为第一变量,且所述第一变量为所述至少一个条件语句中的第一条件语句所包含变量中的一个。
步骤S530:对所述第一循环语句中第一变量的取值区间进行处理,以得到与所述第一循环语句对应的至少一个循环语句;其中,所述至少一个循环语句中包括第二循环语句,所述第一循环语句和所述第二循环语句都包括所述第一变量,所述第二循环语句中第一变量的取值区间为第一区间,所述第一区间为所述第一循环语句中第一变量的取值区间的子集,所述第一区间使得所述第一条件语句恒成立。
步骤S540:基于所述至少一个循环语句对所述第一程序进行编译,以得到所述第一程序的编译结果,所述第一程序的编译结果与所述至少一个循环语句有关。
其中,上述包含多层循环语句的第一程序可以用于表征多维数据,第一程序的编译结果可以对应为多维数据切分后的程序表示。
其中,上述第一区间是基于第一条件语句和第一循环语句中第一变量的取值区间得到的。
其中,上述第一程序的编译结果为一种高级语言(例如,C、C++或python等)程序表示,第一程序的编译结果后续经过指令映射等处理后才能转化为硬件(例如,CPU或GPU) 上的可执行指令。
应当理解,上述对第一循环语句中第一变量的取值区间进行处理,也即是对第一变量的取值区间进行展开。当上述第一程序还包括其它需要进行处理的循环语句时,可以参照第一循环语句中第一变量取值区间的展开方式,依次将其它循环语句中循环条件包含变量的取值区间进行展开,在将最后一个需要进行处理的循环语句进行展开后,即得到上述第一程序编译结果,即上述第一程序的编译结果与上述至少一个循环语句有关。
其中,上述循环语句可以是for循环语句。
下面以图1中程序为例来描述静态场景下,用于表征多维数据的程序结构:在图1中,表征多维数据的程序包括五层嵌套的for循环语句,每层循环语句包括一个循环条件,该五层循环语句的循环条件分别为图1中的第1-2行,以及第4-6行代码。每层循环语句的循环条件包括一个变量和该一个变量的取值区间,例如,在第一层循环语句的控制部分中的变量为C1,其取值区间为(0,3),可以表示C1的取值为0、1和2共3个数值。
此外,如图1所示,最上方的一层(即最外层)循环语句的范围为:第1行-第9行代码;最下方一层(最内层)循环语句的范围为:第6行-第9行代码。上述第一循环语句可以是多层循环语句中的一层循环语句。
在一种可行的实施方式中,所述第一程序包括指令映射标签;当预估的所述第一程序的编译结果的映射时间大于预设时间时,所述第一变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;所述预估映射时间为将所述第一程序的编译结果映射为在硬件上可执行指令的过程所需要的时间;当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间时,所述第一变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
其中,每个表征多维数据的第一程序中包含一个指令映射标签。
具体地,根据预估的第一程序的编译结果的映射时间确定以全部切分方式或者部分切分方式对多层循环语句进行展开,也即确定不同切分方式下进行处理(取值区间展开)的变量。
(1)全部切分
当预估的第一程序的编译结果的映射时间大于预设时间时,采用全部切分方式,以确保最大程度消除第一程序中的条件语句,从而降低编译结果映射得到的指令数量,提升指令发射效率:此种方式下需要进行取值区间展开的变量为指令映射标签下方循环语句的循环条件和至少一个条件语句共同包含的变量,例如,若采用全部切分方式,则图1中需要进行取值区间展开的变量为H_o和H_i。
(2)部分切分
当预估的第一程序的编译结果的映射时间小于或等于预设时间时,采用部分切分方式,以确保最大程度降低编译结果中的代码长度,从而降低编译结果在映射过程中的映射时间:此种方式下需要进行取值区间展开的变量为多层循环语句中每层循环语句的循环条件和至少一个条件语句共同包含变量,例如,若采用全部切分方式,则图1中需要进行取值区间展开的变量为H_i。
其中,上述第一循环语句中的第一变量即为需要进行取值区间展开的变量,上述预设时间可以是根据具体场景设定的时间,本申请不限定。
从技术效果上看,当预估的第一程序的编译结果的映射时间大于预设时间时,为控制编译结果的映射时长,此时只对指令映射标签下方循环语句的循环条件和至少一个条件语句共 同包含的变量的取值区间依次进行处理,从而减少处理的循环语句的数量,以降低编译结果中的代码长度,进而降低程序编译结果的映射时长;同时,当预估的第一程序的编译结果的映射时间小于或等于预设时间时,可以对多层循环语句中每层循环语句的循环条件和至少一个条件语句共同包含的所有变量的取值区间都依次进行处理,在此种情况下,可以最大程度消除第一程序中的条件语句,从而减少第一程序编译结果映射得到的指令数量,即得到高性能指令。
在一种可行的实施方式中,上述预估的所述第一程序的编译结果的映射时间是基于所述至少一个条件语句包含变量的数量、所述多层循环语句的层数、所述至少一个条件语句包含变量的模糊区间或所述条件语句的数量中的一个或多个决定的;其中,当所述变量的取值为所述变量的模糊区间中的数值时,包含所述变量的条件语句是否成立还与其它因素有关。
其中,预估的第一程序的编译结果的映射时间与编译结果中代码长度成正比。
其中,上述其它因素指条件语句中的其它变量,即包含变量的条件语句是否成立由该变量和包含该变量的条件语句中的其它变量共同决定。
具体地,可以利用上述多层循环语句中至少一个条件语句包含的变量的数量、多层循环语句的层数、多层循环语句中至少一个条件语句包含的变量的模糊区间或多层循环语句中条件语句的数量来计算编译结果(即将表征多维数据的第一程序进行展开后的程序)中的代码长度,进而基于编译结果中的代码长度计算预估的第一程序的编译结果的映射时间。其中,具体计算方式本申请不限定。
从技术效果上看,由于第一程序编译结果中的代码长度可以由至少一个条件语句中变量的数量、多层循环语句的层数、变量的模糊区间和至少一个条件语句中条件语句的数量计算得到,因而基于上述参数可以计算得到较为准确的映射时长,从而基于该映射时长确定需要进行处理的变量(即对应确定变量位于的一层循环语句)的数量,进而兼顾第一程序编译结果的指令发射效率和映射时长,即使第一程序编译结果在具有较高的指令发射效率的同时具有较短的映射时长。
在一种可行的实施方式中,上述与所述第一循环语句对应的至少一个循环语句中还包括第三循环语句;其中,所述第一变量在所述第三循环语句中的取值区间为第二区间,且当所述第一变量的取值为所述第二区间中的数值时,所述第一条件语句是否成立还与其它因素有关。
具体地,上述对第一循环语句中第一变量的取值区间进行处理,包括:将第一循环语句中循环条件包括的第一变量的取值区间进行分解,分解后的区间包括第一区间、第二区间和第四区间。其中,当第一变量的取值为第一区间中的数值时,第一条件语句恒成立;当第一变量的取值为第二区间中的数值时,第一条件语句成立或者不成立,即此时第一条件语句是否成立由第一变量的第二区间和第一条件等式中其它部分共同决定,第二区间也可称为第一变量的模糊区间;当第一变量的取值为第四区间中的数值时,第一条件语句恒不成立,因而第四区间可以直接消除。然后,根据上述第一区间、第二区间和第四区间对第一循环语句中第一变量的取值区间进行展开,得到第二循环语句和第三循环语句。
其中,第二循环语句中第一变量的取值范围为第一区间,同时,由于第二循环语句中第一条件语句恒成立,因而第二循环语句中可以不包含第一条件语句;第三循环语句中第一变量的取值区间为第二区间,由于第三循环语句中第一条件语句是否成立由第一变量的第二区 间和第一条件等式中其它部分共同决定,因而第三循环语句中包括第一条件语句。
应当理解,上述对第一循环语句中第一变量的取值区间进行处理的过程,即为对第一循环语句中第一变量的取值区间进行展开的过程,也即是对第一循环语句进行展开的过程。
其中,其它因素指第一条件语句中的其它变量,即包含第一变量的第一条件语句是否成立由第一变量和第一条件语句中的其它变量共同决定。
其中,第三循环语句是在对第一循环语句进行展开后,与第二循环语句同时得到的。第一区间和第二区间为第一循环语句中第一变量的取值区间的子集。第二区间为第一变量的模糊区间。
从技术效果上看,由于第二循环语句中第一条件语句恒成立,第三循环语句中第一条件语句是否成立由第一变量和第一条件语句中的其它变量共同决定,因而将第一循环语句展开为第二循环语句和第三循环语句,以示区别,从而消除第二循环语句中的第一条件语句。同时,当第一条件语句中还包含其它变量时,后续可以对第三循环语句继续进行展开,以消除第一条件语句。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述方法还包括:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;或者,当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间,且所述第二变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个时,对所述第三循环语句中第一变量的第二区间进行处理,得到一个或多个第四循环语句;其中,每个所述第四循环语句中的第一变量的取值区间为所述第二区间的子集。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述第二变量在所述第三循环语句中的取值区间为第三区间;所述方法还包括:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签上方的循环语句的循环条件中包含的变量时,对所述第三循环语句中的第三区间进行处理,得到一个或多个第五循环语句;其中,每个所述第五循环语句中第二变量的取值区间为所述第三区间的子集。
应当理解,上述展开第一循环语句(或称为展开第一循环语句中第一变量的取值区间)的目的即是为了消除包含第一变量的第一条件语句。
为了消除第三循环语句中第一条件语句,还可以对第三循环语句进行展开,第三循环语句的展开包含两种情况:
(1)第一条件语句中包含的其它变量为需要进行取值区间展开的变量
除第一变量外,第一条件语句还包括需要进行取值区间展开的第二变量。此时,由于第二变量的取值区间后续也需要进行展开,第一变量在第三循环语句中的取值范围为第二区间(模糊区间),此时,可以将第二区间进行分解(具体地,以1为单位进行分解)。然后基于分解后第一变量的取值进一步展开第三循环语句,以便后续对第二变量的取值区间进行展开时,可以基于确定的第一变量取值来消除第一条件语句。
其中,对第三循环语句进行展开后,得到的每个第四循环语句中第一变量的取值为一个固定的数值,且每个第四循环语句仍然包括第一条件语句。每个第四循环语句中的第一变量的取值区间为第二区间的子集。
应当理解,除第一变量外,第一条件语句中还可以包含多个需要进行取值区间展开的变量。此种情况下,可以将多个需要进行取值区间展开的变量作为整体,参照第一条件语句中 包含第二变量时的情况,对第三循环语句进行展开,此处不再赘述。
(2)第一条件语句中包含的其它变量不是需要进行取值区间展开的变量
由于第二变量不是需要进行取值区间展开的变量,可以将第二变量在第三循环语句中的取值区间,即第三区间,以1位单位进行分解。同时基于分解后的第二变量的每个固定取值,来确定对应第一变量的取值,以消除第一条件语句。最后基于第二变量的每个固定取值和对应第一变量取值的组合,展开第三循环语句,得到一个或多个第五循环语句,以使得每个第五循环语句中第一条件语句恒成立,以消除每个第五循环语句中的第一条件语句。
其中,每个第五循环语句中第一变量的取值区间为第二区间的子集。
应当理解,除第一变量外,第一条件语句中可以包含多个变量,该多个变量都不是需要进行取值区间展开的变量。此种情况下,可以将该多个变量作为整体(即上述第二变量),参照上述第(2)种情况下的展开过程,对第三循环语句进行展开,此处不再赘述。
从技术效果上看,由于第三循环语句中第一条件语句是否成立由第一变量和第二变量的取值共同决定,因而,可以先将第一变量取值对应的第二区间进行展开,并对应地确定每个第四循环语句中第二变量的取值,以使得每个第四循环语句中的第一条件语句恒成立,以消除每个第四循环语句中的第一条件语句,进而减少编译结果映射时生成的指令,提高指令发射效率,以获得高性能的程序编译结果。
同时,由于第三循环语句中第一条件语句是否成立由第一变量和第二变量的取值共同决定,因而可以将第三循环语句中第二变量对应的取值区间(第三区间)进行展开,并同时确定每个第五循环语句中第一变量的取值,以使得每个第五循环语句中的第一条件语句恒成立,以消除每个第五循环语句中的第一条件语句,进而减少后续编译结果映射时生成的指令,提高指令发射效率。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点为常数,且所述至少一个条件语句包含的变量都为所述多层循环语句的循环条件中包含的变量。
具体地,方法500可以应用于静态场景:每层循环语句中循环条件包含变量的取值区间为常数区间,且上述至少一个条件语句包含的变量都为多层循环语句的循环条件中包含的变量。
因此,在静态场景下,上述确定的第一区间、第二区间、第四区间和第三区间都为常数区间(区间的两个端点的数值为确定的整数),进而可以对第一循环语句进行展开,以及对第三循环语句进行展开,以消除第一条件语句,从而减少编译结果映射得到的指令数量。
应当理解,第一循环语句只是多层循环语句中的任意一层需要进行展开的循环语句,即上述方法500描述了对任意一层需要进行展开的循环语句的处理过程。在对表征多维数据的第一程序进行编译的过程中,对于其它需要进行展开的循环语句,其展开方式可以参照第一循环语句的展开过程,此处不再赘述。
此外,此外,本申请实施例可以采用从最外层循环语句到最内层循环语句或从最内层循环语句到最外层循环语句的顺序依次展开需要进行处理的循环语句。其中,所述最外层循环语句为第一程序最上方的一层循环语句,最内层循环语句为第一程序最下方的一层循环语句。
从技术效果上看,在方法实施例500中,由于第一变量取第一区间中的数值时,可以使第一条件语句恒成立,且第一变量在第二循环语句中的取值区间为第一区间,因而可以消除第二循环语句中的第一条件语句,进而可以在第一程序的编译结果(切分后的多维数据)的 映射过程中,减少映射得到的指令数量,提高映射过程中的指令发射效率,获取更高的第一程序编译结果的运行性能。同时,相对于现有技术中对第一变量的取值区间进行全开展(即将第一变量的取值区间以1为单位进行展开),来消除第一条件语句的过程;在本申请中,由于第一变量取第一区间中的数值时,第一条件语句恒成立,因而可以将第一循环语句中第一变量的取值区间以区间形式进行展开,从而可以有效降低展开后得到的第二循环语句的代码长度,即降低第一程序编译结果的代码长度,进而降低后续第一程序编译结果的映射时长。
下面将以表1和表2为例,描述在静态场景下,采用上述两种不同切分方式对表征多维数据的第一程序进行编译的过程。表1和表2中两种方式所针对的多维数据的程序表示如下程序一所示:
程序一:
Figure PCTCN2022103415-appb-000001
在程序一中,包含两层嵌套for循环语句,第一层循环语句(第1行到第6行代码)中循环条件为第1行代码,循环体为第2行到第6行代码;循环条件包含的变量为A,A的取值区间为[0,8],取0-8中的九个数值;循环体中包含的条件语句为A+B<10。第二层循环语句(第3行到第6行代码)中循环条件为第3行代码,循环体为第3行到第6行代码;循环条件包含的变量为B,B的取值区间为[0,5],取0-5中的六个数值。同时,由于变量A和变量B的取值区间都为常数区间,且条件语句中变量都为循环条件中包含的变量,可以看出程序一为静态场景下的程序。
请参见表1,表1为采用全部切分的方式对程序一进行编译的过程,此时需要进行取值区间展开的变量为多层循环语句中每层循环语句的循环条件与多层循环语句中所有条件语句共同包含的变量,即变量A和变量B。
表1中采用从最外层到最内层的顺序对两层循环语句依次进行展开。此时,变量A为第一变量,变量B为第二变量,条件语句A+B<10为第一条件语句。
首先展开第一循环语句(即第一层循环语句):计算第一变量A的第一区间(恒真区间)、第二区间(模糊区间)和第四区间(恒假区间),分别为[0,4]、[5,9]和[10,+∞]。不考虑第四区间,依据第一区间和第二区间对第一循环语句进行展开,得到第二循环语句和第三循环语句。
具体可参见表1中的第一列:第1行到第4行代码为第二循环语句,第5行到第9行代码为第三循环语句。由于第二循环语句中第一变量A的取值区间(恒真区间)使得第一条件语句A+B<10恒成立,因而第二循环语句中可以消除第一条件语句。第三循环语句中第一变量A的取值区间为第二区间(模糊区间),此时,第一条件语句是否成立由第一变量A的取值和第二变量B的取值共同决定,因而无法消除第一条件语句。其中,第一区间[0,4]由第1行代码中的(A,0,5)进行表征,第二区间[5,9]由第5行代码中的(A,0,4)进行表征。
由于第一条件语句中包含的第二变量B也是需要进行取值区间展开的变量,为便于后续对第二变量B的取值区间进行展开,还需将第二区间[5,9]以1为单位进行分解。然后基于分 解后第一变量A的固定取值进一步展开第三循环语句,得到四个第四循环语句。
具体可参见表1中的第二列:从上到下四个第四循环语句分别是:第6行到第11行代码、第12行到第17行代码、第18行到第23行代码、第24行到第29行代码。从上到下四个第四循环语句中第一变量的取值分别是5、6、7和8。
综上,表1中的第二列即为将第一循环语句展开后得到的第二循环语句和四个第四循环语句。
接下来,开始对第二变量B的取值区间进行展开。包含第二变量B的循环语句包括:第二循环语句和四个第四循环语句。对于第二循环语句而言,由于第二循环语句中第一条件语句已被消除,因而无需对第二循环语句中的第二变量B进行切分,即无需展开第二循环语句。对于每个第四循环语句而言,由于第一条件语句中只包含两个变量,且第一变量A的值为一个固定数值,因而可以计算第二变量B的第一区间(恒真区间)和第四区间(恒假区间),从上到下四个第四循环语句中第二变量B对应的第一区间分别为:[0,4]、[0,3]、[0,2]和[0,1]。然后基于第二变量B对应的第一区间,对每个第四循环语句进行展开,得到表1中第三列所示程序,即程序一的编译结果。
如表1中第三列所示,展开后的每个第四循环语句中第一条件语句已被消除。
表1:静态场景下全部切分方式的编译过程
Figure PCTCN2022103415-appb-000002
表2:静态场景下部分切分方式编译过程
Figure PCTCN2022103415-appb-000003
请参见表2,表2为采用部分切分的方式对程序一进行编译的过程,此时需要进行取值区间展开的变量为指令映射标签下方循环语句的循环条件和多层循环语句中所有条件语句共同包含的变量,即变量B。
其中,第一变量为需要进行取值区间展开的变量,即变量B,第二变量为不需要进行取值区间展开的变量,即变量A。
首先展开第一循环语句(即多层循环语句中的第二层循环语句):由于第二变量A位于第一变量B的外层,首先分解第二变量A的取值区间。当第二变量A的取值区间分别为[0,4]、[5,9]和[10,+∞],无论第一变量B取何值,第一条件语句的判别结果分别为:恒成立、成立或不成立,以及恒不成立。因而,第一变量B的第一区间(恒真区间)、第二区间(模糊区间)和第四区间(恒假区间)都为[0,5]。不考虑第四区间,依据第一变量B取值区间和第二变量A取值区间的组合,对第一循环语句进行展开,得到第二循环语句和第三循环语句。
具体参见表2中的第一列,第2行到第6行代码为第二循环语句,第7行到第12行代码为第三循环语句。由于第二循环语句中第一变量B和第二变量A取值的组合使得第一条件语句A+B<10恒成立,因而第二循环语句中可以消除第一条件语句。在第三循环语句中,第一条件语句是否成立由第一变量B和第二变量A取值共同决定,因而无法消除第一条件语句。其中,第一区间[0,5]由第4行代码中的(B,0,6)进行表征,第二区间[0,5]由第9行代码中的(B,0,6)进行表征。
进一步地,为消除第三循环语句中的第一条件语句,可以将第三循环语句中第二变量A的取值区间(即第三区间)以1为单位进行分解。然后基于分解后第二变量A固定的取值进 一步展开第三循环语句,得到四个第五循环语句。
参见表2中的第二列,从上到下四个第五循环语句分别是:第7行到第11行代码、第12行到第16行代码、第17行到第21行代码、第22行到第26行代码。从上到下四个第五循环语句中第二变量A的取值分别是5、6、7和8,对应的第一变量B的取值区间分别为:[0,4]、[0,3]、[0,2]和[0,1]。如表2中第二列所示,展开后的每个第五循环语句中第一条件语句已被消除。
综上,表2中的第二列即为将第一循环语句展开后得到的第二循环语句和四个第五循环语句,即程序一的编译结果。
下面将以表3和表4中的示例来描述:采用全部切分和部分切分两种方式进行编译得到的编译结果中代码长度和后续映射得到的指令数量的区别。
如表3所示,第一列为多维数据的程序表示,第二列为采用全部切分方式得到的程序编译结果;其中,第二列中第3行、第7行、第11行和第15行代码分别指代其包含第一列中第3行到第13行代码,因而采用全部切分方式得到的程序编译结果相对原始程序代码长度扩展约4倍。第三列为采用部分切分方式得到的程序编译结果,即切分变量outer2,在层内消除分支条件,程序编译结果相对原始程序代码长度扩展约1-2倍。
表3:不同编译方式下编译结果中代码长度的对比
Figure PCTCN2022103415-appb-000004
如表4所示,表4描述了不同编译方式下编译结果映射得到的指令数量对比结果。表4中第一列为多维数据的程序表示,在每次对该程序表示的数据进行处理时,由于指令映射标签需要下移到if条件语句下方,因而会映射得到4608条指令,即指令实际运行4608次。第二列为采用全部切分方式得到的程序编译结果,编译结果映射得到9条指令;具体地,第一层循环语句和第二循环语句组合时,映射得到6条指令,第一层循环语句和第三循环语句组合时,映射得到3条指令。同理,表4第三列中部分切分方式得到的编译结果也映射为9指令。相比原始指令映射,以上两种编译方式,可以使指令发射效率提升4608/9=512倍。
应当注意,部分切分方式是一种不彻底的切分方式,编译结果中仍然会存在一些if条件语句,在一定程度上影响到指令发射效率。所以在映射时长不影响的情况下,可以采用全部 切分的方式,获取更高性能的指令。
表4:不同编译方式下编译结果映射得到的指令数量对比
Figure PCTCN2022103415-appb-000005
请参见图6,图6是本申请实施例中另一种程序编译方法的流程图。该方法600包括步骤S610、S620、S630和S640。方法600应用于动态场景。
步骤S610:获取第二程序;其中,所述第二程序包括多层循环语句,每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句。
步骤S620:所述多层循环语句包括第五循环语句;其中,所述第五循环语句为所述多层循环语句中的一层循环语句,所述第五循环语句的循环条件中包括的变量为第三变量,且所述第三变量为所述至少一个条件语句中的第二条件语句所包含变量中的一个。
步骤S630:对所述第五循环语句中第三变量的第一取值区间进行更新,以得到第六循环语句;其中,更新后的所述第一取值区间使所述第二条件语句恒成立。
步骤S640:基于所述第六循环语句对所述第二程序进行编译,以得到所述第二程序的编译结果;所述第二程序的编译结果与所述第六循环语句有关。
其中,上述包含多层循环语句的第二程序可以用于表征多维数据,第二程序的编译结果可以对应为多维数据切分后的程序表示。
其中,上述第二程序的编译结果为一种高级语言(例如,C、C++或python等)程序表示,第二程序的编译结果后续经过指令映射等处理后才能转化为硬件(例如,CPU或GPU)上的可执行指令。
应当理解,上述对第五循环语句中第三变量的取值区间进行处理,也即是对第三变量的取值区间进行展开。当上述第二程序还包括其它需要进行处理的循环语句时,可以参照第五循环语句中第三变量取值区间的更新方式,依次将其它循环语句中循环条件包含变量的取值区间进行更新,在将最后一个需要进行处理的循环语句的循环条件包含的取值区间进行更新后,即得到上述第二程序编译结果,即上述第二程序的编译结果与上述第六循环语句有关。
此外,本申请实施例可以采用从最外层循环语句到最内层循环语句或从最内层循环语句到最外层循环语句的顺序依次展开需要进行处理的循环语句。其中,所述最外层循环语句为第一程序最上方的一层循环语句,最内层循环语句为第一程序最下方的一层循环语句。
下面以表5第一列中程序为例来描述动态场景中,用于表征多维数据的程序结构:在表5中,表征多维数据的程序包括两层嵌套的for循环语句,每层循环语句包括一个循环条件,该两层循环语句的循环条件分别为表5第一列中的第1行和第3行代码。每层循环语句的循环条件包括一个变量和该一个变量的取值区间,例如,在第一层循环语句的循环条件中的变量为A,其取值区间为[0,8],在第一行代码中的表示为(A,0,9),可以表示A的取值为0-9共9个数值。
可选的,该循环语句可以是for循环语句。
此外,如表5第一列所示,最上方的一层(即最外层)循环语句的范围为:第1行-第6行代码;最下方一层(最内层)循环语句的范围为:第3行-第6行代码。第一循环语句可以是多层循环语句中的一层循环语句。
在一种可行的实施方式中,上述第二程序包括指令映射标签;所述第三变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
其中,表征每个多维数据的第二程序中包含一个指令映射标签。
具体地,在动态场景下,需要进行取值区间更新的变量即为指令映射标签下方循环语句的循环条件和多层循环语句中条件语句共同包含的变量,第三变量即为一个需要进行取值区间更新的变量。例如,在表5第一列中,需要进行取值区间更新的变量为变量B。
从技术效果上看,只对指令映射标签下方循环语句的循环条件和至少一个条件语句共同包含变量的取值区间进行更新,而不是对多层循环语句的循环条件和至少一个条件语句共同包含变量的取值区间进行更新,可以减少进行处理的循环语句的数量,进而降低编译结果中的代码长度,获得较低的第二程序编译结果的映射时长。
在一种可行的实施方式中,所述对所述第五循环语句中第三变量的第一取值区间进行更新,包括:基于所述第二条件语句得到所述第三变量的第二取值区间;利用所述第一取值区间和所述第二取值区间的交集更新所述第一取值区间。
具体地,对第二条件语句进行移项操作,得到第三变量的第二取值区间,然后计算第二取值区间和第一取值区间的交集。由于在动态场景下,第二取值区间的两个端点和第一取值区间的两个端点中至少有一个为非常数,因而需要比较第二取值区间的两个端点和第一取值区间的两个端点的数值,具体地:第二取值区间左侧和右侧的端点分别为第一端点和第二端点,第一取值区间左侧和右侧的端点分别为第三端点和第四端点。更新后的第一取值区间左侧和右侧的端点分别为第五端点和第六端点。则可按照公式(1)中方式计算第一取值区间和第二取值区间的交集。
第五端点=max(第一端点,第三端点)
第六端点=min(第二端点,第四端点)(1)
其中,公式(1)表示:将第一端点和第三端点中最大值作为第五端点,将第二端点和第四端点中最小值作为第六端点。
在利用第一取值区间和第二取值区间的交集更新第一取值区间同时,将公式(1)更新到程序中指令映射标签的上方,以消除第二条件语句。
进一步地,由于公式(1)位于指令映射标签上方,可以采用if条件表达式对公式(1)进行展开,得到第六循环语句。
可选的,可以对公式(1)进行展开,得到的第六循环语句中包含四个条件分支:
1)if(第一端点大于第二端点,且第三端点大于第四端点),更新后第一取值区间为[第一端点,第四端点]。
2)if(第一端点大于第二端点,且第三端点小于或等于第四端点),更新后第一取值区间为[第一端点,第三端点]。
3)if(第一端点小于或等于第二端点,且第三端点小于或等于第四端点),更新后第一取值区间为[第二端点,第三端点]。
4)if(第一端点小于或等于第二端点,且第三端点大于第四端点),更新后第一取值区间为[第二端点,第四端点]。
从技术效果上看,基于第二条件语句可以得到一个使得该不等式恒成立的第二取值区间,因而利用第一取值区间和第二取值区间的交集更新第一取值区间后,得到的更新后的第一取值区间可以使第五循环语句中的第二条件语句恒成立,此时无需对第五循环语句进行全开展就可以消除第二条件语句,既可以有效降低更新后得到的第六循环语句中的代码长度,提升程序编译结果的运行性能,同时还可以减少第二程序编译结果映射得到的指令数量,提升指令发射效率。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或所述至少一个条件语句中每个条件语句还包括第四变量,其中,所述第四变量为未被包含于所述每层循环语句的循环条件中的变量。
其中,每个条件语句至多包含一个需要进行取值区间更新的变量。
具体地,方法600应用的动态场景具体为:循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或每个条件语句还包括第四变量,且每个条件语句至多包含一个需要进行取值区间更新的变量。因而,在上述动态场景下,第二取值区间的两个端点和第一取值区间的两个端点中至少有一个为非常数,因而可以采用上述公式(1)和更新后第一取值区间来消除第二条件语句;同时,由于第二取值区间的两个端点和第一取值区间的两个端点中不包含需要进行取值区间更新的变量,因而公式(1)中可以更新到指令映射标签的上方。进而后续可以利用条件语句对公式(1)进行展开,以更新第五循环语句,得到第六循环语句,且由于公式(1)在指令映射标签的上方,因而对公式(1)进行展开后,并不会增加程序编译结果映射生成的指令数量。
应当理解,第五循环语句只是多层循环语句中的任意一层需要对循环条件中取值区间进行更新的循环语句,即上述方法600描述了对任意一层需要对循环条件中取值区间进行更新的循环语句的处理过程。在动态场景下,对表征多维数据的程序进行编译的过程中,对于其它需要对循环条件中取值区间进行更新的循环语句,其处理方式可以对应参照第五循环语句的处理过程,此处不再赘述。在对第二程序中所有需要对循环条件中取值区间进行更新的循环语句进行处理后,即得到了第二程序的编译结果,因此第二程序的编译结果与第六循环语句有关。
此外,在动态场景下,此外,本申请实施例可以采用任意顺序更新所有需要进行处理的循环语句的循环条件中变量的取值区间。
从技术效果上看,在上述方法实施例600中,由于第三变量取更新后的第一取值区间中的数值时,第二条件语句恒成立,此时可以消除第五循环语句中的第二条件语句,进而可以在后续第二程序编译结果(切分后的多维数据)的映射过程中,减少映射得到的指令数量,提高映射过程中的指令发射效率。同时,相对于现有技术中对第三变量的循环条件进行全开 展(即将第一变量的取值区间以1为单位进行展开)来消除条件语句的过程;在本申请实施例中,将第三变量的取值区间作为整体进行更新,以使得第二条件语句恒成立,可以有效降低更新后得到的第六循环语句中的代码长度,以及降低基于第六循环语句得到的程序编译结果中的代码长度,进而降低后续程序编译结果的映射时长,提升程序编译结果的运行性能。
下面参照表5中的示例,描述动态场景下,第二程序的编译过程。如表5第一列所示,第一列为多维数据的程序表示。
如表5第一列所示,程序包括两层嵌套的for循环语句和第二条件语句if(A+B<n),第2行代码为指令映射标签。第二条件语句中包含一个第四变量n,此时需要进行取值区间更新的变量为指令映射标签下方循环语句的循环条件和多层循环语句中所有条件语句共同包含的变量,即变量B。
综上,程序中第3行到第6行代码为方法600中的第五循环语句,变量B为方法600中的第三变量。第三变量B的第一取值区间和第二取值区间分别为[0,5]和[-∞,n-A]。
然后,基于第一取值区间和第二取值区间计算其交集,由于0大于-∞,因而此时只需计算更新后的第一取值区间的第六端点B_ext的取值,第六端点取值为5和n-A中最小值,即B_ext=min(n-A,5),此时第一取值区间和第二取值区间的交集为[0,n-A]。利用B_ext=min(n-A,5)和[0,n-A]同时更新上述程序,以消除第二条件语句,得到表5第二列所示的程序代码。
最后,利用if条件语句将表5第二列中第2行代码进行展开,得到表5第三列中第2行到第12行代码,即第六循环语句。
表5:动态场景下程序编译过程
Figure PCTCN2022103415-appb-000006
下面将以表6中的程序为例,描述动态场景下,现有技术和本方案程序的编译结果映射生成的指令数量。如表6第一列所示,指令映射标签位于第3行代码,在不对for循环语句进行展开时,映射得到的指令数量为4608条,即指令实际运行4608次。采用本申请实施例,编译结果映射得到的指令数量为9条,具体地,表6第二列中第一层循环语句和第3行代码下的条件分支的组合会映射得到6条指令,第一层循环语句和第7行代码下的条件分支映射得到3条指令,共9条指令。
即采用本申请实施例方法,执行9次即可完成多维数据的处理。指令发射效率提升4608/9=512倍。
表6:动态场景下编译结果映射生成指令数量
Figure PCTCN2022103415-appb-000007
参见图7,图7为本申请实施例提供的一种编译装置700的结构示意图。装置700包括:
获取单元701,用于获取第一程序;其中,所述第一程序包括多层循环语句,所述多层循环语句中每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第一循环语句;其中,所述第一循环语句为所述多层循环语句中的一层循环语句,所述第一循环语句的循环条件中包括的变量为第一变量,且所述第一变量为所述至少一个条件语句中的第一条件语句所包含变量中的一个。
处理单元702,用于对所述第一循环语句中第一变量的取值区间进行处理,以得到与所述第一循环语句对应的至少一个循环语句;其中,所述至少一个循环语句中包括第二循环语句,所述第一循环语句和所述第二循环语句都包括所述第一变量,所述第二循环语句中第一变量的取值区间为第一区间,所述第一区间为所述第一循环语句中第一变量的取值区间的子集,所述第一区间使得所述第一条件语句恒成立。
编译单元703,基于所述至少一个循环语句对所述第一程序进行编译,以得到所述第一程序的编译结果,所述第一程序的编译结果与所述至少一个循环语句有关。
在一种可行的实施方式中,所述第一程序包括指令映射标签;当预估的所述第一程序的编译结果的映射时间大于预设时间时,所述第一变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;所述预估映射时间为将所述第一程序的编译结果映射为在硬件上可执行指令的过程所需要的时间;当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间时,所述第一变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
在一种可行的实施方式中,所述预估的所述第一程序的编译结果的映射时间是基于所述至少一个条件语句包含变量的数量、所述多层循环语句的层数、所述至少一个条件语句包含变量的模糊区间或所述条件语句的数量中的一个或多个决定的;其中,当所述变量的取值为所述变量的模糊区间中的数值时,包含所述变量的条件语句是否成立还与其它因素有关。
在一种可行的实施方式中,所述与所述第一循环语句对应的至少一个循环语句中还包括第三循环语句;其中,所述第一变量在所述第三循环语句中的取值区间为第二区间,且当所述第一变量的取值为所述第二区间中的数值时,所述第一条件语句是否成立还与其它因素有关。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述处理单元还用于:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所 述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;或者,当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间,且所述第二变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个时,对所述第三循环语句中第一变量的第二区间进行处理,得到一个或多个第四循环语句;其中,每个所述第四循环语句中的第一变量的取值区间为所述第二区间的子集。
在一种可行的实施方式中,所述第一条件语句还包括第二变量,所述第二变量在所述第三循环语句中的取值区间为第三区间;所述处理单元还用于:当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签上方的循环语句的循环条件中包含的变量时,对所述第三循环语句中的第三区间进行处理,得到一个或多个第五循环语句;其中,每个所述第五循环语句中第二变量的取值区间为所述第三区间的子集。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点为常数,且所述至少一个条件语句包含的变量都为所述多层循环语句的循环条件中包含的变量。
请参见图8,图8为本申请实施例提供的一种编译装置800的结构示意图。装置800包括:
获取单元801,用于获取第二程序;其中,所述第二程序包括多层循环语句,每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第五循环语句;其中,所述第五循环语句为所述多层循环语句中的一层循环语句,所述第五循环语句的循环条件中包括的变量为第三变量,且所述第三变量为所述至少一个条件语句中的第二条件语句所包含变量中的一个。
更新单元802,对所述第五循环语句中第三变量的第一取值区间进行更新,以得到第六循环语句;其中,更新后的所述第一取值区间使所述第二条件语句恒成立。
编译单元803,基于所述第六循环语句对所述第二程序进行编译,以得到所述第二程序的编译结果;所述第二程序的编译结果与所述第六循环语句有关。
在一种可行的实施方式中,所述第二程序包括指令映射标签;所述第三变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
在一种可行的实施方式中,在所述对所述第五循环语句中第三变量的第一取值区间进行更新的方面,所述更新单元具体用于:基于所述第二条件语句得到所述第三变量的第二取值区间;利用所述第一取值区间和所述第二取值区间的交集更新所述第一取值区间。
在一种可行的实施方式中,所述每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或所述至少一个条件语句中每个条件语句还包括第四变量,其中,所述第四变量为未被包含于所述每层循环语句的循环条件中的变量。
应理解,这里的装置700和800以功能单元的形式体现。这里的术语“单元”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置700和800可以用于执行与上述方法实施例500和/或600的编译器对应的各个流程和/或步骤,为避免重复,在此不再赘述。
请参见图9,图9为本申请实施例提供的一种编译装置的硬件结构示意图。如图9所示,该装置可以包括:存储器901、一个或多个(图中仅示出一个)处理器902、通信接口903以及总线904。其中,存储器901、处理器902、通信接口903通过总线904实现彼此之间的通信连接。
存储器901,用于存储指令,该处理器902用于调用该存储器901中存储的指令;该指令可以是前述申请实施例500和/或600中的程序。
处理器902具体用于获取实施例500和/或600中的程序,以执行实施例500和/或600中对应的编译方法。
本申请实施例的编译装置,能够在编译阶段使用本申请实施例中的编译方法对多维数据进行不规则切分,以消除程序中的条件不等式,使得得到的程序编译结果在映射过程中具有较高指令发射效率同时,降低程序编译结果中的代码长度,得到较低的程序编译结果映射时长。
应理解,装置900可以具体为计算机,并且其可以用于执行上述方法实施例500和实施例600中与编译器对应的各个步骤和/或流程。
存储器901可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器901可以存储程序,当存储器901中存储的程序被处理器902执行时,处理器902和通信接口903用于执行本申请实施例的编译方法的各个步骤。
处理器902可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的编译装置中的单元所需执行的功能,或者执行本申请方法实施例的编译方法。
处理器902还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的编译方法的各个步骤可以通过处理器902中软件形式的指令完成。上述的处理器902还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器901,处理器902读取存储器901中的信息,结合其硬件完成本申请实施例的编译装置中包括的单元所需执行的功能,或者执行本申请方法实施例的编译方法。
通信接口903使用例如但不限于收发器一类的收发装置,来实现装置900与其他设备或通信网络之间的通信。例如,可以通过通信接口903获取程序。
总线904可包括在装置900各个部件(例如,存储器901、处理器902、通信接口903)之间传送信息的通路。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分, 仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
上述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (26)

  1. 一种程序编译方法,其特征在于,所述方法包括:
    获取第一程序;其中,所述第一程序包括多层循环语句,所述多层循环语句中每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;
    所述多层循环语句包括第一循环语句;其中,所述第一循环语句为所述多层循环语句中的一层循环语句,所述第一循环语句的循环条件中包括的变量为第一变量,且所述第一变量为所述至少一个条件语句中的第一条件语句所包含变量中的一个;
    对所述第一循环语句中第一变量的取值区间进行处理,以得到与所述第一循环语句对应的至少一个循环语句;其中,所述至少一个循环语句中包括第二循环语句,所述第一循环语句和所述第二循环语句都包括所述第一变量,所述第二循环语句中第一变量的取值区间为第一区间,所述第一区间为所述第一循环语句中第一变量的取值区间的子集,所述第一区间使得所述第一条件语句恒成立;
    基于所述至少一个循环语句对所述第一程序进行编译,以得到所述第一程序的编译结果,所述第一程序的编译结果与所述至少一个循环语句有关。
  2. 根据权利要求1所述的方法,其特征在于,所述第一程序包括指令映射标签;
    当预估的所述第一程序的编译结果的映射时间大于预设时间时,所述第一变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;所述预估映射时间为将所述第一程序的编译结果映射为在硬件上可执行指令的过程所需要的时间;
    当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间时,所述第一变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
  3. 根据权利要求2所述的方法,其特征在于,所述预估的所述第一程序的编译结果的映射时间是基于所述至少一个条件语句包含变量的数量、所述多层循环语句的层数、所述至少一个条件语句包含变量的模糊区间或所述条件语句的数量中的一个或多个决定的;
    其中,当所述变量的取值为所述变量的模糊区间中的数值时,包含所述变量的条件语句是否成立还与其它因素有关。
  4. 根据权利要求2或3中所述的方法,其特征在于,所述与所述第一循环语句对应的至少一个循环语句中还包括第三循环语句;
    其中,所述第一变量在所述第三循环语句中的取值区间为第二区间,且当所述第一变量的取值为所述第二区间中的数值时,所述第一条件语句是否成立还与其它因素有关。
  5. 根据权利要求4所述的方法,其特征在于,所述第一条件语句还包括第二变量,所述方法还包括:
    当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;或者,当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间,且 所述第二变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个时,对所述第三循环语句中第一变量的第二区间进行处理,得到一个或多个第四循环语句;
    其中,每个所述第四循环语句中的第一变量的取值区间为所述第二区间的子集。
  6. 根据权利要求4所述的方法,其特征在于,所述第一条件语句还包括第二变量,所述第二变量在所述第三循环语句中的取值区间为第三区间;所述方法还包括:
    当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签上方的循环语句的循环条件中包含的变量时,对所述第三循环语句中的第三区间进行处理,得到一个或多个第五循环语句;
    其中,每个所述第五循环语句中第二变量的取值区间为所述第三区间的子集。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,所述每层循环语句的循环条件包括的变量的取值区间的两个端点为常数,且所述至少一个条件语句包含的变量都为所述多层循环语句的循环条件中包含的变量。
  8. 一种程序编译方法,其特征在于,所述方法包括:
    获取第二程序;其中,所述第二程序包括多层循环语句,每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;
    所述多层循环语句包括第五循环语句;其中,所述第五循环语句为所述多层循环语句中的一层循环语句,所述第五循环语句的循环条件中包括的变量为第三变量,且所述第三变量为所述至少一个条件语句中的第二条件语句所包含变量中的一个;
    对所述第五循环语句中第三变量的第一取值区间进行更新,以得到第六循环语句;其中,更新后的所述第一取值区间使所述第二条件语句恒成立;
    基于所述第六循环语句对所述第二程序进行编译,以得到所述第二程序的编译结果;所述第二程序的编译结果与所述第六循环语句有关。
  9. 根据权利要求8所述的方法,其特征在于,所述第二程序包括指令映射标签;
    所述第三变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
  10. 根据权利要求8或9所述的方法,其特征在于,所述对所述第五循环语句中第三变量的第一取值区间进行更新,包括:
    基于所述第二条件语句得到所述第三变量的第二取值区间;利用所述第一取值区间和所述第二取值区间的交集更新所述第一取值区间。
  11. 根据权利要求8-10中任一项所述的方法,其特征在于,所述每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或所述至少一个条件语句中每个条件语句还包括第四变量,其中,所述第四变量为未被包含于所述每层循环语句的循环条件中的变量。
  12. 一种程序编译装置,其特征在于,所述装置包括:
    获取单元,用于获取第一程序;其中,所述第一程序包括多层循环语句,所述多层循环语句中每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第一循环语句;其中,所述第一循环语句为所述多层循环语句中的一层循环语句,所述第一循环语句的循环条件中包括的变量为第一变量,且所述第一变量为所述至少一个条件语句中的第一条件语句所包含变量中的一个;
    处理单元,用于对所述第一循环语句中第一变量的取值区间进行处理,以得到与所述第一循环语句对应的至少一个循环语句;其中,所述至少一个循环语句中包括第二循环语句,所述第一循环语句和所述第二循环语句都包括所述第一变量,所述第二循环语句中第一变量的取值区间为第一区间,所述第一区间为所述第一循环语句中第一变量的取值区间的子集,所述第一区间使得所述第一条件语句恒成立;
    编译单元,基于所述至少一个循环语句对所述第一程序进行编译,以得到所述第一程序的编译结果,所述第一程序的编译结果与所述至少一个循环语句有关。
  13. 根据权利要求12所述的装置,其特征在于,所述第一程序包括指令映射标签;
    当预估的所述第一程序的编译结果的映射时间大于预设时间时,所述第一变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;所述预估映射时间为将所述第一程序的编译结果映射为在硬件上可执行指令的过程所需要的时间;
    当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间时,所述第一变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
  14. 根据权利要求12所述的装置,其特征在于,所述预估的所述第一程序的编译结果的映射时间是基于所述至少一个条件语句包含变量的数量、所述多层循环语句的层数、所述至少一个条件语句包含变量的模糊区间或所述条件语句的数量中的一个或多个决定的;
    其中,当所述变量的取值为所述变量的模糊区间中的数值时,包含所述变量的条件语句是否成立还与其它因素有关。
  15. 根据权利要求13或14中所述的装置,其特征在于,所述与所述第一循环语句对应的至少一个循环语句中还包括第三循环语句;
    其中,所述第一变量在所述第三循环语句中的取值区间为第二区间,且当所述第一变量的取值为所述第二区间中的数值时,所述第一条件语句是否成立还与其它因素有关。
  16. 根据权利要求15中所述的装置,其特征在于,所述第一条件语句还包括第二变量,所述处理单元还用于:
    当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个;或者,当所述预估的所述第一程序的编译结果的映射时间小于或等于所述预设时间,且所述第二变量为所述多层循环语句中每层循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个时,对所述第三循环语句中第一变量的第二区间进行处理,得到一个或多个第四循环语句;
    其中,每个所述第四循环语句中的第一变量的取值区间为所述第二区间的子集。
  17. 根据权利要求15中所述的装置,其特征在于,所述第一条件语句还包括第二变量,所述第二变量在所述第三循环语句中的取值区间为第三区间;所述处理单元还用于:
    当所述预估的所述第一程序的编译结果的映射时间大于所述预设时间,且所述第二变量为所述指令映射标签上方的循环语句的循环条件中包含的变量时,对所述第三循环语句中的第三区间进行处理,得到一个或多个第五循环语句;
    其中,每个所述第五循环语句中第二变量的取值区间为所述第三区间的子集。
  18. 根据权利要求12-17中任一项所述的装置,其特征在于,所述每层循环语句的循环条件包括的变量的取值区间的两个端点为常数,且所述至少一个条件语句包含的变量都为所述多层循环语句的循环条件中包含的变量。
  19. 一种程序编译装置,其特征在于,所述装置包括:
    获取单元,用于获取第二程序;其中,所述第二程序包括多层循环语句,每层循环语句的循环条件包括变量和所述变量的取值区间,所述多层循环语句的循环体中包括至少一个条件语句;所述多层循环语句包括第五循环语句;其中,所述第五循环语句为所述多层循环语句中的一层循环语句,所述第五循环语句的循环条件中包括的变量为第三变量,且所述第三变量为所述至少一个条件语句中的第二条件语句所包含变量中的一个;
    更新单元,对所述第五循环语句中第三变量的第一取值区间进行更新,以得到第六循环语句;其中,更新后的所述第一取值区间使所述第二条件语句恒成立;
    编译单元,基于所述第六循环语句对所述第二程序进行编译,以得到所述第二程序的编译结果;所述第二程序的编译结果与所述第六循环语句有关。
  20. 根据权利要求19所述的装置,其特征在于,所述第二程序包括指令映射标签;
    所述第三变量为所述指令映射标签下方循环语句的循环条件和所述至少一个条件语句共同包含变量中的一个。
  21. 根据权利要求20所述的装置,其特征在于,在所述对所述第五循环语句中第三变量的第一取值区间进行更新的方面,所述更新单元具体用于:
    基于所述第二条件语句得到所述第三变量的第二取值区间;利用所述第一取值区间和所述第二取值区间的交集更新所述第一取值区间。
  22. 根据权利要求19-21中任一项所述的装置,其特征在于,所述每层循环语句的循环条件包括的变量的取值区间的两个端点中至少一个为非常数和/或所述至少一个条件语句中每个条件语句还包括第四变量,其中,所述第四变量为未被包含于所述每层循环语句的循环条件中的变量。
  23. 一种芯片系统,其特征在于,所述芯片系统包括至少一个处理器、存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,权利要求1-11中任意一项所述的方法得以实现。
  24. 一种编译装置,其特征在于,所述编译装置包括如权利要求23中所述芯片系统,以及耦合至所述芯片系统的分立器件。
  25. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行如权利要求1-11中任一项所述的方法。
  26. 一种计算机程序产品,其特征在于,所述计算机程序产品包括程序指令,当所述程序指令在计算机上运行时,权利要求1-11中任意一项所述的方法得以实现。
PCT/CN2022/103415 2021-07-09 2022-07-01 程序编译方法和装置 WO2023280078A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22836829.6A EP4332752A1 (en) 2021-07-09 2022-07-01 Program compiling method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110781901.9A CN113672232B (zh) 2021-07-09 2021-07-09 程序编译方法和装置
CN202110781901.9 2021-07-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/407,383 Continuation US20240201968A1 (en) 2021-07-09 2024-01-08 Program compilation method and apparatus

Publications (1)

Publication Number Publication Date
WO2023280078A1 true WO2023280078A1 (zh) 2023-01-12

Family

ID=78539289

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/103415 WO2023280078A1 (zh) 2021-07-09 2022-07-01 程序编译方法和装置

Country Status (3)

Country Link
EP (1) EP4332752A1 (zh)
CN (1) CN113672232B (zh)
WO (1) WO2023280078A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113672232B (zh) * 2021-07-09 2024-06-11 华为技术有限公司 程序编译方法和装置
CN115220727B (zh) * 2022-06-07 2024-05-28 清华大学 面向利用Python语言编写的不规则张量程序的优化方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160147511A1 (en) * 2014-11-26 2016-05-26 Markus Eble Pre-compiler
CN107301079A (zh) * 2017-05-22 2017-10-27 南京南瑞继保电气有限公司 一种计算机程序语言的编译方法和编译器
CN109408034A (zh) * 2018-03-17 2019-03-01 东南大学 一种面向对象程序的控制流图构造方法
CN111949269A (zh) * 2020-07-14 2020-11-17 华中科技大学 一种COStream语法分析过程中符号表和静态数据流图生成方法
CN113672232A (zh) * 2021-07-09 2021-11-19 华为技术有限公司 程序编译方法和装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6988266B2 (en) * 2001-05-08 2006-01-17 Sun Microsystems, Inc. Method of transforming variable loops into constant loops
JP4819442B2 (ja) * 2005-09-01 2011-11-24 富士通株式会社 コンパイル処理方法、コンパイル処理装置及びコンパイル処理プログラム
US20130227533A1 (en) * 2008-11-06 2013-08-29 Albert Donald Tonkin Code transformation
US9465591B2 (en) * 2012-12-17 2016-10-11 Unisys Corporation Syntax language generator for compiler validation
KR102147355B1 (ko) * 2013-09-27 2020-08-24 삼성전자 주식회사 프로그램 변환 방법 및 장치
JP6554959B2 (ja) * 2015-07-14 2019-08-07 富士通株式会社 情報処理装置、コンパイル方法、およびコンパイルプログラム
CN110825386B (zh) * 2019-11-01 2023-07-14 腾讯科技(深圳)有限公司 代码的编译方法和装置、存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160147511A1 (en) * 2014-11-26 2016-05-26 Markus Eble Pre-compiler
CN107301079A (zh) * 2017-05-22 2017-10-27 南京南瑞继保电气有限公司 一种计算机程序语言的编译方法和编译器
CN109408034A (zh) * 2018-03-17 2019-03-01 东南大学 一种面向对象程序的控制流图构造方法
CN111949269A (zh) * 2020-07-14 2020-11-17 华中科技大学 一种COStream语法分析过程中符号表和静态数据流图生成方法
CN113672232A (zh) * 2021-07-09 2021-11-19 华为技术有限公司 程序编译方法和装置

Also Published As

Publication number Publication date
CN113672232B (zh) 2024-06-11
EP4332752A1 (en) 2024-03-06
CN113672232A (zh) 2021-11-19

Similar Documents

Publication Publication Date Title
WO2023280078A1 (zh) 程序编译方法和装置
US10089086B2 (en) Method and apparatus for compiling regular expressions
US9916145B2 (en) Utilizing special purpose elements to implement a FSM
WO2021000970A1 (zh) 深度学习算法的编译方法、装置及相关产品
US20220012575A1 (en) Methods and apparatus for localized processing within multicore neural networks
WO2021190597A1 (zh) 一种神经网络模型的处理方法以及相关设备
US20200004514A1 (en) High parallelism computing system and instruction scheduling method thereof
JP2002510826A (ja) 陰影付け言語命令を含むグラフィックスアプリケーションプログラムを高速で実行するためのシステムおよび方法
US10180825B2 (en) System and method for using ubershader variants without preprocessing macros
CN107885503B (zh) 一种基于程序特征分析的迭代编译优化方法
US10802806B1 (en) Generating vectorized control flow using reconverging control flow graphs
CN110865814B (zh) 一种支持异构计算核架构的编译器实现方法和系统
WO2023071238A1 (zh) 计算图的编译、调度方法及相关产品
CN110659069A (zh) 用于执行神经网络计算的指令调度方法及相应计算系统
CN114416045A (zh) 自动生成算子的方法和装置
CN113312175A (zh) 一种算子确定、运行方法及装置
CN117009038B (zh) 一种基于云原生技术的图计算平台
US20060130008A1 (en) Model-to-model transformation by kind
WO2023030507A1 (zh) 编译优化方法、装置、计算机设备以及存储介质
US20130185239A1 (en) Accelerated Decision Tree Execution
US20230116546A1 (en) Method for compilation, electronic device and storage medium
CN115840894A (zh) 一种用于处理多维张量数据的方法及其相关产品
CN113705800A (zh) 处理单元、相关装置和方法
CN116755714B (zh) 深度神经网络模型的运行方法、装置、设备和存储介质
Chennupati et al. Automatic evolution of parallel recursive programs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836829

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022836829

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022836829

Country of ref document: EP

Effective date: 20231128

NENP Non-entry into the national phase

Ref country code: DE