CN116974572A - Memory access address calculation optimization method and device based on cyclic stripping - Google Patents
Memory access address calculation optimization method and device based on cyclic stripping Download PDFInfo
- Publication number
- CN116974572A CN116974572A CN202310825932.9A CN202310825932A CN116974572A CN 116974572 A CN116974572 A CN 116974572A CN 202310825932 A CN202310825932 A CN 202310825932A CN 116974572 A CN116974572 A CN 116974572A
- Authority
- CN
- China
- Prior art keywords
- calculation expression
- base address
- expression
- offset
- address calculation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 189
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000005457 optimization Methods 0.000 title claims abstract description 36
- 125000004122 cyclic group Chemical group 0.000 title claims abstract description 24
- 230000001419 dependent effect Effects 0.000 claims description 71
- 238000004590 computer program Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 6
- 230000009191 jumping Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000005033 Fourier transform infrared spectroscopy Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The application discloses a memory address calculation optimization method and a memory address calculation optimization device based on cyclic stripping. The application can strip the calculation of the memory access address in the linear assembly layer to obtain the core loop, can realize the loop body code by using more simplified sentences, effectively reduce redundant address calculation in the core loop body, reduce the total execution instruction number and the program execution time cost of the core loop, improve the efficiency of the program, and can be applicable to all loops.
Description
Technical Field
The application relates to the technical field of program compiling of linear assembly language, in particular to a memory address calculation optimization method and device based on cyclic stripping.
Background
A linear assembly language is a programming language that is intermediate between assembly language and high-level programming language, has a simpler syntax than assembly language and is more efficient than high-level programming language. Linear assembly language has the following 3 advantages: (1) no manual allocation of registers is required; (2) the assignment of the functional units, the arrangement of instruction beats and the filling of delay slots do not need to be considered; (3) the parallel scheduling of the design codes is not needed, and the parallel scheduling can be automatically completed by a linear assembly compiler, so that the coding efficiency of the codes is improved. At present, common linear assembly optimization methods are cyclic expansion and cyclic soft running water. And (3) cyclic unfolding: the loops are unfolded, so that the total times of branch instructions can be reduced, and the instruction parallelism is improved; circulating soft running water: the length of a critical path of the circulating body is reduced, and the code execution efficiency is improved. However, the existing linear assembly optimization method still has the problems of insufficient conciseness and poor optimization effect.
Disclosure of Invention
The application aims to solve the technical problems: aiming at the problems in the prior art, the application provides a memory address calculation optimization method and a memory address calculation optimization device based on cyclic stripping, which can strip the calculation of the memory address from a linear assembly layer to a core cycle by analyzing and reconstructing the calculation of the memory address, can realize cyclic body codes by using more simplified sentences, effectively reduce redundant address calculation in a core cycle body, can reduce the total execution instruction number and the program execution time cost of the core cycle, improve the efficiency of the program, and can be applicable to all cycles.
In order to solve the technical problems, the application adopts the following technical scheme:
a memory address calculation optimization method based on cyclic stripping comprises the following steps:
s101, determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression, and initial values and step sizes of the dependent variables;
s102, if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression which does not contain the dependent variable in the circulation, reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors, wherein each factor contains at most one dependent variable, and jumping to the next step; otherwise, ending and exiting;
s103, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, firstly bringing an initial value of the dependent variable into a base address calculation expression, bringing a step length of the dependent variable into the offset calculation expression, and then respectively using intermediate variables to represent the base address calculation expression and the offset calculation expression before the base address calculation expression and the offset calculation expression are put out from a circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
Optionally, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, the original base address calculation expression and the offset calculation expression determined in the step S103 are added first, then split into a form of adding and subtracting multiple factors, finally merging the factors without the dependent variable in the present loop together, presenting the factors to the front of the loop body, representing the result by an intermediate variable, and replacing the corresponding part in the loop body by the intermediate variable.
Optionally, the proposing to the front of the loop body refers to proposing from the loop body to the last code block before the loop body.
Alternatively, the use of the intermediate variable to represent the base address calculation expression and the offset calculation expression in step S103 refers to the use of the intermediate variable AR to represent the result of the base address calculation expression and the use of the intermediate variable OR to represent the result of the base address calculation expression.
Optionally, after replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables in step S103, the method further includes replacing the address of the variables with an address of ar++ [ OR ], where addr is the original address of the variables, AR is an intermediate variable for representing the base address calculation expression, and OR is an intermediate variable for representing the base address calculation expression.
Optionally, step S101 is preceded by a step of semantically downgrading class C language code corresponding to the original linear assembly code to be optimized to a linear assembly language hierarchy to obtain the original linear assembly code to be optimized.
Optionally, step S103 further includes a step of converting the optimized linear assembly code into assembly code and compiling the assembly code to obtain an execution program.
In addition, the application also provides a memory address calculation optimizing device based on cyclic stripping, which comprises the following steps:
the assembly code analysis program unit is used for determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression and initial values and step sizes of the dependent variables;
the dependent variable judging program unit is used for reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression without the dependent variable in the cycle, and each factor at most comprises one dependent variable, and the jump execution expression replaces the program unit; otherwise, ending and exiting;
an expression replacement program unit, configured to, when all factors in the reconstructed offset calculation expression contain only a single dependent variable, first bring an initial value of the dependent variable into a base calculation expression, bring a step length of the dependent variable into the offset calculation expression, and then respectively use intermediate variables to represent the base calculation expression and the offset calculation expression before the base calculation expression and the offset calculation expression are put out from the loop body to the loop body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
In addition, the application also provides a memory access calculation optimization device based on the cyclic stripping, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the memory access calculation optimization method based on the cyclic stripping.
Furthermore, the present application provides a computer readable storage medium having stored therein a computer program for being programmed or configured by a microprocessor to perform the cyclic stripping based memory access calculation optimization method.
Compared with the prior art, the application has the following advantages: according to the application, through analyzing and reconstructing the calculation of the memory address, the core loop can be stripped from the calculation of the memory address in the linear assembly layer, the loop body code can be realized by using more simplified sentences, the length of the core loop body is reduced, the corresponding total execution length is reduced, the redundant address calculation in the core loop body is greatly reduced, the calculation time is faster than that of the common cyclic linear assembly, the total execution instruction number and the program execution time cost of the core loop can be reduced, the program efficiency is improved, and the method is applicable to all loops and has good reliability.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of a method for correcting the error of the upstream and downstream integrity rates according to an embodiment of the present application.
Detailed Description
As shown in fig. 1, the memory address calculation optimization method based on loop stripping in this embodiment includes:
s101, determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression, and initial values and step sizes of the dependent variables; wherein the dependent variable refers to a variable affected by the number of cycles, and the base address expression and the offset expression refer to an expression for calculating a base address and an expression for calculating an offset, respectively. For example, in the C language, the value is obtained by indexing a [ i+1] variable, when the degradation is carried out to the stage of linear assembly, a is a base address expression, i+1 is an offset expression, a+i+1 is used for obtaining a value address, and finally the value of a [ i+1] is obtained. The base address calculation expression and the offset calculation expression in the original linear assembly code loop body are defined in the original linear assembly code loop body, and can be obtained according to the definition. For example, in this embodiment, for a loop body of a certain original linear assembly code, the base address calculation expression in the loop body is extracted as follows:
Conv2d_NCHW_ft_out_transform_write_cach,
the offset calculation expression is:
16*a_inner+7168*j_inner+64*k_inner,
where a_inner is the dependent variable of the current cycle.
S102, if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression which does not contain the dependent variable in the cycle, reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors, wherein each factor contains at most one dependent variable (only one or zero dependent variable, and the single use of the same dependent variable is regarded as one, for example, a is a, and a is used twice, so the number of the dependent variables is two), and jumping to the next step; otherwise, ending and exiting;
for the base address calculation expression and offset calculation expression examples previously described, the base address expression after reconstruction:
Conv2d_NCHW_ft_out_transform_write_cach+7168*j_inner+64*k_inner+16*a_inner,
the offset expression is:
16*a_inner
s103, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, firstly bringing an initial value of the dependent variable into a base address calculation expression, bringing a step length of the dependent variable into the offset calculation expression, and then respectively using intermediate variables to represent the base address calculation expression and the offset calculation expression before the base address calculation expression and the offset calculation expression are put out from a circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
In this embodiment, the original linear assembly code defines an initial value of 0, a step size of 1 (1 is added each time as the number of loops increases) and a final base address expression is:
Conv2d_NCHW_ft_out_transform_write_cach+7168*j_inner+64*k_inner+16*0,
the offset expression is 16 x 1,
the calculations for the access base address and offset become now independent of the current cycle. Then, the base address calculation expression and the offset calculation expression are respectively expressed by using intermediate variables before being put out from the circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body. For example, in the present embodiment, the intermediate variables ar_ex_76 and or_ex_76 are used to represent the base address calculation expression and the offset calculation expression, respectively; finally, the base address calculation expression and the offset calculation expression in the loop body are replaced by the intermediate variables AR_EX_76 and OR_EX_76, so that the calculation of the base address and the offset is not needed in the loop body.
The method of the embodiment can optimize the memory access calculation of the linear assembly layer. If the dependent variable exists in one polynomial factor and the number of the dependent variable is not more than one, the optimal optimization can be performed, and the calculation of the memory address is stripped out of the loop body; otherwise, only primary optimization can be performed, namely, the loop body is stripped after the polynomial analysis and recombination which are irrelevant to the dependent variables, and the purposes of optimal optimization and primary optimization are to reduce codes in the current loop body as much as possible, reduce the time cost of code execution, reduce the memory access time during code execution and improve the overall execution efficiency of the program.
Needless to say, the above step is independent of whether and how the optimization is performed when all the factors in the offset calculation expression after the reconstruction contain only a single dependent variable are not established in step S103. Referring to fig. 1, as an alternative embodiment, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, step S103 further includes adding the original base address calculation expression and the offset calculation expression determined in step S103, splitting the original base address calculation expression and the offset calculation expression into a form of adding and subtracting multiple factors, merging the factors without the dependent variable in the present loop, presenting the merged factors to the front of the loop body, using an intermediate variable to represent the result, and replacing the corresponding part in the loop body with the intermediate variable.
In this embodiment, the proposal to the front of the loop body refers to the proposal from the loop body to the last code block before the loop body, so that the positioning and reading are easy.
In the present embodiment, the use of the intermediate variable to represent the base address calculation expression and the offset calculation expression in step S103 refers to the use of the intermediate variable AR to represent the result of the base address calculation expression and the use of the intermediate variable OR to represent the result of the base address calculation expression. In this embodiment, after replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables in step S103, the method further includes replacing the memory address of the variables with an addr form of ar++ [ OR ], where addr is the original address of the variables, AR is an intermediate variable for representing the base address calculation expression, OR is an intermediate variable for representing the base address calculation expression, and AR and OR can be regarded as constants in the loop. For example, var_11_59 is replaced with the form of ar_ex_76++ [ or_ex_76], and the intermediate variables ar_ex_76 and or_ex_76 represent the base address calculation expression and the offset calculation expression.
As shown in fig. 2, step S101 in the present embodiment further includes a step of semantically downgrading class C language code corresponding to the original linear assembly code to be optimized to a linear assembly language hierarchy to obtain the original linear assembly code to be optimized. The class C language code is converted into FTIR, and the FTIR is converted into original linear assembly code.
As shown in fig. 2, step S103 in this embodiment further includes a step of converting the optimized linear assembly code into assembly code, compiling the assembly code to obtain an execution program, and then executing the assembly code program.
It should be noted that, when the method of this embodiment is optimized, the following two cases should be noted: 1) The number of all the dependent variables with the factors of the dependent variables in the index obtained after splitting is not more than one. 2) The initial value and the step length of the dependent variable need to be a determined value or an expression which can be recorded, and the expression does not contain the dependent variable of the cycle; when this occurs, only part of the loop-independent calculations can be stripped off, while loop-dependent calculations can still only be calculated within the loop.
In order to verify the effect of the method of the embodiment, the number of code lines before and after the optimization of the method of the embodiment is adopted for a certain linear assembly code for comparison, and the experimental results are shown in table 1.
Table 1: the code line number comparison schematic diagram before and after optimization by adopting the method of the embodiment.
Number of lines of code | Executing beats | |
Before optimization | 43 | 40 |
After optimization | 15 | 24 |
Referring to table 1, the number of code lines optimized by the method of the embodiment is reduced by two thirds, and the execution efficiency of the code can be remarkably improved.
In summary, in the analysis method for reconstructing the calculation of the memory address in the embodiment, a series of processes of calculating, loop stripping, data substitution, code replacement and the like of the memory address are analyzed and recombined to realize optimization, so that all the calculation of the memory address by the linear assembly layer can be stripped out of a core loop, only the important instruction of an algorithm is reserved in the core loop, thereby effectively avoiding redundant address calculation in the core loop body, reducing the length of the core loop body, correspondingly reducing the total execution length, only assigning the address base address and the offset value to the corresponding register after calculation, and being faster than the common loop linear assembly in operation time, reducing the total execution instruction number and the program execution time cost of the core loop, improving the code execution efficiency, and having ideal effect and reliability.
In addition, the embodiment also provides a memory address calculation optimizing device based on cyclic stripping, which comprises:
the assembly code analysis program unit is used for determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression and initial values and step sizes of the dependent variables;
the dependent variable judging program unit is used for reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression without the dependent variable in the cycle, and each factor at most comprises one dependent variable, and the jump execution expression replaces the program unit; otherwise, ending and exiting;
an expression replacement program unit, configured to, when all factors in the reconstructed offset calculation expression contain only a single dependent variable, first bring an initial value of the dependent variable into a base calculation expression, bring a step length of the dependent variable into the offset calculation expression, and then respectively use intermediate variables to represent the base calculation expression and the offset calculation expression before the base calculation expression and the offset calculation expression are put out from the loop body to the loop body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
In addition, the embodiment also provides a memory access computing optimization device based on cyclic stripping, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the memory access computing optimization method based on cyclic stripping. In addition, the present embodiment also provides a computer readable storage medium having a computer program stored therein, the computer program being configured or programmed by a microprocessor to perform the cyclic stripping-based address calculation optimization method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and the protection scope of the present application is not limited to the above examples, and all technical solutions belonging to the concept of the present application belong to the protection scope of the present application. It should be noted that modifications and adaptations to the present application may occur to one skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.
Claims (10)
1. The access address calculation optimization method based on cyclic stripping is characterized by comprising the following steps of:
s101, determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression, and initial values and step sizes of the dependent variables;
s102, if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression which does not contain the dependent variable in the circulation, reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors, wherein each factor contains at most one dependent variable, and jumping to the next step; otherwise, ending and exiting;
s103, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, firstly bringing an initial value of the dependent variable into a base address calculation expression, bringing a step length of the dependent variable into the offset calculation expression, and then respectively using intermediate variables to represent the base address calculation expression and the offset calculation expression before the base address calculation expression and the offset calculation expression are put out from a circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
2. The optimization method according to claim 1, wherein step S103 further comprises adding the original base address calculation expression and the offset calculation expression determined in step S103 when all factors in the reconstructed offset calculation expression only contain a single dependent variable, splitting the added and subtracted multiple factors, merging the factors that do not contain the dependent variable in the present loop, extracting the merged factors to the front of the loop, representing the result by an intermediate variable, and replacing the corresponding part of the intermediate variable in the loop body.
3. The optimization method of claim 2, wherein the step of extracting the memory address before the loop body refers to extracting the memory address from the loop body to a last code block before the loop body.
4. The method according to claim 3, wherein the step S103 of expressing the base address calculation expression using the intermediate variable and the offset calculation expression means that the base address calculation expression is expressed using the intermediate variable AR and the base address calculation expression is expressed using the intermediate variable OR.
5. The optimization method of claim 4, wherein after replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables in step S103, the method further comprises replacing the base address of the variables with an addr in the form of ar++ [ OR ], where addr is an original address of the variables, AR is an intermediate variable for representing the base address calculation expression, and OR is an intermediate variable for representing the base address calculation expression.
6. The optimization method of claim 1, further comprising the step of semantically downgrading class C language code corresponding to the original linear assembly code to be optimized to a linear assembly language hierarchy to obtain the original linear assembly code to be optimized before step S101.
7. The method for optimizing memory address calculation based on loop stripping as recited in claim 6, further comprising the step of converting the optimized linear assembly code into assembly code and compiling the assembly code to obtain an execution program after step S103.
8. The access address calculation optimizing device based on cyclic stripping is characterized by comprising:
the assembly code analysis program unit is used for determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression and initial values and step sizes of the dependent variables;
the dependent variable judging program unit is used for reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression without the dependent variable in the cycle, and each factor at most comprises one dependent variable, and the jump execution expression replaces the program unit; otherwise, ending and exiting;
an expression replacement program unit, configured to, when all factors in the reconstructed offset calculation expression contain only a single dependent variable, first bring an initial value of the dependent variable into a base calculation expression, bring a step length of the dependent variable into the offset calculation expression, and then respectively use intermediate variables to represent the base calculation expression and the offset calculation expression before the base calculation expression and the offset calculation expression are put out from the loop body to the loop body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
9. A cyclic stripping-based memory address calculation optimization device comprising a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to perform the cyclic stripping-based memory address calculation optimization method of any one of claims 1 to 7.
10. A computer readable storage medium having a computer program stored therein, wherein the computer program is for programming or configuring by a microprocessor to perform the cyclic stripping based address calculation optimization method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310825932.9A CN116974572A (en) | 2023-07-06 | 2023-07-06 | Memory access address calculation optimization method and device based on cyclic stripping |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310825932.9A CN116974572A (en) | 2023-07-06 | 2023-07-06 | Memory access address calculation optimization method and device based on cyclic stripping |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116974572A true CN116974572A (en) | 2023-10-31 |
Family
ID=88484214
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310825932.9A Pending CN116974572A (en) | 2023-07-06 | 2023-07-06 | Memory access address calculation optimization method and device based on cyclic stripping |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116974572A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117850881A (en) * | 2024-01-18 | 2024-04-09 | 上海芯联芯智能科技有限公司 | Instruction execution method and device based on pipelining |
-
2023
- 2023-07-06 CN CN202310825932.9A patent/CN116974572A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117850881A (en) * | 2024-01-18 | 2024-04-09 | 上海芯联芯智能科技有限公司 | Instruction execution method and device based on pipelining |
CN117850881B (en) * | 2024-01-18 | 2024-06-18 | 上海芯联芯智能科技有限公司 | Instruction execution method and device based on pipelining |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7784039B2 (en) | Compiler, compilation method, and compilation program | |
JP3664473B2 (en) | Program optimization method and compiler using the same | |
US9946523B2 (en) | Multiple pass compiler instrumentation infrastructure | |
CN116974572A (en) | Memory access address calculation optimization method and device based on cyclic stripping | |
US20040025152A1 (en) | Compiling method, apparatus, and program | |
TWI463404B (en) | Compiling systems and methods | |
EP0428560A4 (en) | Machine process for translating programs in binary machine language into another binary machine language | |
US8943484B2 (en) | Code generation method and information processing apparatus | |
Baek et al. | A flexible proof format for SAT solver-elaborator communication | |
CN108197027A (en) | Software performance optimization method, can storage medium, computer, computer program | |
WO2012104907A1 (en) | Test data production method for evaluating execution performance of program | |
CN105988854A (en) | Dynamic compilation method and apparatus | |
Ramirez et al. | Trace cache redundancy: Red and blue traces | |
Su et al. | An improvement of trace scheduling for global microcode compaction | |
US8200469B2 (en) | Method for reconstructing statement, and computer system having the function therefor | |
CN117555548A (en) | Code generation method and device and electronic equipment | |
Bradel et al. | Automatic trace-based parallelization of java programs | |
CN116630040A (en) | Intelligent contract transaction rapid execution method based on fine-granularity read-write analysis | |
CN111309329B (en) | Instruction address self-adaptive repositioning method and program compiling method | |
CN113791770A (en) | Code compiler, code compiling method, code compiling system, and computer medium | |
US20090112568A1 (en) | Method for Generating a Simulation Program Which Can Be Executed On a Host Computer | |
US20050060692A1 (en) | Method and apparatus for reducing time to generate a build for a software product from source-files | |
CN109725904B (en) | Low-power-consumption program instruction compiling method and system | |
JPH02176938A (en) | Machine language instruction optimizing system | |
US20230266950A1 (en) | Methods and devices for compiler function fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |