CN116974572A - Memory access address calculation optimization method and device based on cyclic stripping - Google Patents

Memory access address calculation optimization method and device based on cyclic stripping Download PDF

Info

Publication number
CN116974572A
CN116974572A CN202310825932.9A CN202310825932A CN116974572A CN 116974572 A CN116974572 A CN 116974572A CN 202310825932 A CN202310825932 A CN 202310825932A CN 116974572 A CN116974572 A CN 116974572A
Authority
CN
China
Prior art keywords
calculation expression
base address
expression
offset
address calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310825932.9A
Other languages
Chinese (zh)
Inventor
王耀华
刘昕睿
郭阳
扈啸
李哲
文梅
陈照云
时洋
张天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202310825932.9A priority Critical patent/CN116974572A/en
Publication of CN116974572A publication Critical patent/CN116974572A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4441Reducing the execution time required by the program code

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The application discloses a memory address calculation optimization method and a memory address calculation optimization device based on cyclic stripping. The application can strip the calculation of the memory access address in the linear assembly layer to obtain the core loop, can realize the loop body code by using more simplified sentences, effectively reduce redundant address calculation in the core loop body, reduce the total execution instruction number and the program execution time cost of the core loop, improve the efficiency of the program, and can be applicable to all loops.

Description

Memory access address calculation optimization method and device based on cyclic stripping
Technical Field
The application relates to the technical field of program compiling of linear assembly language, in particular to a memory address calculation optimization method and device based on cyclic stripping.
Background
A linear assembly language is a programming language that is intermediate between assembly language and high-level programming language, has a simpler syntax than assembly language and is more efficient than high-level programming language. Linear assembly language has the following 3 advantages: (1) no manual allocation of registers is required; (2) the assignment of the functional units, the arrangement of instruction beats and the filling of delay slots do not need to be considered; (3) the parallel scheduling of the design codes is not needed, and the parallel scheduling can be automatically completed by a linear assembly compiler, so that the coding efficiency of the codes is improved. At present, common linear assembly optimization methods are cyclic expansion and cyclic soft running water. And (3) cyclic unfolding: the loops are unfolded, so that the total times of branch instructions can be reduced, and the instruction parallelism is improved; circulating soft running water: the length of a critical path of the circulating body is reduced, and the code execution efficiency is improved. However, the existing linear assembly optimization method still has the problems of insufficient conciseness and poor optimization effect.
Disclosure of Invention
The application aims to solve the technical problems: aiming at the problems in the prior art, the application provides a memory address calculation optimization method and a memory address calculation optimization device based on cyclic stripping, which can strip the calculation of the memory address from a linear assembly layer to a core cycle by analyzing and reconstructing the calculation of the memory address, can realize cyclic body codes by using more simplified sentences, effectively reduce redundant address calculation in a core cycle body, can reduce the total execution instruction number and the program execution time cost of the core cycle, improve the efficiency of the program, and can be applicable to all cycles.
In order to solve the technical problems, the application adopts the following technical scheme:
a memory address calculation optimization method based on cyclic stripping comprises the following steps:
s101, determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression, and initial values and step sizes of the dependent variables;
s102, if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression which does not contain the dependent variable in the circulation, reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors, wherein each factor contains at most one dependent variable, and jumping to the next step; otherwise, ending and exiting;
s103, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, firstly bringing an initial value of the dependent variable into a base address calculation expression, bringing a step length of the dependent variable into the offset calculation expression, and then respectively using intermediate variables to represent the base address calculation expression and the offset calculation expression before the base address calculation expression and the offset calculation expression are put out from a circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
Optionally, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, the original base address calculation expression and the offset calculation expression determined in the step S103 are added first, then split into a form of adding and subtracting multiple factors, finally merging the factors without the dependent variable in the present loop together, presenting the factors to the front of the loop body, representing the result by an intermediate variable, and replacing the corresponding part in the loop body by the intermediate variable.
Optionally, the proposing to the front of the loop body refers to proposing from the loop body to the last code block before the loop body.
Alternatively, the use of the intermediate variable to represent the base address calculation expression and the offset calculation expression in step S103 refers to the use of the intermediate variable AR to represent the result of the base address calculation expression and the use of the intermediate variable OR to represent the result of the base address calculation expression.
Optionally, after replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables in step S103, the method further includes replacing the address of the variables with an address of ar++ [ OR ], where addr is the original address of the variables, AR is an intermediate variable for representing the base address calculation expression, and OR is an intermediate variable for representing the base address calculation expression.
Optionally, step S101 is preceded by a step of semantically downgrading class C language code corresponding to the original linear assembly code to be optimized to a linear assembly language hierarchy to obtain the original linear assembly code to be optimized.
Optionally, step S103 further includes a step of converting the optimized linear assembly code into assembly code and compiling the assembly code to obtain an execution program.
In addition, the application also provides a memory address calculation optimizing device based on cyclic stripping, which comprises the following steps:
the assembly code analysis program unit is used for determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression and initial values and step sizes of the dependent variables;
the dependent variable judging program unit is used for reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression without the dependent variable in the cycle, and each factor at most comprises one dependent variable, and the jump execution expression replaces the program unit; otherwise, ending and exiting;
an expression replacement program unit, configured to, when all factors in the reconstructed offset calculation expression contain only a single dependent variable, first bring an initial value of the dependent variable into a base calculation expression, bring a step length of the dependent variable into the offset calculation expression, and then respectively use intermediate variables to represent the base calculation expression and the offset calculation expression before the base calculation expression and the offset calculation expression are put out from the loop body to the loop body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
In addition, the application also provides a memory access calculation optimization device based on the cyclic stripping, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the memory access calculation optimization method based on the cyclic stripping.
Furthermore, the present application provides a computer readable storage medium having stored therein a computer program for being programmed or configured by a microprocessor to perform the cyclic stripping based memory access calculation optimization method.
Compared with the prior art, the application has the following advantages: according to the application, through analyzing and reconstructing the calculation of the memory address, the core loop can be stripped from the calculation of the memory address in the linear assembly layer, the loop body code can be realized by using more simplified sentences, the length of the core loop body is reduced, the corresponding total execution length is reduced, the redundant address calculation in the core loop body is greatly reduced, the calculation time is faster than that of the common cyclic linear assembly, the total execution instruction number and the program execution time cost of the core loop can be reduced, the program efficiency is improved, and the method is applicable to all loops and has good reliability.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of a method for correcting the error of the upstream and downstream integrity rates according to an embodiment of the present application.
Detailed Description
As shown in fig. 1, the memory address calculation optimization method based on loop stripping in this embodiment includes:
s101, determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression, and initial values and step sizes of the dependent variables; wherein the dependent variable refers to a variable affected by the number of cycles, and the base address expression and the offset expression refer to an expression for calculating a base address and an expression for calculating an offset, respectively. For example, in the C language, the value is obtained by indexing a [ i+1] variable, when the degradation is carried out to the stage of linear assembly, a is a base address expression, i+1 is an offset expression, a+i+1 is used for obtaining a value address, and finally the value of a [ i+1] is obtained. The base address calculation expression and the offset calculation expression in the original linear assembly code loop body are defined in the original linear assembly code loop body, and can be obtained according to the definition. For example, in this embodiment, for a loop body of a certain original linear assembly code, the base address calculation expression in the loop body is extracted as follows:
Conv2d_NCHW_ft_out_transform_write_cach,
the offset calculation expression is:
16*a_inner+7168*j_inner+64*k_inner,
where a_inner is the dependent variable of the current cycle.
S102, if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression which does not contain the dependent variable in the cycle, reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors, wherein each factor contains at most one dependent variable (only one or zero dependent variable, and the single use of the same dependent variable is regarded as one, for example, a is a, and a is used twice, so the number of the dependent variables is two), and jumping to the next step; otherwise, ending and exiting;
for the base address calculation expression and offset calculation expression examples previously described, the base address expression after reconstruction:
Conv2d_NCHW_ft_out_transform_write_cach+7168*j_inner+64*k_inner+16*a_inner,
the offset expression is:
16*a_inner
s103, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, firstly bringing an initial value of the dependent variable into a base address calculation expression, bringing a step length of the dependent variable into the offset calculation expression, and then respectively using intermediate variables to represent the base address calculation expression and the offset calculation expression before the base address calculation expression and the offset calculation expression are put out from a circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
In this embodiment, the original linear assembly code defines an initial value of 0, a step size of 1 (1 is added each time as the number of loops increases) and a final base address expression is:
Conv2d_NCHW_ft_out_transform_write_cach+7168*j_inner+64*k_inner+16*0,
the offset expression is 16 x 1,
the calculations for the access base address and offset become now independent of the current cycle. Then, the base address calculation expression and the offset calculation expression are respectively expressed by using intermediate variables before being put out from the circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body. For example, in the present embodiment, the intermediate variables ar_ex_76 and or_ex_76 are used to represent the base address calculation expression and the offset calculation expression, respectively; finally, the base address calculation expression and the offset calculation expression in the loop body are replaced by the intermediate variables AR_EX_76 and OR_EX_76, so that the calculation of the base address and the offset is not needed in the loop body.
The method of the embodiment can optimize the memory access calculation of the linear assembly layer. If the dependent variable exists in one polynomial factor and the number of the dependent variable is not more than one, the optimal optimization can be performed, and the calculation of the memory address is stripped out of the loop body; otherwise, only primary optimization can be performed, namely, the loop body is stripped after the polynomial analysis and recombination which are irrelevant to the dependent variables, and the purposes of optimal optimization and primary optimization are to reduce codes in the current loop body as much as possible, reduce the time cost of code execution, reduce the memory access time during code execution and improve the overall execution efficiency of the program.
Needless to say, the above step is independent of whether and how the optimization is performed when all the factors in the offset calculation expression after the reconstruction contain only a single dependent variable are not established in step S103. Referring to fig. 1, as an alternative embodiment, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, step S103 further includes adding the original base address calculation expression and the offset calculation expression determined in step S103, splitting the original base address calculation expression and the offset calculation expression into a form of adding and subtracting multiple factors, merging the factors without the dependent variable in the present loop, presenting the merged factors to the front of the loop body, using an intermediate variable to represent the result, and replacing the corresponding part in the loop body with the intermediate variable.
In this embodiment, the proposal to the front of the loop body refers to the proposal from the loop body to the last code block before the loop body, so that the positioning and reading are easy.
In the present embodiment, the use of the intermediate variable to represent the base address calculation expression and the offset calculation expression in step S103 refers to the use of the intermediate variable AR to represent the result of the base address calculation expression and the use of the intermediate variable OR to represent the result of the base address calculation expression. In this embodiment, after replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables in step S103, the method further includes replacing the memory address of the variables with an addr form of ar++ [ OR ], where addr is the original address of the variables, AR is an intermediate variable for representing the base address calculation expression, OR is an intermediate variable for representing the base address calculation expression, and AR and OR can be regarded as constants in the loop. For example, var_11_59 is replaced with the form of ar_ex_76++ [ or_ex_76], and the intermediate variables ar_ex_76 and or_ex_76 represent the base address calculation expression and the offset calculation expression.
As shown in fig. 2, step S101 in the present embodiment further includes a step of semantically downgrading class C language code corresponding to the original linear assembly code to be optimized to a linear assembly language hierarchy to obtain the original linear assembly code to be optimized. The class C language code is converted into FTIR, and the FTIR is converted into original linear assembly code.
As shown in fig. 2, step S103 in this embodiment further includes a step of converting the optimized linear assembly code into assembly code, compiling the assembly code to obtain an execution program, and then executing the assembly code program.
It should be noted that, when the method of this embodiment is optimized, the following two cases should be noted: 1) The number of all the dependent variables with the factors of the dependent variables in the index obtained after splitting is not more than one. 2) The initial value and the step length of the dependent variable need to be a determined value or an expression which can be recorded, and the expression does not contain the dependent variable of the cycle; when this occurs, only part of the loop-independent calculations can be stripped off, while loop-dependent calculations can still only be calculated within the loop.
In order to verify the effect of the method of the embodiment, the number of code lines before and after the optimization of the method of the embodiment is adopted for a certain linear assembly code for comparison, and the experimental results are shown in table 1.
Table 1: the code line number comparison schematic diagram before and after optimization by adopting the method of the embodiment.
Number of lines of code Executing beats
Before optimization 43 40
After optimization 15 24
Referring to table 1, the number of code lines optimized by the method of the embodiment is reduced by two thirds, and the execution efficiency of the code can be remarkably improved.
In summary, in the analysis method for reconstructing the calculation of the memory address in the embodiment, a series of processes of calculating, loop stripping, data substitution, code replacement and the like of the memory address are analyzed and recombined to realize optimization, so that all the calculation of the memory address by the linear assembly layer can be stripped out of a core loop, only the important instruction of an algorithm is reserved in the core loop, thereby effectively avoiding redundant address calculation in the core loop body, reducing the length of the core loop body, correspondingly reducing the total execution length, only assigning the address base address and the offset value to the corresponding register after calculation, and being faster than the common loop linear assembly in operation time, reducing the total execution instruction number and the program execution time cost of the core loop, improving the code execution efficiency, and having ideal effect and reliability.
In addition, the embodiment also provides a memory address calculation optimizing device based on cyclic stripping, which comprises:
the assembly code analysis program unit is used for determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression and initial values and step sizes of the dependent variables;
the dependent variable judging program unit is used for reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression without the dependent variable in the cycle, and each factor at most comprises one dependent variable, and the jump execution expression replaces the program unit; otherwise, ending and exiting;
an expression replacement program unit, configured to, when all factors in the reconstructed offset calculation expression contain only a single dependent variable, first bring an initial value of the dependent variable into a base calculation expression, bring a step length of the dependent variable into the offset calculation expression, and then respectively use intermediate variables to represent the base calculation expression and the offset calculation expression before the base calculation expression and the offset calculation expression are put out from the loop body to the loop body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
In addition, the embodiment also provides a memory access computing optimization device based on cyclic stripping, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the memory access computing optimization method based on cyclic stripping. In addition, the present embodiment also provides a computer readable storage medium having a computer program stored therein, the computer program being configured or programmed by a microprocessor to perform the cyclic stripping-based address calculation optimization method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and the protection scope of the present application is not limited to the above examples, and all technical solutions belonging to the concept of the present application belong to the protection scope of the present application. It should be noted that modifications and adaptations to the present application may occur to one skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (10)

1. The access address calculation optimization method based on cyclic stripping is characterized by comprising the following steps of:
s101, determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression, and initial values and step sizes of the dependent variables;
s102, if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression which does not contain the dependent variable in the circulation, reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors, wherein each factor contains at most one dependent variable, and jumping to the next step; otherwise, ending and exiting;
s103, when all factors in the reconstructed offset calculation expression only contain a single dependent variable, firstly bringing an initial value of the dependent variable into a base address calculation expression, bringing a step length of the dependent variable into the offset calculation expression, and then respectively using intermediate variables to represent the base address calculation expression and the offset calculation expression before the base address calculation expression and the offset calculation expression are put out from a circulating body to the circulating body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
2. The optimization method according to claim 1, wherein step S103 further comprises adding the original base address calculation expression and the offset calculation expression determined in step S103 when all factors in the reconstructed offset calculation expression only contain a single dependent variable, splitting the added and subtracted multiple factors, merging the factors that do not contain the dependent variable in the present loop, extracting the merged factors to the front of the loop, representing the result by an intermediate variable, and replacing the corresponding part of the intermediate variable in the loop body.
3. The optimization method of claim 2, wherein the step of extracting the memory address before the loop body refers to extracting the memory address from the loop body to a last code block before the loop body.
4. The method according to claim 3, wherein the step S103 of expressing the base address calculation expression using the intermediate variable and the offset calculation expression means that the base address calculation expression is expressed using the intermediate variable AR and the base address calculation expression is expressed using the intermediate variable OR.
5. The optimization method of claim 4, wherein after replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables in step S103, the method further comprises replacing the base address of the variables with an addr in the form of ar++ [ OR ], where addr is an original address of the variables, AR is an intermediate variable for representing the base address calculation expression, and OR is an intermediate variable for representing the base address calculation expression.
6. The optimization method of claim 1, further comprising the step of semantically downgrading class C language code corresponding to the original linear assembly code to be optimized to a linear assembly language hierarchy to obtain the original linear assembly code to be optimized before step S101.
7. The method for optimizing memory address calculation based on loop stripping as recited in claim 6, further comprising the step of converting the optimized linear assembly code into assembly code and compiling the assembly code to obtain an execution program after step S103.
8. The access address calculation optimizing device based on cyclic stripping is characterized by comprising:
the assembly code analysis program unit is used for determining a base address calculation expression and an offset calculation expression in an original linear assembly code loop body to be optimized, and analyzing the loop body to obtain dependent variables used in the base address calculation expression and the offset calculation expression and initial values and step sizes of the dependent variables;
the dependent variable judging program unit is used for reconstructing the base address calculation expression and the offset calculation expression into a form of adding and subtracting a plurality of factors if the initial value and the step length of the dependent variable are determined values which can be recorded or the expression without the dependent variable in the cycle, and each factor at most comprises one dependent variable, and the jump execution expression replaces the program unit; otherwise, ending and exiting;
an expression replacement program unit, configured to, when all factors in the reconstructed offset calculation expression contain only a single dependent variable, first bring an initial value of the dependent variable into a base calculation expression, bring a step length of the dependent variable into the offset calculation expression, and then respectively use intermediate variables to represent the base calculation expression and the offset calculation expression before the base calculation expression and the offset calculation expression are put out from the loop body to the loop body; and finally, replacing the base address calculation expression and the offset calculation expression in the loop body with corresponding intermediate variables, so that the calculation of the base address and the offset is not needed in the loop body.
9. A cyclic stripping-based memory address calculation optimization device comprising a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to perform the cyclic stripping-based memory address calculation optimization method of any one of claims 1 to 7.
10. A computer readable storage medium having a computer program stored therein, wherein the computer program is for programming or configuring by a microprocessor to perform the cyclic stripping based address calculation optimization method of any one of claims 1 to 7.
CN202310825932.9A 2023-07-06 2023-07-06 Memory access address calculation optimization method and device based on cyclic stripping Pending CN116974572A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310825932.9A CN116974572A (en) 2023-07-06 2023-07-06 Memory access address calculation optimization method and device based on cyclic stripping

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310825932.9A CN116974572A (en) 2023-07-06 2023-07-06 Memory access address calculation optimization method and device based on cyclic stripping

Publications (1)

Publication Number Publication Date
CN116974572A true CN116974572A (en) 2023-10-31

Family

ID=88484214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310825932.9A Pending CN116974572A (en) 2023-07-06 2023-07-06 Memory access address calculation optimization method and device based on cyclic stripping

Country Status (1)

Country Link
CN (1) CN116974572A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117850881A (en) * 2024-01-18 2024-04-09 上海芯联芯智能科技有限公司 Instruction execution method and device based on pipelining

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117850881A (en) * 2024-01-18 2024-04-09 上海芯联芯智能科技有限公司 Instruction execution method and device based on pipelining
CN117850881B (en) * 2024-01-18 2024-06-18 上海芯联芯智能科技有限公司 Instruction execution method and device based on pipelining

Similar Documents

Publication Publication Date Title
US7784039B2 (en) Compiler, compilation method, and compilation program
JP3664473B2 (en) Program optimization method and compiler using the same
US9946523B2 (en) Multiple pass compiler instrumentation infrastructure
CN116974572A (en) Memory access address calculation optimization method and device based on cyclic stripping
US20040025152A1 (en) Compiling method, apparatus, and program
TWI463404B (en) Compiling systems and methods
EP0428560A4 (en) Machine process for translating programs in binary machine language into another binary machine language
US8943484B2 (en) Code generation method and information processing apparatus
Baek et al. A flexible proof format for SAT solver-elaborator communication
CN108197027A (en) Software performance optimization method, can storage medium, computer, computer program
WO2012104907A1 (en) Test data production method for evaluating execution performance of program
CN105988854A (en) Dynamic compilation method and apparatus
Ramirez et al. Trace cache redundancy: Red and blue traces
Su et al. An improvement of trace scheduling for global microcode compaction
US8200469B2 (en) Method for reconstructing statement, and computer system having the function therefor
CN117555548A (en) Code generation method and device and electronic equipment
Bradel et al. Automatic trace-based parallelization of java programs
CN116630040A (en) Intelligent contract transaction rapid execution method based on fine-granularity read-write analysis
CN111309329B (en) Instruction address self-adaptive repositioning method and program compiling method
CN113791770A (en) Code compiler, code compiling method, code compiling system, and computer medium
US20090112568A1 (en) Method for Generating a Simulation Program Which Can Be Executed On a Host Computer
US20050060692A1 (en) Method and apparatus for reducing time to generate a build for a software product from source-files
CN109725904B (en) Low-power-consumption program instruction compiling method and system
JPH02176938A (en) Machine language instruction optimizing system
US20230266950A1 (en) Methods and devices for compiler function fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination