CN117707995A - Optimization device for data pre-reading and operation method - Google Patents

Optimization device for data pre-reading and operation method Download PDF

Info

Publication number
CN117707995A
CN117707995A CN202410145958.3A CN202410145958A CN117707995A CN 117707995 A CN117707995 A CN 117707995A CN 202410145958 A CN202410145958 A CN 202410145958A CN 117707995 A CN117707995 A CN 117707995A
Authority
CN
China
Prior art keywords
instruction
sequence table
instructions
execution
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410145958.3A
Other languages
Chinese (zh)
Other versions
CN117707995B (en
Inventor
陈振华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huilang Times Technology Co Ltd
Original Assignee
Beijing Huilang Times Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huilang Times Technology Co Ltd filed Critical Beijing Huilang Times Technology Co Ltd
Priority to CN202410145958.3A priority Critical patent/CN117707995B/en
Priority claimed from CN202410145958.3A external-priority patent/CN117707995B/en
Publication of CN117707995A publication Critical patent/CN117707995A/en
Application granted granted Critical
Publication of CN117707995B publication Critical patent/CN117707995B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an optimization device and an operation method for data pre-reading, and relates to the technical field of optimization of data pre-reading, wherein the optimization device comprises a processor, a memory and a server; the server is electrically connected with the processor, and the processor is electrically connected with the memory; the method solves the technical problems that the dependency relationship among the instructions cannot be determined, the pre-reading sequence of the data is determined according to the data dependency relationship among the instructions, so that the pre-reading instructions cannot be in the optimal position, and the data reading rate cannot be improved: the method comprises the steps of obtaining and analyzing the required execution times of each instruction in a cycle to obtain an instruction execution index corresponding to each instruction, analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions to generate an instruction execution confirmation sequence table, and determining the pre-reading sequence of data corresponding to each instruction, so that the position of the instruction is optimal, and the position of the instruction is optimized.

Description

Optimization device for data pre-reading and operation method
Technical Field
The invention relates to the technical field of optimization of data pre-reading, in particular to an optimization device and an operation method of data pre-reading.
Background
The optimizing device for data pre-reading adopts an advanced data pre-reading technology, automatically predicts data possibly needed in the future by analyzing the characteristics and the access mode of the data, and loads the data into a cache in advance. When data is accessed, the device can perform quick response according to predicted data, and does not need to wait for a disk or a network to read, so that the delay of data access can be greatly reduced, and the efficiency and the speed of data reading are improved.
Patent publication No. CN111858400A discloses an optimization method and device for data pre-reading, wherein the method comprises the following steps: collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine the function to be optimized, and writing the function to an analysis log; aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position of the function as states and taking the change of the optimizing result as an action construction action cost function; for each action cost function, performing iterative training by using a reinforcement learning algorithm with the single step speed of the CFD solver as a reward until the action cost function converges; and determining the optimal pre-reading scheduling distance and pre-reading scheduling position according to the corresponding states, actions and converged action cost functions so as to execute data pre-reading in the cache.
However, in the scheme, when the program loops, the data is pre-read according to the instructions, the data dependency relationship between the instructions cannot be determined, and the pre-reading sequence of the data is determined according to the data dependency relationship between the instructions, so that the pre-reading instructions cannot be positioned at the optimal position, the pre-read data cannot be ensured to be used when the data is pre-read, the pre-read data cannot be fully utilized, and the data reading rate is improved.
Disclosure of Invention
The invention aims to provide an optimizing device and an operating method for data pre-reading, which solve the technical problems that the dependency relationship among instructions cannot be determined mutually, and the pre-reading sequence of data is determined according to the data dependency relationship among the instructions, so that the pre-reading instructions cannot be in an optimal position and the data reading speed cannot be improved.
The aim of the invention can be achieved by the following technical scheme:
an optimization apparatus for data pre-reading, comprising:
a processor, a memory, and a server; the server is electrically connected with the processor, the processor is electrically connected with the memory, the server can run the program codes in the loops, the processor processes the code instructions, and the memory is used for storing data;
the instruction execution index generation module is arranged in the server and used for acquiring and analyzing the required execution times of each instruction in the loop, acquiring instruction execution indexes corresponding to each instruction according to an analysis result, and sending the instruction execution indexes to the initial instruction execution sequence table generation module, wherein the required execution times of each instruction refer to the times of each instruction in the loop which needs to be executed in the loop, and the required execution times of the instructions can be acquired through the sampler;
the initial instruction execution sequence table generation module is arranged in the server and is used for generating an initial instruction execution sequence table according to the instruction execution indexes corresponding to the instructions and sending the initial instruction execution sequence table to the instruction execution confirmation sequence table generation module;
the instruction execution confirmation sequence table generation module is arranged in the processor and is used for analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions, generating a judgment result value corresponding to each instruction and other instructions, generating an instruction execution confirmation sequence table through the judgment analysis between each instruction and other instructions, and transmitting the instruction execution confirmation sequence table to the execution module;
data dependency relationship refers to the data dependency relationship existing between each instruction and other instructions in the loop, namely, the relationship that the subsequent instruction needs to use the calculation result of the previous instruction;
an instruction control relationship refers to an instruction control relationship existing between each instruction and other instructions in a loop, namely, a relationship that a subsequent instruction needs to be controlled and implemented by a previous instruction, wherein the instruction control relationship comprises forward control and reverse control, the forward control relationship refers to whether the subsequent instruction needs to be controlled and executed by an execution result of the previous instruction, and the reverse control relationship refers to whether the execution of the previous instruction needs to be controlled and executed by an execution result of the subsequent instruction;
and the execution module is arranged in the server and is used for determining the pre-reading sequence of the data corresponding to each instruction according to the sequence of each instruction in the instruction execution confirmation sequence table.
As a further scheme of the invention, the specific mode for obtaining the instruction execution index corresponding to each instruction is as follows:
the required execution times of all instructions in the loop are respectively marked as F1, F2, … … and Fi, wherein i refers to the number of the instructions in the loop, and i is more than or equal to 2;
by the formulaAnd calculating and obtaining an instruction execution index Zi corresponding to each instruction.
As a further scheme of the invention, the specific mode for generating the initial instruction execution sequence table is as follows:
and ordering the instructions according to the sizes of the instruction execution index Zi values corresponding to the instructions from large to small, so as to generate an initial instruction execution sequence table.
As a further scheme of the invention, the specific mode for generating the instruction execution confirmation sequence table is as follows:
s1: selecting a first instruction in the initial instruction execution sequence table as a first designated instruction;
s2: defining the data dependency relationship between the first specified instruction and other instructions located behind the first specified instruction in the instruction execution sequence table as a characteristic value corresponding to the instruction, wherein the specific mode of defining the characteristic value is as follows:
the definition of the characteristic value of the data dependency relation is as follows: the data dependency relationship between the first specified instruction and the corresponding instruction is represented by '00', and the data dependency relationship between the first specified instruction and the corresponding instruction is represented by '11';
s3: the corresponding instruction with the characteristic value of '00' is not processed, the control relation between the corresponding instruction with the specific value of '11' and the first appointed instruction is judged, and the specific judgment mode of the control relation is as follows:
when the corresponding instruction with the special value of 11 and the first appointed instruction are in a reverse control relation, multiplying the characteristic value 11 of the corresponding instruction by-1 to obtain a judging result value 11 of the corresponding instruction;
when the corresponding instruction with the special value of 11 and the first specified instruction are in a forward control relation, multiplying the characteristic value of 11 of the corresponding instruction by +1 to obtain a judging result value of +11 of the corresponding instruction;
s4: determining the position relationship between each instruction and the first appointed instruction again according to the positive and negative of the judgment result value of each instruction and the position relationship between the first appointed instruction and each instruction in the initial instruction execution sequence table, so as to obtain a first instruction execution judgment sequence table;
s5: taking a second instruction in the initial instruction execution sequence table as a second instruction, repeating the steps S1-S4, and analyzing and judging the data dependency relationship and the instruction control relationship between the second instruction and other instructions positioned behind the second instruction in the instruction execution sequence table, so as to obtain a second instruction execution judgment sequence table;
and then, carrying out analysis and judgment on the rest subsequent instructions one by one, stopping the judgment and analysis after the analysis and judgment on the next-to-last instruction in the execution sequence table are completed, and taking the finally obtained instruction execution judgment sequence table as an instruction execution confirmation sequence table.
As a further scheme of the invention, the specific mode for obtaining the first instruction execution judging sequence table is as follows:
when the judgment result value corresponding to the instruction is positive, if the position of the corresponding instruction in the initial instruction execution sequence table is positioned behind the first appointed instruction, the reordering is not needed, and if the position of the corresponding instruction in the initial instruction execution sequence table is positioned in front of the first appointed instruction, the position of the corresponding instruction is inserted behind the position of the first appointed instruction;
when the judgment result value corresponding to the instruction is negative, if the position of the corresponding instruction in the initial instruction execution sequence table is positioned before the first appointed instruction, the reordering is not needed, and if the position of the corresponding instruction in the initial instruction execution sequence table is positioned after the first appointed instruction, the position of the corresponding instruction is inserted before the position of the first appointed instruction, and then the first instruction execution judgment sequence is obtained.
As a further scheme of the present invention, when each specified instruction is analyzed in step S2, only the data dependency relationship and the control relationship between the other instructions located after each specified instruction and the corresponding specified instruction in the initial instruction execution sequence table are analyzed and judged, and the data dependency relationship and the control relationship between the other instructions located before the corresponding specified instruction in the initial instruction execution sequence table and the corresponding specified instruction are not analyzed.
As a further scheme of the present invention, when the position of each instruction is redetermined in step S4, the corresponding instruction execution judgment sequence table obtained by analyzing the previous instruction is used as the basis for analyzing and judging the position relationship between the corresponding specified instruction and the corresponding other instructions, and the situation when the first specified instruction is analyzed is excluded.
A method of operating an optimization device for pre-reading data, comprising the steps of:
step one: the method comprises the steps of obtaining and analyzing the required execution times of each instruction in a cycle, and obtaining instruction execution indexes corresponding to each instruction according to analysis results;
step two: generating an initial instruction execution sequence table according to the instruction execution index corresponding to each instruction;
step three: analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions, generating a judgment result value corresponding to each instruction and other instructions, and generating an instruction execution confirmation sequence table through judgment analysis between each instruction and other instructions;
step four: and determining the pre-reading sequence of the data corresponding to each instruction according to the sequence of each instruction in the instruction execution confirmation sequence table.
The invention has the beneficial effects that:
according to the method, the device and the system, the required execution times of all instructions in the loop are acquired and analyzed to obtain the instruction execution index corresponding to each instruction, the data dependency relationship and the instruction control relationship between each instruction and other instructions are analyzed to generate the instruction execution confirmation sequence, the pre-reading sequence of data corresponding to each instruction is determined according to the sequence of each instruction in the instruction execution confirmation sequence, so that the position of the instruction is optimal, the position of the instruction is optimized, the performance of the program is optimized, the dependent data is ensured to be available when the data is pre-read, the pre-read data can be used when needed, the data which can be used is read into a high-speed cache from a memory in advance, the pre-read data can be fully utilized, the data access time is shortened, the performance of the program is optimized, and the data reading rate is improved.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a schematic diagram of a frame structure of an optimizing apparatus for data pre-reading according to the present invention;
FIG. 2 is a schematic block diagram of a method framework of an operation method of an optimizing apparatus for data pre-reading according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1-2, the invention is an optimizing device for pre-reading data, which includes a processor, a memory and a server; the server is electrically connected with the processor, and the processor is electrically connected with the memory;
the server is used for running the program code in the circulation, the processor processes the code instruction, the memory is used for storing data, the circulation refers to a circulation body or circulation block for executing the instruction, and a plurality of instructions are executed according to a certain order, and the method belongs to the prior art and is not repeated herein;
the instruction execution index generation module is arranged in the server and is used for acquiring and analyzing the required execution times of each instruction in the loop, obtaining the instruction execution index corresponding to each instruction according to the analysis result and sending the instruction execution index to the initial instruction execution sequence table generation module, wherein the required execution times of each instruction refer to the times of each instruction in the loop which needs to be executed in the loop, the required execution times of the instructions can be acquired through the sampler, and the specific generation steps of the instruction execution index corresponding to each instruction are as follows:
the required execution times of all instructions in the loop are respectively marked as F1, F2, … … and Fi, wherein i refers to the number of the instructions in the loop, and i is more than or equal to 2;
by the formulaCalculating and obtaining an instruction execution index Zi corresponding to each instruction;
the initial instruction execution sequence table generation module is arranged in the server and is used for generating an initial instruction execution sequence table according to the instruction execution indexes corresponding to the instructions and sending the initial instruction execution sequence table to the instruction execution confirmation sequence table generation module, wherein the specific mode for generating the initial instruction execution sequence table is as follows:
sequencing all the instructions according to the size of the Zi value of the instruction execution index corresponding to each instruction from large to small, and further generating an initial instruction execution sequence table;
the initial instruction execution sequence table carries out initial sequencing on each instruction according to the instruction execution index Zi, so that the instruction with higher instruction execution index preferentially carries out data pre-reading, and the data access efficiency is improved;
the instruction execution confirmation sequence table generation module is arranged in the processor and is used for analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions, generating a judgment result value corresponding to each instruction and other instructions, generating an instruction execution confirmation sequence table through the judgment analysis between each instruction and other instructions, and sending the instruction execution confirmation sequence table to the execution module, wherein the specific mode for generating the instruction execution confirmation sequence table is as follows:
the data dependency relationship refers to a data dependency relationship existing between each instruction and other instructions in the loop, namely, a relationship that a subsequent instruction needs to use a calculation result of a previous instruction, and the data dependency relationship is used for helping to determine which data needs to be pre-read and ensuring the availability of the data when needed;
an instruction control relationship refers to an instruction control relationship existing between each instruction and other instructions in a loop, namely, a relationship that a subsequent instruction needs to be controlled and implemented by a previous instruction, wherein the instruction control relationship comprises forward control and reverse control, the forward control relationship refers to whether the subsequent instruction needs to be controlled and executed by an execution result of the previous instruction, and the reverse control relationship refers to whether the execution of the previous instruction needs to be controlled and executed by an execution result of the subsequent instruction;
it should be noted that, the data dependency relationship and the instruction control relationship are obtained through instruction analysis and static code analysis, which belong to the prior art and are not described in detail;
s1: selecting a first instruction in the initial instruction execution sequence table as a first designated instruction;
s2: defining the data dependency relationship between the first specified instruction and other instructions located behind the first specified instruction in the instruction execution sequence table as a characteristic value corresponding to the instruction, wherein the specific mode of defining the characteristic value is as follows:
the definition of the characteristic value of the data dependency relation is as follows: the data dependency relationship between the first specified instruction and the corresponding instruction is represented by '00', and the data dependency relationship between the first specified instruction and the corresponding instruction is represented by '11';
s3: the corresponding instruction with the characteristic value of '00' is not processed, the control relation between the corresponding instruction with the specific value of '11' and the first appointed instruction is judged, and the specific judgment mode of the control relation is as follows:
when the corresponding instruction with the special value of 11 and the first appointed instruction are in a reverse control relation, multiplying the characteristic value 11 of the corresponding instruction by-1 to obtain a judging result value 11 of the corresponding instruction;
when the corresponding instruction with the special value of 11 and the first specified instruction are in a forward control relation, multiplying the characteristic value of 11 of the corresponding instruction by +1 to obtain a judging result value of +11 of the corresponding instruction;
s4: the method comprises the following steps of re-determining the position relation between each instruction and the first appointed instruction according to the positive and negative of the judgment result value of each instruction and the position relation between the first appointed instruction and each instruction in an initial instruction execution sequence table, further obtaining a first instruction execution judgment sequence table, and obtaining the first instruction execution judgment sequence table by the following specific modes:
when the judgment result value corresponding to the instruction is positive, if the position of the corresponding instruction in the initial instruction execution sequence table is positioned behind the first appointed instruction, the reordering is not needed, and if the position of the corresponding instruction in the initial instruction execution sequence table is positioned in front of the first appointed instruction, the position of the corresponding instruction is inserted behind the position of the first appointed instruction;
when the judgment result value corresponding to the instruction is negative, if the position of the corresponding instruction in the initial instruction execution sequence table is positioned before the first appointed instruction, no reordering is needed, and if the position of the corresponding instruction in the initial instruction execution sequence table is positioned after the first appointed instruction, the position of the corresponding instruction is inserted before the position of the first appointed instruction, so that a first instruction execution judgment sequence table is obtained;
s5: taking a second instruction in the initial instruction execution sequence table as a second instruction, repeating the steps S1-S4, and analyzing and judging the data dependency relationship and the instruction control relationship between the second instruction and other instructions positioned behind the second instruction in the instruction execution sequence table, so as to obtain a second instruction execution judgment sequence table;
and then, carrying out analysis and judgment on the rest subsequent instructions one by one, stopping the judgment and analysis after completing the analysis and judgment on the next-to-last instruction in the execution sequence table, and taking the finally obtained instruction execution judgment sequence table as an instruction execution confirmation sequence table;
it should be noted that, when analyzing each specified instruction in step S2, only the data dependency relationship and the control relationship between other instructions located after each specified instruction and the corresponding specified instruction in the initial instruction execution sequence are analyzed and judged, and the data dependency relationship and the control relationship between other instructions located before the corresponding specified instruction in the initial instruction execution sequence and the corresponding specified instruction are not analyzed;
when the position of each instruction is redetermined in the step S4, the corresponding instruction execution judging sequence table obtained by analyzing the last instruction is used as the basis for analyzing and judging the position relationship between the corresponding specified instruction and the corresponding other instructions, and the situation when the first specified instruction is analyzed is eliminated;
the execution module is arranged in the server and used for determining the pre-reading sequence of the data corresponding to each instruction according to the sequence of each instruction in the instruction execution confirmation sequence table, so that the data which are used in advance can be pre-read from the memory into the cache according to the dependency relationship among the instructions, the subsequent data access time is reduced, and the subsequent data reading rate is improved;
the above steps S1 to S5 are illustrated: when the instructions in the initial instruction execution sequence table are ordered into an instruction 1, an instruction 2, an instruction 3 and an instruction 4, preferentially selecting the instruction 1 as a first specified instruction, wherein no data dependency relationship between the instruction 2 and the first specified instruction generates a corresponding characteristic value quantization result of '00', and the data dependency relationship between the instruction 3 and the first specified instruction generates a corresponding characteristic value quantization result of '11'; the data dependency relationship between the instruction 4 and the first appointed instruction generates a corresponding characteristic value quantization result of 11;
determining control relationships between the instructions 3 and 4 and the first specified instructions, respectively:
when the instruction 3 and the first appointed instruction are in a reverse control relation, multiplying the characteristic value quantization result 11 by-1 to obtain a judgment result value 11 of the instruction 3; when the instruction 4 and the first appointed instruction have a forward control relation, multiplying the characteristic value quantization result 11 by +1 to obtain a judgment result value +11 of the instruction 4;
because the judgment result value of the instruction 4 is positive and the position in the initial instruction execution sequence table is positioned behind the first appointed instruction, the reordering is not needed;
because the judgment result value of the instruction 3 is negative, and the position in the initial instruction execution sequence table is located after the first specified instruction, inserting the position of the instruction 3 into the position before the first specified instruction, and finally obtaining a first instruction execution judgment sequence table as follows: instruction 3, instruction 1, instruction 2, instruction 4;
then, taking the instruction 2 as a second designated instruction according to the sequence in the initial instruction execution sequence table, and judging the dependency relationship and the control relationship between the second designated instruction and the instructions 3 and 4 to obtain a judgment result value '-11' of the instruction 3 and a judgment result value '+11' of the instruction 4;
because the judgment result value of the instruction 4 is positive and the position in the first instruction execution judgment sequence table is positioned behind the second instruction, the reordering is not needed;
because the judgment result value of the instruction 3 is negative, when the position in the first instruction execution judgment sequence table is positioned before the second instruction, the reordering is not needed, and finally, the second instruction execution judgment sequence table is obtained as follows: instruction 3, instruction 1, instruction 2, instruction 4;
then, according to the sequence in the initial instruction execution sequence table, taking the instruction 3 as a third specified instruction, and judging the dependency relationship and the control relationship between the third specified instruction and the instruction 4 to obtain a judging result value "+11" of the instruction 4;
because the judgment result value of the instruction 4 is positive and the position in the second instruction execution judgment sequence table is located behind the second instruction, the reordering is not needed, and finally, the third instruction execution judgment sequence table is obtained as follows: instruction 3, instruction 1, instruction 2 and instruction 4, and then exporting and marking the third instruction execution judgment sequence table as an instruction execution confirmation sequence table;
a method of operating an optimization device for pre-reading data, comprising the steps of:
step one: the method comprises the steps of obtaining and analyzing the required execution times of each instruction in a cycle, and obtaining instruction execution indexes corresponding to each instruction according to analysis results;
step two: generating an initial instruction execution sequence table according to the instruction execution index corresponding to each instruction;
step three: analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions, generating a judgment result value corresponding to each instruction and other instructions, and generating an instruction execution confirmation sequence table through judgment analysis between each instruction and other instructions;
step four: and determining the pre-reading sequence of the data corresponding to each instruction according to the sequence of each instruction in the instruction execution confirmation sequence table.
Example two
As an embodiment two of the present invention, in comparison with the first embodiment, the difference between the technical solution of the present embodiment and the first embodiment is that in the present embodiment, the number of times each data in the memory area is read and written in the cycle is quantitatively analyzed, the pre-read index of each data is calculated, the pre-read data in the memory area is determined according to the pre-read index of each data, and the specific manner of determining the pre-read data in the memory area is as follows:
the memory area is arranged in the memory and is used for storing data;
marking the read times and the write times of each data in the memory area as Di and Xi respectively, wherein i refers to the number of instructions in the loop, and i is more than or equal to 1;
it should be noted that, when the number of times of reading and the number of times of writing of each data in the memory area are obtained, the corresponding time period is a time period of 15 days forward from the time of obtaining the data, and the current data of the obtained data is not counted;
the read times and the write times of each data can be recorded through a counter or a sampler;
calculating and obtaining a pre-reading index Qi of each data through a formula Di x beta 1+xi x beta 2=Qi, sequencing each data from large to small according to the numerical value of the corresponding reading index, and marking the data corresponding to the maximum value of the pre-reading index as preferential reading data, wherein beta 1 and beta 2 are both fixed coefficients, and specific numerical values are all drawn up according to experience by related personnel;
when a plurality of pre-reading instructions are received at the same time, firstly, executing judgment is carried out on the pre-reading instruction corresponding to the priority reading data, so that the pre-reading and optimizing operation of the data can be carried out in the circulation, and the executing efficiency of the program is improved;
the data in the memory area are ordered and marked through the analysis of the read times and the written times of each data, and the data corresponding to the maximum value of the pre-read index is marked as the priority read data, so that the aim of priority reading the priority read data is fulfilled, and the execution efficiency of a program is improved.
Example III
As an embodiment three of the present invention, in the present application, the technical solution of the present embodiment is to combine the embodiment one and the embodiment two for implementation, compared with the embodiment one and the embodiment two.
The working principle of the invention is as follows: the method comprises the steps of obtaining and analyzing the required execution times of all instructions in a loop to obtain instruction execution indexes corresponding to all instructions, generating an initial instruction execution sequence table according to the instruction execution indexes corresponding to all instructions, analyzing the data dependency relationship and instruction control relationship between all instructions and other instructions, generating a judgment result value corresponding to all instructions and other instructions, generating an instruction execution confirmation sequence table according to the judgment analysis between all instructions and other instructions, determining the pre-reading sequence of data corresponding to all instructions according to the sequence of all instructions in the instruction execution confirmation sequence table, optimizing the position of the instructions, optimizing the performance of the program, ensuring that dependent data is available when the data is pre-read, ensuring that the pre-read data can be used when needed, and reading the data which can be used from a memory into a cache in advance.
The above formulas are all formulas with dimensionality removed and numerical calculation, the formulas are formulas with the latest real situation obtained by software simulation through collecting a large amount of data, and preset parameters and threshold selection in the formulas are set by those skilled in the art according to the actual situation.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. An optimizing apparatus for data pre-reading, comprising:
a processor, a memory, and a server; the server is electrically connected with the processor, the processor is electrically connected with the memory, the server can run the program codes in the loops, the processor processes the code instructions, and the memory is used for storing data;
the instruction execution index generation module is arranged in the server and used for acquiring and analyzing the required execution times of each instruction in the loop, acquiring instruction execution indexes corresponding to each instruction according to an analysis result, and sending the instruction execution indexes to the initial instruction execution sequence table generation module, wherein the required execution times of each instruction refer to the times of each instruction in the loop which needs to be executed in the loop, and the required execution times of the instructions can be acquired through the sampler;
the initial instruction execution sequence table generation module is arranged in the server and is used for generating an initial instruction execution sequence table according to the instruction execution indexes corresponding to the instructions and sending the initial instruction execution sequence table to the instruction execution confirmation sequence table generation module;
the instruction execution confirmation sequence table generation module is arranged in the processor and is used for analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions, generating a judgment result value corresponding to each instruction and other instructions, generating an instruction execution confirmation sequence table through the judgment analysis between each instruction and other instructions, and transmitting the instruction execution confirmation sequence table to the execution module;
data dependency relationship refers to the data dependency relationship existing between each instruction and other instructions in the loop, namely, the relationship that the subsequent instruction needs to use the calculation result of the previous instruction;
instruction control relation refers to the instruction control relation existing between each instruction and other instructions in the loop, namely the relation that the subsequent instruction needs to be controlled and implemented through the previous instruction, wherein the instruction control relation comprises forward control and reverse control;
and the execution module is arranged in the server and is used for determining the pre-reading sequence of the data corresponding to each instruction according to the sequence of each instruction in the instruction execution confirmation sequence table.
2. The optimizing device for data pre-reading according to claim 1, wherein the specific manner of obtaining the instruction execution index corresponding to each instruction is:
the required execution times of all instructions in the loop are respectively marked as F1, F2, … … and Fi, wherein i refers to the number of the instructions in the loop, and i is more than or equal to 2;
by the formulaAnd calculating and obtaining an instruction execution index Zi corresponding to each instruction.
3. The optimizing device for data pre-reading according to claim 2, wherein the specific way of generating the initial instruction execution sequence table is:
and ordering the instructions according to the sizes of the instruction execution index Zi values corresponding to the instructions from large to small, so as to generate an initial instruction execution sequence table.
4. The optimizing apparatus for pre-reading data according to claim 3, wherein the specific manner of generating the instruction execution acknowledgement list is:
s1: selecting a first instruction in the initial instruction execution sequence table as a first designated instruction;
s2: defining the data dependency relationship between the first specified instruction and other instructions located behind the first specified instruction in the instruction execution sequence table as a characteristic value corresponding to the instruction, wherein the specific mode of defining the characteristic value is as follows:
the definition of the characteristic value of the data dependency relation is as follows: the data dependency relationship between the first specified instruction and the corresponding instruction is represented by '00', and the data dependency relationship between the first specified instruction and the corresponding instruction is represented by '11';
s3: the corresponding instruction with the characteristic value of '00' is not processed, the control relation between the corresponding instruction with the specific value of '11' and the first appointed instruction is judged, and the specific judgment mode of the control relation is as follows:
when the corresponding instruction with the special value of 11 and the first appointed instruction are in a reverse control relation, multiplying the characteristic value 11 of the corresponding instruction by-1 to obtain a judging result value 11 of the corresponding instruction;
when the corresponding instruction with the special value of 11 and the first specified instruction are in a forward control relation, multiplying the characteristic value of 11 of the corresponding instruction by +1 to obtain a judging result value of +11 of the corresponding instruction;
s4: determining the position relationship between each instruction and the first appointed instruction again according to the positive and negative of the judgment result value of each instruction and the position relationship between the first appointed instruction and each instruction in the initial instruction execution sequence table, so as to obtain a first instruction execution judgment sequence table;
s5: taking a second instruction in the initial instruction execution sequence table as a second instruction, repeating the steps S1-S4, and analyzing and judging the data dependency relationship and the instruction control relationship between the second instruction and other instructions positioned behind the second instruction in the instruction execution sequence table, so as to obtain a second instruction execution judgment sequence table;
and then, carrying out analysis and judgment on the rest subsequent instructions one by one, stopping the judgment and analysis after the analysis and judgment on the next-to-last instruction in the execution sequence table are completed, and taking the finally obtained instruction execution judgment sequence table as an instruction execution confirmation sequence table.
5. The optimizing apparatus for pre-reading data according to claim 4, wherein the specific manner of obtaining the first instruction execution judgment sequence table is:
when the judgment result value corresponding to the instruction is positive, if the position of the corresponding instruction in the initial instruction execution sequence table is positioned behind the first appointed instruction, the reordering is not needed, and if the position of the corresponding instruction in the initial instruction execution sequence table is positioned in front of the first appointed instruction, the position of the corresponding instruction is inserted behind the position of the first appointed instruction;
when the judgment result value corresponding to the instruction is negative, if the position of the corresponding instruction in the initial instruction execution sequence table is positioned before the first appointed instruction, the reordering is not needed, and if the position of the corresponding instruction in the initial instruction execution sequence table is positioned after the first appointed instruction, the position of the corresponding instruction is inserted before the position of the first appointed instruction, and then the first instruction execution judgment sequence is obtained.
6. The optimizing apparatus for pre-reading data according to claim 5, wherein in the analyzing of each of the specified instructions in step S2, only the data dependency and the control relationship between the other instructions located after each of the specified instructions in the initial instruction execution sequence table and the corresponding specified instructions are analyzed and judged, and the data dependency and the control relationship between the other instructions located before the corresponding specified instructions in the initial instruction execution sequence table and the corresponding specified instructions are not analyzed.
7. The optimizing apparatus for pre-reading data according to claim 5, wherein in step S4, when the positions of the respective instructions are redetermined, the corresponding instruction execution judgment sequence table obtained by analyzing the previous instruction is used as a basis for analyzing and judging the positional relationship between the corresponding specified instruction and the corresponding other respective instructions, and the case when the first specified instruction is analyzed is excluded.
8. A method of operating an optimization device for pre-reading data, comprising the steps of:
step one: the method comprises the steps of obtaining and analyzing the required execution times of each instruction in a cycle, and obtaining instruction execution indexes corresponding to each instruction according to analysis results;
step two: generating an initial instruction execution sequence table according to the instruction execution index corresponding to each instruction;
step three: analyzing the data dependency relationship and the instruction control relationship between each instruction and other instructions, generating a judgment result value corresponding to each instruction and other instructions, and generating an instruction execution confirmation sequence table through judgment analysis between each instruction and other instructions;
step four: and determining the pre-reading sequence of the data corresponding to each instruction according to the sequence of each instruction in the instruction execution confirmation sequence table.
CN202410145958.3A 2024-02-02 Optimization device for data pre-reading and operation method Active CN117707995B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410145958.3A CN117707995B (en) 2024-02-02 Optimization device for data pre-reading and operation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410145958.3A CN117707995B (en) 2024-02-02 Optimization device for data pre-reading and operation method

Publications (2)

Publication Number Publication Date
CN117707995A true CN117707995A (en) 2024-03-15
CN117707995B CN117707995B (en) 2024-04-26

Family

ID=

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5704053A (en) * 1995-05-18 1997-12-30 Hewlett-Packard Company Efficient explicit data prefetching analysis and code generation in a low-level optimizer for inserting prefetch instructions into loops of applications
US6981129B1 (en) * 2000-11-02 2005-12-27 Intel Corporation Breaking replay dependency loops in a processor using a rescheduled replay queue
CN108027767A (en) * 2015-09-19 2018-05-11 微软技术许可有限责任公司 Register read/write-in sequence
CN111796869A (en) * 2020-09-07 2020-10-20 华夏芯(北京)通用处理器技术有限公司 Program instruction block processing method and device
CN117348935A (en) * 2023-10-11 2024-01-05 腾讯科技(深圳)有限公司 Instruction processing method, apparatus, computer device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5704053A (en) * 1995-05-18 1997-12-30 Hewlett-Packard Company Efficient explicit data prefetching analysis and code generation in a low-level optimizer for inserting prefetch instructions into loops of applications
US6981129B1 (en) * 2000-11-02 2005-12-27 Intel Corporation Breaking replay dependency loops in a processor using a rescheduled replay queue
CN108027767A (en) * 2015-09-19 2018-05-11 微软技术许可有限责任公司 Register read/write-in sequence
CN111796869A (en) * 2020-09-07 2020-10-20 华夏芯(北京)通用处理器技术有限公司 Program instruction block processing method and device
CN117348935A (en) * 2023-10-11 2024-01-05 腾讯科技(深圳)有限公司 Instruction processing method, apparatus, computer device and storage medium

Similar Documents

Publication Publication Date Title
CN109242105B (en) Code optimization method, device, equipment and medium
JP2024023651A5 (en) Computer systems and computer programs for machine learning
CN109891438B (en) Numerical quantum experiment method and system
CN108595815B (en) Artificial intelligence body training system and passive circuit optimization design system and method
CN110704336B (en) Data caching method and device
KR102161192B1 (en) Method and apparatus for data mining from core trace
CN109242099A (en) Training method, device, training equipment and the storage medium of intensified learning network
CN111191879A (en) Comprehensive evaluation method and system
CN113792920A (en) Hospital treatment sequence optimization method and device for single-examination room
CN112200310B (en) Intelligent processor, data processing method and storage medium
CN117707995B (en) Optimization device for data pre-reading and operation method
CN117707995A (en) Optimization device for data pre-reading and operation method
CN107967335B (en) Distributed SQL processing method and system
CN110888909B (en) Data statistical processing method and device for evaluation content
CN114492251B (en) Low-speed flow field divergence processing method, device, equipment and medium in supercomputing environment
CN116382622A (en) Tensor data processing method based on tensor calculation core and tensor calculation core
CN114021733B (en) Model training optimization method, device, computer equipment and storage medium
CN111552652B (en) Data processing method and device based on artificial intelligence chip and storage medium
CN111523685B (en) Method for reducing performance modeling overhead based on active learning
CN114201369A (en) Server cluster management method and device, electronic equipment and storage medium
CN106933665A (en) The method for predicting MPI program runtimes
CN117195568B (en) Simulation engine performance analysis method and device based on discrete event
CN110298742B (en) Data processing method and device
CN110347506B (en) Data processing method and device based on LSTM, storage medium and electronic equipment
CN113835984B (en) Many-core application performance evaluation method based on domestic super-computing micro-architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant