CN110178123B - Performance index evaluation method and device - Google Patents

Performance index evaluation method and device Download PDF

Info

Publication number
CN110178123B
CN110178123B CN201780083763.9A CN201780083763A CN110178123B CN 110178123 B CN110178123 B CN 110178123B CN 201780083763 A CN201780083763 A CN 201780083763A CN 110178123 B CN110178123 B CN 110178123B
Authority
CN
China
Prior art keywords
vector
instruction
feature matrix
column
columns
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780083763.9A
Other languages
Chinese (zh)
Other versions
CN110178123A (en
Inventor
程捷
朱冠宇
赵俊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN110178123A publication Critical patent/CN110178123A/en
Application granted granted Critical
Publication of CN110178123B publication Critical patent/CN110178123B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment

Abstract

A performance index evaluation method and a device thereof are provided, the method comprises the following steps: acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments (301); counting the base in each instruction segment in N instruction segmentsThe jump probability between every two blocks of the block forms N jump matrixes (302) with M rows and M columns; converting N jump matrixes with M rows and M columns into M2A first feature matrix A (303) of rows and N columns; selecting p column vectors in the first feature matrix A, combining the p column vectors to form M2Second feature matrix A of rows and columnsp(304) (ii) a Respectively sending the p instruction fragments to the simulator, and receiving simulation results of the p instruction fragments from the simulator to determine performance index vectors C of the p instruction fragmentsp(305) (ii) a According to the performance index vector CpA first feature matrix A and a second feature matrix ApA performance level vector C of the N instruction fragments is determined (306). Through the method, the performance index of each instruction segment of the test program can be effectively evaluated.

Description

Performance index evaluation method and device
Technical Field
The present application relates to the field of computers, and more particularly, to a performance index evaluation method and apparatus.
Background
Personnel engaged in design and development of processor architecture often need to run a test program in a simulator of a certain architecture and then collect data indexes of relevant performance, such as Instruction number Per Cycle (IPC), hit rate of Level 2Cache (L2 Cache), energy consumption and the like, so as to find the bottleneck of the current processor architecture. After the system architecture is improved, the design is redeployed in the simulator, the simulator is used again to run the test program, data are collected, the performance difference of the same test program which runs under the new and old system architectures is compared, then the bottleneck is found again. It can be seen that a great deal of design test work is done with a software simulator before deployment in hardware.
However, one of the major disadvantages of the software simulation platform is: the same test program is run at a much longer runtime than the hardware platform. Especially when running large, comprehensive test suite programs, such as SPEC CPU 2006, it often takes weeks or even months to get the data needed by itself. And after the system architecture is changed every time, the test program needs to be operated again to collect data under the new architecture. Therefore, such repeated operations and waits will seriously affect the development efficiency.
Disclosure of Invention
The application discloses a performance index evaluation method and a performance index evaluation device, which are based on the linear transformation of a local simulation result and a local jump matrix, evaluate the performance index of each instruction segment in an instruction stream by locally simulating the selected instruction segment, can realize the effective evaluation of the performance index of each instruction segment of a test program, save the simulation cost and shorten the simulation time.
In a first aspect, an embodiment of the present application provides a performance index evaluation method, including: obtaining an instruction stream and dividing the instruction stream into N instruction segments, counting the jump probability between every two basic blocks in each instruction segment in the N instruction segments to form N jump matrixes with M rows and M columns, wherein M is the type of the basic block, and converting the N jump matrixes with M rows and M columns into M jump matrixes2A first feature matrix A with N rows and columns, wherein each column of the first feature matrix A represents a jump matrix of an instruction fragment, p column vectors are selected in the first feature matrix A, and the p column vectors are combined to form M2Second feature matrix A of rows and columnspWherein, p column vectors represent jump matrixes of p instruction fragments, the p instruction fragments are respectively sent to the simulator, and simulation results of the p instruction fragments are received from the simulator to determine the p instruction fragmentsPerformance indicator vector C for instruction fragmentpDetermining a performance indicator vector C of the N instruction segments according to the first feature matrix A and an indicator contribution vector Y, wherein the indicator contribution vector Y represents the performance indicator vector CpAnd a second feature matrix ApA linear relationship therebetween.
The selected small part of the instruction fragments are simulated, and the linear relation between the simulation result and the second characteristic matrix of the part of the instruction fragments, namely the index contribution degree vector Y, is obtained, and the linear relation can be theoretically suitable for the first characteristic matrix of all the instruction fragments and the simulation result of all the instruction fragments, so that the simulation result of all the instruction fragments is evaluated according to the index contribution degree vector Y and the characteristic matrix of all the instruction fragments, the simulation result of all the instruction fragments is predicted according to the simulation result of the small part of the instruction fragments, the simulation cost can be greatly saved, and the simulation time is shortened.
With reference to the first aspect, in a first possible implementation manner of the first aspect, determining a performance indicator vector C of the N instruction segments according to the first feature matrix a and the indicator contribution degree vector Y specifically includes: according to equation Ap TY=CpDetermining an index contribution vector Y according to equation ATY-C determines a performance indicator vector C for the N instruction fragments.
Since the performance index vector C includes the performance indexes of the N instruction fragments, effective evaluation of the performance index of each instruction fragment of the test program can be achieved.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, selecting p column vectors from a in the first feature matrix specifically includes: averaging each row of the first feature matrix A to obtain a column vector B, selecting p column vectors in the first feature matrix A that are suitable for fitting the column vector B, wherein,
Figure GDA0002583354810000021
h is the p column vector and D is the coefficient vector required for the fit.
When p column vectors are selected from a in the first feature matrix, if the selected p column vectors can linearly express column vector B of the mean value, it indicates that the selected p column vectors are approximate to the mean value level, and is more suitable for expressing a in the first feature matrix.
With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the method is performed according to equation ap TY=CpDetermining an index contribution degree vector Y, specifically comprising: vector C of performance indicatorspAnd carrying out inner product operation on the sum coefficient vector D to obtain the total performance index value c of the instruction stream.
Coefficient vector D reflects the length of the column vector suitable for expressing A in the first feature matrix, and performance index vector CpAnd performing inner product operation on the sum coefficient vector D to obtain a total performance index value c.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the method is performed according to equation ap TY=CpDetermining an index contribution degree vector Y, specifically comprising: using the performance index vector CpAnd the total performance index value c is taken as a limiting condition, and the second feature matrix A is usedpAnd determining a contribution degree vector Y according to a compressed sensing algorithm as an input parameter.
In a possible implementation manner of the first aspect, a jump matrix of N M rows and M columns is converted into M2The first feature matrix a with N rows and columns specifically includes: multiplying the data of each column of the N M rows and M columns of the jump matrix by the weight of the basic block to form N M rows and M columns of feature matrices, and converting the N M rows and M columns of feature matrices into N M rows and M columns of feature matrices2Column vector of row 1 column, N M2The feature vectors of row 1 and column are combined into M2And the basic block weight is the ratio of the number of the instruction segments where the basic blocks represented by the data in each column of the N jump matrices with M rows and M columns are located to the number of all the basic blocks in the instruction segments where the basic blocks are located.
Since the weight of the basic block is further considered in the constructed feature matrix and the sequence of the basic block is combined, the feature matrix can describe the features of the basic block from the above two aspects at the same time, thereby further improving the accuracy of evaluation.
In a possible implementation manner of the first aspect, the simulation result includes at least one of a number of instructions per cycle, a branch prediction success rate, a branch prediction failure rate, a second level cache hit rate, and energy consumption.
In a second aspect, an embodiment of the present application provides a performance index evaluation apparatus, including: the device comprises an instruction stream segmentation module, a jump matrix generation module and a first characteristic matrix acquisition module, wherein the instruction stream segmentation module is used for acquiring an instruction stream and segmenting the instruction stream into N instruction segments, the jump matrix generation module is used for counting the jump probability between every two basic blocks in each instruction segment in the N instruction segments and forming N jump matrixes with M rows and M columns, M is the type of the basic blocks, and the first characteristic matrix acquisition module is used for converting the N jump matrixes with M rows and M columns into M2A first feature matrix A with N rows and columns, wherein each column of the first feature matrix A represents a jump matrix of an instruction segment, and a second feature matrix acquisition module for selecting p columns of vectors in the first feature matrix A and combining the p columns of vectors to form M2Second feature matrix A of rows and columnspWherein, the p column vectors represent jump matrixes of p instruction segments, and the first performance index vector acquisition module is used for respectively sending the p instruction segments to the simulator and receiving simulation results of the p instruction segments from the simulator to determine the performance index vectors C of the p instruction segmentspA second performance index vector obtaining module, configured to determine a performance index vector C of the N instruction segments according to the first feature matrix a and an index contribution vector Y, where the index contribution vector Y represents the performance index vector CpAnd a second feature matrix ApA linear relationship therebetween.
Any implementation manner of the second aspect or the second aspect is an apparatus implementation manner corresponding to any implementation manner of the first aspect or the first aspect, and the description in any implementation manner of the first aspect or the first aspect is applicable to any implementation manner of the second aspect or the second aspect, and is not described herein again.
In a third aspect, an embodiment of the present application provides a performance index evaluation apparatus, which includes a processor and a memory, where the memory stores program instructions, and the processor executes the program instructions to perform the first aspect and the steps of various possible implementation manners of the first aspect.
In a fourth aspect, there is provided a computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of the above aspects.
In a fifth aspect, a computer program product is provided, which, when run on a computer, causes the computer to perform the method of the above aspects.
Drawings
FIG. 1 is a diagrammatic illustration of a segment of instructions according to an embodiment of the present application;
fig. 2 is a schematic diagram of an application scenario to which the performance index evaluation method provided in the embodiment of the present application is applied;
FIG. 3 is a schematic flow chart diagram of a performance indicator evaluation method according to an embodiment of the application;
FIG. 4 is a schematic diagram of a sliced instruction stream according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a hopping matrix according to an embodiment of the present application;
FIG. 6 is a diagram illustrating a jump matrix after weighting processing according to an embodiment of the present application;
FIG. 7 is a schematic diagram of column vectors obtained after shift processing is performed on a jump matrix according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a first feature matrix and a column vector obtained by averaging each row of the first feature matrix according to an embodiment of the present application;
FIG. 9 is a schematic diagram of column vector fitting according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a second feature matrix according to an embodiment of the present application;
FIG. 11 is a performance indicator vector C according to an embodiment of the present applicationpA schematic diagram;
FIG. 12 is a diagram of a performance indicator vector C according to an embodiment of the present application;
FIG. 13 shows pick A according to an embodiment of the present applicationpA schematic flow diagram of the method of (1);
fig. 14 shows a schematic flow chart of a method of picking a column vector from a first feature matrix a in S402 according to an embodiment of the present application;
FIG. 15 is a fitting graph of column vector B according to an embodiment of the present application;
FIG. 16 shows deletion of vector A from Z according to a constraint in S4025 according to an embodiment of the present applicationjA schematic flow diagram of the method of (1);
FIG. 17 shows a schematic flow diagram of a method of solving for Y in accordance with an embodiment of the present application;
FIG. 18 is a schematic diagram showing a data curve after rearrangement of BB blocks;
FIG. 19 shows a schematic representation of the data curve of FIG. 18 after a wavelet basis transform (Fourier transform);
fig. 20 is a sub-flowchart of solving for Y in S605;
FIG. 21 is a schematic diagram of an apparatus structure of a performance index evaluation apparatus according to an embodiment of the present application;
fig. 22 is a schematic hardware configuration diagram of a performance index evaluation apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
To facilitate understanding of the embodiments of the present application, several elements that will be introduced in the description of the embodiments of the present application are first introduced herein.
An instruction stream file: the file for recording the instruction stream information is called an instruction stream file, and each line of the instruction stream file represents information related to an executed instruction and conforms to a uniform format. Typically, the instruction stream file size is fixed. For example, an instruction stream file typically contains information such as 1 million instructions, and a complete test program can be regarded as an instruction stream composed of a plurality of 1 million instructions, each 1 million instruction being referred to as an instruction fragment, in other words, a complete test program is composed of a plurality of instruction fragments. The full set of test programs is all instruction fragments and the subset of test programs is part of instruction fragments. The purpose of simplifying the test program is to select representative partial instruction segments from the corpus, and it is required that the fewer the number of the selected instruction segments are, the better the running results of the segments are similar to the running results of the original test program. An instruction may include the following information: program pointer, assembly instruction, operation type, and memory address. The memory address is a selectable item.
And (3) program pointer: the program pointer for each line of instructions is the address in memory of the line assembler instruction, which is a hexadecimal number beginning with "0 x".
Assembling instructions: the binary instruction code of the instruction needs to meet assembly syntax requirements.
The operation type is as follows: all assembly instructions can be divided into three categories: the arithmetic logic unit operates, reads the memory, writes the memory and controls the instruction.
Memory address: if the instruction is an arithmetic logic unit operation, no memory address information is needed; if the instruction is a memory read-write operation, a memory address is required.
Basic Block (BB): a piece of sequentially executed instructions. In general, an instruction stream may be divided into a plurality of BBs with a control instruction as a boundary. Each segment of the test program is composed of BB.
Specifically, the control instruction may be a jump instruction, for example, a jump instruction in an assembly language such as JMP, JE, JNE, JZ, JNZ, JS, JNS, JC, and JNC.
Basic Block feature indicator Vector (BBV): according to different control instructions, the execution times of BB of different types in each segment are counted, and a vector constructed based on the types and the execution times of the BB is called BBV. FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application. As shown in the instruction code fragment of FIG. 1, if the type ID of BB is {1,2,3,4,5}, and the corresponding execution times is {1,20,0,5,0}, then BBV of the fragment can be recorded as {1:1,2:20,3:0,4:5,5:0 }.
Fig. 2 is a schematic diagram illustrating an application scenario to which the method for index estimation provided in the embodiment of the present application is applied, as shown in fig. 2, the test system includes a performance index evaluation apparatus 10 and a simulator 20.
While the test program is running on the performance index evaluation apparatus 10, the performance index evaluation apparatus 10 fetches the binary code of the test program and stores the binary code. The performance index evaluation device 10 selects an Instruction segment from the test program and sends the Instruction segment to the simulator 20 for simulation test, and in the simulation test, the simulator 20 runs the Instruction segment and obtains a simulation result, for example, an Instruction count Per Cycle (IPC), a branch prediction success rate, a branch prediction failure rate, a secondary cache hit rate, energy consumption, and the like. Then, the simulator 20 sends the simulation result to the performance index evaluation device 10. The performance index evaluation device 10 performs performance index evaluation based on the simulation result.
The processing device 101 may be a computer or an integrated circuit. The simulator 20 may be a simulator or a software simulator, which is not limited in this application.
Fig. 3 shows a schematic flow chart of a method for index estimation according to an embodiment of the present application, the method being executed by the performance index evaluation device 10 in fig. 2, as shown in fig. 3, and the method including:
step 301: an instruction stream of a test program is obtained and the instruction stream is divided into N instruction fragments.
Specifically, reference may be made to fig. 4, where fig. 4 is a schematic view of a command stream after splitting according to an embodiment of the present application, and for convenience of description, in this embodiment, N is 6.
It is noted that in other embodiments of the present application, N may be any positive integer.
Also, in this step, the performance index evaluation device 10 may divide the instruction stream according to the number of instructions, for example, the performance index evaluation device 10 may divide the instruction stream into N instruction fragments in units of 1 hundred million instructions in the instruction stream, that is, each instruction fragment includes 1 hundred million instructions.
Step 302: and counting the jump probability between every two basic blocks in each instruction segment in the N instruction segments to form N jump matrixes with M rows and M columns, wherein M is the type of the basic blocks.
Specifically, reference may be made to fig. 5, where fig. 5 is a schematic diagram of a skip matrix according to an embodiment of the present application, and for convenience of description, in this embodiment, a type M of a basic block is 3, that is, 3 types of basic blocks are correspondingly set for each segment.
It is noted that M may be any positive integer in other embodiments of the present application.
As shown in fig. 5, the basic block has 3 kinds, BB1, BB2, and BB3, respectively, and E1 to E6 respectively represent jump matrices of instruction fragments 1 to 6. The jump matrix is a Markov matrix, the sum of each row is 1, and the sum of each column is 1.
Taking the jump matrix E1 as an example, a value of 0.3 at 1 column and 1 row of the jump matrix E1 indicates that the probability of the BB1 jumping to BB1 is 30%, a value of 0.2 at 1 column and 2 column of E1 indicates that the probability of the BB1 jumping to BB2 is 20%, and a value of 0.5 at 1 column and 3 row of E1 indicates that the probability of the BB1 jumping to BB3 is 50%.
In other examples, a value of 0.3 at row 1 column of the hopping matrix E1 indicates that the probability of BB1 hopping to BB1 is 30%, a value of 0.4 at row 1 column of E1 indicates that BB1 hopping to BB2 is 40%, and a value of 0.3 at row 1 column of E1 indicates that BB1 hopping to BB3 is 30%.
The numerical values in the second column of E1 represent the probabilities of BB2 jumping to BB1, BB2, and BB3, respectively, and the numerical values in the third column of E1 represent the probabilities of BB3 jumping to BB1, BB2, and BB 3.
E2 to E6 are similar to E1, and the jump probability between basic blocks is different due to the difference of instruction fragments, and specific values can be seen in fig. 5, which is not described herein for brevity.
In the embodiment of the application, the jump matrix completely reflects the sequence information of the basic block of the instruction fragment, and the accuracy of performance index evaluation can be greatly improved due to the introduction of the sequence information of the basic block.
Step 303: converting N jump matrixes with M rows and M columns into M2And a first feature matrix A with N rows and columns, wherein each column of the first feature matrix A represents a jump matrix of one instruction segment.
Optionally, in this step, the proportion of the basic blocks in the instruction fragment may be further introduced, so that the order of the basic blocks and the proportion of the basic blocks are combined to perform comprehensive performance index evaluation, and the accuracy of performance index evaluation may be greatly improved.
Specifically, the data of each column of the N M rows and M columns of the hopping matrix may be multiplied by the basic block weight, respectively, to form N M rows and M columns of the feature matrix. Referring to fig. 6, fig. 6 is a schematic diagram of a jump matrix after weighting processing according to an embodiment of the present application, and fig. 6 shows feature matrices F1 to F6 obtained after weighting processing is performed on E1 to E6, respectively.
For F1, assume that the basic block weight is the ratio of the number of the instruction segment 1 in which the basic block BB1 represented by the column of the data in the first column of the jump matrix E1 is located to the number of all the basic blocks of the instruction segment in which the basic block BB1 is located to the basic block weight is 0.1, the ratio of BB2 is 0.4, and the ratio of BB3 is 0.5, wherein the above ratio is the basic block weight.
The weighted jump matrix F1 is obtained by multiplying the first column data of E1 by the basic block weight 0.1 of BB1, the second column data of E1 by the basic block weight 0.4 of BB2, and the third column data of E1 by the basic block weight 0.5 of BB 3.
Similarly, in the instruction fragment 2, the basic block weight of BB1 is 0.2, the basic block weight of BB2 is 0.4, and the basic block weight of BB3 is 0.4, so that F2 is obtained after weighting E2.
In the instruction fragment 3, since the basic block weight of BB1 is 0.3, the basic block weight of BB2 is 0.1, and the basic block weight of BB3 is 0.5, F3 is obtained by weighting E3. Assuming that the basic block weight of BB1, BB2, and BB3 in instruction fragment 4 are 0.5, 0.2, and 0.3, respectively, F4 is obtained by weighting E4. Assume that, in the instruction fragment 5, the basic block weight of BB1 is 0.3, the basic block weight of BB2 is 0.4, and the basic block weight of BB3 is 0.3, and therefore, F5 is obtained by weighting E5. Assume that, in the instruction fragment 6, the basic block weight of BB1 is 0.2, the basic block weight of BB2 is 0.2, and the basic block weight of BB3 is 0.6, and therefore F6 is obtained by weighting E2.
Further, in this step, N feature matrices of M rows and M columns may be converted into N M feature matrices of M rows and M columns2Specifically, referring to fig. 7, fig. 7 is a schematic diagram of a column vector obtained after performing shift processing on the jump matrix according to an embodiment of the present application, and as shown in fig. 7, taking a column vector a1 as an example, the second column data of F1 shown in fig. 6 may be shifted to be below the first column data, and the third column data may be shifted to be below the second column data, so as to form a single column vector a1, and similarly, a2 to a6 are obtained in a similar manner.
Further, in this step, N M are added2The feature vectors of row 1 and column are combined into M2Referring to fig. 8, fig. 8 is a schematic diagram of a first feature matrix a with N rows and columns, and a column vector obtained by averaging each row of the first feature matrix according to an embodiment of the present application, where a shown in fig. 8 is a combination of column vectors a1 to a6 of fig. 1.
Step 304: selecting p column vectors in the first feature matrix A, combining the p column vectors to form M2Second feature matrix A of rows and columnspWherein the p column vectors represent a jump matrix of p instruction fragments;
in this step, specifically, each row of the first feature matrix a may be averaged to obtain a column vector B (as shown in fig. 8), and p column vectors suitable for fitting the column vector B are selected from the first feature matrix a, where the p column vectors satisfy the following condition:
Figure GDA0002583354810000061
h is the p column vector and D is the coefficient vector required for the fit.
Referring to fig. 9 in particular, fig. 9 is a schematic diagram of column vector fitting according to an embodiment of the present application, and as shown in fig. 9, D1, D2, and D3 are coefficient values, which are real numbers and represent fitting lengths.
In the present embodiment, the column vector selection is implemented by selecting a vector that can linearly express the column vector B from among the column vectors a1 to a6, and in the present embodiment, it is assumed that a1, a2, and A3 can fit the column vector B.
Thus, in this embodiment, p may be 3.
Referring to fig. 10, fig. 10 is a schematic diagram of a second feature matrix according to an embodiment of the present application, where the second feature matrix a of fig. 10 is obtained by combining the column vectors a1, a2, and A3 shown in fig. 9p
Step 305: respectively sending the p instruction fragments to the simulator, and receiving simulation results of the p instruction fragments from the simulator to determine performance index vectors C of the p instruction fragmentsp
Specifically, since the column vectors a1, a2, and A3 were picked in step 305, the instruction fragments 1,2, and 3 corresponding to the column vectors a1, a2, and A3 may be sent to the emulator for emulation.
For example, assuming the simulation result is the number of instructions per cycle, which is 2.1, 1.7 and 2.3, respectively, the performance indicator vector C can be constructed according to the simulation resultp[2.1,1.7,2.3]T
Step 306: and determining a performance index vector C of the N instruction segments according to the first feature matrix A and the index contribution degree vector Y. Wherein, the index contribution degree vector Y represents the performance index vector CpAnd a second feature matrix ApA linear relationship therebetween.
Specifically, in this step, equation A may be followedp TY=CpDetermining M2An index contribution vector Y of the line jump feature according to equation ATY-C determines a performance indicator vector C for the N instruction fragments.
Referring now to FIG. 11, FIG. 11 is a performance indicator vector C according to an embodiment of the present disclosurepSchematic view, FIG. 11 shows Ap TY and CpThe relation between them, as shown in FIG. 1, Ap TY=CpWherein Y represents a second feature matrix ApIn each row M2An indicator contribution vector for individual line jump feature, in this example, M2=32=9。
That is, Y is represented by ApIn (1), each row and the actual simulation result CpThe linear relationship of (c).
Alternatively, Y may be obtained specifically by:
vector C of performance indicatorspPerforming inner product operation on the coefficient vector D obtained in the step 304 to obtain a total performance index value c of the instruction stream; using the performance index vector CpAnd the total performance index value c is taken as a limiting condition, and the second feature matrix A is usedpAnd determining a contribution degree vector Y according to a compressed sensing algorithm as an input parameter.
After Y is obtained, it can be further according to equation ATY ═ C is determined as the performance index vector C of 6 instruction fragments, which can be referred to specifically with reference to fig. 12, fig. 12 is a schematic diagram of the performance index vector C according to the embodiment of the present application, and in fig. 12, the performance indexes C1, C2, C3, C4, C5, and C6 of each instruction fragment in the instruction stream can be obtained by multiplying the transposed first feature matrix a by Y obtained in this step.
In summary, in the embodiment of the present application, the first feature matrix is obtained by skipping the matrix, and the vectors are selected from the first feature matrix to form the second feature matrix, sending the instruction segment corresponding to the second feature matrix to a simulator for local simulation, acquiring the performance index of each instruction segment according to the simulation result, the second feature matrix and the first feature matrix, since the jump matrix reflects the order between the basic blocks in the embodiment of the present application, the first feature matrix includes information reflecting the order of the basic blocks, moreover, the simulator only needs to simulate partial instruction fragments, by using the linear relation between the simulation result of the partial instruction fragments and the second characteristic matrix, the performance index of each instruction segment can be further obtained according to the linear relation and the first characteristic matrix, so that the simulation cost can be saved, and the simulation time can be shortened.
A specific application scenario is listed below for pick A shown in step 304pFor further details:
referring first to FIG. 13, FIG. 13 shows pick A according to an embodiment of the present applicationpA schematic flow chart of the method of (1), as shown in FIG. 13, pick ApCan be determined by the following steps.
S401, initializing a residual error R as B, Z as a null set and J as A;
s402, selecting column vectors from A and adding the column vectors into Z;
s403, judging whether a convergence condition is met, and if the convergence condition is not met, jumping to S402; if not, calculating the weight of Z to obtain the sparse solution vector X1
Alternatively, fig. 14 shows a schematic flowchart of a method for picking column vectors from the first feature matrix a in S402 according to an embodiment of the present application, and as shown in fig. 14, the picking column vectors from a and adding to Z may be implemented by the following steps.
S4021, calculating a correlation coefficient between J and R;
s4022, converting the vector A of the maximum correlation coefficientiAs the advancing direction u, AiAdding Z;
s4023, walking a first distance U along U according to a first strategy;
this can be understood in conjunction with fig. 15, where fig. 15 is a fitting graph of column vectors B according to an embodiment of the present application, and in fig. 15, a1*D1=U,D1For real number, the fitting error between U and B is R, i.e. U-B ═ R, then R is used as a new vector to be fitted, and A close to R is further searchedi*DiUntil the fitting error meets an acceptable condition, the selected column vector is the required column vector. Among them, acceptable conditions are, for example, B and AiIs smaller than a preset threshold value, for example 1 deg..
S4024, judging whether the time is greater than t, and if yes, jumping to S4025; if not, jumping to S4026;
s4025, removing vector A from Z according to constraintsjAnd jumping to S4026;
s4026, calculating an expression Y closest to B;
s4027, updating the residual R ═ B-Y.
Alternatively, fig. 16 shows that vector a is deleted from Z according to a constraint in S4025 according to an embodiment of the present applicationjAs shown in fig. 16, S4025 deletes vector a from Z according to constraintsjThis can be achieved by the following steps.
S40251, determining A by using least square methodiAnd B the closest fit Y;
s40252, calculating the value of the objective function i (x) with each vector removed;
s40253, deleting the vector A corresponding to the minimum objective function valuej
It will be appreciated that the basic idea of the algorithm in figures 13 to 16 is to find a set of vectors, linearly fitting B with all the vectors in the set of vectors, but subject to the simulator usage time, i.e. the total time cannot be greater than t.
Further, a specific application scenario is listed below to further describe the method for solving Y in step 306.
Referring first to fig. 17, fig. 17 shows a schematic flow chart of a method for solving for Y according to an embodiment of the present application, and as shown in fig. 17, the method for solving for Y may be determined by the following steps.
S501: and sorting the types of the basic blocks BB of each instruction segment according to the sequence of the predefined index values from large to small.
Optionally, the predefined metric value is at least one of a CPI, a cache hit error rate (cache miss), and a branch prediction failure rate (branch miss).
For example, fig. 18 shows a schematic diagram of a data curve after BB block rearrangement, as shown in fig. 18, an instruction segment includes 1000 BB blocks, and the 1000 BB blocks are monotonically arranged according to the significance (e.g., CPI, etc.) of BB, resulting in the data curve shown in fig. 18.
S502: determining an optimal wavelet basis matrix Ψ, wherein Ψ is of size M2×M2The wavelet basis matrix of (1).
Specifically, determining the radix matrix Ψ can be determined in the following two ways.
The first method is as follows: determining a smooth performance curve of the basic blocks BB of each instruction segment according to the types and the jump probabilities of the sequenced basic blocks BB, wherein the smooth performance curve is monotonous, and determining the wave-base matrix psi according to the monotonous smooth performance curve;
the second method comprises the following steps: and determining the radix matrix psi according to the experimental result.
Alternatively, the determination of the wave-basis matrix according to the experimental result may include the following two ways:
the first method is as follows: performing an experiment according to the execution delay of the instruction to determine a radix matrix psi;
the second method comprises the following steps: the radix matrix Ψ was determined by a BB instruction test experiment.
For example, fig. 19 shows a schematic diagram of the data curve shown in fig. 18 after wavelet basis transform (fourier transform), and it can be seen that there are few positions where the frequency domain coefficients on the wavelet basis are non-zero after the wavelet basis transform (fourier transform).
S503: according to the predefined index value pair, the second feature matrix ApSorting is carried out;
s504: according to the second feature matrix ApThe performance index vector CpAnd the wave base matrix psi, establishing a constraint compressed sensing model.
Compressed Sensing (also called Compressive Sampling) and Sparse Sampling (Sparse Sampling), by exploiting the coefficient characteristics of the signal, discrete samples of the signal are obtained by random Sampling under the condition of far less than the Nyquist Sampling rate, and then the signal is perfectly reconstructed by a nonlinear reconstruction algorithm.
In the embodiment of the application, a compressed sensing method is utilized in index estimation of the instruction segment, and index estimation of each instruction segment is obtained by utilizing a small number of discrete BB blocks and adopting the compressed sensing method.
In this step, the performance index evaluation device uses the feature matrix A, the number N of instruction segments and the performance index vector CpSetting a total performance index value c and an optimal wavelet base psi as input parametersAnd optimizing a variable column vector Z, and establishing a constrained compressed sensing model as follows:
Figure GDA0002583354810000097
Figure GDA0002583354810000091
Figure GDA0002583354810000092
where Ψ is of size M2×M2The wavelet basis matrix of (1), Ψ Z ═ Y, c is an index of the entire test procedure, and is a real number.
Where I is a1 matrix (i.e., a matrix with all elements of 1), the size is 1 × N, N is the number of segments, and the optimization objective is set according to the sparse rows of Z.
Optimizing the target: the number of nonzero coefficients in Z is minimum (namely the 1 norm of Z is minimum);
constraint 1:
Figure GDA0002583354810000093
according to the assumption that the feature vector Y-psi S is a smooth curve, a sparse vector S is obtained under the optimized wavelet basis psi.
Constraint 2:
Figure GDA0002583354810000094
here, the calculated mean value of the indicators for each instruction segment is required to be equal to the total performance indicator value.
S505: solving the optimization model obtained in S504 by using an Alternating Direction Method of Multipliers (ADMM) algorithm to obtain a sparse solution vector Z, and obtaining a feature vector Y ═ Ψ S according to Z, which may be specifically referred to with reference to fig. 20, where fig. 20 is a sub-flowchart for solving Y in S505, and as shown in fig. 20, step S505 specifically includes:
s5051: according to a constrained compressed sensing model, introducing a relaxation variable S, and further adding a constraint condition: z ═ S, i.e., the slack variable S is consistent with the Z requirement.
Figure GDA0002583354810000098
Figure GDA0002583354810000095
Figure GDA0002583354810000096
Z=S
Using a Lagrange multiplier method, introducing Lagrange multipliers U, V and W, and establishing a corresponding Lagrange function g, wherein Z is required sparse solution, and mu, V and xi are penalty parameters (positive real numbers);
Figure GDA0002583354810000101
s5052: setting initial values of Z, S, U, V, W, mu, nu and xi, for example, 0, and setting conditions for completing optimization;
s5053: the optimal value of S is calculated using the least squares method.
S5054: fixing S, U, V, W, mu, V and xi, and calculating the optimal value of Z by using a soft threshold method.
S5055: according to the equation
Figure GDA0002583354810000102
Updating the Lagrange multiplier U according to the residual error
Figure GDA0002583354810000103
And finally, increasing the penalty coefficients mu, ν and ξ by a fixed multiple ρ > 1.
S5056: judging whether the convergence condition is satisfied, if not, jumping to step 5053, repeating step 5053-5056, and judging whether the optimized condition is satisfied at the end of each cycle until convergence and ending the process.
Alternatively, the convergence condition is, for example, execution time or execution number.
Finally, vector Y can be obtained according to the equation Y ═ Ψ S.
In summary, the method and the device for index estimation in the embodiments of the present application are helpful to improve the precision of the simulation test program, reduce the measurement error, provide performance index estimation of all instruction segments, and perform linear division or estimate other program performance indexes at each stage according to the indexes.
Referring to fig. 21, fig. 21 is a schematic structural diagram of a performance index evaluation device according to an embodiment of the present application, and as shown in fig. 21, the performance index evaluation device 10 includes:
an instruction stream segmentation module 601, configured to obtain an instruction stream of a test program and segment the instruction stream into N instruction segments;
a skip matrix generation module 602, configured to count skip probabilities between every two basic blocks in each instruction segment of the N instruction segments, and form N skip matrices with M rows and M columns, where M is a type of a basic block;
a first feature matrix obtaining module 603, configured to convert the N jump matrices with M rows and M columns into M2A first feature matrix A with N rows and columns, wherein each column of the first feature matrix A represents a jump matrix of an instruction segment;
a second feature matrix obtaining module 604 for selecting p column vectors in the first feature matrix A, and combining the p column vectors to form M2Second feature matrix A of rows and columnspWherein the p column vectors represent a jump matrix of p instruction fragments;
a first performance indicator vector obtaining module 605, configured to send the p instruction fragments to the simulator respectively, and receive simulation results of the p instruction fragments from the simulator to determine a performance indicator vector C of the p instruction fragmentsp
A second performance indicator vector obtaining module 606, configured to obtain the performance indicator vector C according to the performance indicator vector CpA first feature matrix A and a second feature matrix ApA performance level vector C of the N instruction fragments is determined.
Optionally, the second performance indicator vector obtaining module 606 is specifically configured to:
according to equation Ap TY=CpDetermining M2Index contribution degree vector Y of individual line jump feature:
according to equation ATY-C determines a performance indicator vector C for the N instruction fragments.
Optionally, the second feature matrix obtaining module 604 is specifically configured to:
respectively averaging each row of the first feature matrix A to obtain a column vector B;
in the first feature matrix a, p column vectors are selected which are suitable for fitting the column vectors B, wherein,
Figure GDA0002583354810000111
h is the p column vector and D is the coefficient vector required for the fit.
Optionally, the second performance indicator vector obtaining module 606 is specifically configured to:
vector C of performance indicatorspAnd carrying out inner product operation on the sum coefficient vector D to obtain the total performance index value c of the instruction stream.
Optionally, the second performance indicator vector obtaining module 606 is specifically configured to:
using the performance index vector CpAnd the total performance index value c is taken as a limiting condition, and the second feature matrix A is usedpAnd determining a contribution degree vector Y according to a compressed sensing algorithm as an input parameter.
Optionally, the first feature matrix obtaining module 603 is specifically configured to:
multiplying the data of each column of the jumping matrix of N M rows and M columns by the weight of the basic block respectively to form a characteristic matrix of N M rows and M columns;
converting N characteristic matrixes of M rows and M columns into N M characteristic matrixes2A column vector of row 1 and column;
n M2The feature vectors of row 1 and column are combined into M2A first feature matrix A with N rows and N columns;
the basic block weight is the ratio of the number of instruction segments in which the basic block represented by the column in which the data of each column of the N jump matrixes with M rows and M columns is located to the number of all basic blocks in the instruction segments.
Optionally, the simulation result includes at least one of an instruction number per cycle, a branch prediction success rate, a branch prediction failure rate, a second level cache hit rate, and an energy consumption.
In summary, in the embodiment of the present application, a first feature matrix is obtained through a skip matrix, a vector is selected from the first feature matrix to form a second feature matrix, an instruction segment corresponding to the second feature matrix is sent to a simulator for local simulation, and a performance index of each instruction segment is obtained according to a simulation result, the second feature matrix, and the first feature matrix.
Referring to fig. 22, fig. 22 is a schematic diagram of a hardware structure of a performance index evaluation device according to an embodiment of the present application, and as shown in fig. 22, the performance index evaluation device 10 includes: the memory 701, the processor 702, and the bus 703, wherein the memory 701 and the processor 702 are respectively connected to the bus 703, the memory stores program instructions, and the processor executes the program instructions to perform the performance index estimation method disclosed above.
In summary, by simulating a selected small part of instruction fragments and obtaining a linear relationship between the simulation result and a second feature matrix of the part of instruction fragments, that is, the indicator contribution degree vector Y, the linear relationship can also be theoretically applied to the first feature matrix of all the instruction fragments and the simulation result of all the instruction fragments, so that the simulation result of all the instruction fragments is evaluated according to the indicator contribution degree vector Y and the feature matrix of all the instruction fragments, thereby realizing prediction of the simulation result of all the instruction fragments through the simulation result of the small part of instruction fragments, and greatly saving the cost.
In the embodiment of the present application, the Processor may be a Central Processing Unit (CPU), a Network Processor (NP), or a combination of the CPU and the NP. The processor may further include a hardware chip. The hardware chip may be an Application-Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), General Array Logic (GAL), or any combination thereof.
The memory may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product may include one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic Disk), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (15)

1. A performance index evaluation method is characterized by comprising the following steps:
acquiring an instruction stream and dividing the instruction stream into N instruction segments;
counting the jump probability between every two basic blocks in each instruction segment in the N instruction segments to form N jump matrixes with M rows and M columns, wherein M is the type of the basic blocks;
converting the N jump matrixes with M rows and M columns into M2A first feature matrix A with N rows and columns, wherein each column of the first feature matrix A represents a jump matrix of an instruction segment;
selecting p column vectors in the first feature matrix A, combining the p column vectors to form M2Second feature matrix A of rows and columnspWherein the p column vectors represent a jump matrix of p instruction fragments;
respectively sending the p instruction fragments to a simulator, and receiving simulation results of the p instruction fragments from the simulator to determine performance index vectors C of the p instruction fragmentsp
Determining a performance indicator vector C of the N instruction segments according to the first feature matrix A and an indicator contribution vector Y, wherein the indicator contribution vector Y represents the performance indicator vector CpAnd the second feature matrix ApA linear relationship therebetween.
2. The method of claim 1, wherein determining the performance indicator vector C of the N instruction fragments according to the first feature matrix a and the indicator contribution vector Y comprises:
according to equation Ap TY=CpDetermining the index contribution degree vector Y;
according to equation ATY-C determines a performance indicator vector C for the N instruction fragments.
3. The method according to claim 2, wherein selecting p column vectors in the first feature matrix a comprises:
respectively averaging each row of the first feature matrix A to obtain a column vector B;
selecting, in the first feature matrix A, p column vectors suitable for fitting a column vector B, wherein,
Figure FDA0002632569610000011
Hiis a p-column vector, DiTo fit the required coefficient vector.
4. The method of claim 3, wherein the method is according to equation Ap TY=CpDetermining the index contribution degree vector Y specifically includes:
vector C of the performance indicatorspAnd the coefficient vector DiAnd carrying out inner product operation to obtain the total performance index value c of the instruction stream.
5. The method of claim 4, wherein the method is according to equation Ap TY=CpDetermining the index contribution degree vector Y specifically includes:
using the performance index vector CpAnd the total performance index value c is used as a limiting condition, and the second feature matrix A is usedpAnd determining the index contribution degree vector Y according to a compressed sensing algorithm as an input parameter.
6. The method according to any one of claims 1 to 5, wherein said converting said N M row-M column jump matrices into M2The first feature matrix a with N rows and columns specifically includes:
multiplying the data of each column of the N M rows and M columns of the jump matrix by the weight of the basic block respectively to form N M rows and M columns of feature matrices;
converting the feature matrix of N M rows and M columns into N M2A column vector of row 1 and column;
the N M2Row 1 column bitThe eigenvectors are combined into the M2A first feature matrix A with N rows and N columns;
the basic block weight is the ratio of the number of the instruction segments where the basic blocks represented by the data in each column of the N jump matrixes with M rows and M columns are located to the number of all the basic blocks in the instruction segments where the basic blocks are located.
7. The method of any of claims 1 to 5, wherein the simulation results comprise at least one of number of instructions per cycle, branch prediction success rate, branch prediction failure rate, level two cache hit rate, and power consumption.
8. A performance index evaluation device, comprising:
the instruction stream segmentation module is used for acquiring an instruction stream and segmenting the instruction stream into N instruction segments;
a skip matrix generation module, configured to count skip probabilities between every two basic blocks in each of the N instruction segments, and form N skip matrices with M rows and M columns, where M is a type of a basic block;
a first feature matrix obtaining module, configured to convert the N skip matrices with M rows and M columns into M2A first feature matrix A with N rows and columns, wherein each column of the first feature matrix A represents a jump matrix of an instruction segment;
a second feature matrix obtaining module for selecting p column vectors in the first feature matrix A, combining the p column vectors to form M2Second feature matrix A of rows and columnspWherein the p column vectors represent a jump matrix of p instruction fragments;
a first performance indicator vector obtaining module, configured to send the p instruction fragments to a simulator respectively, and receive simulation results of the p instruction fragments from the simulator to determine a performance indicator vector C of the p instruction fragmentsp
A second performance index vector obtaining module, configured to determine a second performance index according to the first feature matrix a and the index contribution degree vector YA performance indicator vector C of the N instruction fragments, wherein the indicator contribution vector Y represents the performance indicator vector CpAnd the second feature matrix ApA linear relationship therebetween.
9. The apparatus of claim 8, wherein the second performance indicator vector obtaining module is specifically configured to:
according to equation Ap TY=CpDetermining the index contribution degree vector Y;
according to equation ATY-C determines a performance indicator vector C for the N instruction fragments.
10. The apparatus of claim 9, wherein the second feature matrix obtaining module is specifically configured to:
respectively averaging each row of the first feature matrix A to obtain a column vector B;
selecting, in the first feature matrix A, p column vectors suitable for fitting a column vector B, wherein,
Figure FDA0002632569610000021
Hiis a p-column vector, DiTo fit the required coefficient vector.
11. The apparatus of claim 10, wherein the second performance indicator vector obtaining module is specifically configured to:
vector C of the performance indicatorspAnd the coefficient vector DiAnd carrying out inner product operation to obtain the total performance index value c of the instruction stream.
12. The apparatus of claim 11, wherein the second performance indicator vector obtaining module is specifically configured to:
using the performance index vector CpAnd saidTaking the total performance index value c as a limiting condition, and taking the second feature matrix A aspAnd determining the index contribution degree vector Y according to a compressed sensing algorithm as an input parameter.
13. The apparatus according to any one of claims 8 to 12, wherein the first feature matrix obtaining module is specifically configured to:
multiplying the data of each column of the N M rows and M columns of the jump matrix by the weight of the basic block respectively to form N M rows and M columns of feature matrices;
converting the feature matrix of N M rows and M columns into N M2A column vector of row 1 and column;
the N M2The feature vectors of row 1 and column are combined into the M2A first feature matrix A with N rows and N columns;
the basic block weight is the ratio of the number of the instruction segments where the basic blocks represented by the data in each column of the N jump matrixes with M rows and M columns are located to the number of all the basic blocks in the instruction segments where the basic blocks are located.
14. The apparatus of any of claims 8 to 12, wherein the simulation results comprise at least one of number of instructions per cycle, branch prediction success rate, branch prediction failure rate, level two cache hit rate, and power consumption.
15. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the steps of any of claims 1-7.
CN201780083763.9A 2017-07-12 2017-07-12 Performance index evaluation method and device Active CN110178123B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/092662 WO2019010656A1 (en) 2017-07-12 2017-07-12 Method and device for evaluating performance indicator

Publications (2)

Publication Number Publication Date
CN110178123A CN110178123A (en) 2019-08-27
CN110178123B true CN110178123B (en) 2020-12-01

Family

ID=65001410

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780083763.9A Active CN110178123B (en) 2017-07-12 2017-07-12 Performance index evaluation method and device

Country Status (2)

Country Link
CN (1) CN110178123B (en)
WO (1) WO2019010656A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112825058A (en) * 2019-11-21 2021-05-21 阿里巴巴集团控股有限公司 Processor performance evaluation method and device
CN111739646A (en) * 2020-06-22 2020-10-02 平安医疗健康管理股份有限公司 Data verification method and device, computer equipment and readable storage medium
CN111897707B (en) * 2020-07-16 2024-01-05 中国工商银行股份有限公司 Optimization method and device for business system, computer system and storage medium
CN113203920B (en) * 2021-05-11 2023-03-24 国网山东省电力公司临沂供电公司 Power distribution network single-phase earth fault positioning system and method
CN115543719B (en) * 2022-11-24 2023-04-07 飞腾信息技术有限公司 Component optimization method and device based on chip design, computer equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763291A (en) * 2009-12-30 2010-06-30 中国人民解放军国防科学技术大学 Method for detecting error of program control flow
CN102110013A (en) * 2009-12-23 2011-06-29 英特尔公司 Method and apparatus for efficiently generating processor architecture model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7904870B2 (en) * 2008-04-30 2011-03-08 International Business Machines Corporation Method and apparatus for integrated circuit design model performance evaluation using basic block vector clustering and fly-by vector clustering
CN103902443B (en) * 2012-12-26 2017-04-26 华为技术有限公司 Program running performance analysis method and device
CN103049310B (en) * 2012-12-29 2016-12-28 中国科学院深圳先进技术研究院 A kind of multi-core simulation parallel acceleration method based on sampling
CN104424101B (en) * 2013-09-10 2017-08-11 华为技术有限公司 The determination method and apparatus of program feature interference model
CN105654120B (en) * 2015-12-25 2019-06-21 东南大学苏州研究院 A kind of software load feature extracting method based on SOM and K-means two-phase analyzing method
CN105630458B (en) * 2015-12-29 2018-03-02 东南大学—无锡集成电路技术研究所 The Forecasting Methodology of average throughput under a kind of out-of order processor stable state based on artificial neural network
CN105677521B (en) * 2015-12-29 2019-06-18 东南大学苏州研究院 A kind of benchmark synthetic method towards mobile intelligent terminal processor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102110013A (en) * 2009-12-23 2011-06-29 英特尔公司 Method and apparatus for efficiently generating processor architecture model
CN101763291A (en) * 2009-12-30 2010-06-30 中国人民解放军国防科学技术大学 Method for detecting error of program control flow

Also Published As

Publication number Publication date
WO2019010656A1 (en) 2019-01-17
CN110178123A (en) 2019-08-27

Similar Documents

Publication Publication Date Title
CN110178123B (en) Performance index evaluation method and device
Diaz-Uriarte et al. Testing hypotheses of correlated evolution using phylogenetically independent contrasts: sensitivity to deviations from Brownian motion
CN111274134A (en) Vulnerability identification and prediction method and system based on graph neural network, computer equipment and storage medium
JP4627674B2 (en) Data processing method and program
CN107038457B (en) Telemetry data compression batch processing method based on principal component signal-to-noise ratio
US10248462B2 (en) Management server which constructs a request load model for an object system, load estimation method thereof and storage medium for storing program
Tousi et al. Comparative analysis of machine learning models for performance prediction of the spec benchmarks
CN110990603B (en) Method and system for format recognition of segmented image data
CN111260419A (en) Method and device for acquiring user attribute, computer equipment and storage medium
CN111897707B (en) Optimization method and device for business system, computer system and storage medium
CN108008999B (en) Index evaluation method and device
CN110348581B (en) User feature optimizing method, device, medium and electronic equipment in user feature group
KR20210143460A (en) Apparatus for feature recommendation and method thereof
WO2020099606A1 (en) Apparatus and method for creating and training artificial neural networks
CN116149917A (en) Method and apparatus for evaluating processor performance, computing device, and readable storage medium
CN108664368B (en) Processor performance index evaluation method and device
Ganesan et al. A case for generalizable DNN cost models for mobile devices
Shantharam et al. Exploiting dense substructures for fast sparse matrix vector multiplication
CN108037979B (en) Virtual machine performance degradation evaluation method based on Bayesian network containing hidden variables
CN114745366A (en) Method and apparatus for continuous monitoring telemetry in the field
CN107678734B (en) CPU benchmark test program set construction method based on genetic algorithm
CN112733433A (en) Equipment testability strategy optimization method and device
CN107025462B (en) real-time parallel compression processing method for telemetering data of unmanned aerial vehicle
CN115543719B (en) Component optimization method and device based on chip design, computer equipment and medium
KR102413753B1 (en) Information processing apparatus, information processing method, and information processing program stored in a recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant