CN108008999B - Index evaluation method and device - Google Patents

Index evaluation method and device Download PDF

Info

Publication number
CN108008999B
CN108008999B CN201610944177.6A CN201610944177A CN108008999B CN 108008999 B CN108008999 B CN 108008999B CN 201610944177 A CN201610944177 A CN 201610944177A CN 108008999 B CN108008999 B CN 108008999B
Authority
CN
China
Prior art keywords
coefficient
product
norm
optimization
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610944177.6A
Other languages
Chinese (zh)
Other versions
CN108008999A (en
Inventor
程捷
刘旭东
项定义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610944177.6A priority Critical patent/CN108008999B/en
Publication of CN108008999A publication Critical patent/CN108008999A/en
Application granted granted Critical
Publication of CN108008999B publication Critical patent/CN108008999B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45508Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Abstract

The embodiment of the application provides an index evaluation method and device, and the method comprises the following steps: acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length; determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments; determining an object optimization model of the test program according to the feature matrix A, the column vector B, the weight vector W and a plurality of optimization objects, and solving an optimization column vector of the object optimization model; determining a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector; simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors; and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.

Description

Index evaluation method and device
Technical Field
The embodiment of the application relates to the field of computers, in particular to an index evaluation method and device.
Background
Personnel involved in the design and development of processor architectures often need to run test programs in an emulator of a certain architecture and then collect performance-related data indicators, such as Instruction Per Clock (IPC), hit rate of a second level Cache (L2 Cache), energy consumption, etc., to find the bottleneck of the current processor architecture. After the system architecture is improved, the design is redeployed in the simulator, the simulator is used again to run the test program, data are collected, the performance difference of the same test program which runs under the new and old system architectures is compared, then the bottleneck is found again. It can be seen that a great deal of design test work is done with a software simulator before deployment in hardware.
However, one of the major disadvantages of the software simulation platform is: the same test program is run at a much longer runtime than the hardware platform. Especially when running large, comprehensive test suite programs, such as SPEC CPU 2006, it often takes weeks or even months to get the data needed by itself. And after the system architecture is changed every time, the test program needs to be operated again to collect data under the new architecture. Therefore, such repeated operations and waits will seriously affect the development efficiency.
Research shows that the running process of the application program has obvious stage characteristics. The developer needs to shorten the time required by simulation from the perspective of simplifying test programs, and the core thought of the method is as follows: a user can obtain a group of artificial programs only by running the original test program once, each artificial program is very small and can be simulated quickly, and the time for simulating the group of artificial programs is far shorter than that for simulating the original test program; and (3) multiplying the data obtained after each artificial program simulation by a specific weight by a user, wherein the data obtained after weighted averaging is very close to the data obtained after the original artificial program runs.
In the existing Simpoint technology, the steps are as follows: grabbing an instruction stream of a test program component, cutting the instruction stream into segments (intervals) with equal length, and calculating a Basic Block Vector (BBV) of each Interval; performing K-Means clustering on all BBVs, and selecting an Interval closest to the center of each type as a representative Interval of the type; and simulating and testing the indexes of the selected representative intervals, and multiplying the indexes measured by the representative intervals of each category by the proportion of the Interval of the category to the total Interval to calculate a weighted average to represent the indexes of the original test program components. The simulation test in the prior art needs too many selected fragments, long simulation time and uncontrollable error.
How to reduce the instruction fragments selected in the simulation test to shorten the simulation time and/or reduce the error of the simulation index is a technical problem to be solved by the embodiment of the present application.
Disclosure of Invention
The embodiment of the application provides an index evaluation method and device, which can reduce instruction segments selected in simulation test to shorten simulation time and reduce simulation index errors, so that the simulation efficiency or precision of a processor can be improved.
In a first aspect, an index evaluation method is provided, including:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the first aspect, in a first possible implementation manner, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
Figure GDA0003030818070000031
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
Figure GDA0003030818070000041
wherein d represents the second threshold, | X | | non-volatile memoryt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the optimization objective is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X
Figure GDA0003030818070000042
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000043
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000044
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000045
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the optimization objective is: in that
Figure GDA0003030818070000046
And/or the eighth norm of X is less than or equal to the second threshold, the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the product of the second coefficient and the first norm of X
Figure GDA0003030818070000047
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000048
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000049
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA00030308180700000410
and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the first aspect and the foregoing implementation manner, in a seventh possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
In a second aspect, an index evaluation apparatus is provided, including:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain the characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing the characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the second aspect, in a first possible implementation manner, the optimization goal is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector by using the following target optimization formula:
Figure GDA0003030818070000071
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
With reference to the second aspect, in a third possible implementation manner of the second aspect, the optimization goal is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector through the following target optimization formula:
Figure GDA0003030818070000072
wherein d represents the second threshold, | X | | non-volatile memoryt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm W.AX-W.B,||X||t3represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the second aspect, in a fifth possible implementation manner of the second aspect, the optimization goal is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X
Figure GDA0003030818070000073
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000074
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000075
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000081
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the second aspect, in a sixth possible implementation manner of the second aspect, the optimization goal is: a sixth norm at W.AX-W.B or a seventh norm at AX-B is less than or equal to a first threshold, and/or an eighth norm of X is less than or equal to a second normWhen the threshold value is used, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, the first product is the product of a first coefficient and a first norm of X, and the second product is the product of a second coefficient and a first norm of X
Figure GDA0003030818070000082
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000083
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000084
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000085
and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the second determining unit is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in an eighth possible implementation manner of the second aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a ninth possible implementation manner of the second aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
In a third aspect, there is provided another index evaluation apparatus, including a processor and a channel interface, where the processor is configured to execute the method in the first aspect or any possible implementation manner of the first aspect through the channel interface.
In a fourth aspect, a computer-readable storage medium is presented for storing a computer program comprising instructions for performing the method of the first aspect or any possible implementation manner of the first aspect.
Based on the above technical solutions, the index evaluation method and apparatus in the embodiments of the present application construct a target optimization model of a test program based on a feature matrix, a column vector, and a weight vector of an instruction segment of the test program, determine an optimized column vector based on the target optimization model and an optimization target, determine an instruction segment and a coefficient vector corresponding to a non-zero position according to the optimized column vector, simulate the instruction segment corresponding to the optimized column vector in the simulation apparatus to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and obtain a feature index for evaluating the test program according to the feature index vector and the coefficient vector, thereby reducing the number of instruction segments required for simulation work of the test program or reducing a simulation index error of the test program.
Drawings
FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application.
FIG. 3 is an interaction flow diagram of an index evaluation method according to an embodiment of the present application.
FIG. 4 is a schematic diagram of a target optimization model of an embodiment of the present application.
FIG. 5 is a flow diagram of a method of determining an optimized column vector according to one embodiment of the present application.
Fig. 6 is a schematic structural diagram of an index evaluation device according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation device according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
To facilitate understanding of the embodiments of the present application, several elements that will be introduced in the description of the embodiments of the present application are first introduced herein.
An instruction stream file: the file for recording the instruction stream information is called an instruction stream file, and each line of the instruction stream file represents information related to an executed instruction and conforms to a uniform format. Typically, the instruction stream file size is fixed. For example, an instruction stream file generally contains information such as 1 hundred million instructions, and a complete test program can be regarded as an instruction stream composed of a plurality of 1 hundred million instructions, each 1 hundred million instruction being referred to as an Interval (Interval), in other words, a complete test program is composed of a plurality of segments. The full set of test programs is all fragments and the subset of test programs is part of fragments. The purpose of simplifying the test program is to select representative partial segments from the corpus, and it is required that the number of the selected segments is as small as possible, and the operation results of the segments are as high as possible in similarity with the operation results of the original test program. An instruction may include the following information: program pointer, assembly instruction, operation type, and memory address. The memory address is a selectable item.
And (3) program pointer: the program pointer for each line of instructions is the address in memory of the line assembler instruction, which is a hexadecimal number beginning with "0 x".
Assembling instructions: the binary instruction code of the instruction needs to meet assembly syntax requirements.
The operation type is as follows: all assembly instructions can be divided into three categories: the arithmetic logic unit operates, reads the memory, writes the memory and controls the instruction.
Memory address: if the instruction is an arithmetic logic unit operation, no memory address information is needed; if the instruction is a memory read-write operation, a memory address is required.
Basic Block (BB): a piece of sequentially executed instructions. In general, an instruction stream may be divided into a plurality of BBs with a control instruction as a boundary. Each segment of the test program is composed of BB.
Basic Block feature indicator Vector (BBV): according to different control instructions, the execution times of BB of different types in each segment are counted, and a vector constructed based on the types and the execution times of the BB is called BBV. FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application. As shown in the instruction code fragment of FIG. 1, if the type ID of BB is {1,2,3,4,5}, and the corresponding execution times is {1,20,0,5,0}, then BBV of the fragment can be recorded as {1:1,2:20,3:0,4:5,5:0 }.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application. The method of FIG. 2 is performed by an index evaluation device of a processor. Specifically, the index evaluation device may be a host apparatus or the like.
An instruction stream of a test program is obtained and divided into N instruction fragments of equal length 201.
202, determining a feature matrix a, a mean column vector B, and a weight vector W corresponding to the N instruction fragments.
The feature matrix a is a matrix with M rows and N columns for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature index vectors of each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indexes of the test program by types of the basic blocks BB corresponding to each row of the feature matrix a.
Specifically, in the embodiment of the present application, the characteristic index may be any one of the following: instruction Per Clock (IPC), predicted branch success/failure, L2 Cache hit, energy consumption, etc.
203, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W and an optimization goal.
The optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between characteristic indexes of the evaluation test program and characteristic indexes of the test program, and the optimization column vector is used for representing instruction segments used for simulation in the N instruction segments and weights corresponding to the instruction segments used for simulation.
Specifically, for example, an objective optimization model may be established according to the feature matrix a, the column vector B, the weight vector W and an optimization objective, the objective optimization model is used to represent the optimization objective through a relationship between A, B, W and an optimization variable X of the objective optimization model, and the optimization column vector is a value of the optimization variable X satisfying the constraint relationship.
And 204, determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector.
The number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
205, the instruction segments corresponding to the number vectors are simulated in the simulation device to obtain the feature index vectors of the instruction segments corresponding to the number vectors.
The feature indicator vector is used to indicate the feature indicator of the instruction segment corresponding to the number vector.
And 206, determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
Optionally, as an embodiment, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
It should be understood that in the embodiments of the present application, the firstThe norm, the second norm, the third norm, etc. may be any one of the following norms: 0 norm, 1 norm,
Figure GDA0003030818070000121
Norm, p-norm or nuclear norm, etc., where p is a positive integer. It should be understood that the first to eighth norms, etc. mentioned elsewhere in this application may also be any of the following norms: 0 norm, 1 norm,
Figure GDA0003030818070000122
Norm, p-norm, or nuclear norm, etc. Taking X as an example, 0 norm | X | of X is non-woven01 norm of X (| X | | non-woven phosphor) of X1The kernel norm X of X is not counting*X of
Figure GDA0003030818070000123
Norm of
Figure GDA0003030818070000124
P norm | X | Y | of X non-woven gridpAnd p is a positive integer.
Specifically, step 203 may determine the optimized column vector by equation 1:
Figure GDA0003030818070000125
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X. It is understood that the operator symbol, represents an inner product, e.g., [ x ]1,x2,x3]·[y1,y2,y3]=x1 y1+x2 y2+x3 y3
It is not assumed that the first norm of W.AX-W.B in equation 1 is | | | W.AX-W.B | | survival0The second norm of X | | non-conducting phosphor1A 0 norm | | X | | non-woven phosphor of X0Then equation 1 can now be as shown in equation 2:
Figure GDA0003030818070000131
the method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
Optionally, as another embodiment, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
Specifically, step 203 may determine the optimized column vector by equation 3:
Figure GDA0003030818070000132
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
It should be understood that, in the embodiment of the present application, the first norm of X and the third norm of X may be any one of the following norms: 0 norm of X | | non-woven phosphor 01 norm of X (| X | | non-woven phosphor) of X1X of
Figure GDA0003030818070000133
Norm of
Figure GDA0003030818070000134
P norm | X | Y | of X non-woven gridpAnd p is a positive integer. The first norm of X and the third norm of X may be the same or different。
It should be understood that in the embodiment of the present application, the second norm of W · AX-W · B may be any one of the following norms: W.AX-W.B has a 0 norm | | | W.AX-W.B | | luminance 01 norm of W.AX-W.B
||W·AX-W·B||1W.AX-W.B
Figure GDA0003030818070000137
Norm of
Figure GDA0003030818070000135
P norm of W.AX-W.B | | | W.AX-W.B | | luminancepAnd p is a positive integer.
It should be understood that, in the first coefficient and the second coefficient of the embodiments of the present application, at least 1 term is not 0. For example, λ ≠ 0.
It is not assumed that the first norm of X in equation 3 | | | X | | luminancetA 0 norm | | X | | non-woven phosphor of X0The third norm of X | | non-conducting phosphortA 1-norm | | X | | non-woven phosphor of X1The second norm of W.AX-W.B is 2 norms of W.AX-W.B | | tormenting2If the values of the first coefficient and the second coefficient are both 1, formula 3 can be as shown in formula 4:
Figure GDA0003030818070000136
it should be understood that, in the embodiment of the present application, the values of the first coefficient and the second coefficient are determined by the optimization objective. The values of the first coefficient and the second coefficient determine whether the optimization target is smaller instruction fragments for simulation or smaller simulation errors.
Optionally, as another embodiment, the optimization objective is: so that the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimum, wherein the first product is a product of a first coefficient and a first norm of X, the second product is a product of a second coefficient and a second norm of LX-L, the third product is a product of a third coefficient and a third norm of SX-S, the fourth product is a product of a fourth coefficient and a fourth norm of L, the fifth product is a product of a fifth coefficient and a fifth norm of S, A is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S, wherein, X is an optimization variable used for representing instruction segments used for simulation among the N instruction segments and a weight corresponding to each instruction segment, and the optimized column vector is a value of X that satisfies the optimization objective.
Specifically, step 203 may specifically determine the optimized column vector by equation 5:
Figure GDA0003030818070000141
wherein | X | Y luminancet1A first norm representing X is given,
Figure GDA0003030818070000142
to represent
Figure GDA0003030818070000143
Is measured in a first direction of the first norm,
Figure GDA0003030818070000144
to represent
Figure GDA0003030818070000145
Is a third norm, | L | | non-woven phosphort4Represents the fourth norm of L, | S | | non-woven phosphort5Denotes a fifth norm of S, λ denotes a first coefficient, δ denotes a second coefficient, α denotes a third coefficient, β denotes a fourth coefficient, and γ denotes a fifth coefficient.
It should be understood that at least one of the first coefficient, the second coefficient, the third coefficient, the fourth coefficient, and the fifth coefficient in the embodiments of the present application is not 0.
Specifically, when the first norm of X is 1 norm, the second norm of LX-L is 2 norms, the third norm of SX-S is 2 norms, the fourth norm of L is 1 norm, and the fifth norm of S is 1 norm, equation 5 can be expressed as shown in equation 6:
Figure GDA0003030818070000146
optionally, as an embodiment, the optimization objective is: when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold value and/or the eighth norm of X is less than or equal to a second threshold value, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of X
Figure GDA0003030818070000147
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000148
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000149
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000151
and representing a column vector formed by the average values of all rows of the sparse matrix S, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold, step 203 may determine the optimized column vector by equation 7:
Figure GDA0003030818070000152
wherein c represents a first threshold, | W.AX-W.B | | calceit6Represents the sixth norm of W.AX-W.B, | | AX-B | | non-combustible cellst7Represents the seventh norm of AX-B, and the remaining individual parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint condition of the optimization objective is that the sixth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 8:
Figure GDA0003030818070000153
wherein d represents a second threshold, | X | | non-volatile memoryt6Denotes the sixth norm of X, and the remaining respective parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold and the eighth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 9:
Figure GDA0003030818070000154
wherein c represents a first threshold, d represents a second threshold, and | X | | magnetism is not present when a sixth norm of W.AX-W.B or a seventh norm of AX-B is less than or equal to the first thresholdt8Represents the eighth norm of X, and the remaining parameters have the same meanings as the corresponding parameters in equation 5.
Specifically, the detailed process of step 203 may be as follows:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the steps 3-7.
In a specific implementation manner, the reduction of one non-zero element in X and the replacement of X in step 5 can be specifically implemented as: the lagrange multiplier Z is updated according to the residual of equation X ═ Y and the penalty factor μ is increased by a fixed multiple ρ, where ρ > 1. Of course, it should be understood that there may be other specific implementation manners of step 5, for example, decreasing the non-zero elements in X and replacing X one by one in the order of decreasing the non-zero elements from small to large or from large to small until X satisfies the constraint condition of the target optimization model, and so on.
In order to facilitate understanding of the method of the embodiments of the present application, the method of the embodiments of the present application will be further described with reference to specific embodiments.
FIG. 3 is a flowchart illustrating evaluation of metrics by a processor according to an embodiment of the present disclosure. It should be understood that in the embodiment of the present application, BBV is used as the feature index vector of the test program. The method of the embodiment of the present application is executed by an index evaluation apparatus of a processor disposed on a host device, where the apparatus may be an analog system or an execution chip, and the like, and the embodiment of the present application is not limited herein.
301, an instruction stream is fetched.
In one particular example, the host device may use an instruction stream fetching software tool to fetch an instruction stream of the test program, the instruction stream being composed of instructions. The host device can segment the instruction stream into BBs according to the jump instruction, and thus the instruction stream can be seen as being composed of BB sequences.
302, the instruction stream is partitioned into instruction fragments, and the BBV of each instruction fragment is calculated.
The host device may divide the instruction stream acquired in step 301 into a plurality of instruction fragments (intervals). For example, the instruction stream may be divided into instruction fragments of equal length, and the number and entry address of each instruction fragment may be recorded. Specifically, for example, each instruction fragment contains 1 billion BB, and so on. Of course, the division may be performed according to other instruction lengths, for example, 2 hundred million BBs, 10 hundred million BBs, etc., and the embodiment of the present application is not limited herein.
It is not assumed that the number of divided instruction fragments is N.
303, construct a matrix A, B, W.
The host device may construct the feature matrix a according to the BBV of each instruction segment after the instruction stream is split. Where each row represents the BBV of one instruction fragment.
The host device may calculate a column vector B based on the average value of each row in the feature matrix a. Specifically, the value of each row in B can be expressed by the following equation 10:
Figure GDA0003030818070000171
wherein, i is more than or equal to 1 and less than or equal to N, and M represents the number of BB types in the instruction segment.
In addition, a weight vector W may be set according to the difference in weight of the type of BB in the characteristic index of the test program. W is the column vector. Wherein each value of W represents a weight of the corresponding type of BB in the characteristic index of the test program.
FIG. 4 is a schematic diagram of a target optimization model according to an embodiment of the present application. As shown in FIG. 4, the rows of the feature matrix A correspond to the types of BB, the columns correspond to the divided instruction fragments, the feature matrix A is a matrix of M rows × N columns, and the element A in Ai,jIndicates the j-th instructionThe characteristic index value of the ith BB in the segment. The sparse solution matrix X is an optimization variable X in the embodiment shown in fig. 2, a position number of a non-zero coefficient in X is used to indicate a number of a selected instruction segment, a value of the non-zero coefficient indicates a weight of a finally calculated characteristic index of a characteristic index of the instruction segment corresponding to the non-zero coefficient, and the sparse solution matrix X is an N row × 1 column matrix. The column mean matrix B of a is a column vector formed by the column means of each column in a, and the matrix B is a matrix of M rows × 1 column, and its calculation method is shown in equation 10. The weight matrix W is a matrix of M rows by 1 column, and the ith element W in the weight matrixiAnd (3) representing the weight of the feature vector of the ith BB in the finally calculated feature index. As shown in fig. 4, X is a value satisfying AX · W — B · W. It is understood that AX.W and W.AX are equivalent; B.W and W.B are equivalent.
An optimized column vector is determined 304 based on A, B, W and the optimization objective.
Specifically, an objective optimization model may be constructed from A, B, W and the optimization objectives, and the optimization column vectors of the objective optimization model may be solved.
An optimization variable X can be introduced, wherein a non-zero element in X represents a selected instruction segment, and the value of the non-zero element in X represents the value of the weight corresponding to the instruction segment. That is, X is used to represent the instruction segment for simulation among the N instruction segments of the test program and the weight corresponding to each instruction segment.
It should be understood that in practical applications, W · AX and W · B will have a certain error, and the smaller the error, the smaller the simulation error; further, the smaller the number of non-zero coefficients in X, the fewer instruction fragments to be selected for the simulation test, and the shorter the time required for the simulation test. Thus, an objective optimization model may be constructed based on A, B, W according to the optimization objectives.
It should be appreciated that if the cost of the number of instruction fragments is high, then a higher weight may be placed on the optimization objective that affects the number of X non-zero coefficients. At this point, the optimization objective may include limiting the number of non-zero coefficients in X.
Specifically, for example, the number of non-zero coefficients in X is minimizedAnd (4) transforming. At this time, it can be achieved by minimizing the norm of X. The X norm may include the 0 norm | | | X | | luminance of X 01 norm of X (| X | | non-woven phosphor) of X1X of
Figure GDA0003030818070000181
Figure GDA0003030818070000182
Or, for example, such that the 1 norm of X is less than d. At this time, | | X | | non-woven phosphor can be restricted1≤d,d∈R+. Reducing non-zero numbers in X is achieved by constraining the 1-norm reduction of X to be less than a value that results in a reduction of non-zero numbers in X.
Of course, other constraint formulas may exist, and the embodiments of the present application are not listed here.
It should be appreciated that if certain simulation error accuracy is required, higher weights may be placed on the optimization objectives affecting W.AX-W.B.
Specifically, for example, the value of W · AX-W · B is minimized, or W · AX-W · B is smaller than d, or the norm of W · AX-W · B is smaller than d, or the like.
Based on the difference of the optimization objectives, different objective optimization models can be set.
For example, assume that the optimization goal of the target optimization model is to minimize the third norm of X when either the first norm of W.AX-W.B or the second norm of AX-B is less than or equal to the first threshold. At this time, the objective optimization model may be expressed by equation 1 above.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product and a second product when a first norm of X is smaller than or equal to a second threshold, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number. At this time, the target optimization model may be expressed by equation 3 above.
As another example, assume that the optimization objective of the objective optimization model is such that the first product,The sum of a second product, a third product, a fourth product and a fifth product is minimum, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a fifth product
Figure GDA0003030818070000191
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000192
A fourth product is a product of a fourth coefficient and a fourth norm of L, a fifth product is a product of a fifth coefficient and a fifth norm of S, a is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equation 5.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product, a second product, a third product, a fourth product and a fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold and/or the eighth norm of X is less than or equal to a second threshold, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of X
Figure GDA0003030818070000193
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000194
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000195
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000196
a column vector representing an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equations 7, 8, 9, and the like.
Without assuming that the formula of the objective optimization model of the embodiment of the present application is shown in formula 4, a method for establishing the objective optimization model and solving the optimized column vectors of the embodiment of the present application is shown in fig. 5. Fig. 5 is a flowchart of a method for determining an optimized column vector according to an embodiment of the present application.
S501, establishing a target optimization model, and determining a Lagrangian function f (X, Y, Z and mu) according to the target optimization model.
Specifically, an objective optimization model may be established based on A, B, W and an optimization objective, where the objective optimization model is used to represent a relationship between A, B, W and an optimization variable X under the optimization objective, and a value of X that satisfies the objective optimization model is an optimization column vector.
To solve for the optimized column vector, the lagrangian function f (X, Y, Z, μ) may be determined. Specifically, a relaxation variable Y can be introduced, a lagrangian multiplier method is used, a lagrangian multiplier Z is introduced, μ is a penalty parameter, and the value is a positive real number. The variable X of the Lagrange function is the optimized variable X of the target optimization model.
S502, setting initial values of variables X, Y, Z and mu in the Lagrangian function, and setting a convergence condition of the optimization completion.
For example, a convergence condition for completion of optimization when the profit of optimization is less than a predetermined threshold may be set. Specifically, taking the target optimization model shown in formula 4 as an example, if the obtained | | W · AX-W · B | | survival rate2+||X||1The value of (A) is less than the (| W.AX-W.B |) of the last circulation2+||X||1The magnitude of the decrease in the value of (b) is less than a predetermined threshold V, the optimization is deemed complete.
For another example, a maximum number of loop executions may be set, and when the loop reaches the maximum number, a convergence condition for completion of optimization may be set.
Of course, it should be understood that a plurality of determination conditions may be set, and any one of them may be satisfied. For example, if | | W · AX-W · B | | Wy calculation is obtained2+||X||1The value of (A) is less than the (| W.AX-W.B |) of the last circulation2+||X||1The amplitude of the decrease of the value of (b) is less than a predetermined threshold V, or the number of cycles reaches the maximum number, which is a convergence condition for completing optimization, and so on.
S503, an optimal value of Y is calculated using the least square method.
For a specific implementation of calculating Y by using the least square method, reference may be made to the prior art, and details of the embodiment of the present application are not repeated herein.
S504, the optimal value of X is calculated by using a soft threshold method.
For a specific implementation of calculating X by using the soft threshold method, reference may be made to the prior art, and details of the embodiment of the present application are not described herein again.
And S505, when the value of the X does not meet the constraint condition of the target optimization model, reducing a non-zero element in the X and replacing the X until the X meets the constraint condition of the target optimization model.
It should be understood that step S505 has different implementations according to different target optimization models.
For example, assuming that the target optimization model is the expression shown in formula 4, in step S505, the non-zero elements of X may be reduced to not more than d.
There are many ways to reduce non-zero elements in X.
Preferably, on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X minimizes the function value of the target optimization model, the first non-zero element is reduced and replaces X.
For example, in the target optimization model assumed in equation 4, in order to reduce the non-zero elements of X to no more than d, in order to make | | W · AX-W · B | | white magnetism2+||X||1The value of (A) is taken as the minimum value as much as possible, and when one nonzero element is reduced each time, the (| | W.AX-W.B | |) of each nonzero element can be reduced relatively2+||X||1Selecting | | | W.AX-W.B | | non-woven cells2+||X||1The minimum value of (a) is used as a principle of reducing the non-zero elements until the number of the non-zero elements does not exceed d.
Of course, the non-zero elements may be reduced according to the principle that the value of the non-zero elements in X is from large to small or from small to large. The method has the advantages of simple calculation, and smaller function value of the target optimization model due to the fact that multiple comparisons for reducing non-zero elements are not needed.
S506, the lagrange multiplier is updated according to the residual error of the equation X ═ Y, and the penalty coefficient μ is increased by a fixed multiple ρ, where ρ > 1.
S507, judging whether the convergence condition of the Lagrangian function f (X, Y, Z and mu) is met.
If the convergence condition of the Lagrangian function f (X, Y, Z, mu) is met, outputting the current value of X, and ending the execution;
if the convergence condition of the Lagrangian function f (X, Y, Z, μ) is not satisfied, step 503 is performed.
Steps S503 to S507 are executed in a loop until the convergence condition of the lagrangian function f (X, Y, Z, μ) is satisfied.
Of course, it should be understood that there may be other methods of solving the objective optimization model, and the embodiments of the present application are not limited thereto.
The number vector and coefficient vector of the instruction fragment to which the optimized column vector corresponds are determined 305.
The optimized column vector determined in step 304 is not recorded as X0
It is understood that X0The numbering of the non-zero position-corresponding segments is recorded in a numbering vector D, X0The non-zero coefficients are recorded in a coefficient vector G. The number vector D is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector G is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
And 306, simulating the instruction segments corresponding to the number vector D in the simulation device respectively to obtain the characteristic index vector of the instruction segment corresponding to the number vector D.
And the simulation test system simulates the instruction segment corresponding to the number vector D by using a simulation tool according to the number vector D to obtain the characteristic indexes of each instruction segment, and the characteristic indexes of each instruction segment corresponding to the number vector D form a characteristic index vector C.
The simulation tool may be a software or hardware emulator, such as LiveSP, Checkpoint, or the like. The characteristic indexes of the instruction segment obtained by the simulation tool can be IPC, predicted branch success/failure rate, energy consumption and the like.
307, determining the inner product of the feature index vector and the coefficient vector as the feature index for evaluating the test program.
From the coefficient vector D of step 305 and the characteristic index vector C of step 306, a characteristic index C · D of the evaluation test program can be obtained.
And finishing the index evaluation of the test program.
Fig. 6 is a schematic structural diagram of an index evaluation device 600 according to an embodiment of the present application. As shown in fig. 6, the index evaluation device 600 may include:
an obtaining unit 601, configured to obtain an instruction stream of a test program and divide the instruction stream into N instruction fragments with equal lengths;
a first determining unit 602, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row in the feature matrix a in a test program;
a second determining unit 603, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal, where the optimization goal includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit 604, configured to determine a number vector and a coefficient vector of the instruction fragment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction fragment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction fragment corresponding to the non-zero position in the optimized column vector;
a simulation unit 605, configured to simulate the instruction segments corresponding to the number vector in a simulation device, respectively, to obtain a feature indicator vector of the instruction segment corresponding to the number vector, where the feature indicator vector is used to indicate a feature indicator of the instruction segment corresponding to the number vector;
a fourth determining unit 606, configured to determine an inner product of the feature indicator vector and the coefficient vector as a feature indicator for evaluating the test program.
In the embodiment of the application, a target optimization model of a test program is constructed based on a feature matrix, a column vector and a weight vector of an instruction segment of the test program, an optimized column vector is determined based on the target optimization model and an optimized target, an instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in a simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
The index evaluation device 600 may also execute the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation apparatus 700 according to an embodiment of the present application. The metric evaluation device 700 may include a channel interface 701, a processor 702, and a memory 703.
The channel interface 701, the processor 702 and the memory 703 are interconnected by a bus 704 system. Bus 704 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 7, but this does not indicate only one bus or one type of bus.
The memory 703 is used for storing programs. In particular, the program may include program code comprising computer operating instructions. Memory 703 may include both read-only memory and random-access memory, and provides instructions and data to processor 702. The memory 703 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 702 executes the program stored in the memory 703, and is specifically configured to perform the following operations:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
The method performed by the index estimation apparatus disclosed in any one of fig. 2, fig. 3, and fig. 5 may be applied to the processor 702, or implemented by the processor 702. The processor 702 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 702. The Processor 702 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 703, and the processor 702 reads the information in the memory 703 and performs the steps of the above method in combination with the hardware thereof.
The index evaluation device 700 may also perform the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this application.
The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program includes instructions for executing the method of the embodiment shown in fig. 2.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

1. An index evaluation method, comprising:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimized target, wherein the optimized target comprises reducing the number of instruction fragments for simulation and/or reducing errors between characteristic indexes of the evaluation test program and the characteristic indexes of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector, wherein the number vector is used for representing the number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
2. The method of claim 1, wherein the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
3. The method of claim 2, wherein the determining an optimized column vector based on the feature matrix a, the mean column vector B, the weight vector W, and an optimization objective comprises:
determining the optimized column vector by the following target optimization formula:
Figure FDA0003030818060000021
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
4. The method of claim 1, wherein the optimization objective is:
when a first norm of X is smaller than or equal to a second threshold, minimizing the sum of a first product and a second product, wherein the first product is a product of a second norm of W.AX-W.B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, a value of the first coefficient is a non-zero real number, X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and a weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
5. The method of claim 4,
the determining an optimized column vector according to the feature matrix a, the mean column vector B, the weight vector W, and an optimization objective includes:
determining the optimized column vector by the following target optimization formula:
min λ||W·AX-W·B||t2+δ||X||t3
s.t.||X||t1≤d
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
6. The method of claim 1, wherein the optimization objective is:
the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X
Figure FDA0003030818060000022
Is the product of a third coefficient and a second norm of
Figure FDA0003030818060000023
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure FDA0003030818060000024
a column vector representing the average of the rows of the low rank matrix L,
Figure FDA0003030818060000025
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
7. The method of claim 1, wherein the optimization objective is:
maximizing the sum of the first product, the second product, the third product, the fourth product, and the fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold, and/or the eighth norm of X is less than or equal to a second thresholdSmall, wherein the first product is a product of a first coefficient and a first norm of X, and the second product is a product of a second coefficient and
Figure FDA0003030818060000031
is the product of a third coefficient and a second norm of
Figure FDA0003030818060000032
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure FDA0003030818060000033
a column vector representing the average of the rows of the low rank matrix L,
Figure FDA0003030818060000034
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
8. The method of any of claims 1-7, wherein the determining an optimized column vector based on the feature matrix A, the mean column vector B, the weight vector W, and an optimization objective comprises:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the mean column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
9. The method of claim 8, wherein the reducing one non-zero element of X and replacing X comprises:
and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
10. The method of claim 1, wherein the characteristic indicator comprises at least one of:
IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
11. An index evaluation apparatus of a processor, comprising:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is a number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB types in each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the mean column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between an evaluation result of the feature indicator of the test program and the feature indicator of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of an instruction segment corresponding to the optimized column vector, where the number vector is used to represent a number of an instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to represent a weight of a feature index of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
12. The apparatus of claim 11, wherein the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
13. The apparatus of claim 12, wherein the second determining unit is to determine the optimized column vector by the following target optimization formula:
Figure FDA0003030818060000051
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
14. The apparatus of claim 11, wherein the optimization objective is:
when a first norm of X is smaller than or equal to a second threshold, minimizing the sum of a first product and a second product, wherein the first product is a product of a second norm of W.AX-W.B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, the value of the first coefficient is a non-zero real number, X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
15. The apparatus of claim 14, wherein the second determining unit is to determine the optimized column vector by the following target optimization formula:
Figure FDA0003030818060000052
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents W.AX-W.A second norm of B, | X | | non-woven phosphort3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
16. The apparatus of claim 11, wherein the optimization objective is:
the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X
Figure FDA0003030818060000053
Is the product of a third coefficient and a second norm of
Figure FDA0003030818060000054
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure FDA0003030818060000055
a column vector representing the average of the rows of the low rank matrix L,
Figure FDA0003030818060000056
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
17. The apparatus of claim 11, wherein the optimization objective is:
a sixth norm at W.AX-W.B or a seventh norm at AX-B is less than or equal to a first threshold, and/orWhen the eighth norm of X is less than or equal to the second threshold, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the product of the second coefficient and the first norm of X
Figure FDA0003030818060000061
Is the product of a third coefficient and a second norm of
Figure FDA0003030818060000062
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure FDA0003030818060000063
a column vector representing the average of the rows of the low rank matrix L,
Figure FDA0003030818060000064
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
18. The apparatus according to any of claims 11-17, wherein the second determining unit determines an optimized column vector from the feature matrix a, the mean column vector B, the weight vector W, and an optimization goal, embodied as:
step 1: establishing a target optimization model of the optimization target according to the feature matrix A, the mean column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of mu, and setting the optimization target as a convergence condition of the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not meet the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X meets the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
19. The apparatus of claim 18, wherein the second determining unit reduces one non-zero element in X and replaces X by:
and on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
20. The apparatus of claim 11, wherein the characteristic indicator comprises at least one of:
IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
CN201610944177.6A 2016-11-02 2016-11-02 Index evaluation method and device Active CN108008999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610944177.6A CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610944177.6A CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Publications (2)

Publication Number Publication Date
CN108008999A CN108008999A (en) 2018-05-08
CN108008999B true CN108008999B (en) 2021-07-20

Family

ID=62048196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610944177.6A Active CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Country Status (1)

Country Link
CN (1) CN108008999B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023015560A1 (en) * 2021-08-13 2023-02-16 Huawei Technologies Co.,Ltd. Systems and methods for sparsity-aware vector processing in general purpose cpus
CN115543719B (en) * 2022-11-24 2023-04-07 飞腾信息技术有限公司 Component optimization method and device based on chip design, computer equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751792B1 (en) * 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
CN103049310A (en) * 2012-12-29 2013-04-17 中国科学院深圳先进技术研究院 Multi-core simulation parallel accelerating method based on sampling
CN104268085A (en) * 2014-10-24 2015-01-07 重庆邮电大学 Software vulnerability discovery system and method based on attribute extraction
CN105654120A (en) * 2015-12-25 2016-06-08 东南大学—无锡集成电路技术研究所 Two-step cluster software load feature extraction method based on SOM and K-means

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9098352B2 (en) * 2013-07-17 2015-08-04 Deja Vu Security, Llc Metaphor based language fuzzing of computer code

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751792B1 (en) * 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
CN103049310A (en) * 2012-12-29 2013-04-17 中国科学院深圳先进技术研究院 Multi-core simulation parallel accelerating method based on sampling
CN104268085A (en) * 2014-10-24 2015-01-07 重庆邮电大学 Software vulnerability discovery system and method based on attribute extraction
CN105654120A (en) * 2015-12-25 2016-06-08 东南大学—无锡集成电路技术研究所 Two-step cluster software load feature extraction method based on SOM and K-means

Also Published As

Publication number Publication date
CN108008999A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
US7549069B2 (en) Estimating software power consumption
CN110178123B (en) Performance index evaluation method and device
US20150347666A1 (en) Micro-benchmark analysis optimization for microprocessor designs
CN114004352B (en) Simulation implementation method, neural network compiler and computer readable storage medium
CN109726764A (en) A kind of model selection method, device, equipment and medium
CN108008999B (en) Index evaluation method and device
Abate et al. Adaptive aggregation of Markov chains: Quantitative analysis of chemical reaction networks
CN114818600A (en) Chip verification method and device, electronic equipment and storage medium
CN116149917A (en) Method and apparatus for evaluating processor performance, computing device, and readable storage medium
CN111176253B (en) Monte Carlo-based flight control law automatic evaluation method
JP5153724B2 (en) Processing time estimation device and processing time estimation program
CN108304352B (en) Accelerated life test sample distribution method and terminal equipment
CN108664368B (en) Processor performance index evaluation method and device
CN107769987B (en) Message forwarding performance evaluation method and device
CN107678734B (en) CPU benchmark test program set construction method based on genetic algorithm
CN109815104B (en) GPGPU program approximate analysis system and method based on soft error perception
EP3518153A1 (en) Information processing method and information processing system
WO2016181549A1 (en) Performance evaluation apparatus and performance evaluation program
CN115543719B (en) Component optimization method and device based on chip design, computer equipment and medium
CN112602059A (en) Generating vector predicate digests
Herglotz et al. Estimation of non-functional properties for embedded hardware with application to image processing
KR20200072391A (en) Method and apparatus for predicting game indicator information
CN110750856B (en) Effective instruction window size assessment method based on machine learning
KR102413753B1 (en) Information processing apparatus, information processing method, and information processing program stored in a recording medium
CN108628731B (en) Method for selecting test instruction and processing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant