CN110197026A - A kind of processor core optimization method and system based on nearly threshold calculations - Google Patents
A kind of processor core optimization method and system based on nearly threshold calculations Download PDFInfo
- Publication number
- CN110197026A CN110197026A CN201910449741.0A CN201910449741A CN110197026A CN 110197026 A CN110197026 A CN 110197026A CN 201910449741 A CN201910449741 A CN 201910449741A CN 110197026 A CN110197026 A CN 110197026A
- Authority
- CN
- China
- Prior art keywords
- degree
- voltage
- data group
- approximation
- approximation data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Test And Diagnosis Of Digital Computers (AREA)
- Power Sources (AREA)
Abstract
The invention discloses a kind of processor core optimization methods and system based on nearly threshold calculations.This method comprises: obtaining multiple groups voltage-degree of approximation data group;Using multiple groups voltage-degree of approximation data group as the input of processor core, the corresponding performance prediction value of every group of voltage-degree of approximation data group, energy consumption predicted value and output quality predictions are obtained;Using the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality predictions as input, objective optimization function is solved using simulated annealing, obtains optimal voltage-degree of approximation data group;Voltage in optimal voltage-degree of approximation data group is determined as the optimal voltage under nearly threshold calculations state, the degree of approximation value in optimal voltage-degree of approximation data group is as best fit approximation degree.The present invention can automate selection voltage level and degree of approximation, to obtain energy consumption, performance and the output optimal three-dimensional optimized effect of quality comprehensive, high reliablity.
Description
Technical field
The present invention relates to processor core system optimization technology fields, more particularly to a kind of processing based on nearly threshold calculations
Device core optimization method and system.
Background technique
The appearance of " power consumption wall " has become a challenge for hindering development of computer.As manufacture craft is continuously improved,
Have in order to keep power consumption within tolerance interval, in chip and be not utilized largely, is i.e. " dark silicon " described in everybody,
The problem of this can bring using wall again.It data show under 10nm manufacture craft, 52% chip area is in " dark silicon "
Under state.In order to solve the problems, such as that researchers propose nearly threshold calculations technology (NTC), under NTC state " using wall "
Transistor all operates in the nearly threshold range of voltage.Under nearly threshold calculations state, good performance and power consumption can be obtained
Compromise.For example, nearly threshold calculations calculate (STC) under conditions of obtaining the saving of identical energy consumption compared to superthreshold, performance
Loss but very little.For example, obtaining 50% energy saving under approximate calculation state, 20% performance loss will cause, but
It is under superthreshold design conditions, to obtain identical energy saving, superthreshold calculating can bring higher energy loss.
But nearly threshold calculations are faced with the challenge of new integrity problem, the progress especially as manufacture craft is this
Challenge seems particularly evident.Process deviation influence the fundamental characteristics of device in chip, this has been that industry is inevitable
Problem.This process deviation seems especially prominent in the case where nearly threshold calculations, this is primarily due to being averaged for each unit
Error rate can rise, while the deviation of entire chip also will increase.In order to solve the problems, such as reliability, researchers are proposed very
More fault-toleranr techniques, for example, error correcting, reconstruct and hardware redundancy etc..The purpose of these technologies is to completely eliminate mistake, this
Sample can bring inevitably additional fault-tolerant expense.
Present many applications, for example, pattern-recognition, data mining and speech recognition etc., themselves has good appearance
Wrong characteristic.It is specifically inaccurately calculated in these applications and data is all acceptable.For example, in a search engine
Even if search result is not to coincide but can be received with search content completely;Since the sensing capability of people in itself is limited
Can skip picture in some videos, people only lie in these applications as a result, they will not be concerned about intermediate mistake
Journey is correct execution very.Therefore, occur to be not necessarily to correct it when mistake in these applications, it thus can be with
It reduces due to unnecessary additional performance and time overhead caused by fault-tolerant.These applications are insensitive to mistake, so they
Be to nearly threshold calculations it is friendly, they can be mitigated the reliability of nearly threshold calculations the problem of.For the appearance of these applications
Wrong characteristic, researchers propose the approximation method of hardware layer and software layer.These approximations understand artificial introduction mistakes but they are right
The influence of final output quality is limited, and the loss of these output quality is completely within the tolerance interval of user.
Currently, generalling use in processor core optimization method: 1) voltage regulation techniques consider how to adjust voltage to obtain
Obtain the raising of energy efficiency.This method only considers voltage, has not only accounted for the selection of approximation technique, but also does not account for energy consumption, property
The optimization of quality three-dimensional can and be exported, this results in reliability lower.For example, dynamic voltage frequency adjustment (DVFS), dynamic skill
Art is then the different needs of the application program that is run according to chip to computing capability, the running frequency and electricity of dynamic regulation chip
It presses (for same chip, frequency is higher, and required voltage is also higher), thus reach energy-efficient purpose, it is this popular
Technology is exactly only to consider the adjusting of voltage, and cannot obtain the effect of optimization of multidimensional.2) consider to apply under different degrees of approximation
The variation of output accuracy, but the combination for the selection suitable degree of approximation and voltage that can be automated there is no one kind.Therefore, existing
Optimization method be usually accomplished that one-dimensional optimizes, and be all that nearly threshold calculations technology and approximate calculation technology is isolated out completely
To realize optimization, lacks one kind and can automate adjusting and obtain energy consumption, performance and defeated to select voltage level and degree of approximation
The comprehensive optimal three-dimensional optimized effect of mass.
Summary of the invention
Based on this, it is necessary to a kind of processor core optimization method and system based on nearly threshold calculations is provided, to realize certainly
Dynamicization selects voltage level and degree of approximation, guarantees that processor core is transported in the case where optimal energy consumption, performance and output quality
Row realizes multi-dimensional optimization, improves the reliability of processor core optimization method.
To achieve the above object, the present invention provides following schemes:
A kind of processor core optimization method based on nearly threshold calculations, comprising:
Obtain multiple groups voltage-degree of approximation data group;The each voltage-degree of approximation data group includes a voltage
With corresponding degree of approximation value;
Using voltage described in multiple groups-degree of approximation data group as the input of processor core, obtain every group described in voltage-approximation
The corresponding performance prediction value of level data group, energy consumption predicted value and output quality predictions;
Construct objective optimization function;
By the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality predictions
As input, the objective optimization function is solved using simulated annealing, obtains optimal voltage-degree of approximation data
Group;
Voltage in the optimal voltage-degree of approximation data group is determined as the optimal electricity under nearly threshold calculations state
It presses, the degree of approximation value in the optimal voltage-degree of approximation data group is as best fit approximation degree;The processor core operation
Under the optimal voltage and the best fit approximation degree.
Optionally, described using voltage described in multiple groups-degree of approximation data group as the input of processor core, obtain every group of institute
The corresponding performance prediction value of voltage-degree of approximation data group, energy consumption predicted value and output quality predictions are stated, are specifically included:
Using voltage described in the multiple groups-degree of approximation data group as the input of Performance Predicter, using approximate calculation side
Method obtain every group described in the corresponding performance prediction value of voltage-degree of approximation data group
IPSi=Avi+ΔIPSi,
Wherein, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are constant, A depends on processor core
The application program executed in configuration and processor core, Δ IPSiIndicate approximate calculation method to the influence degree of performance;
Using voltage described in the multiple groups-degree of approximation data group as the input of energy consumption fallout predictor, using approximate calculation side
Method obtain every group described in the corresponding energy consumption predicted value of voltage-degree of approximation data group
Energyi=(βivi)2C+(βivi)2miD,
Wherein, βiIndicate i-th group of voltage-degree of approximation data group it is one 0 to 1 corresponding between constant, βiDepending on use
Desirability of the family to voltage, C expression constant, the configuration that C depends on processor core indicate constant, miIndicate that i-th group of voltage-is close
Like the degree of approximation value in level data group, D is the influence degree depending on approximate calculation method to energy consumption;
Using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, counted using approximation
Calculation method and fault injection methods obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group.
Optionally, it is described using voltage described in the multiple groups-degree of approximation data group as output quality predictor input,
Using approximate calculation method and fault injection methods obtain every group described in the corresponding output prediction of quality of voltage-degree of approximation data group
Value, specifically includes:
Using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, to the processing
The instruction in application program executed on device core is classified, and multiple instruction classification is obtained;Each described instruction classification includes more
The similar instruction of a propagation path;
Each described instruction classification is sampled using approximate calculation method, obtains multiple sampling instructions;
Using fault injection methods to each sampling instruction injection failure, sampling faulting instruction is obtained;
According to each sampling instruction and corresponding sampling faulting instruction, error amount is calculated;The error amount includes that sampling refers to
The poly- heap error amount of maximum of output and corresponding sampling faulting instruction output, sampling instruction output is enabled to refer to corresponding sampling failure
Enable the maximum value of the relative error of output and the matrix error of sampling instruction output and corresponding sampling faulting instruction output;
According to the error amount obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group.
Optionally, described that failure is injected to each sampling instruction using fault injection methods, obtain sampling faulting instruction, tool
Body includes:
Construct direct fault location platform;Integrated debugging controls software, direct fault location software, hardware on the direct fault location platform
Emulator and emulation backboard;
Using the direct fault location platform to each sampling instruction injection failure, sampling faulting instruction is obtained.
Optionally, the building objective optimization function, specifically includes:
Building is using performance parameter as target, the function of energy consumption parameter and output mass parameter as constraint condition;It is described
Function is objective optimization function.
Optionally, described by the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output
Quality predictions solve the objective optimization function using simulated annealing as input, and it is close to obtain optimal voltage-
Like level data group, specifically include:
By the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality predictions
As input, the voltage-degree of approximation data group for meeting optimal conditions is obtained using simulated annealing;The optimal conditions are
Performance prediction value is maximum, energy consumption predicted value is less than default energy consumption preset value and output quality predictions are pre- less than default output quality
If being worth or performance prediction value being as the reduction of annealing temperature is become smaller with predeterminated frequency;
The voltage for meeting optimal conditions-degree of approximation data group is determined as optimal voltage-degree of approximation data group.
The present invention also provides a kind of processor core optimization systems based on nearly threshold calculations, comprising:
Data acquisition module, for obtaining multiple groups voltage-degree of approximation data group;The each voltage-degree of approximation number
It include a voltage and corresponding degree of approximation value according to group;
Predicted value obtains module, for obtaining using voltage described in multiple groups-degree of approximation data group as the input of processor core
To the corresponding performance prediction value of voltage described in every group-degree of approximation data group, energy consumption predicted value and output quality predictions;
Objective function constructs module, for constructing objective optimization function;
Module is solved, for by the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and defeated
Mass predicted value solves the objective optimization function using simulated annealing, obtains optimal voltage-as input
Degree of approximation data group;
Optimal set determining module, for the voltage in the optimal voltage-degree of approximation data group to be determined as nearly threshold value
Optimal voltage under calculating state, the degree of approximation value in the optimal voltage-degree of approximation data group is as best fit approximation journey
Degree;The processor core operates under the optimal voltage and the best fit approximation degree.
Optionally, the predicted value obtains module, specifically includes:
Performance prediction unit, for using voltage described in the multiple groups-degree of approximation data group as the defeated of Performance Predicter
Enter, using approximate calculation method obtain every group described in the corresponding performance prediction value of voltage-degree of approximation data group
IPSi=Avi+ΔIPSi,
Wherein, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are constant, A depends on processor core
The application program executed in configuration and processor core, Δ IPSiIndicate approximate calculation method to the influence degree of performance;
Energy consumption predicting unit, for voltage described in the multiple groups-degree of approximation data group is defeated as energy consumption fallout predictor
Enter, using approximate calculation method obtain every group described in the corresponding energy consumption predicted value of voltage-degree of approximation data group
Energyi=(βivi)2C+(βivi)2miD,
Wherein, βiIndicate i-th group of voltage-degree of approximation data group it is one 0 to 1 corresponding between constant, βiDepending on use
Desirability of the family to voltage, C expression constant, the configuration that C depends on processor core indicate constant, miIndicate that i-th group of voltage-is close
Like the degree of approximation value in level data group, D is the influence degree depending on approximate calculation method to energy consumption;
Prediction of quality unit is exported, for voltage described in the multiple groups-degree of approximation data group is pre- as output quality
Survey device input, using approximate calculation method and fault injection methods obtain every group described in voltage-degree of approximation data group it is corresponding
Export quality predictions.
Optionally, the output prediction of quality unit, specifically includes:
Classification subelement, for using voltage described in the multiple groups-degree of approximation data group as output quality predictor
Input, classifies to the instruction in the application program executed on the processor core, obtains multiple instruction classification;It is each described
Classes of instructions includes the similar instruction of multiple propagation paths;
Sub-unit obtains multiple sampling for being sampled using approximate calculation method to each described instruction classification
Instruction;
Direct fault location subelement is used to obtain sampling event to each sampling instruction injection failure using fault injection methods
Barrier instruction;
Error calculation subelement, for calculating error amount according to each sampling instruction and corresponding sampling faulting instruction;Institute
Stating error amount includes sampling instruction output and the poly- heap error amount of maximum of corresponding sampling faulting instruction output, sampling instruction output
And the maximum value of the relative error of corresponding sampling faulting instruction output and sampling instruction output refer to corresponding sampling failure
Enable the matrix error of output;
Prediction of quality subelement is exported, for voltage-degree of approximation data group described in obtaining every group according to the error amount
Corresponding output quality predictions.
Optionally, the objective function constructs module, specifically includes:
Performance objective construction unit is made for constructing using performance parameter as target, energy consumption parameter and output mass parameter
For the function of constraint condition;The function is objective optimization function;
The solution module, specifically includes:
Optimize unit, for by the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and defeated
Mass predicted value obtains the voltage-degree of approximation data group for meeting optimal conditions using simulated annealing as input;Institute
Stating optimal conditions, maximum, energy consumption predicted value is less than default energy consumption preset value and output quality predictions less than pre- for performance prediction value
If exporting quality preset value or performance prediction value as the reduction of annealing temperature is become smaller with predeterminated frequency;
Determination unit, for the voltage-degree of approximation data group for meeting optimal conditions to be determined as optimal voltage-approximation journey
Spend data group.
Compared with prior art, the beneficial effects of the present invention are:
The invention proposes a kind of processor core optimization methods and system based on nearly threshold calculations, which comprises
Obtain voltage-degree of approximation data group that multiple groups include voltage and corresponding degree of approximation value;By multiple groups voltage-degree of approximation number
According to the input group as processor core, the corresponding performance prediction value of every group of voltage-degree of approximation data group, energy consumption predicted value are obtained
With output quality predictions;By the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output matter
Predicted value is measured as input, objective optimization function is solved using simulated annealing, obtains optimal voltage-degree of approximation
Data group;Voltage in optimal voltage-degree of approximation data group is determined as the optimal voltage under nearly threshold calculations state, it is optimal
Degree of approximation value in voltage-degree of approximation data group is as best fit approximation degree.The present invention considers voltage and approximate journey simultaneously
Angle value, and energy consumption, performance and output quality comprehensive are considered, optimal voltage-degree of approximation number is selected using simulated annealing
According to group, multi-dimensional optimization is realized, processor core is made to operate in optimal voltage and best fit approximation degree under nearly threshold calculations state
Under, high reliablity.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention
Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings
Obtain other attached drawings.
Fig. 1 is the flow chart of processor core optimization method of the embodiment of the present invention 1 based on nearly threshold calculations;
Fig. 2 is the block diagram of processor core optimization method of the embodiment of the present invention 2 based on nearly threshold calculations;
Fig. 3 is the block diagram that the embodiment of the present invention 2 exports quality predictor;
Fig. 4 is the system framework figure of 2 direct fault location platform of the embodiment of the present invention;
Fig. 5 is the structural schematic diagram of processor core optimization system of the embodiment of the present invention 3 based on nearly threshold calculations.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Applying mode, the present invention is described in further detail.
Fig. 1 is a kind of flow chart of the processor core optimization method based on nearly threshold calculations of the embodiment of the present invention.
Referring to Fig. 1, the processor core optimization method based on nearly threshold calculations of embodiment, comprising:
Step S1: multiple groups voltage-degree of approximation data group is obtained.
The each voltage-degree of approximation data group includes a voltage and corresponding degree of approximation value.Due to area
The scaling section of the limitation of expense, voltage level is limited, and the variation of voltage level be not it is continuous, therefore, voltage
Alternative value is determining;In the system using approximate calculation, different degrees of approximation can choose, for example, refreshing
Change degree of approximation by changing different network topologies through network approximation technique, mantissa rounding approximation technique is cut by changing
The digit fallen changes different degrees of approximation.The selectable range of degree of approximation is limited, for example, neural network approximation skill
The digit that can be intercepted in selectable network topology and mantissa rounding in art is all determining.Therefore, voltage and degree of approximation value
Combination is also limited, i.e. voltage-degree of approximation data group number is limited.
Step S2: using voltage described in multiple groups-degree of approximation data group as the input of processor core, obtain every group described in electricity
The corresponding performance prediction value of pressure-degree of approximation data group, energy consumption predicted value and output quality predictions.
The step S2, specifically includes:
1) using voltage described in the multiple groups-degree of approximation data group as the input of Performance Predicter, using approximate calculation
Method obtain every group described in the corresponding performance prediction value of voltage-degree of approximation data group
IPSi=Avi+ΔIPSi,
Wherein, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are constant, A depends on processor core
The application program executed in configuration and processor core, Δ IPSiIndicate approximate calculation method to the influence degree of performance.
2) using voltage described in the multiple groups-degree of approximation data group as the input of energy consumption fallout predictor, using approximate calculation
Method obtain every group described in the corresponding energy consumption predicted value of voltage-degree of approximation data group
Energyi=(βivi)2C+(βivi)2miD,
Wherein, βiIndicate i-th group of voltage-degree of approximation data group it is one 0 to 1 corresponding between constant, βiDepending on use
Desirability of the family to voltage, C expression constant, the configuration that C depends on processor core indicate constant, miIndicate that i-th group of voltage-is close
Like the degree of approximation value in level data group, D is the influence degree depending on approximate calculation method to energy consumption.
3) using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, using approximation
Calculation method and fault injection methods obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group.Specifically
Include:
31) using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, to the place
The instruction in application program executed on reason device core is classified, and multiple instruction classification is obtained;Each described instruction classification includes
The similar instruction of multiple propagation paths.
32) each described instruction classification is sampled using approximate calculation method, obtains multiple sampling instructions.
33) sampling faulting instruction is obtained to each sampling instruction injection failure using fault injection methods;Specifically, first
Direct fault location platform is first constructed, integrated debugging controls software, direct fault location software, hardware emulator on the direct fault location platform
With emulation backboard;Then sampling faulting instruction is obtained to each sampling instruction injection failure using the direct fault location platform.
34) according to each sampling instruction and corresponding sampling faulting instruction, error amount is calculated;The error amount includes sampling
The poly- heap error amount of maximum of instruction output and corresponding faulting instruction output of sampling, sampling instruction export and corresponding sampling failure
Instruct the maximum value of the relative error of output and the matrix error of sampling instruction output and corresponding sampling faulting instruction output.
35) according to the error amount obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group.
Step S3: building objective optimization function.
The objective optimization function can be using performance parameter as target, and energy consumption parameter and output mass parameter are as about
The function of beam condition;It can be using performance parameter as target, function of the energy consumption parameter as constraint condition;It can be with energy consumption
Parameter is as target, the function of performance parameter and output mass parameter as constraint condition;It can be using energy consumption parameter as mesh
Mark, function of the performance parameter as constraint condition.The present embodiment is using performance parameter as target, energy consumption parameter and output quality ginseng
The function as constraint condition is counted as objective optimization function.
Step S4: by the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality
Predicted value solves the objective optimization function using simulated annealing, obtains optimal voltage-approximation journey as input
Spend data group.
Using performance parameter as target, energy consumption parameter and output mass parameter are excellent as target as the function of constraint condition
When changing function, is solved using simulated annealing, obtains optimal voltage-degree of approximation data group detailed process are as follows:
1) by the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output prediction of quality
Value obtains the voltage-degree of approximation data group for meeting optimal conditions using simulated annealing as input;The optimal conditions
For performance prediction value is maximum, energy consumption predicted value is less than default energy consumption preset value and output quality predictions are less than default output quality
Preset value or performance prediction value are become smaller with the reduction of annealing temperature with predeterminated frequency.
2) voltage for meeting optimal conditions-degree of approximation data group is determined as optimal voltage-degree of approximation data group.
Step S5: the voltage in the optimal voltage-degree of approximation data group is determined as under nearly threshold calculations state
Optimal voltage, the degree of approximation value in the optimal voltage-degree of approximation data group is as best fit approximation degree;The processor
Core operates under the optimal voltage and the best fit approximation degree.
Processor core optimization method of the present embodiment based on nearly threshold calculations, can efficiently find optimum voltage-approximation
Level data group, to obtain the effect of optimization of optimal performance, energy and output quality three-dimensional, this method high reliablity.The party
Prediction model is made of three fallout predictors in method, is used for estimated performance, exports quality and energy, wherein output quality predictor is logical
The fault filling method for crossing Hardware/Software Collaborative Design simulates in nearly threshold calculations system static failure to predict output quality, passes through
Simulated annealing optimization algorithm configures to find the optimum voltage rank of system and degree of approximation in the nearly thresholding system of processor core, should
Configuration can maximize performance under given energy/output quality requirement, or under given performance/output quality requirement
Energy is minimized, reliability is further improved.
Embodiment 2:
The present embodiment in order to obtain optimal voltage and degree of approximation value so that processor core has in the process of running
Optimal energy consumption, performance and the three-dimensional optimized effect for exporting quality, firstly, a multi-dimensional optimization model is established, the model packet
Including three fallout predictors is respectively: then Performance Predicter, energy consumption fallout predictor and output quality predictor pass through simulated annealing
Optimal voltage level and degree of approximation combination are obtained to obtain the effect of optimization of optimal multidimensional.As shown in Fig. 2, multi-dimensional optimization
The input of model is output quality threshold, energy consumption budget, system configuration and can approximate code region;Three fallout predictors in model
It can predict output quality of nearly threshold calculations system under the conditions of certain given voltage rank and degree of approximation, energy consumption and performance
(performance indicates that IPS refers to the instruction number run in the unit time with IPS);Three above fallout predictor is finally by simulated annealing
Algorithm obtains optimum voltage rank and degree of approximation combination, to obtain the effect of optimization of multidimensional.
1, Performance Predicter
Performance Predicter describes performance with IPS, and Performance Predicter is specific as follows shown:
IPSi=Avi+ΔIPSi,
Wherein, IPSiIndicate performance predicted value, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are normal
Amount, the application program that A is executed in the configuration and processor core depending on processor core, Δ IPSiIndicate approximate calculation method to property
The influence degree of energy, A is bigger, then IPS is more sensitive to the variation of voltage.
2, energy consumption fallout predictor
Shown in energy consumption fallout predictor is specific as follows:
Energyi=(βivi)2C+(βivi)2miD,
Wherein, EnergyiIndicate energy consumption predicted value, βiIndicate that i-th group of voltage-degree of approximation data group corresponding one 0 is arrived
Constant between 1, βiDepending on user to the desirability of voltage, C indicates constant, and the configuration that C depends on processor core indicates
Constant, miIndicate the degree of approximation value in i-th group of voltage-degree of approximation data group, D is to depend on approximate calculation method to energy consumption
Influence degree.
3, quality predictor is exported
The starting point of output quality predictor design is: if instructing the data controlled with " similar " propagation path,
These data can generate similar influence to the output quality of program in case of mistake.Instruction is divided into according to this starting point
A series of group, wherein data dissemination has being classified into same group of " similar " attribute.
As shown in figure 3, no matter first kind instruction will not all produce output quality using which kind of voltage level and degree of approximation
It is raw to influence.Four instruction groups being divided into are as follows: 1. NOP instruction, is null statement, does not do any operation, is not appointed to output result
What is influenced;It is prefetched 2. mistake occurs in Performance-enhancing instructions performance enhancement instruction and may cause
In vain, but Program Semantics will not change;3. Predicated-false instructions instructs assertion failed, the knot of instruction
Fruit will be dropped, but not influence program operation;4. the dead instruction of Dynamically dead instructions dynamic,
As a result it will not be used by system, will not influence to export result.
The instruction of second class will cause different degrees of influence to output quality using different degrees of approximate calculation method.Allusion quotation
The grouping of type has the instruction for saving the related data in relation to filter and pixel;The instruction of the convolutional calculation involved in convolution is can
Approximately;The instruction for being loaded and stored to image destination from image source is also can be approximate.Then we adopt according to grouping
Representative sampling is selected to instruct from each group with approximate calculation method.
Due to the limitation of area overhead, the scaling section of voltage level is limited, while the variation of voltage level is not
Continuously, so that the alternative value of voltage level is determining.In the system using approximate calculation method, degree of approximation
Selection is also limited.Therefore, the combination of voltage and degree of approximation is also limited.Since data group is limited, it can
To assess each sampling instruction by direct fault location and propagation analysis under each possible voltage and degree of approximation to output
The influence of quality.In order to simulate the static failure of some voltage, direct fault location platform is used to from the every group of representative selected instruction
Inject failure.The influence that can be obtained to output quality is analyzed in the propagation for passing through mistake after direct fault location.
In order to quantify to propose three kinds of output quality index for different types of application: 1. to the influence of output quality
Max-abs-diff: the index gives the maximum absolute difference between complete correct output and failure output;②max-
Rel-err: the maximum value of the relative error between correct output and failure output is calculated;3. rel-l2-norm: for directly ratio
Compared with the error of two matrixes.As shown in figure 3, the influence after direct fault location to output result is passed through in each sampling instruction of statistics, and
These influences are quantized into three above-mentioned indexs.Specific quantizing process is will to be given above three quality index to be abstracted into
" output quality bucket " (Quality Bucket) quantifies output quality, to defeated after then observation instruction is inputted through failure
Which " output quality bucket " influence of mass falls in.As shown in table 1, it by the method for direct fault location, obtains various
" the output quality bucket " of each instruction group under different voltage levels and degree of approximation.Assuming that a program is given, by finger
The direct fault location of order and propagation analysis can know the distribution of all instructions in table 1, can be predicted according to distribution corresponding defeated
Mass.
Table 1
4, direct fault location platform
Direct fault location platform in the present embodiment is designed based on backplane technology, and Fig. 4 is 2 direct fault location of the embodiment of the present invention
The system framework figure of platform.Referring to fig. 4, integrated debugging control software, direct fault location software and hardware are imitative on direct fault location platform
True device.
Debugging control software is responsible for compiling and configuring the software program run on target processor, can be by figure circle
Face or order line provide input and output, are communicated by network or serial ports with bottom hardware, software is controlled debugging by emulation backboard
System orders the target processor being sent in hardware emulator, is simultaneously also by what emulation backboard received hardware processor return
System state is for debugging.
Direct fault location software receives the parameters such as fault injection time and position by user interface, is received by emulation backboard
The hardware signal list that hardware is sent, the searching loop searched for deeply by an elder generation, complete all signals layering identification and
Record establishes hierarchical resource pool for directly positioning Injection Signal in direct fault location, then generates the event for being directed to hardware signal
Hinder library, failure and Simulation Control information generated is sent to target processor by emulating backboard, and connects by emulating backboard
State after receiving target processor injection.
Fault-tolerant processor prototype executes in hardware emulator, and hardware emulator selects LEON2 as CPU core, LEON2
Technical characterstic mainly have: using SPARCV8 structure, using internal AMBA bus structures, fault-tolerant design and VHDL coding style.
The processor unit of LENO2 mainly includes integer unit (5-stage IU), the floating-point list for floating-point operation of Pyatyi flowing water
Member (FPU), coprocessor unit (CP).Integer unit 5-stage IU has isolated data cache (Dcache) and instruction
Cache (Icache), in addition to this there are also memory management unit (MMU).The On-Chip peripheral of LENO2 includes having external component mutual
The even bus (PCI) of standard, Ethernet interface (net), dynamic random access memory (DRAM) and debugging supporter
(DSU).DSU can be arranged processor to debugging mode, and all registers and Cache of processor can be read and write by it.
DSU further includes a trace cache, the data that can be saved the instruction executed and transmit on AHB.By having trace cache
Internal debugging cells D SU receive and software command and send execution state.The virtual hardware such as network interface card and serial ports of emulation backboard are set
It is standby, control command and direct fault location information are sent to by hardware DSU, the letter that DSU is returned according to the demand of software and hardware information exchange
Breath is then according to content distribution to different destinations, and the Debugging message that debugger is observed is sent to debugging software, by signal
The direct fault locations information needed such as state is sent to direct fault location software, realizes control command and returns the result the biography for waiting communication datas
Defeated and scheduling.
The hybrid programming that backboard passes through VHDL and C language is emulated, it is real by means of the external language interface of hardware description language
Information exchange seamless between mutual association, mapping and data control block between existing hardware emulator and software systems.
External language interface can design excitation with C language and execute simulating, verifying task in hardware emulator.Hardware interface part
The signal monitoring and signal immediate status read functions provided by means of hardware description language external interface will be closed in hardware module
The signal extraction of note comes out, and realizes direct fault location by signal logic value pressure assignment, signal logic value immediately occurred after injection
Variation, will not influence the signal that other in circuit have logical relation with the signal, simulate when single event occurs to circuit
Impact effect.
Direct fault location platform completely disengages physical prototype by the collaborative simulation based on backplane technology in the present embodiment, makes
Designer is as far as possible in design early detection and time update mistake, to reduce development cost.Emulate software portion on backboard
Divide the sensitive signal example in corresponding hardware description language module, when logic circuit triggers the sensitive signal of the module, emulation
Device will call the program in corresponding dynamic link library, be mapped to high-level language for signal defined in hardware description language is seamless
Under environment, at this time to any processing of this signal logic value can software-based algorithm realize that no matter is high-level language
It is all easier than hardware language and scripting language in terms of Row control or function call, it is no longer influenced by hardware language and mould
The limitation of quasi- device emulation command, it is thus possible to support more complicated fault model and algorithm, and high-level language is with good
Good portability is convenient for more multi-functional extension.And software program will not be integrated into when downloading in chip, therefore not increased
Added logic also avoids intrusion and influence on hardware module, had not only protected the integrality of object module, but also test is tied
Fruit has more authenticity.
5, optimization algorithm
It needs the problem of optimizing as follows: having one group of processor core PE (PE1......N) it may operate in M (v1......M) it is a not
Same voltage level and K (m1......K) under a different degree of approximation, selected most from M × K kind voltage and degree of approximation combination
Good combination, to obtain the three-dimensional best compromise optimization of performance, power consumption and output quality.Table 2 give four kinds it is different excellent
Change strategy.In situation 1, target is to maximize performance, and P-E+R indicates that performance is optimization aim, and energy and output quality are
Constraint condition, P-E indicates that performance is optimization aim, and energy is constraint.In situation 2, optimization aim is to minimize energy, E-
P+R indicates that energy is objective function, and performance and output quality are constraints, and E-P indicates that energy is objective function, and performance is
Constraint exports quality without considering.The folding between energy, performance and output quality is assessed in the present embodiment using index EEPI
In effect of optimization, EEPI is specific as follows:
EEPI=ErrorEnergy/IPS,
Wherein Error is the output mass loss of system, and Energy is the energy consumption of system, and IPS indicates the performance of system,
EEPI value is lower, and the complex optimum effect for indicating performance, energy consumption and output quality three is better.
Table 2
It selects P-E+R as prioritization scheme below, which is solved by simulated annealing (SA) algorithm.Simulation is moved back
Fiery algorithm is a kind of Stochastic Optimization Algorithms based on Monte-Carlo iterative solution strategy, solid matter in analogies Neo-Confucianism
Annealing process solve optimization problem.SA can find the globally optimal solution of objective function in the case of a temperature drop.SA
Sharpest edges be can jump out local optimum solve scheme, be finally intended to global optimum solve scheme.Utilize simulated annealing
What is optimized is briefly described below: the voltage and degree of approximation of initialization temperature and processor core combination first, passage capacity
Fallout predictor obtains initial performance, then constantly changes voltage and degree of approximation combination, if energy consumption and output quality satisfaction are wanted
Performance becomes larger then more Combination nova and performance when seeking common ground, if performance is become smaller with certain probability with the reduction of annealing temperature,
Then also more Combination nova and performance find optimal voltage eventually by constantly iteration and degree of approximation combine.It is moved back by simulation
The code that fiery algorithm obtains optimal solution is as follows:
Optimum voltage level and degree of approximation combination can be obtained using as optimal solution by above-mentioned algorithm.Specifically, adopting
With a series of C={ c1,c2,c3,...,cnConfiguration initialization PE, wherein the corresponding c of b-th of PEbFor voltage level and approximate journey
Degree combination.The algorithm passes through annealing mobile ANNEAL_MOVE () first come while maximizing IPS, and it is empty to explore entire solution
Between;Then algorithm checks IPSnewWhether IPS, Energy are greater thannewWhether Energy is lower thanbudget(energy consumption budget) and
OQLOSSnewWhether OQLOSS is lower thanthreshold(output quality threshold), if above-mentioned condition is all unsatisfactory for, IPSnewWith certain
Probability become smaller with the reduction of annealing temperature;Meet the combination of above-mentioned condition if it exists, then the combination for meeting condition determines
It is combined for optimum voltage rank and degree of approximation;Processor core operates under optimum voltage rank and degree of approximation combination, can
Performance is improved to the maximum extent is able to satisfy requirement of the user to output quality and performance again simultaneously.What above-mentioned specific algorithm provided is
Using performance as target, energy and output quality similarly can also be using other three kinds for the concrete example of constraint condition (P-E+R)
Optimisation strategy (E-P+R, E-P or P-E) combines simulated annealing prioritization scheme the most.
The processor core optimization method based on nearly threshold calculations in the present embodiment, is set using the NTC system of approximate calculation
Multi-dimensional optimization model is counted, which includes that three fallout predictors are respectively: Performance Predicter, energy consumption fallout predictor and output prediction of quality
Then device obtains optimal voltage level finally by simulated annealing optimization algorithm and degree of approximation combines.The method achieve
Automation selection voltage level and degree of approximation ensure that processor core optimal energy consumption, performance and the output quality the case where
Lower operation, realizes multi-dimensional optimization, improves the reliability of processor core optimization method.
Mistake in instruction is divided into one group with approximate propagation path, so by the output quality predictor wherein designed
Representational sampling instruction is selected from these groups afterwards, it is given at certain to analyze these instructions using the method for direct fault location
Quality is exported under voltage level and degree of approximation, wherein direct fault location platform, based on emulation back plate design, realize based on VHDL
Design with the hybrid-language programming of C, the output quality predictor further improves the reliability of method.
In addition, by analyzing energetic efficiency characteristic and fault-tolerant spy in different application under certain voltage rank and degree of approximation
Sign has evaluated the influence that above-mentioned optimization method applies these in NTC system, and finds that Optimized model can be with aware application
Characteristic, go the energy for adjusting each program and IPS allocation proportion, come obtain whole system optimal multi-dimensional optimization effect power
Weighing apparatus, demonstrates the validity of above-mentioned optimization method.
Embodiment 3:
The present invention also provides a kind of processor core optimization system based on nearly threshold calculations, Fig. 5 is the embodiment of the present invention 3
The structural schematic diagram of processor core optimization system based on nearly threshold calculations.
Referring to Fig. 5, the processor core optimization system based on nearly threshold calculations of embodiment includes:
Data acquisition module 501, for obtaining multiple groups voltage-degree of approximation data group;Each voltage-degree of approximation
Data group includes a voltage and corresponding degree of approximation value.
Predicted value obtains module 502, for using voltage described in multiple groups-degree of approximation data group as the defeated of processor core
Enter, obtain every group described in the corresponding performance prediction value of voltage-degree of approximation data group, energy consumption predicted value and output prediction of quality
Value.
Objective function constructs module 503, for constructing objective optimization function.
Module 504 is solved, is used for the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value
With output quality predictions as input, the objective optimization function is solved using simulated annealing, is obtained optimal
Voltage-degree of approximation data group.
Optimal set determining module 505, for the voltage in the optimal voltage-degree of approximation data group to be determined as nearly threshold
The optimal voltage being worth under calculating state, the degree of approximation value in the optimal voltage-degree of approximation data group is as best fit approximation
Degree;The processor core operates under the optimal voltage and the best fit approximation degree.
As an alternative embodiment, the predicted value obtains module 502, specifically include:
Performance prediction unit, for using voltage described in the multiple groups-degree of approximation data group as the defeated of Performance Predicter
Enter, using approximate calculation method obtain every group described in the corresponding performance prediction value of voltage-degree of approximation data group
IPSi=Avi+ΔIPSi,
Wherein, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are constant, A depends on processor core
The application program executed in configuration and processor core, Δ IPSiIndicate approximate calculation method to the influence degree of performance.
Energy consumption predicting unit, for voltage described in the multiple groups-degree of approximation data group is defeated as energy consumption fallout predictor
Enter, using approximate calculation method obtain every group described in the corresponding energy consumption predicted value of voltage-degree of approximation data group
Energyi=(βivi)2C+(βivi)2miD,
Wherein, βiIndicate i-th group of voltage-degree of approximation data group it is one 0 to 1 corresponding between constant, βiDepending on use
Desirability of the family to voltage, C expression constant, the configuration that C depends on processor core indicate constant, miIndicate that i-th group of voltage-is close
Like the degree of approximation value in level data group, D is the influence degree depending on approximate calculation method to energy consumption.
Prediction of quality unit is exported, for voltage described in the multiple groups-degree of approximation data group is pre- as output quality
Survey device input, using approximate calculation method and fault injection methods obtain every group described in voltage-degree of approximation data group it is corresponding
Export quality predictions.
The wherein output prediction of quality unit, specifically includes:
Classification subelement, for using voltage described in the multiple groups-degree of approximation data group as output quality predictor
Input, classifies to the instruction in the application program executed on the processor core, obtains multiple instruction classification;It is each described
Classes of instructions includes the similar instruction of multiple propagation paths.
Sub-unit obtains multiple sampling for being sampled using approximate calculation method to each described instruction classification
Instruction.
Direct fault location subelement is used to obtain sampling event to each sampling instruction injection failure using fault injection methods
Barrier instruction.
Error calculation subelement, for calculating error amount according to each sampling instruction and corresponding sampling faulting instruction;Institute
Stating error amount includes sampling instruction output and the poly- heap error amount of maximum of corresponding sampling faulting instruction output, sampling instruction output
And the maximum value of the relative error of corresponding sampling faulting instruction output and sampling instruction output refer to corresponding sampling failure
Enable the matrix error of output.
Prediction of quality subelement is exported, for voltage-degree of approximation data group described in obtaining every group according to the error amount
Corresponding output quality predictions.
As an alternative embodiment, the objective function constructs module 503, specifically include:
Performance objective construction unit is made for constructing using performance parameter as target, energy consumption parameter and output mass parameter
For the function of constraint condition;The function is objective optimization function.
The solution module 504, specifically includes:
Optimize unit, for by the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and defeated
Mass predicted value obtains the voltage-degree of approximation data group for meeting optimal conditions using simulated annealing as input;Institute
Stating optimal conditions, maximum, energy consumption predicted value is less than default energy consumption preset value and output quality predictions less than pre- for performance prediction value
If exporting quality preset value or performance prediction value as the reduction of annealing temperature is become smaller with predeterminated frequency;
Determination unit, for the voltage-degree of approximation data group for meeting optimal conditions to be determined as optimal voltage-approximation journey
Spend data group.
The processor core optimization system based on nearly threshold calculations of the present embodiment can be realized automation selection voltage level
And degree of approximation, guarantee that processor core is run in the case where optimal energy consumption, performance and output quality, realize multi-dimensional optimization,
Improve the reliability of processor core optimization method.
For the system disclosed in the embodiment 3, since it is corresponding with method disclosed in embodiment 1 or 2, so description
It is fairly simple, reference may be made to the description of the method.
Used herein a specific example illustrates the principle and implementation of the invention, and above embodiments are said
It is bright to be merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, foundation
Thought of the invention, there will be changes in the specific implementation manner and application range.In conclusion the content of the present specification is not
It is interpreted as limitation of the present invention.
Claims (10)
1. a kind of processor core optimization method based on nearly threshold calculations characterized by comprising
Obtain multiple groups voltage-degree of approximation data group;The each voltage-degree of approximation data group includes a voltage and right
The degree of approximation value answered;
Using voltage described in multiple groups-degree of approximation data group as the input of processor core, obtain every group described in voltage-degree of approximation
The corresponding performance prediction value of data group, energy consumption predicted value and output quality predictions;
Construct objective optimization function;
Using the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality predictions as
Input, solves the objective optimization function using simulated annealing, obtains optimal voltage-degree of approximation data group;
Voltage in the optimal voltage-degree of approximation data group is determined as the optimal voltage under nearly threshold calculations state, institute
The degree of approximation value in optimal voltage-degree of approximation data group is stated as best fit approximation degree;The processor core operates in institute
It states under optimal voltage and the best fit approximation degree.
2. a kind of processor core optimization method based on nearly threshold calculations according to claim 1, which is characterized in that described
Using voltage described in multiple groups-degree of approximation data group as the input of processor core, obtain every group described in voltage-degree of approximation data
The corresponding performance prediction value of group, energy consumption predicted value and output quality predictions, specifically include:
Using voltage described in the multiple groups-degree of approximation data group as the input of Performance Predicter, obtained using approximate calculation method
To the corresponding performance prediction value of voltage described in every group-degree of approximation data group
IPSi=Avi+ΔIPSi,
Wherein, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are constant, A depends on the configuration of processor core
With the application program executed on processor core, Δ IPSiIndicate approximate calculation method to the influence degree of performance;
Using voltage described in the multiple groups-degree of approximation data group as the input of energy consumption fallout predictor, obtained using approximate calculation method
To the corresponding energy consumption predicted value of voltage described in every group-degree of approximation data group
Energyi=(βivi)2C+(βivi)2miD,
Wherein, βiIndicate i-th group of voltage-degree of approximation data group it is one 0 to 1 corresponding between constant, βiDepending on user couple
The desirability of voltage, C indicate constant, and the configuration that C depends on processor core indicates constant, miIndicate i-th group of voltage-approximation journey
The degree of approximation value in data group is spent, D is the influence degree depending on approximate calculation method to energy consumption;
Using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, using approximate calculation side
Method and fault injection methods obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group.
3. a kind of processor core optimization method based on nearly threshold calculations according to claim 2, which is characterized in that described
Using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, using approximate calculation method and
Fault injection methods obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group, specifically include:
Using voltage described in the multiple groups-degree of approximation data group as the input of output quality predictor, to the processor core
Instruction in the application program of upper execution is classified, and multiple instruction classification is obtained;Each described instruction classification includes multiple biographies
Broadcast the similar instruction in path;
Each described instruction classification is sampled using approximate calculation method, obtains multiple sampling instructions;
Using fault injection methods to each sampling instruction injection failure, sampling faulting instruction is obtained;
According to each sampling instruction and corresponding sampling faulting instruction, error amount is calculated;The error amount includes that sampling instruction is defeated
Out and the poly- heap error amount of maximum of corresponding sampling faulting instruction output, sampling instruct output defeated with corresponding sampling faulting instruction
The maximum value of relative error out and the matrix error of sampling instruction output and corresponding sampling faulting instruction output;
According to the error amount obtain every group described in the corresponding output quality predictions of voltage-degree of approximation data group.
4. a kind of processor core optimization method based on nearly threshold calculations according to claim 3, which is characterized in that described
Using fault injection methods to each sampling instruction injection failure, sampling faulting instruction is obtained, is specifically included:
Construct direct fault location platform;Integrated debugging controls software, direct fault location software, simulation hardware on the direct fault location platform
Device and emulation backboard;
Using the direct fault location platform to each sampling instruction injection failure, sampling faulting instruction is obtained.
5. a kind of processor core optimization method based on nearly threshold calculations according to claim 1, which is characterized in that described
Objective optimization function is constructed, is specifically included:
Building is using performance parameter as target, the function of energy consumption parameter and output mass parameter as constraint condition;The function
For objective optimization function.
6. a kind of processor core optimization method based on nearly threshold calculations according to claim 5, which is characterized in that described
Using the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality predictions as input,
The objective optimization function is solved using simulated annealing, obtains optimal voltage-degree of approximation data group, it is specific to wrap
It includes:
Using the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output quality predictions as
Input, the voltage-degree of approximation data group for meeting optimal conditions is obtained using simulated annealing;The optimal conditions are performance
Predicted value is maximum, energy consumption predicted value is less than default energy consumption preset value and output quality predictions are default less than default output quality
It is worth or performance prediction value is as the reduction of annealing temperature is become smaller with predeterminated frequency;
The voltage for meeting optimal conditions-degree of approximation data group is determined as optimal voltage-degree of approximation data group.
7. a kind of processor core optimization system based on nearly threshold calculations characterized by comprising
Data acquisition module, for obtaining multiple groups voltage-degree of approximation data group;The each voltage-degree of approximation data group
It include a voltage and corresponding degree of approximation value;
Predicted value obtains module, every for obtaining using voltage described in multiple groups-degree of approximation data group as the input of processor core
The corresponding performance prediction value of the group voltage-degree of approximation data group, energy consumption predicted value and output quality predictions;
Objective function constructs module, for constructing objective optimization function;
Module is solved, is used for the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output matter
Predicted value is measured as input, the objective optimization function is solved using simulated annealing, obtains optimal voltage-approximation
Level data group;
Optimal set determining module, for the voltage in the optimal voltage-degree of approximation data group to be determined as nearly threshold calculations
Optimal voltage under state, the degree of approximation value in the optimal voltage-degree of approximation data group is as best fit approximation degree;Institute
Processor core is stated to operate under the optimal voltage and the best fit approximation degree.
8. a kind of processor core optimization system based on nearly threshold calculations according to claim 7, which is characterized in that described
Predicted value obtains module, specifically includes:
Performance prediction unit, for adopting using voltage described in the multiple groups-degree of approximation data group as the input of Performance Predicter
With approximate calculation method obtain every group described in the corresponding performance prediction value of voltage-degree of approximation data group
IPSi=Avi+ΔIPSi,
Wherein, viIndicate that the voltage in i-th group of voltage-degree of approximation data group, A are constant, A depends on the configuration of processor core
With the application program executed on processor core, Δ IPSiIndicate approximate calculation method to the influence degree of performance;
Energy consumption predicting unit, for adopting using voltage described in the multiple groups-degree of approximation data group as the input of energy consumption fallout predictor
With approximate calculation method obtain every group described in the corresponding energy consumption predicted value of voltage-degree of approximation data group
Energyi=(βivi)2C+(βivi)2miD,
Wherein, βiIndicate i-th group of voltage-degree of approximation data group it is one 0 to 1 corresponding between constant, βiDepending on user couple
The desirability of voltage, C indicate constant, and the configuration that C depends on processor core indicates constant, miIndicate i-th group of voltage-approximation journey
The degree of approximation value in data group is spent, D is the influence degree depending on approximate calculation method to energy consumption;
Prediction of quality unit is exported, for using voltage described in the multiple groups-degree of approximation data group as output quality predictor
Input, using approximate calculation method and fault injection methods obtain every group described in voltage-corresponding output of degree of approximation data group
Quality predictions.
9. a kind of processor core optimization system based on nearly threshold calculations according to claim 8, which is characterized in that described
Prediction of quality unit is exported, is specifically included:
Classify subelement, for using voltage described in the multiple groups-degree of approximation data group as export quality predictor input,
Classify to the instruction in the application program executed on the processor core, obtains multiple instruction classification;Each described instruction
Classification includes the similar instruction of multiple propagation paths;
Sub-unit obtains multiple sampling instructions for being sampled using approximate calculation method to each described instruction classification;
Direct fault location subelement, for, to each sampling instruction injection failure, obtaining sampling failure using fault injection methods and referring to
It enables;
Error calculation subelement, for calculating error amount according to each sampling instruction and corresponding sampling faulting instruction;The mistake
Difference include sampling instruction output with the poly- heap error amount of maximum of corresponding sampling faulting instruction output, sampling instruction export with it is right
The maximum value of the relative error for the sampling faulting instruction output answered and sampling instruction output are defeated with corresponding sampling faulting instruction
Matrix error out;
Prediction of quality subelement is exported, it is corresponding for voltage described in obtaining every group according to the error amount-degree of approximation data group
Output quality predictions.
10. a kind of processor core optimization system based on nearly threshold calculations according to claim 7, which is characterized in that institute
Objective function building module is stated, is specifically included:
Performance objective construction unit, for constructing using performance parameter as target, energy consumption parameter and output mass parameter are as about
The function of beam condition;The function is objective optimization function;
The solution module, specifically includes:
Optimize unit, is used for the corresponding performance prediction value of all voltages-degree of approximation data group, energy consumption predicted value and output matter
Predicted value is measured as input, the voltage-degree of approximation data group for meeting optimal conditions is obtained using simulated annealing;It is described excellent
For performance prediction value, maximum, energy consumption predicted value is less than default energy consumption preset value to change condition and output quality predictions are defeated less than presetting
Mass preset value or performance prediction value are become smaller with the reduction of annealing temperature with predeterminated frequency;
Determination unit, for the voltage-degree of approximation data group for meeting optimal conditions to be determined as optimal voltage-degree of approximation number
According to group.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910449741.0A CN110197026B (en) | 2019-05-28 | 2019-05-28 | Processor core optimization method and system based on near-threshold calculation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910449741.0A CN110197026B (en) | 2019-05-28 | 2019-05-28 | Processor core optimization method and system based on near-threshold calculation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110197026A true CN110197026A (en) | 2019-09-03 |
CN110197026B CN110197026B (en) | 2023-03-31 |
Family
ID=67753192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910449741.0A Active CN110197026B (en) | 2019-05-28 | 2019-05-28 | Processor core optimization method and system based on near-threshold calculation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110197026B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130096902A1 (en) * | 2011-10-12 | 2013-04-18 | International Business Machines Corporation | Hardware Execution Driven Application Level Derating Calculation for Soft Error Rate Analysis |
US20150185816A1 (en) * | 2013-09-23 | 2015-07-02 | Cornell University | Multi-core computer processor based on a dynamic core-level power management for enhanced overall power efficiency |
WO2015152939A1 (en) * | 2014-04-04 | 2015-10-08 | Empire Technology Development Llc | Instruction optimization using voltage-based functional performance variation |
US20160196122A1 (en) * | 2015-01-02 | 2016-07-07 | Reservoir Labs, Inc. | Systems and methods for efficient determination of task dependences after loop tiling |
CN107516148A (en) * | 2017-08-22 | 2017-12-26 | 厦门逸圣科智能科技有限公司 | system modelling optimization method and storage medium |
CN108475099A (en) * | 2015-12-17 | 2018-08-31 | 米尼码处理器公司 | System and method for controlling operating voltage |
US20190005390A1 (en) * | 2017-06-30 | 2019-01-03 | University Of Florida Research Foundation, Inc. | Architecture-independent approximation discovery |
-
2019
- 2019-05-28 CN CN201910449741.0A patent/CN110197026B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130096902A1 (en) * | 2011-10-12 | 2013-04-18 | International Business Machines Corporation | Hardware Execution Driven Application Level Derating Calculation for Soft Error Rate Analysis |
US20150185816A1 (en) * | 2013-09-23 | 2015-07-02 | Cornell University | Multi-core computer processor based on a dynamic core-level power management for enhanced overall power efficiency |
WO2015152939A1 (en) * | 2014-04-04 | 2015-10-08 | Empire Technology Development Llc | Instruction optimization using voltage-based functional performance variation |
CN106164810A (en) * | 2014-04-04 | 2016-11-23 | 英派尔科技开发有限公司 | Use the optimization that the performance of function based on voltage changes |
US20160196122A1 (en) * | 2015-01-02 | 2016-07-07 | Reservoir Labs, Inc. | Systems and methods for efficient determination of task dependences after loop tiling |
CN108475099A (en) * | 2015-12-17 | 2018-08-31 | 米尼码处理器公司 | System and method for controlling operating voltage |
US20190005390A1 (en) * | 2017-06-30 | 2019-01-03 | University Of Florida Research Foundation, Inc. | Architecture-independent approximation discovery |
CN107516148A (en) * | 2017-08-22 | 2017-12-26 | 厦门逸圣科智能科技有限公司 | system modelling optimization method and storage medium |
Non-Patent Citations (1)
Title |
---|
孙奥林等: "暗硅多核系统芯片资源调度算法", 《计算机辅助设计与图形学学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN110197026B (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kumar et al. | The Codesign of Embedded Systems: A Unified Hardware/Software Representation: A Unified Hardware/Software Representation | |
Chippa et al. | Approximate computing: An integrated hardware approach | |
Kathail et al. | PICO: Automatically designing custom computers | |
Zaccaria et al. | Multicube explorer: An open source framework for design space exploration of chip multi-processors | |
Bartolini et al. | Neuron constraints to model complex real-world problems | |
Xu et al. | GoodFloorplan: Graph convolutional network and reinforcement learning-based floorplanning | |
Gellert et al. | Performance and energy optimisation in CPUs through fuzzy knowledge representation | |
Haris et al. | Secda: Efficient hardware/software co-design of fpga-based dnn accelerators for edge inference | |
Cao et al. | CPU-GPU cooperative QoS optimization of personalized digital healthcare using machine learning and swarm intelligence | |
Ziegler et al. | Machine learning techniques for taming the complexity of modern hardware design | |
Chen et al. | Quality optimization of adaptive applications via deep reinforcement learning in energy harvesting edge devices | |
Wang et al. | Enabling energy-efficient and reliable neural network via neuron-level voltage scaling | |
CN110197026A (en) | A kind of processor core optimization method and system based on nearly threshold calculations | |
Wang et al. | A QoS-QoR aware CNN accelerator design approach | |
Shahshahani et al. | An automated tool for implementing deep neural networks on fpga | |
CN104991884B (en) | Heterogeneous polynuclear SoC architecture design method | |
Čibej et al. | Adaptation and evaluation of the simplex algorithm for a data-flow architecture | |
Singha et al. | LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning | |
Wang et al. | Improving the efficiency of functional verification based on test prioritization | |
Adegbija et al. | Dynamic phase-based tuning for embedded systems using phase distance mapping | |
Jordan et al. | Data clustering for efficient approximate computing | |
Yao et al. | EALI: Energy-aware layer-level scheduling for convolutional neural network inference services on GPUs | |
Aghapour et al. | PELSI: Power-Efficient Layer-Switched Inference | |
Bobrek et al. | Shared resource access attributes for high-level contention models | |
Bachrach et al. | Cyclist: Accelerating hardware development |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |