CN103077283B

CN103077283B - The C-to-RTL integrated approach of optimizing based on VFI

Info

Publication number: CN103077283B
Application number: CN201310016186.5A
Authority: CN
Inventors: 李双辰; 何鑫宇; 刘勇攀; 杨华中
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2013-01-16
Filing date: 2013-01-16
Publication date: 2016-05-18
Anticipated expiration: 2033-01-16
Also published as: CN103077283A

Abstract

The present invention relates to hardware design technical field of automation, be specifically related to that a kind of streamline is divided, module is parallel and VFI allocation optimized and towards the C-to-RTL integrated approach of ASIC hardware design. For hardware design, streamline and parallel structure are to improve two of hardware performance effective means, and in extensive ASIC design, the design of VFI can significantly reduce power consumption simultaneously; And a kind of C-to-RTL integrated approach of optimizing based on VFI of the present invention, by C-to-RTL combined process, simultaneously to streamline divide, the parallel and VFI of module distributes and is optimized; Meanwhile, than the method that adopts three optimizing process step-by-step optimizations, method of the present invention has ensured overall optimality. Therefore, the present invention has strengthened C-to-RTL complex art practicality and the scope of application, for hardware design provides strong technical support.

Description

The C-to-RTL integrated approach of optimizing based on VFI

Technical field

The present invention relates to hardware design technical field of automation, be specifically related to a kind of streamline divide,Parallel and the VFI(Voltage-FrequencyIslands of module, electric voltage frequency island) distribute excellentThat change and towards the C-to-RTL integrated approach of ASIC hardware design.

Background technology

At integrated circuit circle, ASIC(ApplicationSpecificIntegratedCircui, speciallyWith integrated circuit) be considered to a kind of integrated circuit designing for special object. ASIC'sFeature is the demand towards specific user, and ASIC is in when batch production and universal integrated circuit phaseThan having, volume is less, power consumption is lower, reliability improves, performance improves, confidentiality strengthens,The advantages such as cost.

ASIC hardware design, C-to-RTL comprehensively has superiority very much; C-to-RTL is comprehensiveRefer to the c program of arthmetic statement layer is directly converted to transistor layer (RegistertransferLevel) HDL(hardware description language) program, belong to hardware design automation on the middle and senior levelComprehensive one. The tradition that completes that can be fast automatic by C-to-RTL complex art needs peopleWork expends the HDL design work of plenty of time. Generally speaking, C-to-RTL comprehensively has as followsAdvantage: (1), shortened the hardware design time, simplified design difficulty, this makes it become solutionEffective way of contradiction between the designed capacity that certainly the hardware design demand of rapid growth and low speed increaseFootpath; (2), the distance of furthered Software for Design and hardware design, Hardware/Software Collaborative Design is obtainedTo supporting. In view of above-mentioned advantage, C-to-RTL comprehensively in academia still in industrial quartersObtain paying close attention to widely.

But existing C-to-RTL complex art still exists a lot of open questions,For example: (1), in the time of comprehensive extensive c program, the quality of synthesis result is very undesirable; (2),User cannot make optimizing to the performance of synthesis result (throughput, area and power consumption etc.) and establishPut and specifying constraint; (3), very undesirable to the optimization measure of power consumption in system level,Especially be embodied in the design of ASIC; (4), this technology does not design VFI at presentSupport. Cause the basic reason of these problems to be mainly: high-level or system-level hardware frameThe design and optimization of structure is not considered and C can't be conveyed by words in hardware sequential, alsoRow, framework etc.

In sum, one can divide streamline, the parallel and VFI allocation optimized of moduleC-to-RTL integrated approach urgently provide.

Summary of the invention

(1) technical problem that will solve

The object of the present invention is to provide that a kind of streamline is divided, module is parallel and VFI distributes excellentThe C-to-RTL integrated approach of changing, for the comprehensive mistake of C-to-RTL towards ASIC designCheng Zhong walks abreast to streamline division, module simultaneously and VFI distribution is optimized, thereby strengthensC-to-RTL complex art practicality and the scope of application, for hardware design provides strong technologySupport.

(2) technical scheme

Technical solution of the present invention is as follows:

A C-to-RTL integrated approach of optimizing based on VFI, comprises step:

S1. respectively in comprehensive c program each until comprehensive function and obtain comprehensive after function ginsengNumber;

S2. set optimization aim and constraints;

S3. in conjunction with described function parameter and optimization aim and constraints, determine streamline mouldPiece division, module degree of concurrence and VFI distribute;

S4. comprehensive streamline obtains module and carries out mould according to described module degree of concurrence after dividingPiece is parallel;

S5. distribute parallel modules is connected to total system in conjunction with described VFI.

Preferably, the connection topological relation for the treatment of comprehensive function described in is linear pattern.

Preferably, described function parameter comprises functional operation cycle, operational data amount, area workThe highest frequency that consumes with and support.

Preferably, described optimization aim comprises that throughput maximizes, area minimizes and power consumptionMinimize; Described constraints comprises throughput constraint, area-constrained and power constraints.

Preferably, in described step S3, according to MILP method, in conjunction with instituteState function parameter and optimization aim and constraints, determine pipeline module division, module alsoStroke degree and VFI distribute.

Preferably, described step S3 comprises:

S311. calculate according to described function parameter the module likely obtaining after streamline is dividedParameter;

S312. according to the parameter of described module and optimization aim and constraints build mix wholeNumber linear programming model;

S313. solve described MILP model obtain one dimension nonnegative integer array withAnd two-dimentional Boolean array;

Described one dimension nonnegative integer array n position is that null representation connects by n function and with itN+1 the function connecing is divided into same module; N function place of non-null representation, n positionThe degree of parallelism of module;

Described in described two-dimentional Boolean array combination, state the each module pair of one dimension nonnegative integer array representationThe electric voltage frequency value of answering.

Preferably, in described step S3, according to heuritic approach, in conjunction with described function parameterAnd optimization aim and constraints, determine pipeline module divide, module degree of concurrence andVFI distributes.

Preferably, described step S3 comprises:

S321. according to described function parameter, minimize or minimise power consumption is built as target taking areaVertical topological diagram also adds start node and end node in described topological diagram; Wherein, nodeBe illustrated under the minimum degree of parallelism that meets described throughput constraint, the module of likely dividingLikely electric voltage frequency value of institute, the weights on limit represent area and the power consumption of the source node of its connection;

S322. calculate the beeline of each node to described end node;

S323. taking each described beeline as estimate cost, in conjunction with A-Star algorithm, to openBeginning node is object solving to shortest path between end node; Described shortest path meets to be handled upRate constraint, area-constrained and power constraints.

Preferably, described step S321 is: according to described function parameter, with throughput maximumTurning to target sets up topological diagram and in described topological diagram, adds start node and end node;Wherein, node table is shown under all possible degree of parallelism, the module of likely dividing allPossible electric voltage frequency value, the weights on limit represent area and the power consumption of the source node of its connection.

Preferably, in described step S323, A-Star algorithm has been done to following improvement:

Improve the transition rule of node from OPEN point set to CLOED point set: by optimum andThe node that meets constraints is transferred to CLOED point and is concentrated from OPEN point set; As do not haveMeet the node of constraints, recall a upper node of amendment current path.

(3) beneficial effect

For hardware design, streamline and parallel structure are two of raising hardware performanceEffective means, in extensive ASIC design, the design of VFI can significantly be fallen simultaneouslyLow-power consumption; And a kind of C-to-RTL integrated approach of optimizing based on VFI of the present invention passes throughIn C-to-RTL combined process, streamline division, module are walked abreast and VFI distribution simultaneouslyBe optimized; Meanwhile, than the method that adopts three optimizing process step-by-step optimizations, the present inventionMethod ensured overall optimality. Therefore, the present invention has strengthened C-to-RTL complex artPracticality and the scope of application, for hardware design provides strong technical support.

Brief description of the drawings

Fig. 1 is that a kind of C-to-RTL integrated approach flow process of optimizing based on VFI of the present invention is shownIntention;

Fig. 2 is the Topology Legend that solves optimization problem in Fig. 1 according to heuritic approach;

Fig. 3 calculates according to heuritic approach the schematic flow sheet that solves optimization problem in Fig. 1;

Fig. 4 moves into CLOSED according to OPEN point centralized node in heuritic approach in Fig. 1The subfunction schematic flow sheet of point set.

Detailed description of the invention

Below in conjunction with drawings and Examples, the detailed description of the invention of invention is described further.Following examples are only for the present invention is described, but are not used for limiting the scope of the invention.

Flow chart a kind of streamline is as shown in Figure 1 divided and the C-to-RTL of module parallel optimizationIntegrated approach, mainly comprises the following steps:

S1. adopt existing C-to-RTL instrument, in advance in the c program of input, each waits to combineClose function and carry out comprehensively, then extract or calculate comprehensive rear function parameter; Wherein, c programRequire to be made up of N function, the connection topological relation for the treatment of comprehensive function described in these is linear pattern,This requirement can realize by the coding style of amendment c program. Wherein, described function parameterComprise the highest frequency of functional operation cycle, operational data amount, area, power consumption and support;Specifically as shown in table 1. Described after comprehensive function is comprehensive through C-to-RTL instrument, front end is imitativeReally obtain execution cycle, operational data amount information, ASIC logic synthesis obtains area information,After the emulation of rear end, use power consumption analysis instrument to obtain power consumption information, and support highest frequency fromLogic synthesis report.

N function F of table 1_nParameter

Parameter	For n function F_nThe description of parameter
		N	The summary of function in c program
T_n	Function F_nComplete the total time (cycle) of once-through operation
		Tⁱⁿ _n/T^out _n	Function F_nComplete once-through operation required input, output time (cycle)
Sⁱⁿ _n/S^out _n	Function F_nBecome the data volume (byte) of once-through operation required input, output
		A^le _n/A^mem _n	Function F_nArea: logical resource, storage resources (um²)
CF^max	Function F_nThe fastest clock frequency (MHz) that can reach
		{P_n,k}	Function F_nAt electric voltage frequency point (V_k,CF_k) power consumption located, be 1*K array (mW)

S2. set optimization aim and constraints; Wherein, described optimization aim comprises throughputMaximize, area minimizes and minimise power consumption etc.; Described constraints comprises that throughput approximatelyThe constraintss such as bundle, area-constrained and power constraints; Also comprise what some users specified simultaneouslySystematic parameter; These default systematic parameters are the parameter of the total system finally obtaining, eachParameter is as shown in table 2.

Table 2 systematic parameter

Parameter	Systematic parameter
		R_req	Overall throughput constraints (byte/us)
A_req	Entire area constraint (um²)
		P_req	Overall power constraints (mW)
OA_para	Area overhead (um parallel time²)
		OP_para	Power consumption overhead (mW) parallel time
A_fifo	Area (the um that the FIFO of every byte takies²/byte)
		P_fifo	The power consumption (mW/byte) that the FIFO of every byte takies
ε^A/ε^P	Area/power consumption that function is divided into after same module is saved parameter
		{(V_k,CF_k)}	The electric voltage frequency available point (V, MHz) of VFI
K	The quantity of the electric voltage frequency available point of VFI

S3. in conjunction with described function parameter and optimization aim and constraints, determine optimization streamWhich letter waterline Module Division strategy, module degree of concurrence and VFI allocation result, determineNumber should be divided into a module as same level production line (streamline division), for example, by NIndividual function is divided in M module; These modules need to be parallel degree (module parallel);Module after these are parallel is mapped to power consumption optimum under what kind of voltage, frequency (VFI distribution),For example, from selecting a group of power consumption optimum among K group electric voltage frequency.

This step can be accomplished in several ways, in the present embodiment with MILPMethod and heuritic approach are that example describes.

For example, in described step S3, according to MILP method, in conjunction with described inFunction parameter and optimization aim and constraints, determine that pipeline module is divided, module is parallelDegree and VFI distribute, and this step mainly comprises:

S311. calculate according to described function parameter the module likely obtaining after streamline is dividedParameter; According to function F in step S1_nParameter, we can calculate through streamlineAfter division can getable module B_i，jParameter, module B_i，jRefer to by function F_i，F_i+1，...F_jThe module of composition; Module B_i，jParameter mainly comprise module throughput, module area withAnd the parallel upper limit etc.; Specifically as shown in table 3.

Table 3 module B_i，jParameter

Parameter	For module B_i，jThe description of parameter
		T_i，j	Module B_i，jComplete the total time (cycle) of once-through operation
Tⁱⁿ _i，j/T^out _i，j	Module B_i，jComplete once-through operation required input, output time (cycle)
		Sⁱⁿ _i，jS^out _i，j	Module B_i，jBecome the data volume (byte) of once-through operation required input, output
A^le _i，j/A^mem _i，j	Module B_i，jArea: logical resource, storage resources (um²)
		UP_i，j	Module B_i，jThe parallel upper limit

In table 3, the specific formula for calculation of each parameter is as follows:

T_{i, j} = T_{i}^{in} + T_{j}^{out} + Σ_{n = i}^{n = j} (T_{n} - T_{n}^{in} - T_{n}^{out}) - - - (1)

A_{i, j}^{le} = \{\begin{matrix} Σ_{n = i}^{n = j} A_{n}^{le} \cdot (1 - ϵ^{A}) & i < j \\ A_{i}^{le} & i = j \end{matrix} - - - (2)

A_{i, j}^{mem} = \max A_{m}^{mem} &ForAll; m &Element; [i, j]

Wherein ε^ADefinition in table 2;

S312. suitable in conjunction with the parameter of described module and optimization aim and constraint conditional definitionVariable, for the problem that will solve provides a mathematical description, then according to described mathematical description structureBuild MILP model, can use existing integral linear programming instrument to solve;In the present embodiment, specifically comprise:

1), defined variable:

The definition of variable is referring to table 4; Wherein variable { x_nAnd variable { c_n,kDefinition be corePlace, one-dimension array { x_n(1*N) represented to divide and parallel result each unit whereinElement is nonnegative integer; { x_nStatement streamline divide and module parallel mode as follows: described inOne dimension nonnegative integer array n position is that null representation is by n function and connectedN+1 function is divided into same module; N function place module of the non-null representation in n positionDegree of parallelism, meanwhile, also represents that n function and connected n+1 function do not haveBe divided into same module. Two dimension boolean array { c_n,k(N*K) represent the allocation result of VFI; {c_n,kEach element be 0 or 1; According to { x_nResult, if (x_nc_n,k) be greater than zero, tableThe VFI apportioning cost of showing the module at function n place is k group electric voltage frequency value, i.e. (V_k,CF_k)。In addition, { y in temporary variable_i，j(N*N) be a two-dimensional array, its element be 0 or1；{y_i，jInterim expression system linearity division, for example, if y_i，j=1, specification module B_i，jBe divided in system and exist, otherwise quite different.

The definition of table 4 variable

2), mathematical description:

According to the parameter in the variable in table 4 and table 1-3, optimization streamline is divided, mouldPiece problem parallel and that VFI distributes can be conceptualized as following mathematical expression.

obj:maxr_allormina_allorminp_all

（4）

s.t.:a_all≤A_reqandr_all≥R_reqandp_all≤P_all

Wherein throughput r_all, area a_allAnd power consumption p_allCan be with the variable in table 4 andParametric Representation in table 1-3, specific as follows:

Throughput r_allExpression:

r_all≤r_i,jifonlyy_i，j=1（5）

r_{i . j} = \{\begin{matrix} x_{j} y_{i, j} {cf}_{i, j} / T_{i, j} & x_{j} y_{i, j} < {UP}_{i, j} \\ {cf}_{i, j} / \max {T_{i, j}^{in}, T_{i, j}^{out}} & otherwise \end{matrix} - - - (6)

{cf}_{i, j} = Σ_{k = 1}^{k} c_{j, k} \cdot {CF}_{k} - - - (7)

Area a_allExpression:

a_all=a_pe+a_ffo（8）

a_{pe} = Σ_{i = 1}^{i = N} Σ_{j = i}^{j = N} ((x_{j} - 1) O + x_{j} (A_{i, j}^{le} + A_{i, j}^{mem})) y_{i, j} - - - (9)

a_{fifo} = A_{fifo} Σ_{i = 1}^{i = N - 1} Σ_{j = i}^{j = N} x_{i, j} y_{i, j} S_{j}^{out} - - - (10)

Power consumption p_allExpression:

p_all=p_pe+p_fifo（11）

p_{pe} = Σ_{i = 1}^{N} Σ_{j = i}^{N} ((x_{j} - 1) {OP}_{para} + x_{j} (p_{i, j}^{le} + p_{i, j}^{mem})) y_{i, j} - - - (12)

p_{fifo} = P_{fifo} \cdot Σ_{i = 1}^{N - 1} Σ_{j = i}^{N} (x_{j} y_{i, j}) S_{j}^{out} - - - (13)

Wherein:

p_{i, j}^{le} = \{\begin{matrix} Σ_{k = 1}^{K} (c_{j . k} Σ_{n = i}^{j} P_{n, k}^{le} T_{n} / T_{i, j}) (1 + ϵ^{P}) & i < j \\ Σ_{k = 1}^{K} (c_{i . j} \cdot P_{i, k}^{le}) & i = j \end{matrix} - - - (14)

p_{i, j}^{mem} = Σ_{k = 1}^{K} (c_{j, k} \max {P_{n, k}^{mem}}) &ForAll; i \leq n \leq j - - - (15)

The restriction of annexation: except the area shown in formula 4 is with, throughput and power consumptionConstraints outside, comprise the restriction of some join dependency relations, as follows:

\{\begin{matrix} Σ_{i = 1}^{i = n} y_{i, n} \leq x_{n} \\ x_{n} = 1 when Σ_{i = 1}^{i = n} y_{i, n} = 0 \end{matrix} x_{n} &Element; N, y_{i, j} &Element; binary - - - (16)

Σ_{i = 1}^{i = j - 1} y_{i, j - 1} = Σ_{i = j}^{i = N} y_{j, i} &ForAll; j &Element; [2, N] - - - (17)

Σ_{i = 1}^{i = j} y_{i, j} + Σ_{i = j}^{i = N} y_{j, i} - u_{j, j} \leq 1 &ForAll; j &Element; [1, N] - - - (18)

1 \leq Σ_{j = 1}^{j = N} Σ_{i = 1}^{i = j} y_{i, j} \leq N - - - (19)

y_{i, j} (Σ_{k = 1}^{K} c_{j, k} {CF}_{k}) \leq \min {{CF}_{n}^{\max}} &ForAll; i \leq n \leq j - - - (20)

Σ_{k = 1}^{K} c_{n, k} = 1 &ForAll; n &Element; [1, N] c_{n, k} &Element; binary - - - (21)

3), this problem of linearisation:

Can find to exist a lot of nonlinear factors in above-mentioned formula 4-21, logical in the present inventionCross following methods, they are converted into linear statement, make whole problem can adopt mixingThe method of integral linear programming solves;

Linearisation x_jy_i，j: use z_i，j=x_jy_i，jReplace, and to z_i，jMake following linearity approximatelyBundle:

-My_i,j≤z_i,j≤My_i,j（22）

x_j-M(1-y_i,j)≤z_i,j≤x_j+M(1-y_i，j)

Wherein M is an integer that is greater than all data in optimization problem;

Linearisation x_jy_i，jc_j,k: use w_i，j,k=z_i，jc_j,kReplace, and to w_i，j,kMake followingLinear restriction:

0≤w_i,j,k≤Mc_j,k（23）

z_i,j-M(1-c_j,k)≤w_i,j,k≤z_i，j+M(1-c_j,k)

Wherein M is an integer that is greater than all data in optimization problem;

Linearisation y_i，jc_j,k: use u_i，j,k=y_i，jc_j,kReplace, and to u_i，j,kMake following lineProperty retrains:

0≤u_i,j,k≤c_j，k（24）

0≤u_i,j,k≤y_i，ju_i,j,k∈binary

The above-mentioned formula 4 of linearisation:

r_{all} \leq r_{i, j} + M (1 - y_{i, j}) &ForAll; 1 \leq i \leq j \leq N - - - (25)

The above-mentioned formula 5 of linearisation:

r_{i . j} \leq \{\begin{matrix} x_{j} y_{i, j} {cf}_{i, j} / T_{i, j} & x_{j} y_{i, j} < {UP}_{i, j} \\ {cf}_{i, j} / \max {T_{i, j}^{in}, T_{i, j}^{out}} & otherwise \end{matrix} - - - (26)

The above-mentioned formula 9 of linearisation:

Σ_{i = 1}^{i = n} y_{i, n} \leq x_{n} \leq M \cdot Σ_{i = 1}^{i = n} y_{i, n} x_{n} &Element; N - - - (27)

S313. solve and obtain one dimension nonnegative integer array { x_nAnd two-dimentional Boolean array{c_n,k, i.e. final calculation result.

Again for example, consider in above-mentioned steps S3 that MILP method is on a large scaleLimitation when calculating, the invention allows for according to heuritic approach and solves optimization streamlineDivide, the method for the parallel and VFI assignment problem of module level. This heuritic approach is asked optimizationTopic converts the basic problem that solves shortest path in topological diagram to, passes through described topological diagram afterwardsCarry out pretreatment, finally in conjunction with A-Star algorithm, (A-Star algorithm is to solve in static road networkShort circuit most effectual way, the present invention, improving in conjunction with existing A-Star algorithm, is specifically shown in stepIn rapid S323, illustrate) solve. In the present embodiment, this step mainly comprises:

S321. according to described function parameter, set up topological diagram; This topological diagram is directed acyclic graph.In the present embodiment, can set up topological diagram by two kinds of modes; Describe for example respectively below:

1), minimize or minimise power consumption is set up topological diagram as target taking area; Wherein, nodeBe illustrated in and meet described throughput constraint (R_req) minimum degree of parallelism under, each may divideModule B_i，jThe electric voltage frequency value (V likely of institute_k,CF_k), total K*C_N+1 ²Individual node; NodeBetween annexation defer to the linking relationship of module, i.e. B_i，jWith B_j+1,mBe connected; The power on limitValue has two kinds, is respectively area and the power consumption that represents the source node of its connection; Be B_i，jConsiderArea and power consumption after parallel and VFI distributes; When these area weights are greater than area-constrained A_reqOrPerson's power consumption weights are greater than power constraints P_reqTime, its source node by deleted go out this figure; Finally addEnter start node in figure Begin node connect B_1，jAnd the i.e. End node in figure of end nodeConnect B_i,N, formed complete topological diagram; Fig. 2 is N=3, the topological diagram forming when K=2.

Problem is just converted into like this, solves the shortest path of area weights, ensures this road simultaneouslyThe power consumption weights in footpath are less than power constraints, or solve the shortest path of power consumption weights, protect simultaneouslyThe area weights of demonstrate,proving this path are less than area-constrained.

2), turn to target with throughput maximum and set up topological diagram; This topological diagram is directed acyclic graph.Wherein, node represents to meet parallel upper limit UP_i，jUnder all possible degree of parallelism of constraint, allThe module B that may divide_i，jThe electric voltage frequency value (V likely of institute_k,CF_k), have at mostmax{UP_ij}*K*C_N+1 ²Individual node; Annexation between node is deferred to the link pass of moduleSystem, i.e. B_i，jWith B_j+1,mBe connected; The weights on limit have two kinds, represent respectively the source joint of its connectionArea and the power consumption of point, i.e. B_i，jConsider area and power consumption after parallel and VFI distributes; WhenThese area weights are greater than area-constrained A_reqOr power consumption weights are greater than power constraints P_reqTime, itsSource node by deleted go out this figure; Add start node in figure Begin node connect B_l，jWithAnd end node in figure End node connect B_i，N, just formed complete topological diagram.

Problem is just converted into like this, solves the path of throughput calculation maximum, and ensure should simultaneouslyPower consumption weights and the area weights in path are all less than constraint.

S322. pretreatment above topology figure, calculates the beeline of each node to terminal; FaceThe beeline of long-pending, power consumption, throughput weights will be used as improved A-Star algorithm from this pointThe estimate cost h (n) of destination node optimal path. This calculating is by the thorough algorithm of enlightening Coase or other calculationsMethod solves and realizes.

S323. in improved A-Star algorithm, f (n) be from initial point via node n to orderThe evaluation function of punctuate, g (n) be in state space the actual generation from start node to n nodeValency, h (n) is the estimate cost from n to destination node optimal path. In the present embodiment, adoptDescribed pretreatment result is the estimation to End node optimal path as V node (arbitrary node)Cost h (n); Begin node is g (n) to the existing route length of V node; Gulp down in maximizationF (n)=min{g (n) is set, h (n) while telling rate }, in the time of minimum area or power consumption, arrangeF (n)=g (n)+h (n); Overhead (V) is the consumption of Begin node to the existing route of V node(,, as minimum area, this consumes as power consumption; If minimizing power dissipation, this consumes as faceLong-pending; If change greatly throughput, this consumes as area and power consumption), solve set up topological diagram fullShortest path under foot constraints; Its flow chart is as shown in Figure 3: function NodeStructure*nodeOPEN_to_CLOSED(SetStructure*OPEN,SetSturcture*CLODED,ParameterStructure*Para) be the main improvement part of this invention to A-Star algorithm; ItsEffect is the suitable node of selecting OPEN point to concentrate, is used for transferring to CLOSED point and concentrates(in traditional A-Star algorithm, OPEN point is concentrated the node of preserving all generation and do not investigate,The node that CLOSED point centralized recording had been accessed); This function is by constraint optimum and satisfiedThe node V of condition transfers to CLOED point and concentrates from OPEN point set; As do not met approximatelyThe point of bundle condition, recalls a upper node of amendment current path and processes; Idiographic flow asShown in Fig. 4.

S4. after dividing by the comprehensive streamline of existing C-to-RTL synthesis tool, obtain module alsoCarrying out module according to described module degree of concurrence walks abreast; The input/output section of parallel module is by manyRoad selector control.

S5. parallel modules is passed through to the asynchronous FIFO (First with level translatorInputFirstOutput, First Input First Output) mode is connected to total system; And according to obtainingVFI allocation result, distribute virtual voltage, frequency.

A kind of C-to-RTL integrated approach of optimizing based on VFI of the present invention, allows user to existGlobal design is provided to constraints, made up C-to-RTL comprehensively aspect this automation excellentThe blank of changing; The more important thing is, the method that the present invention proposes, by streamline division, module alsoRow and VFI distribute these three circuit optimization dimensions to solve in an optimizing process simultaneously. PhaseRatio is in the method that adopts several optimizing process step-by-step optimizations, and method of the present invention has ensured overallOptimality.

Above embodiment is only for the present invention is described, and limitation of the present invention is not relevantThe those of ordinary skill of technical field, without departing from the spirit and scope of the present invention,Can also make a variety of changes and modification, therefore all technical schemes that are equal to also belong to the present inventionProtection category.

Claims

1. a C-to-RTL integrated approach of optimizing based on VFI, is characterized in that, comprisesStep:

S2. set optimization aim and constraints;

Described optimization aim comprises that throughput maximizes, area minimizes and minimise power consumption;Described constraints comprises throughput constraint, area-constrained and power constraints;

S4. comprehensive streamline obtains module and carries out according to described module degree of concurrence after dividingModule is parallel;

S5. distribute parallel modules is passed through with level translator in conjunction with described VFIAsynchronous First Input First Output mode be connected to total system, and distribute knot according to the VFI that obtainsReally, distribute virtual voltage, frequency.

2. C-to-RTL integrated approach according to claim 1, is characterized in that, instituteState and treat that the connection topological relation of comprehensive function is linear pattern.

3. C-to-RTL integrated approach according to claim 1, is characterized in that, instituteState function parameter and comprise functional operation cycle, operational data amount, area power consumption and its supportHighest frequency.

4. according to the C-to-RTL integrated approach described in claim 1-3 any one, its spyLevy and be, in described step S3, according to MILP method, in conjunction with described letterNumber parameter and optimization aim and constraints, determine pipeline module division, module strokeDegree and VFI distribute.

5. C-to-RTL integrated approach according to claim 4, is characterized in that, instituteStating step S3 comprises:

S311. calculate according to described function parameter the mould likely obtaining after streamline is dividedThe parameter of piece;

S312. according to the parameter of described module and optimization aim and the mixing of constraints structureIntegral linear programming model;

S313. solve described MILP model and obtain one dimension nonnegative integer arrayAnd two-dimentional Boolean array;

Described in described two-dimentional Boolean array combination, state the each module of one dimension nonnegative integer array representationCorresponding electric voltage frequency value.

6. according to the C-to-RTL integrated approach described in claim 1-3 any one, its spyLevy and be, in described step S3, according to heuritic approach, in conjunction with described function parameter andOptimization aim and constraints, determine pipeline module division, module degree of concurrence and VFIDistribute.

7. C-to-RTL integrated approach according to claim 6, is characterized in that, instituteStating step S3 comprises:

S321. according to described function parameter, minimize or minimise power consumption is built as target taking areaVertical topological diagram also adds start node and end node in described topological diagram; Wherein, nodeBe illustrated under the minimum degree of parallelism that meets described throughput constraint, the module of likely dividingLikely electric voltage frequency value of institute, the weights on limit represent area and the merit of the source node of its connectionConsumption;

S322. calculate the beeline of each node to described end node;

S323. taking each described beeline as estimate cost, in conjunction with A-Star algorithm, to openBeginning node is object solving to shortest path between end node; Described beeline meets handles upRate constraint, area-constrained and power constraints.

8. C-to-RTL integrated approach according to claim 7, is characterized in that, instituteStating step S3 comprises:

Described step S321 is: according to described function parameter, turn to target with throughput maximumSet up topological diagram and in described topological diagram, add start node and end node; Wherein, jointPoint is illustrated under all possible degree of parallelism, the likely voltage of institute of the module of likely dividingFrequency values, the weights on limit represent area and the power consumption of the source node of its connection.

9. according to the C-to-RTL integrated approach described in claim 7 or 8, it is characterized in that,In described step S323, A-Star algorithm is done to following improvement:

Improve the transition rule of node from OPEN point set to CLOED point set: by optimum and satisfiedThe node of constraints is transferred to CLOED point and is concentrated from OPEN point set; As do not metThe node of constraints, recalls a upper node of revising current path.