CN103077283B - The C-to-RTL integrated approach of optimizing based on VFI - Google Patents

The C-to-RTL integrated approach of optimizing based on VFI Download PDF

Info

Publication number
CN103077283B
CN103077283B CN201310016186.5A CN201310016186A CN103077283B CN 103077283 B CN103077283 B CN 103077283B CN 201310016186 A CN201310016186 A CN 201310016186A CN 103077283 B CN103077283 B CN 103077283B
Authority
CN
China
Prior art keywords
module
node
vfi
constraints
rtl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310016186.5A
Other languages
Chinese (zh)
Other versions
CN103077283A (en
Inventor
李双辰
何鑫宇
刘勇攀
杨华中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201310016186.5A priority Critical patent/CN103077283B/en
Publication of CN103077283A publication Critical patent/CN103077283A/en
Application granted granted Critical
Publication of CN103077283B publication Critical patent/CN103077283B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The present invention relates to hardware design technical field of automation, be specifically related to that a kind of streamline is divided, module is parallel and VFI allocation optimized and towards the C-to-RTL integrated approach of ASIC hardware design. For hardware design, streamline and parallel structure are to improve two of hardware performance effective means, and in extensive ASIC design, the design of VFI can significantly reduce power consumption simultaneously; And a kind of C-to-RTL integrated approach of optimizing based on VFI of the present invention, by C-to-RTL combined process, simultaneously to streamline divide, the parallel and VFI of module distributes and is optimized; Meanwhile, than the method that adopts three optimizing process step-by-step optimizations, method of the present invention has ensured overall optimality. Therefore, the present invention has strengthened C-to-RTL complex art practicality and the scope of application, for hardware design provides strong technical support.

Description

The C-to-RTL integrated approach of optimizing based on VFI
Technical field
The present invention relates to hardware design technical field of automation, be specifically related to a kind of streamline divide,Parallel and the VFI(Voltage-FrequencyIslands of module, electric voltage frequency island) distribute excellentThat change and towards the C-to-RTL integrated approach of ASIC hardware design.
Background technology
At integrated circuit circle, ASIC(ApplicationSpecificIntegratedCircui, speciallyWith integrated circuit) be considered to a kind of integrated circuit designing for special object. ASIC'sFeature is the demand towards specific user, and ASIC is in when batch production and universal integrated circuit phaseThan having, volume is less, power consumption is lower, reliability improves, performance improves, confidentiality strengthens,The advantages such as cost.
ASIC hardware design, C-to-RTL comprehensively has superiority very much; C-to-RTL is comprehensiveRefer to the c program of arthmetic statement layer is directly converted to transistor layer (RegistertransferLevel) HDL(hardware description language) program, belong to hardware design automation on the middle and senior levelComprehensive one. The tradition that completes that can be fast automatic by C-to-RTL complex art needs peopleWork expends the HDL design work of plenty of time. Generally speaking, C-to-RTL comprehensively has as followsAdvantage: (1), shortened the hardware design time, simplified design difficulty, this makes it become solutionEffective way of contradiction between the designed capacity that certainly the hardware design demand of rapid growth and low speed increaseFootpath; (2), the distance of furthered Software for Design and hardware design, Hardware/Software Collaborative Design is obtainedTo supporting. In view of above-mentioned advantage, C-to-RTL comprehensively in academia still in industrial quartersObtain paying close attention to widely.
But existing C-to-RTL complex art still exists a lot of open questions,For example: (1), in the time of comprehensive extensive c program, the quality of synthesis result is very undesirable; (2),User cannot make optimizing to the performance of synthesis result (throughput, area and power consumption etc.) and establishPut and specifying constraint; (3), very undesirable to the optimization measure of power consumption in system level,Especially be embodied in the design of ASIC; (4), this technology does not design VFI at presentSupport. Cause the basic reason of these problems to be mainly: high-level or system-level hardware frameThe design and optimization of structure is not considered and C can't be conveyed by words in hardware sequential, alsoRow, framework etc.
In sum, one can divide streamline, the parallel and VFI allocation optimized of moduleC-to-RTL integrated approach urgently provide.
Summary of the invention
(1) technical problem that will solve
The object of the present invention is to provide that a kind of streamline is divided, module is parallel and VFI distributes excellentThe C-to-RTL integrated approach of changing, for the comprehensive mistake of C-to-RTL towards ASIC designCheng Zhong walks abreast to streamline division, module simultaneously and VFI distribution is optimized, thereby strengthensC-to-RTL complex art practicality and the scope of application, for hardware design provides strong technologySupport.
(2) technical scheme
Technical solution of the present invention is as follows:
A C-to-RTL integrated approach of optimizing based on VFI, comprises step:
S1. respectively in comprehensive c program each until comprehensive function and obtain comprehensive after function ginsengNumber;
S2. set optimization aim and constraints;
S3. in conjunction with described function parameter and optimization aim and constraints, determine streamline mouldPiece division, module degree of concurrence and VFI distribute;
S4. comprehensive streamline obtains module and carries out mould according to described module degree of concurrence after dividingPiece is parallel;
S5. distribute parallel modules is connected to total system in conjunction with described VFI.
Preferably, the connection topological relation for the treatment of comprehensive function described in is linear pattern.
Preferably, described function parameter comprises functional operation cycle, operational data amount, area workThe highest frequency that consumes with and support.
Preferably, described optimization aim comprises that throughput maximizes, area minimizes and power consumptionMinimize; Described constraints comprises throughput constraint, area-constrained and power constraints.
Preferably, in described step S3, according to MILP method, in conjunction with instituteState function parameter and optimization aim and constraints, determine pipeline module division, module alsoStroke degree and VFI distribute.
Preferably, described step S3 comprises:
S311. calculate according to described function parameter the module likely obtaining after streamline is dividedParameter;
S312. according to the parameter of described module and optimization aim and constraints build mix wholeNumber linear programming model;
S313. solve described MILP model obtain one dimension nonnegative integer array withAnd two-dimentional Boolean array;
Described one dimension nonnegative integer array n position is that null representation connects by n function and with itN+1 the function connecing is divided into same module; N function place of non-null representation, n positionThe degree of parallelism of module;
Described in described two-dimentional Boolean array combination, state the each module pair of one dimension nonnegative integer array representationThe electric voltage frequency value of answering.
Preferably, in described step S3, according to heuritic approach, in conjunction with described function parameterAnd optimization aim and constraints, determine pipeline module divide, module degree of concurrence andVFI distributes.
Preferably, described step S3 comprises:
S321. according to described function parameter, minimize or minimise power consumption is built as target taking areaVertical topological diagram also adds start node and end node in described topological diagram; Wherein, nodeBe illustrated under the minimum degree of parallelism that meets described throughput constraint, the module of likely dividingLikely electric voltage frequency value of institute, the weights on limit represent area and the power consumption of the source node of its connection;
S322. calculate the beeline of each node to described end node;
S323. taking each described beeline as estimate cost, in conjunction with A-Star algorithm, to openBeginning node is object solving to shortest path between end node; Described shortest path meets to be handled upRate constraint, area-constrained and power constraints.
Preferably, described step S321 is: according to described function parameter, with throughput maximumTurning to target sets up topological diagram and in described topological diagram, adds start node and end node;Wherein, node table is shown under all possible degree of parallelism, the module of likely dividing allPossible electric voltage frequency value, the weights on limit represent area and the power consumption of the source node of its connection.
Preferably, in described step S323, A-Star algorithm has been done to following improvement:
Improve the transition rule of node from OPEN point set to CLOED point set: by optimum andThe node that meets constraints is transferred to CLOED point and is concentrated from OPEN point set; As do not haveMeet the node of constraints, recall a upper node of amendment current path.
(3) beneficial effect
For hardware design, streamline and parallel structure are two of raising hardware performanceEffective means, in extensive ASIC design, the design of VFI can significantly be fallen simultaneouslyLow-power consumption; And a kind of C-to-RTL integrated approach of optimizing based on VFI of the present invention passes throughIn C-to-RTL combined process, streamline division, module are walked abreast and VFI distribution simultaneouslyBe optimized; Meanwhile, than the method that adopts three optimizing process step-by-step optimizations, the present inventionMethod ensured overall optimality. Therefore, the present invention has strengthened C-to-RTL complex artPracticality and the scope of application, for hardware design provides strong technical support.
Brief description of the drawings
Fig. 1 is that a kind of C-to-RTL integrated approach flow process of optimizing based on VFI of the present invention is shownIntention;
Fig. 2 is the Topology Legend that solves optimization problem in Fig. 1 according to heuritic approach;
Fig. 3 calculates according to heuritic approach the schematic flow sheet that solves optimization problem in Fig. 1;
Fig. 4 moves into CLOSED according to OPEN point centralized node in heuritic approach in Fig. 1The subfunction schematic flow sheet of point set.
Detailed description of the invention
Below in conjunction with drawings and Examples, the detailed description of the invention of invention is described further.Following examples are only for the present invention is described, but are not used for limiting the scope of the invention.
Flow chart a kind of streamline is as shown in Figure 1 divided and the C-to-RTL of module parallel optimizationIntegrated approach, mainly comprises the following steps:
S1. adopt existing C-to-RTL instrument, in advance in the c program of input, each waits to combineClose function and carry out comprehensively, then extract or calculate comprehensive rear function parameter; Wherein, c programRequire to be made up of N function, the connection topological relation for the treatment of comprehensive function described in these is linear pattern,This requirement can realize by the coding style of amendment c program. Wherein, described function parameterComprise the highest frequency of functional operation cycle, operational data amount, area, power consumption and support;Specifically as shown in table 1. Described after comprehensive function is comprehensive through C-to-RTL instrument, front end is imitativeReally obtain execution cycle, operational data amount information, ASIC logic synthesis obtains area information,After the emulation of rear end, use power consumption analysis instrument to obtain power consumption information, and support highest frequency fromLogic synthesis report.
N function F of table 1nParameter
Parameter For n function FnThe description of parameter
N The summary of function in c program
Tn Function FnComplete the total time (cycle) of once-through operation
Tin n/Tout n Function FnComplete once-through operation required input, output time (cycle)
Sin n/Sout n Function FnBecome the data volume (byte) of once-through operation required input, output
Ale n/Amem n Function FnArea: logical resource, storage resources (um2)
CFmax Function FnThe fastest clock frequency (MHz) that can reach
{Pn,k} Function FnAt electric voltage frequency point (Vk,CFk) power consumption located, be 1*K array (mW)
S2. set optimization aim and constraints; Wherein, described optimization aim comprises throughputMaximize, area minimizes and minimise power consumption etc.; Described constraints comprises that throughput approximatelyThe constraintss such as bundle, area-constrained and power constraints; Also comprise what some users specified simultaneouslySystematic parameter; These default systematic parameters are the parameter of the total system finally obtaining, eachParameter is as shown in table 2.
Table 2 systematic parameter
Parameter Systematic parameter
Rreq Overall throughput constraints (byte/us)
Areq Entire area constraint (um2)
Preq Overall power constraints (mW)
OApara Area overhead (um parallel time2)
OPpara Power consumption overhead (mW) parallel time
Afifo Area (the um that the FIFO of every byte takies2/byte)
Pfifo The power consumption (mW/byte) that the FIFO of every byte takies
εAP Area/power consumption that function is divided into after same module is saved parameter
{(Vk,CFk)} The electric voltage frequency available point (V, MHz) of VFI
K The quantity of the electric voltage frequency available point of VFI
S3. in conjunction with described function parameter and optimization aim and constraints, determine optimization streamWhich letter waterline Module Division strategy, module degree of concurrence and VFI allocation result, determineNumber should be divided into a module as same level production line (streamline division), for example, by NIndividual function is divided in M module; These modules need to be parallel degree (module parallel);Module after these are parallel is mapped to power consumption optimum under what kind of voltage, frequency (VFI distribution),For example, from selecting a group of power consumption optimum among K group electric voltage frequency.
This step can be accomplished in several ways, in the present embodiment with MILPMethod and heuritic approach are that example describes.
For example, in described step S3, according to MILP method, in conjunction with described inFunction parameter and optimization aim and constraints, determine that pipeline module is divided, module is parallelDegree and VFI distribute, and this step mainly comprises:
S311. calculate according to described function parameter the module likely obtaining after streamline is dividedParameter; According to function F in step S1nParameter, we can calculate through streamlineAfter division can getable module Bi,jParameter, module Bi,jRefer to by function Fi,Fi+1,...FjThe module of composition; Module Bi,jParameter mainly comprise module throughput, module area withAnd the parallel upper limit etc.; Specifically as shown in table 3.
Table 3 module Bi,jParameter
Parameter For module Bi,jThe description of parameter
Ti,j Module Bi,jComplete the total time (cycle) of once-through operation
Tin i,j/Tout i,j Module Bi,jComplete once-through operation required input, output time (cycle)
Sin i,jSout i,j Module Bi,jBecome the data volume (byte) of once-through operation required input, output
Ale i,j/Amem i,j Module Bi,jArea: logical resource, storage resources (um2)
UPi,j Module Bi,jThe parallel upper limit
In table 3, the specific formula for calculation of each parameter is as follows:
T i , j = T i in + T j out + Σ n = i n = j ( T n - T n in - T n out ) - - - ( 1 )
A i , j le = &Sigma; n = i n = j A n le &CenterDot; ( 1 - &epsiv; A ) i < j A i le i = j - - - ( 2 )
A i , j mem = max A m mem &ForAll; m &Element; [ i , j ]
Wherein εADefinition in table 2;
S312. suitable in conjunction with the parameter of described module and optimization aim and constraint conditional definitionVariable, for the problem that will solve provides a mathematical description, then according to described mathematical description structureBuild MILP model, can use existing integral linear programming instrument to solve;In the present embodiment, specifically comprise:
1), defined variable:
The definition of variable is referring to table 4; Wherein variable { xnAnd variable { cn,kDefinition be corePlace, one-dimension array { xn(1*N) represented to divide and parallel result each unit whereinElement is nonnegative integer; { xnStatement streamline divide and module parallel mode as follows: described inOne dimension nonnegative integer array n position is that null representation is by n function and connectedN+1 function is divided into same module; N function place module of the non-null representation in n positionDegree of parallelism, meanwhile, also represents that n function and connected n+1 function do not haveBe divided into same module. Two dimension boolean array { cn,k(N*K) represent the allocation result of VFI; {cn,kEach element be 0 or 1; According to { xnResult, if (xncn,k) be greater than zero, tableThe VFI apportioning cost of showing the module at function n place is k group electric voltage frequency value, i.e. (Vk,CFk)。In addition, { y in temporary variablei,j(N*N) be a two-dimensional array, its element be 0 or1;{yi,jInterim expression system linearity division, for example, if yi,j=1, specification module Bi,jBe divided in system and exist, otherwise quite different.
The definition of table 4 variable
2), mathematical description:
According to the parameter in the variable in table 4 and table 1-3, optimization streamline is divided, mouldPiece problem parallel and that VFI distributes can be conceptualized as following mathematical expression.
obj:maxrallorminaallorminpall
(4)
s.t.:aall≤Areqandrall≥Rreqandpall≤Pall
Wherein throughput rall, area aallAnd power consumption pallCan be with the variable in table 4 andParametric Representation in table 1-3, specific as follows:
Throughput rallExpression:
rall≤ri,jifonlyyi,j=1(5)
r i . j = x j y i , j cf i , j / T i , j x j y i , j < UP i , j cf i , j / max { T i , j in , T i , j out } otherwise - - - ( 6 )
cf i , j = &Sigma; k = 1 k c j , k &CenterDot; CF k - - - ( 7 )
Area aallExpression:
aall=ape+affo(8)
a pe = &Sigma; i = 1 i = N &Sigma; j = i j = N ( ( x j - 1 ) O + x j ( A i , j le + A i , j mem ) ) y i , j - - - ( 9 )
a fifo = A fifo &Sigma; i = 1 i = N - 1 &Sigma; j = i j = N x i , j y i , j S j out - - - ( 10 )
Power consumption pallExpression:
pall=ppe+pfifo(11)
p pe = &Sigma; i = 1 N &Sigma; j = i N ( ( x j - 1 ) OP para + x j ( p i , j le + p i , j mem ) ) y i , j - - - ( 12 )
p fifo = P fifo &CenterDot; &Sigma; i = 1 N - 1 &Sigma; j = i N ( x j y i , j ) S j out - - - ( 13 )
Wherein:
p i , j le = &Sigma; k = 1 K ( c j . k &Sigma; n = i j P n , k le T n / T i , j ) ( 1 + &epsiv; P ) i < j &Sigma; k = 1 K ( c i . j &CenterDot; P i , k le ) i = j - - - ( 14 )
p i , j mem = &Sigma; k = 1 K ( c j , k max { P n , k mem } ) &ForAll; i &le; n &le; j - - - ( 15 )
The restriction of annexation: except the area shown in formula 4 is with, throughput and power consumptionConstraints outside, comprise the restriction of some join dependency relations, as follows:
&Sigma; i = 1 i = n y i , n &le; x n x n = 1 when &Sigma; i = 1 i = n y i , n = 0 x n &Element; N , y i , j &Element; binary - - - ( 16 )
&Sigma; i = 1 i = j - 1 y i , j - 1 = &Sigma; i = j i = N y j , i &ForAll; j &Element; [ 2 , N ] - - - ( 17 )
&Sigma; i = 1 i = j y i , j + &Sigma; i = j i = N y j , i - u j , j &le; 1 &ForAll; j &Element; [ 1 , N ] - - - ( 18 )
1 &le; &Sigma; j = 1 j = N &Sigma; i = 1 i = j y i , j &le; N - - - ( 19 )
y i , j ( &Sigma; k = 1 K c j , k CF k ) &le; min { CF n max } &ForAll; i &le; n &le; j - - - ( 20 )
&Sigma; k = 1 K c n , k = 1 &ForAll; n &Element; [ 1 , N ] c n , k &Element; binary - - - ( 21 )
3), this problem of linearisation:
Can find to exist a lot of nonlinear factors in above-mentioned formula 4-21, logical in the present inventionCross following methods, they are converted into linear statement, make whole problem can adopt mixingThe method of integral linear programming solves;
Linearisation xjyi,j: use zi,j=xjyi,jReplace, and to zi,jMake following linearity approximatelyBundle:
-Myi,j≤zi,j≤Myi,j(22)
xj-M(1-yi,j)≤zi,j≤xj+M(1-yi,j)
Wherein M is an integer that is greater than all data in optimization problem;
Linearisation xjyi,jcj,k: use wi,j,k=zi,jcj,kReplace, and to wi,j,kMake followingLinear restriction:
0≤wi,j,k≤Mcj,k(23)
zi,j-M(1-cj,k)≤wi,j,k≤zi,j+M(1-cj,k)
Wherein M is an integer that is greater than all data in optimization problem;
Linearisation yi,jcj,k: use ui,j,k=yi,jcj,kReplace, and to ui,j,kMake following lineProperty retrains:
0≤ui,j,k≤cj,k(24)
0≤ui,j,k≤yi,jui,j,k∈binary
The above-mentioned formula 4 of linearisation:
r all &le; r i , j + M ( 1 - y i , j ) &ForAll; 1 &le; i &le; j &le; N - - - ( 25 )
The above-mentioned formula 5 of linearisation:
r i . j &le; x j y i , j cf i , j / T i , j x j y i , j < UP i , j cf i , j / max { T i , j in , T i , j out } otherwise - - - ( 26 )
The above-mentioned formula 9 of linearisation:
&Sigma; i = 1 i = n y i , n &le; x n &le; M &CenterDot; &Sigma; i = 1 i = n y i , n x n &Element; N - - - ( 27 )
S313. solve and obtain one dimension nonnegative integer array { xnAnd two-dimentional Boolean array{cn,k, i.e. final calculation result.
Again for example, consider in above-mentioned steps S3 that MILP method is on a large scaleLimitation when calculating, the invention allows for according to heuritic approach and solves optimization streamlineDivide, the method for the parallel and VFI assignment problem of module level. This heuritic approach is asked optimizationTopic converts the basic problem that solves shortest path in topological diagram to, passes through described topological diagram afterwardsCarry out pretreatment, finally in conjunction with A-Star algorithm, (A-Star algorithm is to solve in static road networkShort circuit most effectual way, the present invention, improving in conjunction with existing A-Star algorithm, is specifically shown in stepIn rapid S323, illustrate) solve. In the present embodiment, this step mainly comprises:
S321. according to described function parameter, set up topological diagram; This topological diagram is directed acyclic graph.In the present embodiment, can set up topological diagram by two kinds of modes; Describe for example respectively below:
1), minimize or minimise power consumption is set up topological diagram as target taking area; Wherein, nodeBe illustrated in and meet described throughput constraint (Rreq) minimum degree of parallelism under, each may divideModule Bi,jThe electric voltage frequency value (V likely of institutek,CFk), total K*CN+1 2Individual node; NodeBetween annexation defer to the linking relationship of module, i.e. Bi,jWith Bj+1,mBe connected; The power on limitValue has two kinds, is respectively area and the power consumption that represents the source node of its connection; Be Bi,jConsiderArea and power consumption after parallel and VFI distributes; When these area weights are greater than area-constrained AreqOrPerson's power consumption weights are greater than power constraints PreqTime, its source node by deleted go out this figure; Finally addEnter start node in figure Begin node connect B1,jAnd the i.e. End node in figure of end nodeConnect Bi,N, formed complete topological diagram; Fig. 2 is N=3, the topological diagram forming when K=2.
Problem is just converted into like this, solves the shortest path of area weights, ensures this road simultaneouslyThe power consumption weights in footpath are less than power constraints, or solve the shortest path of power consumption weights, protect simultaneouslyThe area weights of demonstrate,proving this path are less than area-constrained.
2), turn to target with throughput maximum and set up topological diagram; This topological diagram is directed acyclic graph.Wherein, node represents to meet parallel upper limit UPi,jUnder all possible degree of parallelism of constraint, allThe module B that may dividei,jThe electric voltage frequency value (V likely of institutek,CFk), have at mostmax{UPij}*K*CN+1 2Individual node; Annexation between node is deferred to the link pass of moduleSystem, i.e. Bi,jWith Bj+1,mBe connected; The weights on limit have two kinds, represent respectively the source joint of its connectionArea and the power consumption of point, i.e. Bi,jConsider area and power consumption after parallel and VFI distributes; WhenThese area weights are greater than area-constrained AreqOr power consumption weights are greater than power constraints PreqTime, itsSource node by deleted go out this figure; Add start node in figure Begin node connect Bl,jWithAnd end node in figure End node connect Bi,N, just formed complete topological diagram.
Problem is just converted into like this, solves the path of throughput calculation maximum, and ensure should simultaneouslyPower consumption weights and the area weights in path are all less than constraint.
S322. pretreatment above topology figure, calculates the beeline of each node to terminal; FaceThe beeline of long-pending, power consumption, throughput weights will be used as improved A-Star algorithm from this pointThe estimate cost h (n) of destination node optimal path. This calculating is by the thorough algorithm of enlightening Coase or other calculationsMethod solves and realizes.
S323. in improved A-Star algorithm, f (n) be from initial point via node n to orderThe evaluation function of punctuate, g (n) be in state space the actual generation from start node to n nodeValency, h (n) is the estimate cost from n to destination node optimal path. In the present embodiment, adoptDescribed pretreatment result is the estimation to End node optimal path as V node (arbitrary node)Cost h (n); Begin node is g (n) to the existing route length of V node; Gulp down in maximizationF (n)=min{g (n) is set, h (n) while telling rate }, in the time of minimum area or power consumption, arrangeF (n)=g (n)+h (n); Overhead (V) is the consumption of Begin node to the existing route of V node(,, as minimum area, this consumes as power consumption; If minimizing power dissipation, this consumes as faceLong-pending; If change greatly throughput, this consumes as area and power consumption), solve set up topological diagram fullShortest path under foot constraints; Its flow chart is as shown in Figure 3: function NodeStructure*nodeOPEN_to_CLOSED(SetStructure*OPEN,SetSturcture*CLODED,ParameterStructure*Para) be the main improvement part of this invention to A-Star algorithm; ItsEffect is the suitable node of selecting OPEN point to concentrate, is used for transferring to CLOSED point and concentrates(in traditional A-Star algorithm, OPEN point is concentrated the node of preserving all generation and do not investigate,The node that CLOSED point centralized recording had been accessed); This function is by constraint optimum and satisfiedThe node V of condition transfers to CLOED point and concentrates from OPEN point set; As do not met approximatelyThe point of bundle condition, recalls a upper node of amendment current path and processes; Idiographic flow asShown in Fig. 4.
S4. after dividing by the comprehensive streamline of existing C-to-RTL synthesis tool, obtain module alsoCarrying out module according to described module degree of concurrence walks abreast; The input/output section of parallel module is by manyRoad selector control.
S5. parallel modules is passed through to the asynchronous FIFO (First with level translatorInputFirstOutput, First Input First Output) mode is connected to total system; And according to obtainingVFI allocation result, distribute virtual voltage, frequency.
A kind of C-to-RTL integrated approach of optimizing based on VFI of the present invention, allows user to existGlobal design is provided to constraints, made up C-to-RTL comprehensively aspect this automation excellentThe blank of changing; The more important thing is, the method that the present invention proposes, by streamline division, module alsoRow and VFI distribute these three circuit optimization dimensions to solve in an optimizing process simultaneously. PhaseRatio is in the method that adopts several optimizing process step-by-step optimizations, and method of the present invention has ensured overallOptimality.
Above embodiment is only for the present invention is described, and limitation of the present invention is not relevantThe those of ordinary skill of technical field, without departing from the spirit and scope of the present invention,Can also make a variety of changes and modification, therefore all technical schemes that are equal to also belong to the present inventionProtection category.

Claims (9)

1. a C-to-RTL integrated approach of optimizing based on VFI, is characterized in that, comprisesStep:
S1. respectively in comprehensive c program each until comprehensive function and obtain comprehensive after function ginsengNumber;
S2. set optimization aim and constraints;
S3. in conjunction with described function parameter and optimization aim and constraints, determine streamline mouldPiece division, module degree of concurrence and VFI distribute;
Described optimization aim comprises that throughput maximizes, area minimizes and minimise power consumption;Described constraints comprises throughput constraint, area-constrained and power constraints;
S4. comprehensive streamline obtains module and carries out according to described module degree of concurrence after dividingModule is parallel;
S5. distribute parallel modules is passed through with level translator in conjunction with described VFIAsynchronous First Input First Output mode be connected to total system, and distribute knot according to the VFI that obtainsReally, distribute virtual voltage, frequency.
2. C-to-RTL integrated approach according to claim 1, is characterized in that, instituteState and treat that the connection topological relation of comprehensive function is linear pattern.
3. C-to-RTL integrated approach according to claim 1, is characterized in that, instituteState function parameter and comprise functional operation cycle, operational data amount, area power consumption and its supportHighest frequency.
4. according to the C-to-RTL integrated approach described in claim 1-3 any one, its spyLevy and be, in described step S3, according to MILP method, in conjunction with described letterNumber parameter and optimization aim and constraints, determine pipeline module division, module strokeDegree and VFI distribute.
5. C-to-RTL integrated approach according to claim 4, is characterized in that, instituteStating step S3 comprises:
S311. calculate according to described function parameter the mould likely obtaining after streamline is dividedThe parameter of piece;
S312. according to the parameter of described module and optimization aim and the mixing of constraints structureIntegral linear programming model;
S313. solve described MILP model and obtain one dimension nonnegative integer arrayAnd two-dimentional Boolean array;
Described one dimension nonnegative integer array n position is that null representation connects by n function and with itN+1 the function connecing is divided into same module; N function place of non-null representation, n positionThe degree of parallelism of module;
Described in described two-dimentional Boolean array combination, state the each module of one dimension nonnegative integer array representationCorresponding electric voltage frequency value.
6. according to the C-to-RTL integrated approach described in claim 1-3 any one, its spyLevy and be, in described step S3, according to heuritic approach, in conjunction with described function parameter andOptimization aim and constraints, determine pipeline module division, module degree of concurrence and VFIDistribute.
7. C-to-RTL integrated approach according to claim 6, is characterized in that, instituteStating step S3 comprises:
S321. according to described function parameter, minimize or minimise power consumption is built as target taking areaVertical topological diagram also adds start node and end node in described topological diagram; Wherein, nodeBe illustrated under the minimum degree of parallelism that meets described throughput constraint, the module of likely dividingLikely electric voltage frequency value of institute, the weights on limit represent area and the merit of the source node of its connectionConsumption;
S322. calculate the beeline of each node to described end node;
S323. taking each described beeline as estimate cost, in conjunction with A-Star algorithm, to openBeginning node is object solving to shortest path between end node; Described beeline meets handles upRate constraint, area-constrained and power constraints.
8. C-to-RTL integrated approach according to claim 7, is characterized in that, instituteStating step S3 comprises:
Described step S321 is: according to described function parameter, turn to target with throughput maximumSet up topological diagram and in described topological diagram, add start node and end node; Wherein, jointPoint is illustrated under all possible degree of parallelism, the likely voltage of institute of the module of likely dividingFrequency values, the weights on limit represent area and the power consumption of the source node of its connection.
9. according to the C-to-RTL integrated approach described in claim 7 or 8, it is characterized in that,In described step S323, A-Star algorithm is done to following improvement:
Improve the transition rule of node from OPEN point set to CLOED point set: by optimum and satisfiedThe node of constraints is transferred to CLOED point and is concentrated from OPEN point set; As do not metThe node of constraints, recalls a upper node of revising current path.
CN201310016186.5A 2013-01-16 2013-01-16 The C-to-RTL integrated approach of optimizing based on VFI Active CN103077283B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310016186.5A CN103077283B (en) 2013-01-16 2013-01-16 The C-to-RTL integrated approach of optimizing based on VFI

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310016186.5A CN103077283B (en) 2013-01-16 2013-01-16 The C-to-RTL integrated approach of optimizing based on VFI

Publications (2)

Publication Number Publication Date
CN103077283A CN103077283A (en) 2013-05-01
CN103077283B true CN103077283B (en) 2016-05-18

Family

ID=48153812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310016186.5A Active CN103077283B (en) 2013-01-16 2013-01-16 The C-to-RTL integrated approach of optimizing based on VFI

Country Status (1)

Country Link
CN (1) CN103077283B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9449131B2 (en) * 2014-06-02 2016-09-20 Xilinx, Inc. Extracting system architecture in high level synthesis
CN106777503A (en) * 2016-11-19 2017-05-31 天津大学 Higher synthesis optimization method based on code conversion
CN108319459B (en) * 2018-02-12 2022-04-29 芯峰科技(广州)有限公司 CCC compiler for describing behavior level to RTL
CN112559185B (en) * 2020-12-18 2021-12-17 迈普通信技术股份有限公司 Chip resource allocation method, device, network equipment and computer storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043886A (en) * 2010-12-31 2011-05-04 北京大学深圳研究生院 Underlying hardware mapping method for integrated circuit as well as time sequence constraint method and device for data control flow
CN102419789A (en) * 2011-12-16 2012-04-18 中山大学 High-level synthesis method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7904848B2 (en) * 2006-03-14 2011-03-08 Imec System and method for runtime placement and routing of a processing array

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043886A (en) * 2010-12-31 2011-05-04 北京大学深圳研究生院 Underlying hardware mapping method for integrated circuit as well as time sequence constraint method and device for data control flow
CN102419789A (en) * 2011-12-16 2012-04-18 中山大学 High-level synthesis method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Introduction to High-Level Synthesis;Philippe Coussy et al;《IEEE Design & Test of Computers》;20090831;第9-12页 *
高级综合十年进展;王冠军等;《计算机科学》;20070825;正文第2.1.3节 *

Also Published As

Publication number Publication date
CN103077283A (en) 2013-05-01

Similar Documents

Publication Publication Date Title
CN103488537B (en) Method and device for executing data ETL (Extraction, Transformation and Loading)
Ambrosio et al. Transport equation and Cauchy problem for non-smooth vector fields
CN103077283B (en) The C-to-RTL integrated approach of optimizing based on VFI
CN105140907B (en) Direct-current micro-grid multiple agent adaptively sagging uniformity control method for coordinating and device
CN106155791B (en) A kind of workflow task dispatching method under distributed environment
Dujardin et al. Distribution of rational maps with a preperiodic critical point
CN101882238B (en) Wavelet neural network processor based on SOPC (System On a Programmable Chip)
CN104145281A (en) Neural network computing apparatus and system, and method therefor
CN106919769A (en) A kind of hierarchy type FPGA placement-and-routings method based on Hierarchy Method and empowerment hypergraph
CN107579518A (en) Power system environment economic load dispatching method and apparatus based on MHBA
Ebrahimnejad et al. A novel approach for sensitivity analysis in linear programs with trapezoidal fuzzy numbers
CN102902866B (en) Based on the engine product method for designing of behavior stream ecosystem framework
CN106505575B (en) A kind of Line Flow economic load dispatching method based on Granule Computing
WO2019233089A1 (en) Method and device for large-ratio scale reduction of internet testbed topology
CN108092284A (en) A kind of three-phase imbalance intelligent distribution network network reconstruction method based on linear model
CN102867240B (en) A kind of electric power transmission fair distribution data handling system of stream Network Based
CN109378819A (en) The power system voltage partition method and system verified based on spectral clustering and modularity
Ma et al. Optimal SOC control and rule-based energy management strategy for fuel-cell-based hybrid vessel including batteries and supercapacitors
CN102622334A (en) Parallel XSLT (Extensible Style-sheet Language Transformation) conversion method and device for use in multi-thread environment
CN105005638B (en) A kind of High Level Synthesis dispatching method based on linear delay model
CN103092573B (en) The C-to-RTL integrated approach that streamline divides and modular concurrent is optimized
CN105574269A (en) Design verification method of special instruction processor
Haessig Convex storage loss modeling for optimal energy management
Tripathy et al. A comparative analysis of multigranular Approaches and on topological properties of Incomplete Pessimistic Multigranular Rough Fuzzy Sets
CN108288114B (en) Emergency material scheduling method based on primitive dual theory

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant