CN109815545A - More pipelining-stage sequence circuits when being reset based on register synthetic operation method again - Google Patents

More pipelining-stage sequence circuits when being reset based on register synthetic operation method again Download PDF

Info

Publication number
CN109815545A
CN109815545A CN201811587490.4A CN201811587490A CN109815545A CN 109815545 A CN109815545 A CN 109815545A CN 201811587490 A CN201811587490 A CN 201811587490A CN 109815545 A CN109815545 A CN 109815545A
Authority
CN
China
Prior art keywords
pipelining
register
stage
look
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811587490.4A
Other languages
Chinese (zh)
Other versions
CN109815545B (en
Inventor
李鹏
李运娣
郭小波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Institute of Engineering
Original Assignee
Henan Institute of Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Institute of Engineering filed Critical Henan Institute of Engineering
Priority to CN201811587490.4A priority Critical patent/CN109815545B/en
Publication of CN109815545A publication Critical patent/CN109815545A/en
Application granted granted Critical
Publication of CN109815545B publication Critical patent/CN109815545B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Advance Control (AREA)

Abstract

The invention proposes it is a kind of reset based on register when more pipelining-stage sequence circuits synthetic operation method again, the steps include: that hardware description language is generated lut circuits using FPGA design process;Initialize the argin of each pipelining-stage of look-up table sequence circuit;Divide cone set method using lut circuits successively selects look-up table to bore collector since input port to output port direction;Carry out synthetic operation again to look-up table cone collector: if cone collector does not cross over register, using look-up table, synthetic operation method is handled again;If cone collection crosses over register, lut circuits are classified;Judgement is accepted or rejected to the circuit after integrating again according to argin.The present invention utilizes the argin in sequence circuit the pipeline design, for look-up table, synthesis provides prioritization scheme again to a greater extent under critical path delay constraint, it is less that circuit after optimization occupies resource, structure is more simple, largely reduces subsequent FPGA vanning, layout, the workload of wiring stage.

Description

More pipelining-stage sequence circuits when being reset based on register synthetic operation method again
Technical field
The present invention relates to the technical fields that look-up table sequential circuitry netlist integrates again, more particularly to one kind to be thought highly of based on deposit More pipelining-stage sequence circuits of timing synthetic operation method again.
Background technique
In FPGA design process, hardware description language (the Hardware Description of hardware program person's design Language, HDL) by logic synthesis generation gate level netlist (NAND gate circuit netlist), gate level netlist is generated by mapping and is looked into Table (Look Up Table, LUT) circuit is looked for, lut circuits pass through vanning into the logical block of the more big unit of FPGA, then The bit stream file that can download to FPGA is ultimately produced by placement-and-routing, as shown in Figure 1.Synthesis can again for lut circuits netlist To replace original circuit with the consistent circuit of Boolean satisfiability (less using look-up table), to carry out area-optimized.Document [Ling A,Singh D P,Brown S D.FPGA technology mapping:a study of optimality[C] .Proceedings of Design Automation Conference.New York:ACM, 2005:427-432.] and [Cong J,Minkovich K.Improved SAT-based boolean matching using implicants for LUT-based FPGAs[C].Proceedings of 15th International Symposium on Field Programmable Gate Arrays.New York:ACM, 2007:139-147.] propose boolean for combinational circuit The integration algorithm again matched, but the above method does not account for the critical path delay requirement of sequence circuit, can not apply when In sequence circuit.The path delay of time may be made to increase not to be able to satisfy while area obtains optimization because look-up table integrates again Sequence circuit critical path delay requirement carries out simple for the original scheme of situation integration scenario again increased to the path delay of time Give up.And have utilizable argin in practical application, in sequence circuit in each pipelining-stage, it can be used to supplement lookup Table integrates the increased path delay of time again.Document [Li Peng, using argin as the sequence circuit of parameter integration algorithm again, " computer Computer Aided Design and graphics journal ", the phase of volume 22 9, the of in September, 2010] it is directed to the letter that look-up table output port only drives a paths It the case where single look-up table configuration, can be abundant by the time of subsequent pipelining-stage when being reset by multiple input single output lookup table register Amount supplements preceding pipelining-stage.But the algorithm does not consider feelings when multiple-input and multiple-output lookup table register is reset Condition.And multiple-input and multiple-output look-up table can generate certain limitation to the use of the argin in pipelining-stage circuit, not It is that argin existing for all pipelining-stages can be utilized.
Summary of the invention
For existing method to increased the technical issues of integration scenario is simply given up again in the path delay of time, the present invention is mentioned A kind of more pipelining-stage sequence circuits when being reset based on register synthetic operation method again out, calculates multiple-input and multiple-output look-up table The argin of circuit can efficiently use the argin in multiple-input and multiple-output look-up table sequence circuit to make to integrate again It is optimal that circuit afterwards reaches area under critical path delay constraint.
In order to achieve the above object, the technical scheme of the present invention is realized as follows: it is a kind of when being reset based on register More pipelining-stage sequence circuits synthetic operation method again, its step are as follows:
Step 1: the hardware description language for being designed user using FPGA design process is by logic synthesis and mapping phase Processing generates lut circuits;
Step 2: the argin of initialization each pipelining-stage of look-up table sequence circuit: multiple-input and multiple-output look-up table is utilized The argin calculation method of sequence circuit to the argin of each pipelining-stage and its inner track in look-up table sequence circuit into Row calculates;
Step 3: cone set method is divided using lut circuits and is successively selected since input port to output port direction Look-up table bores collector;
Step 4: synthetic operation again is carried out to look-up table cone collector:
(1) if cone collector does not cross over register, the look-up table that step 2 is generated bores collector application look-up table Synthetic operation method is handled again;
(2) if cone collection crosses over register, classification processing can be carried out for circuit look-up table feature;
Step 5: choice judgement is carried out to the circuit after integrating again according to argin: if current pipelining-stage is available Between allowance be negative, give up the integration scenario again of step 4;If current pipelining-stage pot life allowance is positive, using step 4 Integration scenario again.
The method for solving of single pipelining-stage critical path time delay and local argin are as follows:
(a) between counter register each side time delay;
(b) by the T of pipelining-stage input register nodearrivalValue is set as 0;
(c) T of other nodes is calculatedarrivalValue: Wherein, i is the starting point in any one path in assembly line, and j is the terminal in the path, Tarrival(i) it is arrived for the signal of node i Up to time, Tarrival(j) time of arrival (toa) for being node j, fanin (j) represent any one node before connecting node j, The time delay of delay (i, j) delegated path (i, j);
(d) by all pipelining-stage output port register TrequiredValue is set as the crucial path delay of time:Wherein, registerout is the register of any one output port;
(e) T of other nodes is calculated using following formularequiredValue are as follows:Wherein, fanout (i) represents times that node i is driven backward Meaning node, Trequired(i) signal arrival time at the latest of node i, T are indicatedrequierd(j) indicate that the signal of node j reaches at the latest Time;
(f) the argin value arbitrarily connected in following formula counting circuits: slack (i, j)=T is utilizedrequierd(j)- Tarrival(i)-delay(i,j);
(g) local time's allowance of any one paths in pipelining-stage inside are as follows: slack (M, i)=CPD-delay (M, I), wherein i is any paths in pipelining-stage M;
(h) local time's allowance of the pipelining-stage are as follows: slack (M)=min (slack (M, i)), i ∈ M.
The clock cycle of the entire flow line circuit of multithread water level production line circuit has to be larger than equal to its internal pipelining-stage The middle longest pipelining-stage critical path time delay found out using single pipelining-stage critical path time delay method, corresponding each pipelining-stage time The clock cycle of the calculation basis setting of allowance;
The common available length of a game's allowance of each pipelining-stage of flow line circuit is that local time's allowance of this pipelining-stage adds Local time's allowance of the upper subsequent all pipelining-stages of the pipelining-stage:
Wherein, N is pipelining-stage M last assembly line electricity backward The last one pipelining-stage on road, L are pipelining-stage.
There is the available local time's allowance of the pipelining-stage of output port in inside are as follows: sets drive output mouth in pipelining-stage Look-up table output port to the path delay of time between the pipelining-stage output register be y, the part of path delay of time y and the pipelining-stage Argin minimum value between the two is the available local time's allowance of the pipelining-stage;
There is length of a game's allowance method for solving of the pipelining-stage of output port in inside are as follows: if the subsequent flowing water of pipelining-stage N Grade is the common pipelining-stage N+1 of not extra output port, then can be by the available local time of subsequent all pipelining-stages Between allowance mutually sum it up path length y comparison, minimum value be the available argin of the pipelining-stage.
The method of look-up table cone collector is selected in the step 3 are as follows: by original lut circuits netlist from input port Start to be divided into the K input cone collection being made of several look-up tables to output port direction, K is according to the input of look-up table in circuit Port number and circuit structure are configured, if look-up table input port number is J, then K=3*J-2, bores the input port in collection It is less than and is equal to K;If bore the circuit in integrating as area optimum circuit, i.e., the circuit is that K input area is optimal, can not be passed through Subsequent synthesis again is further optimized look-up table, and K input cone collection is invalid, needs then to continue backward selection cone current collection Road.
When cone collection crosses over register in the step 4, the method for circuit look-up table feature progress classification processing are as follows:
When being reset for the register of look-up table:
When multiple input single output lookup table register is reset: when register is reset by look-up table input terminal to output end, needing Will all input ports all there is register;It is then unrestricted when register is reset by look-up table output end to input terminal;
When multiple-input and multiple-output lookup table register is reset: when register is reset by look-up table input terminal to output end, needing Will all input ports all there is register;When register is reset by look-up table output end to input terminal, output port is needed to drive All there is register on dynamic all paths.
Integrated approach again when multiple input single output look-up table cone collection register is reset are as follows:
(a1) register Forward is createed into effectively cone collection first, makes to bore collector without containing register;
(b1) synthetic operation again is carried out to cone collector;
(c1) register is mobile to output end, more argins are created for subsequent pipelining-stage.
Integrated approach again when the multiple-input and multiple-output look-up table cone collection register is reset:
(a2) register Forward is createed into effectively cone collection first, makes to bore collector without containing register.If output end The multipath of driving lacks register, then needs to borrow a register at this to meet register Forward needs.It borrows Local use -1 indicates.
(b2) synthetic operation again is carried out to cone collector;
(c2) Syntheses choice is carried out again according to validity when resetting to judge.Needs after judgement integrates again are mobile to output end Whether the look-up table input port of register all has register:
If so, then all registers of input port move forward, the register that step (a2) is borrowed is offset;If not provided, Then give up the integration scenario again, circuit restores to comprehensive preceding state again.
In the step 5, current pipelining-stage pot life allowance is timing,
If again integrate not across register: again synthetic operation be likely to increase as well path and reduce corresponding pipelining-stage when Between allowance, in some instances it may even be possible to there is the case where this pipelining-stage local time allowance is negative;If from this pipelining-stage to output end flowing water Grade total time allowance is positive, and the argin of subsequent pipelining-stage is added to front flowing water when can reset by follow-up register Grade;
If comprehensive again cross over register: current pipelining-stage local time allowance just, chooses the program, and register is kept Original position;Current pipelining-stage local time allowance is negative, then register is carried out weight fixed cycle operator to input terminal, until its office Portion's argin is more than or equal to zero.
Beneficial effects of the present invention: using the argin in sequence circuit the pipeline design, critical path time delay about For look-up table, synthesis provides prioritization scheme again to a greater extent under beam;Again complex optimum can it is consistent with function, occupy resource more Few lut circuits replace original circuit to reach area-optimized;Circuit occupancy resource after optimization is less, and structure is more Simply, subsequent FPGA vanning, layout, wiring stage workload can be largely reduced by also meaning that.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is the design cycle of existing FPGA.
Fig. 2 is flow chart of the invention.
Fig. 3 is that argin of the invention calculates schematic diagram.
Fig. 4 is the schematic diagram of cone collector selection of the invention, wherein (a) is to select effectively to bore since input port Collector, (b) for successively toward circuit rear end selection effectively cone collector.
Fig. 5 is the schematic diagram that the present invention bores that circuit integrates again in collection, wherein (a) is again comprehensive preceding circuit, is (b) comprehensive again Circuit after conjunction.
Fig. 6 is the schematic diagram that cone collection lut circuits netlist of the invention integrates again, wherein (a) is again comprehensive preceding circuit, It (b) is circuit after integrating again.
Fig. 7 is schematic diagram when multiple input single output lookup table register of the present invention is reset, wherein (a) is flowing water forward It is (c) situation invalid when pipelining-stage is reset backward (b) to set up situation when pipelining-stage is reset backward when grade is reset.
Fig. 8 is schematic diagram when multiple-input and multiple-output lookup table register of the present invention is reset, wherein (a) is flowing water forward Grade resets dead establishment situation, is (c) feelings invalid when pipelining-stage is reset forward (b) to set up situation when pipelining-stage is reset backward Condition is (d) situation invalid when pipelining-stage is reset backward.
Fig. 9 is the schematic diagram integrated again when multiple input single output look-up table cone collection register is reset, wherein (a) has to be non- Comprehensive cone collects and (crosses over register case) effect again, (b) bores collection (register Forward) to be effectively comprehensive again, is (c) synthetic operation again, (d) it is moved back for register.
Figure 10 is the schematic diagram integrated again when multiple-input and multiple-output look-up table of the present invention cone collection register is reset, wherein (a) For the multi output lut circuits before integrating again, it is (b) effectively comprehensive cone collection (register Forward) again, (c) is to integrate again in vain Circuit is (d) effective synthetic circuit again.
Figure 11 is the flow chart for calculating single pipelining-stage critical path time delay and argin.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under that premise of not paying creative labor Embodiment shall fall within the protection scope of the present invention.
As shown in Fig. 2, a kind of more pipelining-stage sequence circuits when being reset based on register synthetic operation method again, step It is as follows:
Step 1: the hardware description language for being designed user using FPGA design process is by logic synthesis and mapping phase Processing generates lut circuits.
As shown in Figure 1, the hardware description language of user's design is generated look-up table electricity by the design cycle processing of FPGA Road, comprising: the hardware description language of user's design is generated into gate level netlist (NAND gate circuit netlist) by the logic synthesis stage, Gate level netlist is generated into look-up table (Look Up Table, LUT) circuit by mapping, determines that lut circuits meet multithread water Grade sequence circuit requirement.
Step 2: the argin of initialization each pipelining-stage of look-up table sequence circuit: multiple-input and multiple-output look-up table is utilized The argin calculation method of sequence circuit to the argin of each pipelining-stage and its inner track in look-up table sequence circuit into Row calculates.
The critical path time delay of more pipelining-stage sequence circuits was determined by the clock cycle to work on register.Clock The setting in period must meet the following requirement:
(1) solution procedure of single pipelining-stage critical path time delay (CPD) and local argin, as shown in figure 11, specifically Method are as follows:
(a) between counter register each side time delay;
(b) by the T of pipelining-stage input register nodearrivalValue is set as 0;
(c) T of other nodes is calculatedarrivalValue: Wherein, if any one path starting point in assembly line is i, which is j, Tarrival(i) it is reached for the signal of node i Time, Tarrival(j) time of arrival (toa) for being node j, fanin (j) represent any one node before connecting node j, The time delay of delay (i, j) delegated path (i, j).
(d) by all pipelining-stage output port register TrequiredValue is set as CPD:Wherein, wherein registerout sets any one output port register.
(e) T of other nodes is calculated using following formularequiredValue are as follows:Wherein, fanout (i) represents times that node i is driven backward Meaning node, Trequired(i) signal arrival time at the latest of node i, T are indicatedrequierd(j) indicate that the signal of node j reaches at the latest Time.
(f) the argin value arbitrarily connected in following formula counting circuits: slack (i, j)=T is utilizedrequierd(j)- Tarrival(i)-delay(i,j)。
(g) local time's allowance of any one paths in pipelining-stage inside are as follows: slack (M, i)=CPD-delay (M, I), wherein i is any paths in pipelining-stage M.
As shown in figure 11, circuit paths: input register-a-c-output register, local time's allowance are 0,
Circuit paths: input register-a-b-output register, local time's allowance are 5,
(h) local time's allowance of the pipelining-stage are as follows: slack (M)=min (slack (M, i)), i ∈ M, wherein i is stream Any paths in water grade M.
In Figure 11, since two paths minimum time allowances are 0, then the internal time allowance of the pipelining-stage is 0.
In practical application, the clock cycle of circuit design is greater than the critical path time delay CPD found out equal to the above process , it just can guarantee that sequence circuit signal is handled through oversampling circuit original part in this way and reach output port register within the clock cycle.Such as Fruit Figure 11 circuit clock period is set as 12, then available local time's allowance of the pipelining-stage is just 2.
(2) setting of multithread water level production line circuit critical path time delay and its internal each pipelining-stage local time allowance are asked Solution method.
For multithread water level production line circuit, in order to guarantee the normal work of sequence circuit, entire flow line circuit when The clock period is had to be larger than equal to the longest pipelining-stage found out in its internal pipelining-stage using single pipelining-stage critical path time delay method Critical path time delay, the clock cycle of the above-mentioned setting of calculation basis of corresponding each pipelining-stage argin.
(3) method for solving of the available length of a game's allowance of common each pipelining-stage of flow line circuit are as follows:
Wherein, N is pipelining-stage M last assembly line electricity backward The last one pipelining-stage on road.
I.e. length of a game's allowance of pipelining-stage is that local time's allowance of this pipelining-stage adds the subsequent all streams of the pipelining-stage Local time's allowance of water grade.
(4) internal to have the available local time's allowance of the pipelining-stage of output port and length of a game's allowance method for solving Are as follows:
As shown in figure 3, the flow line circuit is there are three pipelining-stage: N, N+1, N+2 have output port inside pipelining-stage N.
Due to being limited by pipelining-stage N internal output terminal mouth, pipelining-stage N output register D to the end cannot be past again When look-up table front end is reset, therefore there is no register available at the output port of pipelining-stage N.
The available local time's allowance of pipelining-stage N are as follows: set the look-up table output port of drive output mouth in pipelining-stage N It is y, minimum y between the two with local time's allowance of the pipelining-stage to the path delay of time between the pipelining-stage output register Value is the available local time's allowance of the pipelining-stage.
If the subsequent pipelining-stage of pipelining-stage N is common pipelining-stage N+1 (i.e. not extra output port), can incite somebody to action Available local time's allowance of subsequent all pipelining-stages mutually sums it up path length y comparison, and minimum value is that the pipelining-stage can benefit Argin.
The available length of a game's allowance of pipelining-stage N are as follows: the available local time of the subsequent all pipelining-stages of pipelining-stage N Allowance mutually sums it up path length y comparison, and minimum value is the available length of a game's allowance of the pipelining-stage.
As shown in figure 3, local time's allowance of pipelining-stage N, N+1, N+2 are respectively x1, x2, x3, then pipelining-stage N is available Length of a game's allowance be the minimum value of the sum of x1, x2, x3 with path length y.
Step 3: cone set method is divided using lut circuits and is successively selected since input port to output port direction Look-up table bores collector.
Hardware circuit design is needed to the netlist downloaded in chip by patrolling from the hardware description language of programmers design Comprehensive, mapping, vanning, the processing in placement-and-routing's stage are collected, and the lut circuits for passing through logic synthesis generation can be by again Integrated approach carries out area-optimized.Register can generate more optimization sides in sequence circuit look-up table integrates again when resetting Case.
Integrated approach uses the principle of Boolean matching again, and specific method can refer to document: Moskewicz M, Madigan C,Zhao Y,et al.Chaff:Engineering an efficient SAT solver[C].Proceedings of Design Automation Conference.New York:ACM,2001:530-535..Complex optimum can use function again Unanimously (passing through the principle of Boolean matching), the less lut circuits of occupancy resource replace original circuit excellent to reach area Change.Circuit occupancy resource after optimization is less, and structure is more simple.If ifq circuit cannot find occupancy money by synthesis again The less replacement circuit in source, then ifq circuit itself is the optimal circuit of area, does not need to be substituted.
Original lut circuits netlist is divided into since input port is to output port direction by several look-up tables The K input cone collection of composition, K are configured according to the input port number and circuit structure of look-up table in circuit and (it is defeated to set look-up table Inbound port number is J, then K=3*J-2), the input port bored in collection is less than equal to K.If cone integrate in circuit as area most When excellent circuit, i.e., the circuit is that K input area is optimal, can not further be optimized look-up table by subsequent integrate again, K input cone collection is invalid, needs then to continue backward selection cone collector.As shown in Fig. 4 (a), lut circuits netlist is from defeated Inbound port has divided the effective look-up table cone collection (part of dotted line frame) of two four inputs to output port direction, searches by cone collection Circuit meshwork list is changed into as shown in Fig. 4 (b) watch circuit netlist after synthetic operation again, at this moment has followed by output end direction selection Effect cone collection (part of dotted line frame) carries out synthetic operation again.And so on, until cone collection traverses all output ports.
Shown in the circuit such as Fig. 5 (b) of circuit after synthetic operation again in Fig. 5 in cone collection, then comprehensive preceding circuit is such as A look-up table is lacked shown in Fig. 5 (a), area is optimized.
Area is optimized and the increased cone collection lut circuits netlist of time delay integrates again: in Fig. 6, the input cone of Fig. 6 (a) 10 Circuit in collection obtains Fig. 6 (b) circuit (method is refering to above-mentioned document) after synthetic operation again, and circuit has lacked one than before Look-up table, area are optimized;Meanwhile Fig. 6 (b) again integrate after circuit cone collection in circuit input end at most wanted to output end By 3 look-up tables, than when comprehensive preceding circuit increases a look-up table again in circuit in the cone collection before integrating again such as Fig. 6 (a) Prolong, longest path will pass through three lut circuits, and original circuit longest path passes through two lut circuits.
Step 4: synthetic operation again is carried out to look-up table cone collector:
(1) if cone collector does not cross over register, the look-up table that step 2 is generated bores collector application look-up table Synthetic operation method is handled again;
(2) if cone collection crosses over register, classification processing can be carried out for circuit look-up table feature:
When being reset for the register of look-up table:
(a) when multiple input single output lookup table register is reset:
When register is reset by look-up table input terminal to output end, all input ports is needed all to there is register.Deposit It is then unrestricted when device is reset by look-up table output end to input terminal.
Fig. 7 (a) shows that, for multiple input single output lut circuits, register is to set up toward input terminal movement.Fig. 7 (b) show that all look-up table input ports have register that register is mobile to output end.Fig. 7 (c) shows to work as look-up table All not there is register in input port, then register cannot be mobile to output end.
(b) when multiple-input and multiple-output lookup table register is reset:
When register is reset by look-up table input terminal to output end, all input ports is needed all to there is register.Deposit When device is reset by look-up table output end to input terminal, need all to there is register on all paths of output port driving.
Fig. 8 (a) shows all roads that all look-up table input ports have register that can drive register to output end Diameter is mobile.Fig. 8 (b) shows not there is register when look-up table input port, then register cannot be mobile to output end.Figure 8 (c) show when there is register in all paths of all register output ends driving can be by register to input terminal all of the port It is mobile.Fig. 8 (d) shows all paths of register output end driving when there is no register, then register cannot be to defeated Outlet is mobile.
The method that multiple input single output look-up table cone collection register integrates again when resetting are as follows:
(a1) register Forward is createed into effectively cone collection first, makes to bore collector without containing register;
(b1) synthetic operation again is carried out to cone collector;
(c1) register is mobile to output end, more argins are created for subsequent pipelining-stage.
As shown in Fig. 9 (a), cone collector crosses over register, then can not carry out synthetic operation again.At this moment it needs register Forward creates effectively cone collection, shown in circuit such as Fig. 9 (c) as shown in Fig. 9 (b), then after the completion of synthetic operation.Finally in order to give Subsequent pipelining-stage creates more argins, and register is mobile toward output end as far as possible, as shown in Fig. 9 (d).
Multiple-input and multiple-output look-up table cone collection register general flow again when resetting:
(a2) register Forward is createed into effectively cone collection first, makes to bore collector without containing register.If output end The multipath of driving lacks register, then needs to borrow a register at this to meet register Forward needs.It borrows Local use -1 indicates.
(b2) synthetic operation again is carried out to cone collector;
(c2) Syntheses choice is carried out again according to validity when resetting to judge.Needs after judgement integrates again are mobile to output end Whether the look-up table input port of register all has register:
If so, then all registers of input port move forward, the register that step (a2) is borrowed is offset;If not provided, Then give up the integration scenario again, circuit restores to comprehensive preceding state again.
Figure 10 is the schematic diagram integrated again when multiple-input and multiple-output look-up table of the present invention cone collection register is reset, Figure 10 (a) For the multi output lut circuits before integrating again, for non-effective cone collection (crossing over register case) comprehensive again, deposited inside the circuit In output port, and the output port does not have register;Two paths of lower left corner look-up table output driving, one is passed through register To the look-up table input port on top, another is directly output port.If reset to the register to circuit front-end, by In not having register on another output port, conventional method cannot achieve movement.It, can be in the path when in order to complete to reset Two paths of a virtual register, such lower left corner look-up table output end driving have register, so that it may complete forward At weight fixed cycle operator.As shown in Figure 10 (b), need inside it output port borrow a register come meet register Forward Operation, but due to the register of output port be it is virtual, it is subsequent to be offset by giving back, so register weighs forward The place of output port is just labeled as -1 after timing, and representative needs a register to offset herein.It is right after when register is reset Selected cone collector carries out synthetic operation again.If shown in circuit such as Figure 10 (c) after integrating again, saved by synthetic circuit again One look-up table resource, but do not have register on the input port of lower left corner look-up table entirely, it is subsequent cannot will deposit Device is mobile to circuit rear end, and the register that output port borrows cannot also be offset, i.e., integrates late register again without normal direction electricity The mobile register offsetting output port and borrowing in road rear end, thus this integration scenario must be given up again.If the electricity after integrating again Shown in such as Figure 10 (d) of road, there is register on the input port of lower left corner look-up table, it is subsequent can be by register to circuit rear end Mobile, the register that output port borrows can also be offset, so the integration scenario establishment again.
Step 5: choice judgement is carried out to the circuit after integrating again according to argin:
(1) if current pipelining-stage pot life allowance is negative:
Give up the program
(2) if current pipelining-stage pot life allowance is positive:
Using the program
(a3) if integrated again not across register:
For the non-synthetic operation again across register, it is also possible to which the time for increasing path and reducing corresponding pipelining-stage is abundant Amount, in some instances it may even be possible to the case where this pipelining-stage local time allowance is negative occur.At this moment, if from this pipelining-stage to output end flowing water Grade total time allowance is positive, and the argin of subsequent pipelining-stage is added to front flowing water when can reset by follow-up register Grade.
(b3) if comprehensive again cross over register:
Current pipelining-stage local time allowance just, then chooses the program, and register keeps original position;
Current pipelining-stage local time allowance is negative, then register is carried out weight fixed cycle operator to input terminal, until its office Portion's argin is more than or equal to zero.
Whether step 6: reaching the ending of lut circuits when judging, if not, three-step 5 of circulation step;If It is to terminate.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (9)

1. a kind of more pipelining-stage sequence circuits when being reset based on register synthetic operation method again, which is characterized in that its step It is as follows:
Step 1: it is handled using the hardware description language that FPGA design process designs user by logic synthesis and mapping phase Generate lut circuits;
Step 2: the argin of initialization each pipelining-stage of look-up table sequence circuit: multiple-input and multiple-output look-up table timing is utilized The argin calculation method of circuit counts the argin of each pipelining-stage and its inner track in look-up table sequence circuit It calculates;
Step 3: cone set method is divided using lut circuits and successively selects to search to output port direction since input port Table bores collector;
Step 4: synthetic operation again is carried out to look-up table cone collector:
(1) if cone collector does not cross over register, the look-up table cone collector application look-up table that step 2 is generated is comprehensive again Closing operation method is handled;
(2) if cone collection crosses over register, classification processing can be carried out for circuit look-up table feature;
Step 5: choice judgement is carried out to the circuit after integrating again according to argin: if current pipelining-stage pot life is abundant Amount is negative, and gives up the integration scenario again of step 4;If current pipelining-stage pot life allowance is positive, using the comprehensive again of step 4 Conjunction scheme.
2. more pipelining-stage sequence circuits according to claim 1 when being reset based on register synthetic operation method again, It is characterized in that, the method for solving of single pipelining-stage critical path time delay and local argin are as follows:
(a) between counter register each side time delay;
(b) by the T of pipelining-stage input register nodearrivalValue is set as 0;
(c) T of other nodes is calculatedarrivalValue: Wherein, i is the starting point in any one path in assembly line, and j is the terminal in the path, Tarrival(i) it is arrived for the signal of node i Up to time, Tarrival(j) time of arrival (toa) for being node j, fanin (j) represent any one node before connecting node j, The time delay of delay (i, j) delegated path (i, j);
(d) by all pipelining-stage output port register TrequiredValue is set as the crucial path delay of time:Wherein, registerout is the register of any one output port;
(e) T of other nodes is calculated using following formularequiredValue are as follows: Wherein, fanout (i) represents the arbitrary node that node i is driven backward, Trequired(i) indicate that the signal of node i reaches at the latest Time, Trequierd(j) signal arrival time at the latest of node j is indicated;
(f) the argin value arbitrarily connected in following formula counting circuits: slack (i, j)=T is utilizedrequierd(j)- Tarrival(i)-delay(i,j);
(g) local time's allowance of any one paths in pipelining-stage inside are as follows: slack (M, i)=CPD-delay (M, i), In, i is any paths in pipelining-stage M;
(h) local time's allowance of the pipelining-stage are as follows: slack (M)=min (slack (M, i)), i ∈ M.
3. more pipelining-stage sequence circuits according to claim 2 when being reset based on register synthetic operation method again, It is characterized in that, the clock cycle of the entire flow line circuit of multithread water level production line circuit has to be larger than equal to its internal pipelining-stage The middle longest pipelining-stage critical path time delay found out using single pipelining-stage critical path time delay method, corresponding each pipelining-stage time The clock cycle of the calculation basis setting of allowance;
The common available length of a game's allowance of each pipelining-stage of flow line circuit adds for local time's allowance of this pipelining-stage should Local time's allowance of the subsequent all pipelining-stages of pipelining-stage:
Wherein, N is pipelining-stage M last flow line circuit backward The last one pipelining-stage, pipelining-stage of the L between pipelining-stage M+1 to N.
4. more pipelining-stage sequence circuits according to claim 3 when being reset based on register synthetic operation method again, It is characterized in that, there is the available local time's allowance of the pipelining-stage of output port in inside are as follows: sets drive output mouth in pipelining-stage Look-up table output port to the path delay of time between the pipelining-stage output register be y, the office of path delay of time y and the pipelining-stage Argin minimum value between the two in portion's is the available local time's allowance of the pipelining-stage;
There is length of a game's allowance method for solving of the pipelining-stage of output port in inside are as follows: if the subsequent pipelining-stage of pipelining-stage N is There is no the common pipelining-stage N+1 of extra output port, then it can be abundant by the available local time of subsequent all pipelining-stages Amount mutually adduction path length y comparison, minimum value is the available argin of the pipelining-stage.
5. more pipelining-stage sequence circuits according to claim 1 when being reset based on register synthetic operation method again, It is characterized in that, the method for look-up table cone collector is selected in the step 3 are as follows: by original lut circuits netlist from input terminal Mouthful start to be divided into the K input cone collection being made of several look-up tables to output port direction, K according in circuit look-up table it is defeated Inbound port number and circuit structure are configured, if look-up table input port number is J, then K=3*J-2, bores the input terminal in collection Mouth is less than equal to K;If bore the circuit in integrating as area optimum circuit, i.e., the circuit is that K input area is optimal, Wu Fatong Later continue again synthesis further to be optimized look-up table, K input cone collection is invalid, needs then to continue backward selection cone collection Circuit.
6. more pipelining-stage sequence circuits according to claim 1 when being reset based on register synthetic operation method again, It is characterized in that, when cone collection crosses over register in the step 4, the method for circuit look-up table feature progress classification processing are as follows:
When being reset for the register of look-up table:
When multiple input single output lookup table register is reset: when register is reset by look-up table input terminal to output end, needing institute There is input port all to there is register;It is then unrestricted when register is reset by look-up table output end to input terminal;
When multiple-input and multiple-output lookup table register is reset: when register is reset by look-up table input terminal to output end, needing institute There is input port all to there is register;When register is reset by look-up table output end to input terminal, output port is needed to drive All there is register on all paths.
7. more pipelining-stage sequence circuits according to claim 6 when being reset based on register synthetic operation method again, It is characterized in that, integrated approach again when multiple input single output look-up table cone collection register is reset are as follows:
(a1) register Forward is createed into effectively cone collection first, makes to bore collector without containing register;
(b1) synthetic operation again is carried out to cone collector;
(c1) register is mobile to output end, more argins are created for subsequent pipelining-stage.
8. more pipelining-stage sequence circuits according to claim 6 when being reset based on register synthetic operation method again, It is characterized in that, the integrated approach again when the multiple-input and multiple-output look-up table cone collection register is reset:
(a2) register Forward is createed into effectively cone collection first, makes to bore collector without containing register.If output end drives Multipath lack register, then need at this borrow a register come meet register Forward needs.The place of borrow It is indicated with -1.
(b2) synthetic operation again is carried out to cone collector;
(c2) Syntheses choice is carried out again according to validity when resetting to judge.Needs after judgement integrates again are deposited to output end is mobile Whether the look-up table input port of device all has register:
If so, then all registers of input port move forward, the register that step (a2) is borrowed is offset;If it is not, house The integration scenario again is abandoned, circuit restores to comprehensive preceding state again.
9. according to claim 1, more pipelining-stage sequence circuits described in any one of 7 or 8 when being reset based on register are again Synthetic operation method, which is characterized in that in the step 5, current pipelining-stage pot life allowance is timing,
If integrated again not across register: the time that synthetic operation is likely to increase as well path and reduces corresponding pipelining-stage again is abundant Amount, in some instances it may even be possible to the case where this pipelining-stage local time allowance is negative occur;If total from this pipelining-stage to output end pipelining-stage Argin is positive, and the argin of subsequent pipelining-stage is added to front pipelining-stage when can reset by follow-up register;
If comprehensive again cross over register: current pipelining-stage local time allowance just, chooses the program, and register keeps original Position;Current pipelining-stage local time allowance is negative, then register is carried out weight fixed cycle operator to input terminal, until its local time Between allowance be more than or equal to zero.
CN201811587490.4A 2018-12-25 2018-12-25 Register retiming-based multi-pipeline sequential circuit resynthesis operation method Active CN109815545B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811587490.4A CN109815545B (en) 2018-12-25 2018-12-25 Register retiming-based multi-pipeline sequential circuit resynthesis operation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811587490.4A CN109815545B (en) 2018-12-25 2018-12-25 Register retiming-based multi-pipeline sequential circuit resynthesis operation method

Publications (2)

Publication Number Publication Date
CN109815545A true CN109815545A (en) 2019-05-28
CN109815545B CN109815545B (en) 2023-04-07

Family

ID=66602448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811587490.4A Active CN109815545B (en) 2018-12-25 2018-12-25 Register retiming-based multi-pipeline sequential circuit resynthesis operation method

Country Status (1)

Country Link
CN (1) CN109815545B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112580286A (en) * 2020-12-18 2021-03-30 广东高云半导体科技股份有限公司 Multithreading synthesis method and device
CN115048889A (en) * 2022-08-16 2022-09-13 井芯微电子技术(天津)有限公司 Asynchronous path extraction method and system based on back-end time sequence convergence simulation
CN117807953A (en) * 2023-12-29 2024-04-02 苏州异格技术有限公司 Chip delay optimization method and device, computer equipment and storage medium
CN118070724A (en) * 2024-03-06 2024-05-24 苏州异格技术有限公司 FPGA delay optimization method and device, computer equipment and storage medium
CN118070724B (en) * 2024-03-06 2024-07-02 苏州异格技术有限公司 FPGA delay optimization method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6373279B1 (en) * 2000-05-05 2002-04-16 Xilinx, Inc. FPGA lookup table with dual ended writes for ram and shift register modes
US7120883B1 (en) * 2003-05-27 2006-10-10 Altera Corporation Register retiming technique
CN103324774A (en) * 2012-12-29 2013-09-25 东南大学 Processor performance optimization method based on clock planning deviation algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6373279B1 (en) * 2000-05-05 2002-04-16 Xilinx, Inc. FPGA lookup table with dual ended writes for ram and shift register modes
US7120883B1 (en) * 2003-05-27 2006-10-10 Altera Corporation Register retiming technique
CN103324774A (en) * 2012-12-29 2013-09-25 东南大学 Processor performance optimization method based on clock planning deviation algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YANG GUO;: "A Distributed Register File Architecture Based on Dynamic Scheduling for VLIW Machine", 《IEEE》 *
廖启文: "面向5G通信的高速PAM4信号时钟与数据恢复技术", 《中兴通讯技术》 *
李鹏等: "以时间裕量为参数的时序电路再综合算法", 《计算机辅助设计与图形学学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112580286A (en) * 2020-12-18 2021-03-30 广东高云半导体科技股份有限公司 Multithreading synthesis method and device
CN115048889A (en) * 2022-08-16 2022-09-13 井芯微电子技术(天津)有限公司 Asynchronous path extraction method and system based on back-end time sequence convergence simulation
CN117807953A (en) * 2023-12-29 2024-04-02 苏州异格技术有限公司 Chip delay optimization method and device, computer equipment and storage medium
CN118070724A (en) * 2024-03-06 2024-05-24 苏州异格技术有限公司 FPGA delay optimization method and device, computer equipment and storage medium
CN118070724B (en) * 2024-03-06 2024-07-02 苏州异格技术有限公司 FPGA delay optimization method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN109815545B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
US9041430B2 (en) Operational time extension
US9183344B2 (en) Timing operations in an IC with configurable circuits
US8683410B2 (en) Operational cycle assignment in a configurable IC
US7496879B2 (en) Concurrent optimization of physical design and operational cycle assignment
CN109815545A (en) More pipelining-stage sequence circuits when being reset based on register synthetic operation method again
Singh et al. PITIA: an FPGA for throughput-intensive applications
US7219048B1 (en) Methodology and applications of timing-driven logic resynthesis for VLSI circuits
Xu et al. A design methodology for application-specific networks-on-chip
CN109804385A (en) Binary neural network on programmable integrated circuit
US10615800B1 (en) Method and apparatus for implementing configurable streaming networks
Abbas et al. Latency insensitive design styles for FPGAs
Pontes et al. Hermes-A–an asynchronous NoC router with distributed routing
Muñoz-Martínez et al. STONNE: A detailed architectural simulator for flexible neural network accelerators
Swarbrick et al. Versal network-on-chip (NoC)
Heißwolf A scalable and adaptive network on chip for many-core architectures
Chen et al. Technology mapping and clustering for FPGA architectures with dual supply voltages
US20140347096A1 (en) Non-lut field-programmable gate arrays
CN109800468A (en) A kind of more pipelining-stage sequence circuit incasement operation methods when being reset based on register
Dinh et al. A routing approach to reduce glitches in low power FPGAs
US8386983B1 (en) Parallel signal routing
Pu et al. Power and area efficient router with automated clock gating for neuromorphic computing
Jackson et al. Implementing asynchronous embryonic circuits using AARDVArc
CN109766293A (en) Connect the circuit and System on Chip/SoC of FPGA and artificial intelligence module on chip
Bostelmann et al. A conceptual toolchain for an application domain specific reconfigurable logic architecture
Betz et al. Background and Previous Work

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant