CN116502578A - Construction method of netlist reduction time sequence model and static time sequence analysis method - Google Patents

Construction method of netlist reduction time sequence model and static time sequence analysis method Download PDF

Info

Publication number
CN116502578A
CN116502578A CN202310777310.3A CN202310777310A CN116502578A CN 116502578 A CN116502578 A CN 116502578A CN 202310777310 A CN202310777310 A CN 202310777310A CN 116502578 A CN116502578 A CN 116502578A
Authority
CN
China
Prior art keywords
time sequence
fpga
netlist
path
delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310777310.3A
Other languages
Chinese (zh)
Other versions
CN116502578B (en
Inventor
李艳荣
孙亚强
王俊杰
邹炎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Guoweijingrui Technology Co ltd
Original Assignee
Shenzhen Guoweijingrui Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Guoweijingrui Technology Co ltd filed Critical Shenzhen Guoweijingrui Technology Co ltd
Priority to CN202310777310.3A priority Critical patent/CN116502578B/en
Publication of CN116502578A publication Critical patent/CN116502578A/en
Application granted granted Critical
Publication of CN116502578B publication Critical patent/CN116502578B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/33Design verification, e.g. functional simulation or model checking
    • G06F30/3308Design verification, e.g. functional simulation or model checking using simulation
    • G06F30/331Design verification, e.g. functional simulation or model checking using simulation with hardware acceleration, e.g. by using field programmable gate array [FPGA] or emulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/33Design verification, e.g. functional simulation or model checking
    • G06F30/3308Design verification, e.g. functional simulation or model checking using simulation
    • G06F30/3312Timing analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/33Design verification, e.g. functional simulation or model checking
    • G06F30/3315Design verification, e.g. functional simulation or model checking using static timing analysis [STA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/34Circuit design for reconfigurable circuits, e.g. field programmable gate arrays [FPGA] or programmable logic devices [PLD]
    • G06F30/343Logical level

Abstract

The invention discloses a construction method and a static time sequence analysis method of a netlist reduction time sequence model. The netlist reduction method based on the multi-FPGA system comprises the following steps: step 1, reading gate-level netlists based on each FPGA, which are generated after the whole circuit design is divided, and searching a time sequence path which is influenced by connection delay between the FPGAs in each gate-level netlist; step 2, classifying the time sequence paths according to the starting points of the time sequence paths, and selecting the time sequence paths with delay values in each clock domain under each classification being greater than or equal to the corresponding delay threshold value; and step 3, generating a simplified sequential model netlist of the overall circuit design based on the selected sequential paths. The invention can simplify the time sequence path of the target netlist, thereby improving the performances of each aspect of static time sequence analysis.

Description

Construction method of netlist reduction time sequence model and static time sequence analysis method
Technical Field
The invention relates to the technical field of static time sequence analysis of an FPGA (field programmable gate array), in particular to a method for constructing a netlist simplified time sequence model based on a multi-FPGA system and a static time sequence analysis method.
Background
With the increase of chip design scale, the increase of complexity of circuit design, and the design of billions or billions of gates are not few. With the ultra-large design of digital circuits, software simulation takes a long time to get a result, and software simulation of typical ultra-large scale digital integrated circuits is not satisfied in CPU time. The principle of the hardware acceleration simulator based on multiple FPGAs and prototype verification is that the oversized design of the user is divided into a plurality of parts according to the number of FPGAs on the hardware acceleration simulator, and then the FPGAs work in a combined mode to run all logic of the user design so as to verify the correctness of the user design. The hardware simulation system based on the multiple FPGAs can greatly shorten the simulation time of the design.
Static timing analysis is a technique in circuit design and verification for analyzing timing requirements such as arrival time, hold time, clock interval, etc. of different signals in a circuit to determine the operational performance and correctness of the circuit. Static timing analysis plays an important role in circuit design and verification, and can detect and diagnose timing faults and bad designs in a circuit, thereby helping designers optimize circuit performance and improve circuit reliability. The static timing analysis for a multi-FPGA system is a critical step and a step with technical challenges.
Because STA (Static Timing Analysis ) is an exhaustive analysis method, the setup and hold times and other path-based latency requirements of each flip-flop in a circuit are calculated and checked for satisfaction, depending on the requirements of the synchronous circuit design and the topology of the circuit netlist. For such very large scale digital integrated circuits requiring multiple FPGA systems for verification, static timing analysis requires a long CPU time to perform one analysis.
For static time sequence analysis of a multi-FPGA system, a time sequence analysis model needs to be established from the system angle in order to obtain the time sequence performance of the system. The prior art scheme is that the netlist of each FPGA is reconnected into an integral system according to the original design, and then the conventional time sequence analysis is carried out on the integral system design. Therefore, whether the time sequence path establishment time and the time sequence maintenance time in the FPGA meet the time sequence requirements or not is analyzed, and whether the time sequence path after the delay value is introduced between the two divided FPGAs in an interconnecting line mode meets the time sequence establishment time and the time sequence maintenance time requirements or not is also needed to be analyzed.
The CPU time and RAM required for static timing analysis of a very large scale digital integrated circuit consumes far more compact designs. In the case of a multi-FPGA system, the link delay between FPGAs is constantly changed in the iterative process of a time-sequence-driven path planning tool Route and a plane tool. The disadvantage of the prior art solutions is that the CPU time increases linearly when performing multiple iterations of the timing analysis for a very large scale digital integrated circuit design. Therefore, the static time sequence analysis method for the multi-FPGA system of the ultra-large scale integrated circuit is difficult to meet the requirement on CPU time.
Therefore, a method is needed to simplify the target netlist of static timing analysis to reduce CPU time and RAM usage.
Disclosure of Invention
In order to solve the technical problem that the netlist of the whole circuit design is difficult to meet the requirements of time, performance and the like in the prior art by directly carrying out static time sequence analysis, the invention provides a construction method of a netlist simplified time sequence model and a static time sequence analysis method.
The invention provides a method for constructing a netlist reduction time sequence model based on a multi-FPGA system, which comprises the following steps:
step 1, reading gate-level netlists based on each FPGA, which are generated after the whole circuit design is divided, and searching a time sequence path which is influenced by connection delay between the FPGAs in each gate-level netlist;
step 2, classifying the time sequence paths according to the starting points of the time sequence paths, and selecting the time sequence paths with delay values in each clock domain under each classification being greater than or equal to the corresponding delay threshold value;
and step 3, generating a simplified sequential model netlist of the overall circuit design based on the selected sequential paths.
Further, the step 1 at least includes:
and respectively taking an input port and an output port of each FPGA as starting points, searching a time sequence path from each starting point to the time sequence device, and recording a delay value corresponding to the time sequence path and a corresponding clock domain of the time sequence device.
Further, when the input port of the FPGA is directly connected with the output port of the FPGA, the input port or the output port of each FPGA is taken as each starting point, a time sequence path from each starting point to the other port which is directly connected is searched, and a delay value corresponding to the time sequence path is recorded.
Further, when the input port of each FPGA is used as a starting point, a depth-first search algorithm or a breadth-first search algorithm is adopted to search the time sequence path from each starting point to the time sequence device.
Further, when the output port of each FPGA is used as a starting point, a depth-first search algorithm or a breadth-first search algorithm is adopted to search the time sequence path from each starting point to the time sequence device.
Further, according to the interconnection line delay calculation model and liberty library information of the FPGA device, a delay value corresponding to the time sequence path is obtained.
Further, the interconnection line delay calculation model specifically adopts a formula of net delay=rwire/N (Cwire/n+cpin) for calculation.
Further, in the step 2, the corresponding delay threshold in each clock domain under each category is selected as the maximum delay value of all the timing paths of the corresponding clock domain under the corresponding category.
Further, in the step 2, the selected timing path is replaced by a connection with a delay value.
The static time sequence analysis method based on the multi-FPGA system, which is provided by the invention, is used for simplifying the gate-level netlist corresponding to the overall circuit design by adopting the netlist simplifying method based on the multi-FPGA system, so as to obtain a simplified time sequence model netlist, and carrying out static time sequence analysis on the simplified time sequence model netlist.
In the conventional static time sequence analysis of all paths of the whole design, only the delay values of the wires between the FPGAs are introduced into the time sequence diagram, so that a large amount of CPU time and RAM usage are consumed, and particularly the CPU time of the very large-scale digital integrated circuit is very long, and other tools cannot be allowed to iteratively call the static time sequence analysis function. Compared with the prior art, the method has the advantages that the time sequence paths based on multiple FPGAs are simplified for the whole circuit design, and compared with the traditional static time sequence analysis for the original design, the method has the advantage that the number of analyzed paths is greatly reduced. Thus, CPU time and RAM usage of static time sequence analysis can be reduced; because the time for analyzing each static time sequence is greatly reduced, other tools can iteratively call static analysis. And other tools work under the time sequence drive, so that the running frequency of the final design on the multi-FPGA system is improved.
Drawings
The invention is described in detail below with reference to examples and figures, wherein:
FIG. 1 is a main flow chart of the present invention;
FIG. 2 is a flow chart of a static timing analysis according to an embodiment of the present invention;
FIG. 3 is a simplified schematic of the timing diagram of the present invention;
FIG. 4 is a diagram of an original gate level netlist in accordance with one embodiment of the present invention;
FIG. 5 is a gate level netlist diagram of an FPGA after simplification according to one embodiment of the invention;
FIG. 6 is a diagram of a net list of multiple FPGAs prior to simplification in accordance with an embodiment of the present invention;
FIG. 7 is a simplified netlist diagram of a multi-FPGA according to one embodiment of the invention.
Detailed Description
In order to make the technical problems, technical schemes and beneficial effects to be solved more clear, the invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Thus, reference throughout this specification to one feature will be used in order to describe one embodiment of the invention, not to imply that each embodiment of the invention must be in the proper motion. Furthermore, it should be noted that the present specification describes a number of features. Although certain features may be combined together to illustrate a possible system design, such features may be used in other combinations not explicitly described. Thus, unless otherwise indicated, the illustrated combinations are not intended to be limiting.
Before introducing the method for constructing the netlist simplified time sequence model based on the multi-FPGA system, the implementation background of the invention is introduced.
The object of the method for constructing the netlist reduction time sequence model is mainly large-scale integral circuit design, and the large-scale integral circuit design is often required to be put on a plurality of FPGAs for simulation, so that in the prior art, the large-scale circuit design is divided according to hardware resources of a hardware simulation accelerator through a part tool, and after division, data information such as a gate level netlist of each FPGA, connection relations among all FPGA ports (input ports and output ports), SDC constraint (synopsys design constrains) and the like is generated.
The invention discloses a method for constructing a netlist reduction time sequence model based on a multi-FPGA system, which mainly comprises the following three steps.
Step 1, reading gate-level netlists based on each FPGA, which are generated after the whole circuit design is divided, and searching time sequence paths, which are influenced by connection delay between the FPGAs, in each gate-level netlist.
There are many kinds of wiring in the gate level netlist, such as wiring between logic devices, wiring between combinational logic (i.e., a collection of interconnected logic devices) and FPGA ports, and so forth. Meanwhile, as the whole circuit design is divided into different FPGAs, connecting lines exist between the FPGAs, and the simulation of the whole circuit design needs to transmit corresponding signals through the connecting lines between the FPGAs.
The invention selects the wanted time sequence path according to different characteristics of different connecting lines, the invention considers that the time delay value or time sequence violations in each FPGA are all fixed, and the time delay of the connecting line between two FPGAs is usually much larger than that in the FPGA, and is often a key path after the whole circuit design is put on a multi-FPGA simulation system, so the invention mainly searches the time sequence path crossing the FPGAs, and the time sequence path can be influenced by the time delay of the connecting line between the FPGAs. And whether the internal path time sequence of the FPGA meets the requirement is checked by an STA tool of an FPGA factory.
Step 2, after finding all the time sequence paths which are affected by the connection delay between the FPGAs in the step 1, classifying the time sequence paths according to the starting points of the time sequence paths, and selecting the time sequence paths with delay values greater than or equal to the corresponding delay threshold value in each clock domain under each classification.
In a preferred embodiment, the present invention selects the timing path with the greatest delay in each clock domain under each category, i.e. the timing path with the longest path and the timing path with the greatest delay value. In other embodiments, the longest, second-long, and longest-range timing paths may be selected for each clock domain under each category, and the longest, second-long, third-long, and nth-range timing paths may be selected for each clock domain under each category, where N is less than the number of timing paths for each clock domain under each category. In a further embodiment, the selected timing path may be replaced with a wire with a delay value. I.e. a wire in the reduced netlist, only in the time sequence analysis, this delay value is set into the time sequence diagram. The load of the CPU and the RAM can be greatly lightened based on the port of each FPGA and the longest path (namely the longest delay) of each clock domain. If the second or even third longest timing path is also chosen, it is functionally unaffected, but the CPU and RAM may multiply. But still much stronger than the whole original circuit because the paths are still much reduced.
Although the time sequence paths crossing the FPGAs are affected by the connection delay between the FPGAs, the invention does not consider all the time sequence paths crossing the FPGAs as critical time sequence paths, but discards some time sequence paths which are not considered as critical paths and most paths inside the FPGAs at the interface layer of the FPGAs through the step 2, only considers the longest time sequence path of each clock domain under each classification as the critical time sequence path, removes non-critical time sequence paths, only reserves the critical time sequence paths, thereby reducing the number of paths of time sequence analysis, and reducing the CPU time and the RAM use amount of the subsequent static time sequence analysis.
And 3, generating a simplified time sequence model netlist of the whole circuit design based on the selected time sequence path, wherein the netlist is a simplified time sequence model netlist for static time sequence analysis and is different from an original netlist of the circuit design.
In the step 1, the input port and the output port of each FPGA are used as the starting points, the time sequence path from the starting points to the time sequence device is searched, and the delay value corresponding to the time sequence path and the clock domain corresponding to the time sequence device are recorded. Wherein the clock domains may be determined in accordance with SDC constraints by which the clock domains of the sequential devices may be determined when performing path lookups from the interface to a single FPGA. For example, firstly, taking an input port in1 of the FPGA1 as a starting point, traversing each time sequence device connected with the in1 to obtain a time sequence path of the in1 and the corresponding time sequence device, calculating a delay value of the time sequence path as the delay value in the traversing process, and then recording the delay value corresponding to the time sequence path and a clock domain where the time sequence device is located. Then, the input port in2 of the FPGA1 is used as a starting point for traversing until all the input ports of the FPGA1 are used as starting points for traversing, then, each output port of the FPGA1 is used as a starting point for traversing, and after the traversing is completed, the similar processing process of the next FPGA is carried out until all the FPGAs are processed.
In some FPGAs, there may be a direct connection between an input port and an output port of the FPGA, where the direct connection refers to a direct connection between an input port and an output port, that is, a connection between an input port and an output port of the FPGA, and the connection between the input port and the output port does not include a timing device, and does not refer to a direct connection between the input port and the output port through a wire. If such a connection exists, step 1 is further required to find the corresponding timing path in the following case.
When the input port of the FPGA is directly connected with the output port of the FPGA, the input port or the output port of each FPGA is taken as each starting point, the time sequence path from each starting point to the other port which is directly connected is searched, and the delay value corresponding to the time sequence path is recorded. For example, the input port of the FPGA may be used as a starting point, and then the timing path of the corresponding input port directly connected to the corresponding output port of the same FPGA may be searched. Of course, in another embodiment, the output port of the FPGA may be used as a starting point, and then the timing path of the corresponding output port directly connected to the corresponding input port of the same FPGA is searched. In both embodiments, the desired same timing path can be found, either with the output port as a starting point or with the output port as a starting point.
In the step 1, a delay value corresponding to the time sequence path is obtained according to the interconnection line delay calculation model and liberty library information of the FPGA device. In other embodiments, the delay value may also be queried by a foga tool of the FPGA manufacturer, and the interconnect delay calculation model uses an RC distribution model policy_tree formula for interconnect delay calculation, specifically the following formula 1.
net delay=rwire/N (Cwire/n+cpin) (formula 1)
The net delay is a delay value of a corresponding connection line, such as a delay value from an input port of a certain FPGA to a certain logic device connected with the same FPGA, or a delay value from an input port of a certain FPGA to an output port of the same FPGA directly connected with the same FPGA, or a delay value from a certain logic device to a certain output port of the FPGA where the logic device is located.
Rwire refers to the on-line resistance of the connections between logic devices. Cwire is the on-line capacitance of the wiring between logic devices. N is the number of loads. Cpin refers to the capacitance value on each load pin. If an input port of an FPGA is directly connected to an output port of the same FPGA, the delay value between them includes both the delay of the connection between the logic devices (i.e., the delay value calculated according to equation 1) and the delay within the combined logic device, where the delay is provided with relevant data by the FPGA manufacturer or is estimated as an approximate time.
As can be seen from the description of step 1, the start point (start point) of the timing path in step 2 includes the input port and the output port of each FPGA.
As can be seen from the knowledge of the timing path, the timing path is composed of a start point (i.e., start point), a wire delay, a cell delay, and an end point (end point). The start point is where in the design the data is triggered by a clock edge, propagates through the combinational logic in the timing path, and is then captured by another clock edge at the end point. Thus whatever the other FPGA is in step 2. When the timing path reaches one input port in1 of fpga_n, it is assumed that the path to the maximum of clock clk1 is in1- > reg2 and the path to the maximum of clock clk2 is in1- > reg3. The invention considers only these paths as critical timing paths and ignores the other paths, which is the basic theory of the invention for timing modeling simplicity.
In the preferred embodiment, the path from each start point to the corresponding end point in step 2 takes only the longest path under each clock domain, and the clock domains are the "potential ranges" of the clock signals, and only one clock signal can exist in one clock domain, but at most one clock signal can correspond to two clock domains, when the rising edge and the falling edge of each clock domain are respectively sensitive by a part of resources. To determine which clock domain a sequential device belongs to, it is only necessary to see which clock signal is connected to its clock input port and which edge is sensitive. The clock signal directly controls registers belonging to the clock domain, and indirectly controls some combinational logic resources, because the input of the combinational logic in the FPGA is often the output of the registers. Since the combinational logic is not directly controlled by the clock signal, to determine which clock domain a block of combinational logic belongs to, all inputs of the block of combinational logic need to be analyzed, if all inputs of a combinational logic come from the outputs of registers in the same clock domain, the change frequency of the output of the combinational logic will necessarily follow the pulse beat of the clock, so that the combinational logic can be determined to belong to the clock domain; otherwise, there is an asynchronous or cross-clock domain problem in the combinational logic, which does not belong to any clock domain.
According to the static time sequence analysis method based on the multi-FPGA system, after the gate-level netlist corresponding to the whole circuit design is obtained, the gate-level netlist corresponding to the whole circuit design is simplified by adopting the construction method of the netlist simplifying time sequence model based on the multi-FPGA system, which is disclosed by the technical scheme, so that the simplified time sequence model netlist is obtained, and then the static time sequence analysis is carried out on the simplified time sequence model netlist.
And in the static time sequence analysis process, determining the highest frequency at which each clock constrained in the SDC can run according to the result of the time sequence analysis.
The final purpose of the invention is to obtain delay values by pre-selecting signal lines (routing paths or TDM ratio) between each FPGA and pre-estimating different delay values in order to realize the function of the path planning tools such as Route and the like based on the call time sequence analysis of the gate level netlist iteration after simplification in the algorithm execution process. And then the delay values are transmitted to a static time sequence analysis tool STA, the static time sequence analysis tool STA performs time sequence analysis once to return a time sequence result, and the tools such as a Route planning tool are told to determine whether the selection of the routing path or the TDM ratio (time division multiplexing ratio) of the signal line meets the time sequence requirement.
The key point of the invention is also to simplify the netlist, and the CPU time can be reduced only by iterative time sequence analysis after simplification. The interaction between the Route planning tool Route and the static timing analysis tool STA is performed with respect to the simplified timing diagram of the simplified timing model netlist. The connection delay between the FPGAs is set through the API, and then a static time sequence analysis tool STA returns a key path based on the delay. The Route then adjusts the delay of the corresponding FPGA connection according to the critical path (i.e., by selecting another TDM ratio) until the timing requirement is met. The TDM ratio described above is not selected by the present invention.
The inventive concept will be described in detail below in connection with specific circuit designs.
Fig. 2 shows an overall flow chart of static time sequence path analysis, after a large circuit design is divided by a partition tool, the obtained gate netlist based on each FPGA, connection relation among the FPGA ports (input port and output port) and other data information are used for realizing the time sequence model simplification work of the present invention, after the netlist is simplified by the netlist simplification method based on the multi-FPGA system of the present invention, the corresponding paths and delay values are transferred into an STA tool, the STA tool performs one time sequence analysis to return an analysis result (time sequence result), and the tools such as a Route are told to determine that the selection of the paths or TDM ratio of the signal lines meets the time sequence requirement.
Fig. 3 shows a timing path reduction schematic illustration of the present invention. The invention is based on a theory similar to the interface timing model, when signals from other FPGAs enter fpga_n through input port in1 of fpga_n, the following 3 paths are formed.
The register reg1, the sequential device under the clock domain of input ports in1 to clk1, has a delay value (delay value) of 3ns.
The register reg2, the sequential device under the clock domain of input ports in1 to clk1, has a delay value (delay value) of 5ns.
The register reg3, the sequential device under the clock domain of the input ports in1 to clk2, has a delay value (delay value) of 4ns.
Since the timing path is composed of a start point, a wire delay, a cell delay, and an end point. The start point is where in the design the data is triggered by a clock edge, propagates through the combinational logic in the timing path, and is then captured by another clock edge at the end point. So whatever the case of other FPGAs. When the timing path reaches the input port in1 of fpga_n, the path to the maximum of the clock clk1 is in1- > reg2, and the path to the maximum of the clock clk2 is in1- > reg3. The present invention ignores the path in1- > reg1, as it never becomes a critical path. This is the basic theory of the time sequence modeling and simplification of the invention.
A path from the input port in 1. First, the sequential device reg1 and sequential device reg2 (which belong to the same clock domain clk 1) are seen. Because the line preceding the input port in1 is the common part (the line coming from the FPGA), the timing path through the input port in1 is longer to the timing device reg2 (5ns+ common part) than to the timing device reg1 (3ns+ common part). 5ns+ common portion >3ns+ common portion. The removal of this path of the sequential device reg1 is free of any problems.
Whereas clk2 has only sequential device reg3 in the clock domain, so that it remains regardless of delay.
Meanwhile, a time sequence path is from the starting point to the ending point, and in the condition that the path delay is unchanged, whether the time sequence path is a critical path or not is greatly related to clock domains of the starting point and the ending point.
FIG. 4 shows an example of an FPGA raw gate level netlist.
Taking the original gate-level netlist as an example, in step 1, the gate-level netlist file of each FPGA needs to be read, the required path data is searched, and the corresponding delay information, namely the delay value, is obtained by calculation. Then in this embodiment step 1 specifically comprises the following steps.
Step 1.1, taking each FPGA input port of the FPGA as a seed point (namely a seed point and a starting point). Find all input port to sequential device paths. And records the path length and corresponding clock domain of the sequential device.
According to fig. 4, we will know that the input port in1 can reach the sequential device reg1 and the sequential device reg2 under the clock clk1 by DFS (Depth First Search ) or breadth first search algorithm with the input port in1 and the input port in2 as the start point and seed point, respectively. The delay of the input port in1 reaching the time sequence device reg1 and the time sequence device reg2 can be calculated according to the interconnection line delay calculation model (1) and the liberty library information of the FPGA device. With the input port in2 as the seed point, the sequential device reg2 under the clk1 clock domain, the sequential device reg4 under the clk2 clock domain, and the sequential device reg5 can be reached. And calculating corresponding delay values according to the interconnection line delay calculation model and liberty library information of the FPGA device.
And 1.2, taking each output port of the FPGA as a seed point. DFS (Depth First Search ) or breadth first search algorithms look back for all output port to sequential device paths. And records the delay value of the path and the corresponding clock domain of the time sequence device. Starting from the output port out1, the sequential device reg3 under the clk2 clock domain can be searched. From the output port out2, the sequential device reg4 and the sequential device reg5 in the clk2 clock domain can be searched. The calculation of the delay value is calculated by equation 1.
And 1.3, taking each input port of the FPGA as a seed point. Searching paths from all input ports to output ports, and recording delay values of the paths. This step may search for a path from the input port in2 to the output port out 2. The calculation of the delay value is calculated by equation 1.
While the flip-flop to flip-flop path logic inside the FPGA ignores. Because the timing of these paths is not affected by changes in interconnect line delay between multiple FPGAs. The sequential device of the present invention refers to a device in a digital circuit that is clocked to take a value, including but not limited to registers, counters, flip-flops, etc.
The search algorithm, which goes through the above steps, processes fig. 4, and the following paths are obtained.
The maximum delay value of the sequential device reg1 under the clock domain of the input ports in1 to clk1 is 5ns.
The maximum delay value of the sequential device reg2 under the clock domain of the input ports in1 to clk1 is 7ns.
The maximum delay value of the sequential device reg2 under the clock domain of the input ports in2 to clk1 is 7ns.
The maximum delay value of the sequential device reg4 under the clock domain of the input ports in2 to clk2 is 9ns.
The maximum delay value is 4ns for sequential devices reg5 under the clock domain of input ports in2 through clk 2.
The sequential device reg3 goes down to the output port out1 in the clk2 clock domain with a maximum delay value of 6ns.
The sequential device reg4 goes down to the output port out2 in the clk2 clock domain with a maximum delay value of 2ns.
The sequential device reg5 goes down to the output port out2 in the clk2 clock domain with a maximum delay value of 6ns.
The maximum delay value is 14ns from the input port in2 to the output port out 2.
Next, in step 2, extraction and simplification of critical path data are performed based on the path data of step 1, specifically as follows.
Step 2.1, only the longest path per clock domain is taken for each input port to all sequential devices.
In step 1 the path lengths of all sequential devices and the clock domains of the sequential devices that each input port can reach are recorded. The sequential devices are grouped by clock domain. For each input port and sequential device under each group of clock domains, only one record with the longest path is taken.
Step 2.2, take only the longest path per clock domain for the sequential device to each output port.
In step 1 the path lengths of all sequential devices and the clock domains of the sequential devices that each output port can reach in reverse are recorded. The sequential devices are grouped by clock domain. For each output port and sequential device under each group of clock domains, only one record with the longest path is taken.
Step 2.3, for each pair of input port to output port, only one of the longest paths is taken. The path length of the output port that each input port can reach is recorded in step 1. Only the longest record is taken for each input port to each output port path.
Step 2.4, further simplification is required for the key data extracted in this previous step. The logic in the middle of each longest path is replaced by a wired line of one strip delay value, thus generating the time sequence model data of each FPGA.
The final extraction and reduction of data proceeds as follows.
[ De ] the maximum delay value is 5ns for sequential device reg1 under the clock domain of input ports in1 through clk 1.
[ PRESENT ] the maximum delay value of the sequential device reg2 under the clock domain of the input ports in1 to clk1 is 7ns.
[ PRESENT ] the maximum delay value of the sequential device reg2 under the clock domain of the input ports in2 to clk1 is 7ns.
[ PRESENT ] the maximum delay value of the sequential device reg4 under the clock domain of the input ports in2 to clk2 is 9ns.
[ De ] the maximum delay value is 4ns for sequential devices reg5 under the clock domain of input ports In2 through clk 2.
The sequential device reg3 goes down to the output port out1 in the clk2 clock domain with a maximum delay value of 6ns.
The sequential device reg4 goes down the clk2 clock domain to the output port out2 with a maximum delay value of 2ns.
The sequential device reg5 goes down the clk2 clock domain to the output port out2 with a maximum delay value of 6ns.
Input port in2 to output port out2, the maximum delay value is 14ns.
And then, generating a simplified time sequence model netlist after simplifying the whole circuit design in the step 3.
And 3.1, generating a gate level netlist after simplifying each FPGA according to the model data of each FPGA generated in the step 2. Now taking fig. 4 as an example, the generated gate level netlist after the time sequence modeling is simplified is shown in fig. 5.
And 3.2, generating a simplified time sequence model netlist aiming at the whole design according to the connection relation between the FPGAs and the simplified FPGA gate level netlist. The simplified sequential model netlist represents the combinational logic paths traversed on the original gate-level netlist with some wiring with delay values. FIG. 6 depicts a graph of connection relationships of the FPGAs prior to time-sequential simplification of the multiple FPGAs. Fig. 7 depicts a connection diagram of each FPGA after time-series simplification of the multi-FPGA, in which, when signals are required to be simultaneously input to the FPGA4 from one output port for each of the FPGAs 2 and 3, in order to avoid the generation of multi-drives, logic devices or gates are added. The OR gate is mainly used for static time sequence analysis, and does not affect the original design.
In the static timing analysis method based on the multi-FPGA system, a timing graph (timing chart) is further generated from the reduced gate level netlist generated in step 3, and the timing graph at this time is very much smaller than the graph directly generated by the original design. Thus, CPU time and RAM occupation amount in time sequence analysis can be greatly reduced. And enabling tools such as a path planning tool Route and the like and the STA to iterate interactive calling. Also, because the netlist is reduced by too many paths as compared to the original design, the reduced timing graph can determine in a very small amount of time whether all setup and hold (hold time and/or hold check) violations across the FPGA paths are or are providing a maximum frequency that can be run.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. A method for constructing a netlist reduction time sequence model based on a multi-FPGA system is characterized by comprising the following steps:
step 1, reading gate-level netlists based on each FPGA, which are generated after the whole circuit design is divided, and searching a time sequence path which is influenced by connection delay between the FPGAs in each gate-level netlist;
step 2, classifying the time sequence paths according to the starting points of the time sequence paths, and selecting the time sequence paths with delay values in each clock domain under each classification being greater than or equal to the corresponding delay threshold value;
and step 3, generating a simplified sequential model netlist of the overall circuit design based on the selected sequential paths.
2. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 1, wherein the step 1 at least includes:
and respectively taking an input port and an output port of each FPGA as starting points, searching a time sequence path from each starting point to the time sequence device, and recording a delay value corresponding to the time sequence path and a corresponding clock domain of the time sequence device.
3. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 2, wherein when an input port of an FPGA is directly connected to an output port of the FPGA, a time sequence path from each start point to another port directly connected is searched by using the input port or the output port of each FPGA as each start point, and a delay value corresponding to the time sequence path is recorded.
4. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 2, wherein when an input port of each FPGA is taken as a starting point, a depth-first search algorithm or a breadth-first search algorithm is used to search a time sequence path from each starting point to a time sequence device.
5. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 2, wherein when an output port of each FPGA is taken as a starting point, a depth-first search algorithm or a breadth-first search algorithm is used to search a time sequence path from each starting point to a time sequence device.
6. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 2 or 3, wherein a time delay value corresponding to a time sequence path is obtained according to an interconnection line time delay calculation model and liberty library information of an FPGA device.
7. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 6, wherein the interconnection line delay calculation model specifically adopts a formula net delay=rwire/N (Cwire/n+Cpin) for calculation;
net delay is the delay value of the corresponding connection, rwire is the on-line resistance value of the connection between the logic devices, cwire is the on-line capacitance value of the connection between the logic devices, N is the number of loads, and Cpin is the capacitance value on each load pin.
8. The method for constructing a netlist simplified time sequence model based on a multi-FPGA system according to claim 1, wherein in the step 2, a corresponding delay threshold in each clock domain under each class is selected as a maximum delay value of all time sequence paths of the corresponding clock domain under the corresponding class.
9. The method for constructing a simplified timing model of a netlist based on a multi-FPGA system according to claim 1 or 8, wherein in step 2, the selected timing path is replaced by a connection with a delay value.
10. A static time sequence analysis method based on a multi-FPGA system is characterized in that,
the method for constructing the netlist reduction time sequence model based on the multi-FPGA system according to any one of claims 1 to 9 is adopted to reduce the gate level netlist corresponding to the overall circuit design, a simplified time sequence model netlist is obtained, and static time sequence analysis is carried out on the simplified time sequence model netlist.
CN202310777310.3A 2023-06-29 2023-06-29 Construction method of netlist reduction time sequence model and static time sequence analysis method Active CN116502578B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310777310.3A CN116502578B (en) 2023-06-29 2023-06-29 Construction method of netlist reduction time sequence model and static time sequence analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310777310.3A CN116502578B (en) 2023-06-29 2023-06-29 Construction method of netlist reduction time sequence model and static time sequence analysis method

Publications (2)

Publication Number Publication Date
CN116502578A true CN116502578A (en) 2023-07-28
CN116502578B CN116502578B (en) 2024-04-16

Family

ID=87321691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310777310.3A Active CN116502578B (en) 2023-06-29 2023-06-29 Construction method of netlist reduction time sequence model and static time sequence analysis method

Country Status (1)

Country Link
CN (1) CN116502578B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050132316A1 (en) * 2003-03-19 2005-06-16 Peter Suaris Retiming circuits using a cut-based approach
US20150186561A1 (en) * 2013-12-30 2015-07-02 Tabula, Inc. Optimizing ic design using retiming and presenting design simulation results as rescheduling optimization
US10114920B1 (en) * 2016-06-29 2018-10-30 Cadence Design Systems, Inc. Method and apparatus for performing sign-off timing analysis of circuit designs using inter-power domain logic
CN114742001A (en) * 2022-03-16 2022-07-12 南京邮电大学 System static time sequence analysis method based on multiple FPGAs
CN115293084A (en) * 2022-06-29 2022-11-04 北京轩宇信息技术有限公司 Gate-level netlist clock domain crossing automatic analysis method and system
CN116245060A (en) * 2021-12-31 2023-06-09 海光信息技术股份有限公司 Analysis method and device for digital circuit, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050132316A1 (en) * 2003-03-19 2005-06-16 Peter Suaris Retiming circuits using a cut-based approach
US20150186561A1 (en) * 2013-12-30 2015-07-02 Tabula, Inc. Optimizing ic design using retiming and presenting design simulation results as rescheduling optimization
US10114920B1 (en) * 2016-06-29 2018-10-30 Cadence Design Systems, Inc. Method and apparatus for performing sign-off timing analysis of circuit designs using inter-power domain logic
CN116245060A (en) * 2021-12-31 2023-06-09 海光信息技术股份有限公司 Analysis method and device for digital circuit, electronic equipment and storage medium
CN114742001A (en) * 2022-03-16 2022-07-12 南京邮电大学 System static time sequence analysis method based on multiple FPGAs
CN115293084A (en) * 2022-06-29 2022-11-04 北京轩宇信息技术有限公司 Gate-level netlist clock domain crossing automatic analysis method and system

Also Published As

Publication number Publication date
CN116502578B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
US7162706B2 (en) Method for analyzing and validating clock integration properties in circuit systems
Kahng et al. ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration
US5426591A (en) Apparatus and method for improving the timing performance of a circuit
EP1969502B1 (en) System and method of criticality prediction in statistical timing analysis
US7020856B2 (en) Method for verifying properties of a circuit model
US7958470B1 (en) Method and system for false path analysis
Shi et al. A fast algorithm for optimal buffer insertion
US20020053063A1 (en) Process for automated generation of design-specific complex functional blocks to improve quality of synthesized digital integrated circuits in CMOS
Kim et al. Pipeline optimization for asynchronous circuits: Complexity analysis and an efficient optimal algorithm
Gebhardt et al. Design of an energy-efficient asynchronous NoC and its optimization tools for heterogeneous SoCs
Manoranjan et al. Qualifying relative timing constraints for asynchronous circuits
CN116502578B (en) Construction method of netlist reduction time sequence model and static time sequence analysis method
US8904318B1 (en) Method and apparatus for performing optimization using don't care states
Chakraborty et al. Timing analysis of asynchronous systems using time separation of events
Hassoun et al. Optimal path routing in single-and multiple-clock domain systems
US7117465B2 (en) Application of the retimed normal form to the formal equivalence verification of abstract RTL descriptions for pipelined designs
Lim et al. A statistical approach to the estimation of delay-dependent switching activities in CMOS combinational circuits
Casu et al. Throughput-driven floorplanning with wire pipelining
Chakraborty et al. More accurate polynomial-time min-max timing simulation
Minnella et al. Mix & Latch: An Optimization Flow for High-Performance Designs With Single-Clock Mixed-Polarity Latches and Flip-Flops
Oh et al. Efficient logic-level timing analysis using constraint-guided critical path search
JPH06232735A (en) Designing method of synchronous digital electronic circuit
Prasad et al. Analysis, Physical Design and Power Optimization of Design Block at Lower Technology Node
Tong et al. Performance-driven register insertion in placement
JP2853649B2 (en) How to create a logic simulation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant