CN114282472A - Source code segmentation method and system of FPGA - Google Patents

Source code segmentation method and system of FPGA Download PDF

Info

Publication number
CN114282472A
CN114282472A CN202210001037.0A CN202210001037A CN114282472A CN 114282472 A CN114282472 A CN 114282472A CN 202210001037 A CN202210001037 A CN 202210001037A CN 114282472 A CN114282472 A CN 114282472A
Authority
CN
China
Prior art keywords
clusters
module instance
module
cluster
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210001037.0A
Other languages
Chinese (zh)
Inventor
叶磊
李艳荣
周立兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Guomicrochip Technology Co ltd
Original Assignee
Shenzhen Guomicrochip Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Guomicrochip Technology Co ltd filed Critical Shenzhen Guomicrochip Technology Co ltd
Priority to CN202210001037.0A priority Critical patent/CN114282472A/en
Publication of CN114282472A publication Critical patent/CN114282472A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The invention provides a source code segmentation method and a source code segmentation system for an FPGA (field programmable gate array), wherein the method comprises the following steps of: according to the FPGA resource constraint condition, taking a plurality of Module instances with the input/output delay sequencing at the front as initial seeds, and respectively clustering the initial seeds with the Module instances interconnected with the initial seeds to obtain a plurality of clusters; and (3) moving the Module instance with the minimum input and output delay in each cluster to other clusters to form a plurality of new clusters which accord with the FPGA resource constraint condition, repeating the process, finding out the condition of the minimum IO among different clusters, and taking the cluster at the moment as a segmentation result. By adopting the technical scheme of the invention, the segmentation process can be simplified, and the segmentation performance can be improved.

Description

Source code segmentation method and system of FPGA
Technical Field
The invention relates to the field of FPGA, in particular to a source code segmentation method and a source code segmentation system of FPGA.
Background
As modern SoC designs become more complex and transistors become more bulky, verifying the design becomes equally difficult. Currently, Emulation is adopted to carry out simulation verification acceleration, which becomes the mainstream direction of large-scale and ultra-large-scale integrated circuit design, and the design adopts a plurality of FPGA interconnection and cascade connection modes to accelerate verification of user logic design. A user needs to try to divide a large design into a plurality of small designs and configure the small designs into a plurality of FPGA, and meanwhile, the logic function of the whole design is guaranteed to be correct and correct during operation, and the performance reaches the standard. The existing way to partition the user logic DUT logic is to use a traditional algorithm to perform relatively simple and extensive partitioning, even requiring manual partitioning of the logic design. And most of the segmentation is performed based on the gate-level netlist, so that the performance and the effect are poor.
Disclosure of Invention
The present invention provides a method and a system for partitioning source codes of an FPGA based on Module instance, which are used to solve the above-mentioned disadvantages of poor performance and effect of the source code partitioning method of the FPGA in the prior art.
In the embodiment of the invention, a source code segmentation method of an FPGA is provided, which comprises the following steps:
analyzing the whole RTL logic project to obtain an RTL hierarchical instantiated tree;
calculating the resource consumption of each Module instance in the RTL level instantiated tree;
analyzing the input and output delay of each Module instance, and performing descending sorting;
according to the FPGA resource constraint condition, taking a plurality of Module instances with the input/output delay sequencing at the front as initial seeds, and respectively clustering the initial seeds with the Module instances interconnected with the initial seeds to obtain a plurality of clusters;
and (3) moving the Module instance with the minimum input and output delay in each cluster to other clusters interconnected with the Module instance to form a plurality of new clusters meeting the FPGA resource constraint condition, repeating the iteration process, finding out the cluster combination with the minimum IO among the clusters, and taking the cluster combination at the moment as a segmentation result.
In the embodiment of the invention, the resource consumption of each Module instance comprises LUT, FF, RAM and IO number resources.
In the embodiment of the invention, the input and output delay of the Module instance is the sum of the input delay and the output delay of the Module instance, the input delay of the Module instance is the delay from the input of the Module instance to the first register, and the delay from the last register to the output of the Module instance.
In the embodiment of the invention, the Module instance with the minimum input/output delay in each cluster is moved to other clusters interconnected with the cluster to form a plurality of new clusters meeting the FPGA resource constraint conditions, and the method comprises the following steps:
moving the Module instance with the minimum input and output delay in a certain cluster to other clusters interconnected with the cluster, then calculating the number of IOs between clusters, if the number of IOs between clusters is increased, returning to the state before moving, and then moving the Module instance with the minimum input and output delay in the next cluster; and if the number of IO between clusters is reduced, taking the cluster obtained at the moment as a new cluster.
In an embodiment of the present invention, a system for partitioning a source code of an FPGA is further provided, which includes:
the RTL analysis module is used for analyzing the whole RTL logic project to obtain an RTL level instantiated tree;
the resource calculation Module is used for calculating the resource consumption of each Module instance in the RTL level instantiated tree;
the delay analysis Module is used for analyzing the input and output delay of each Module instance and performing descending sorting;
the clustering Module is used for clustering a plurality of Module instances which are in front of the input-output delay sequencing as initial seeds respectively with the Module instances interconnected with the initial seeds according to the FPGA resource constraint condition to obtain a plurality of clusters;
and the segmentation Module is used for moving the Module instance with the minimum input/output delay in each cluster to other clusters interconnected with the cluster, forming a plurality of new clusters according with the FPGA resource constraint condition, repeating the iteration process, finding out the cluster combination with the minimum IO among the clusters, and taking the cluster combination at the moment as a segmentation result.
In the embodiment of the invention, the resource consumption of each Module instance comprises LUT, FF, RAM and IO number resources.
In the embodiment of the invention, the input/output delay of the Module instance is the sum of the input delay and the output delay of the Module instance, the input delay of the Module instance is the delay from the input of the Module instance to the first internal register, and the output delay of the Module instance is the delay from the last internal register to the output of the Module instance.
In the embodiment of the invention, the Module instance with the minimum input/output delay in each cluster is moved to other clusters interconnected with the cluster to form a plurality of new clusters meeting the FPGA resource constraint conditions, and the method comprises the following steps:
moving the Module instance with the minimum input and output delay in a certain cluster to other clusters interconnected with the cluster, then calculating the number of IOs between clusters, if the number of IOs between clusters is increased, returning to the state before moving, and then moving the Module instance with the minimum input and output delay in the next cluster; and if the number of IO between clusters is reduced, taking the cluster obtained at the moment as a new cluster.
Compared with the prior art, in the technical scheme of the invention, Module instance is used as a basic unit to perform logic segmentation, so that the segmentation complexity is simplified; the delay between Module instances is considered and only the delay of an instance input to a register and the delay of a register to output are analyzed to time-sequentially drive the partitioning. Finally, the segmentation result is that the segmentation IO is as small as possible, and the segmentation line delay is as low as possible, so that the time sequence performance of the logic design is improved.
Drawings
Fig. 1 is a flowchart of a source code division method of an FPGA according to an embodiment of the present invention.
FIG. 2 is a diagram of an RTL-level instantiated tree according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of resource calculation according to an embodiment of the present invention.
FIG. 4 is a schematic delay diagram of an example module according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a source code splitting system of an FPGA according to an embodiment of the present invention.
Detailed Description
As shown in fig. 1, in the embodiment of the present invention, a method for splitting source codes of an FPGA is provided, which includes steps S1-S5. The following description will be made separately.
Step S1: and analyzing the whole RTL logic project to obtain an RTL hierarchical instantiated tree.
As shown in FIG. 2, the RTL hierarchical instantiation tree is composed of a plurality of tree-distributed Module instance nodes. Each Module instance node is associated with at least one other Module instance node.
Step S2: and calculating the resource consumption of each Module instance in the RTL level instantiated tree.
The resource consumption of each Module instance includes LUT (look-up table), FF (Flip Flop), RAM (random access memory), and IO (Input Output) number resources.
For example, the resources of the adder Module instance in FIG. 3 are as follows:
look-up table LUT: 0
Full Adder add: 7
A flip-flop FF: 7
RAM:0
IO:20。
Step S3: and analyzing the input and output delay of each Module instance and sequencing in a descending order.
As shown in FIG. 4, the input/output delay of a Module instance is the sum of the input delay and the output delay of the Module instance, the input delay of the Module instance is the delay from the input of the Module instance to the first internal register, and the output delay of the Module instance is the delay from the last internal register to the output of the Module instance. For many connecting lines among the instances, the specific calculation method is to calculate the average delay of all other input connecting lines as the input delay except the clock line, and calculate the average delay of all other output connecting lines as the output delay. In the segmentation process, the higher the delay, the lower the segmentation probability.
Step S4: and according to the FPGA resource constraint condition, taking a plurality of Module instances with the input/output delay sequencing at the front as initial seeds, and respectively clustering the initial seeds with the Module instances interconnected with the initial seeds to obtain a plurality of clusters.
In the process of segmentation, the higher the delay, the lower the segmentation probability, and the multiple Module instances with the input/output delays ranked in the front can be used as initial seeds to generate multiple clusters.
Step S5: and (3) moving the Module instance with the minimum input and output delay in each cluster to other clusters to form a plurality of new clusters which accord with the FPGA resource constraint condition, repeating the process, finding out the condition of the minimum IO among different clusters, and taking the cluster at the moment as a segmentation result.
Note that the plurality of clusters generated in step S4 are not optimal clusters, and therefore, fine adjustment is required. And moving the Module instance with the minimum input and output delay in each cluster to other clusters to form a plurality of new clusters which accord with the FPGA resource constraint condition, wherein the new clusters specifically comprise:
moving the Module instance with the minimum input and output delay in a certain cluster to other clusters interconnected with the cluster, then calculating the number of IOs between clusters, if the number of IOs between clusters is increased, returning to the state before moving, and then moving the Module instance with the minimum input and output delay in the next cluster; if the number of the IOs among the clusters is reduced, the cluster obtained at the moment is used as a new cluster, the iterative process is repeated until the condition that the IOs among the clusters are minimum is found out, the cluster at the moment is used as a segmentation result, and a plurality of HDL files configured in the FPGA are output according to the segmentation result.
As shown in fig. 5, corresponding to the source code segmentation method for the FPGA, in the embodiment of the present invention, a source code segmentation system for the FPGA is further provided, which includes an RTL analysis module 1, a resource calculation module 2, a delay analysis module 3, a clustering module 4, and a segmentation module 5.
And the RTL analysis module 1 is used for analyzing the whole RTL logic project to obtain an RTL hierarchical instantiated tree.
The resource calculating Module 2 is configured to calculate resource consumption of each Module instance in the RTL hierarchical instantiation tree.
The delay analysis Module 3 is configured to analyze the input/output delay of each Module instance, and perform descending order sorting.
And the clustering Module 4 is configured to cluster a plurality of Module instances interconnected with the Module instances with the input/output delay sequencing in the front as initial seeds according to the FPGA resource constraint condition to obtain a plurality of clusters.
And the segmentation Module 5 is configured to move the Module instance with the smallest input/output delay in each cluster to other clusters interconnected with the Module instance, to form a plurality of new clusters that meet FPGA resource constraint conditions, find a cluster combination with the smallest IO between clusters, and take the cluster combination at this time as a segmentation result.
In summary, in the technical solution of the present invention, Module instance is used as a basic unit to perform logic partitioning, so as to simplify the partitioning complexity; the delay between Module instances is considered and only the delay of an instance input to a register and the delay of a register to output are analyzed to time-sequentially drive the partitioning. Finally, the segmentation result is that the segmentation IO is as small as possible, and the segmentation line delay is as low as possible, so that the time sequence performance of the logic design is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (8)

1. A source code segmentation method of an FPGA is characterized by comprising the following steps:
analyzing the whole RTL logic project to obtain an RTL hierarchical instantiated tree;
calculating the resource consumption of each Module instance in the RTL level instantiated tree;
analyzing the input and output delay of each Module instance, and performing descending sorting;
according to the FPGA resource constraint condition, taking a plurality of Module instances with the input/output delay sequencing at the front as initial seeds, and respectively clustering the initial seeds with the Module instances interconnected with the initial seeds to obtain a plurality of clusters;
and (3) moving the Module instance with the minimum input and output delay in each cluster to other clusters interconnected with the Module instance to form a plurality of new clusters meeting the FPGA resource constraint condition, repeating the iteration process, finding out the cluster combination with the minimum IO among the clusters, and taking the cluster combination at the moment as a segmentation result.
2. The method of FPGA source code partitioning of claim 1, wherein the resource consumption of each Module instance includes LUT, FF, RAM and IO amount resources.
3. The method of claim 1, wherein the input/output delay of the Module instance is a sum of the input delay and the output delay of the Module instance, the input delay of the Module instance is a delay from an input of the Module instance to a first internal register, and the output delay of the Module instance is a delay from a last internal register to an output of the Module instance.
4. The method for partitioning the source code of the FPGA of claim 1, wherein the Module instance with the smallest input/output delay in each cluster is moved to other clusters interconnected with the Module instance to form a plurality of new clusters that meet FPGA resource constraints, comprising:
moving the Module instance with the minimum input and output delay in a certain cluster to other clusters interconnected with the cluster, then calculating the number of IOs between clusters, if the number of IOs between clusters is increased, returning to the state before moving, and then moving the Module instance with the minimum input and output delay in the next cluster; and if the number of IO between clusters is reduced, taking the cluster obtained at the moment as a new cluster.
5. A source code segmentation system of an FPGA, comprising:
the RTL analysis module is used for analyzing the whole RTL logic project to obtain an RTL level instantiated tree;
the resource calculation Module is used for calculating the resource consumption of each Module instance in the RTL level instantiated tree;
the delay analysis Module is used for analyzing the input and output delay of each Module instance and performing descending sorting;
the clustering Module is used for clustering a plurality of Module instances which are in front of the input-output delay sequencing as initial seeds respectively with the Module instances interconnected with the initial seeds according to the FPGA resource constraint condition to obtain a plurality of clusters;
and the segmentation Module is used for moving the Module instance with the minimum input/output delay in each cluster to other clusters interconnected with the cluster, forming a plurality of new clusters according with the FPGA resource constraint condition, repeating the iteration process, finding out the cluster combination with the minimum IO among the clusters, and taking the cluster combination at the moment as a segmentation result.
6. The FPGA source code partitioning system of claim 5, wherein the resource consumption of each Module instance comprises LUT, FF, RAM and IO amount resources.
7. The FPGA source code partitioning system of claim 5, wherein the input/output delay of the Module instance is a sum of an input delay and an output delay of the Module instance, the input delay of the Module instance is a delay from an input of the Module instance to a first internal register, and the output delay of the Module instance is a delay from a last internal register to an output of the Module instance.
8. The FPGA source code partitioning system of claim 5, wherein the moving of the Module instance with the lowest I/O latency in each cluster to other clusters interconnected therewith forms a plurality of new clusters that meet FPGA resource constraints, comprises:
moving the Module instance with the minimum input and output delay in a certain cluster to other clusters interconnected with the cluster, then calculating the number of IOs between clusters, if the number of IOs between clusters is increased, returning to the state before moving, and then moving the Module instance with the minimum input and output delay in the next cluster; and if the number of IO between clusters is reduced, taking the cluster obtained at the moment as a new cluster.
CN202210001037.0A 2022-01-04 2022-01-04 Source code segmentation method and system of FPGA Pending CN114282472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210001037.0A CN114282472A (en) 2022-01-04 2022-01-04 Source code segmentation method and system of FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210001037.0A CN114282472A (en) 2022-01-04 2022-01-04 Source code segmentation method and system of FPGA

Publications (1)

Publication Number Publication Date
CN114282472A true CN114282472A (en) 2022-04-05

Family

ID=80880172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210001037.0A Pending CN114282472A (en) 2022-01-04 2022-01-04 Source code segmentation method and system of FPGA

Country Status (1)

Country Link
CN (1) CN114282472A (en)

Similar Documents

Publication Publication Date Title
US5956257A (en) Automated optimization of hierarchical netlists
Yang et al. Balanced partitioning
Rabaey et al. Fast prototyping of datapath-intensive architectures
US7124071B2 (en) Partitioning a model into a plurality of independent partitions to be processed within a distributed environment
US8918748B1 (en) M/A for performing automatic latency optimization on system designs for implementation on programmable hardware
Pasricha et al. Constraint-driven bus matrix synthesis for MPSoC
US7797667B1 (en) Hardware acceleration of functional factoring
US11922106B2 (en) Memory efficient scalable distributed static timing analysis using structure based self-aligned parallel partitioning
US8381142B1 (en) Using a timing exception to postpone retiming
US7370295B1 (en) Directed design space exploration
US8346529B2 (en) Delta retiming in logic simulation
Muñoz-Martínez et al. STONNE: A detailed architectural simulator for flexible neural network accelerators
Moreira et al. Design of NCL gates with the ASCEnD flow
US7191417B1 (en) Method and apparatus for optimization of digital integrated circuits using detection of bottlenecks
CN111159967A (en) FPGA circuit layout and resource allocation method based on webpage ranking algorithm
CN114282472A (en) Source code segmentation method and system of FPGA
US10628543B1 (en) Systems and methods for estimating a power consumption of a register-transfer level circuit design
US7246340B1 (en) Timing-driven synthesis with area trade-off
US8904318B1 (en) Method and apparatus for performing optimization using don't care states
US7006962B1 (en) Distributed delay prediction of multi-million gate deep sub-micron ASIC designs
KR20230109649A (en) poly-bit cells
Turki et al. Towards synthetic benchmarks generator for CAD tool evaluation
Cheng et al. Pushing the limits of machine design: Automated CPU design with AI
Lee et al. K-means clustering-specific lightweight RISC-V processor
TWI759817B (en) Simulation system for soc-level power integrity and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination