CN109901884B - Method and device for high-level synthesis and code stream generation of FPGA - Google Patents

Method and device for high-level synthesis and code stream generation of FPGA Download PDF

Info

Publication number
CN109901884B
CN109901884B CN201910044685.2A CN201910044685A CN109901884B CN 109901884 B CN109901884 B CN 109901884B CN 201910044685 A CN201910044685 A CN 201910044685A CN 109901884 B CN109901884 B CN 109901884B
Authority
CN
China
Prior art keywords
clock frequency
ith
recorded
netlist
cfi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910044685.2A
Other languages
Chinese (zh)
Other versions
CN109901884A (en
Inventor
刘建洋
王海力
连荣椿
马明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingwei Qili Beijing Technology Co ltd
Original Assignee
Jingwei Qili Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingwei Qili Beijing Technology Co ltd filed Critical Jingwei Qili Beijing Technology Co ltd
Priority to CN201910044685.2A priority Critical patent/CN109901884B/en
Publication of CN109901884A publication Critical patent/CN109901884A/en
Application granted granted Critical
Publication of CN109901884B publication Critical patent/CN109901884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The embodiment of the specification provides a method and a device for high-level synthesis and code stream generation of a Field Programmable Gate Array (FPGA), wherein the method and the device comprise the following steps: acquiring a C/C + + file to be executed; dividing the operations contained in the C/C + + file to obtain n different division schemes Pn, wherein n is not less than 1 and is an integer, and determining a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn, wherein one clock cycle in the division schemes Pn contains different operation numbers; generating a result netlist according to the division scheme Pmax; and generating an FPGA code stream file according to the result netlist. Thus, the operating efficiency of the entire design can be improved.

Description

Method and device for high-level synthesis and code stream generation of FPGA
Technical Field
One or more embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for high-level synthesis and code stream generation of an FPGA.
Background
In the high-level integration process, the program written in the C/C + + language does not describe the time length occupied by different operations, but in a Field-Programmable Gate Array (FPGA), the time lengths occupied by different operations are different. Therefore, if the different operations in the high-level synthesis generated hardware description language Verilog cannot be given reasonable time, the operation frequency of the whole design is affected finally.
In addition, the existing FPGA high-level comprehensive tools generate a Verilog/VHSIC Hardware Description Language (VHSIC Hardware Description Language, VHDL) file by taking a C/C + + file as input, and can not generate a code stream file continuously. When the Verilog/VHDL file needs to be used for generating a code stream file, another set of tools needs to be used for generating the code stream file by using the Verilog/VHDL file as input and running a comprehensive Synthesis tool, a placement and routing (P & R) tool and a code stream Bitstream tool, and manual intervention is needed and cannot be automatically completed.
Therefore, it is desirable to have an improved scheme that can give reasonable time for different operations, and can automatically execute the whole process of taking a C/C + + file as input and a code stream file as output, thereby improving the operating efficiency of the whole design.
Disclosure of Invention
One or more embodiments of the present disclosure describe a method and an apparatus for high-level integration and code stream generation of an FPGA, which can provide a reasonable time for different operations, and can automatically execute the whole process of taking a C/C + + file as an input and a code stream file as an output, thereby improving the operating efficiency of the whole design.
According to a first aspect, a method for high-level synthesis and code stream generation of a field programmable gate array FPGA is provided, which includes:
acquiring a C/C + + file to be executed;
dividing the operations contained in the C/C + + file to obtain n different division schemes Pn, wherein n is not less than 1 and is an integer, and determining a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn, wherein one clock cycle in the division schemes Pn contains different operation numbers;
generating a result netlist according to the division scheme Pmax;
and generating an FPGA code stream file according to the result netlist.
In one possible embodiment, the n partitioning schemes include an ith partitioning scheme Pi, 1 ≦ i ≦ n; and
the determining a maximum clock frequency Fmax and a corresponding division scheme Pmax based on the respective clock frequencies Fn obtained by the n division schemes Pn further includes:
generating a corresponding ith netlist according to the ith partitioning scheme Pi;
generating corresponding ith current clock frequency cFi according to the ith netlist;
comparing the current clock frequency cFi with the recorded clock frequency cF, updating the recorded clock frequency cF with the current clock frequency cFi and the recorded partitioning scheme cP with the ith partitioning scheme Pi in case cFi is greater than cF.
In one example, the initial clock frequency cF0 of the recorded clock frequency cF has a value of 0 and the initial value of the recorded partitioning scheme cP is null.
In a possible embodiment, in the case that i is equal to n +1, the determining the maximum clock frequency Fmax and the corresponding division scheme Pmax based on the respective clock frequencies Fn obtained by the n division schemes Pn further includes:
and taking the recorded clock frequency cF as the maximum clock frequency Fmax and the recorded partitioning scheme cP as the partitioning scheme Pmax.
In one example, the ith current clock frequency cFi is generated using a synthetic Synthesis tool.
According to a second aspect, there is provided an apparatus for high-level synthesis and code stream generation of a field programmable gate array FPGA, comprising:
the acquisition unit is configured to acquire a C/C + + file to be executed;
a determining unit, configured to divide the operations included in the C/C + + file to obtain n different division schemes Pn, where n is an integer greater than or equal to 1, and determine a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn, where one clock cycle in the division schemes Pn includes different operation numbers;
a netlist generation unit configured to generate a result netlist according to the partitioning scheme Pmax;
and the code stream generating unit is configured to generate an FPGA code stream file according to the result netlist.
In one possible embodiment, the n partitioning schemes include an ith partitioning scheme Pi, 1 ≦ i ≦ n; and
the determination unit is further configured to:
generating a corresponding ith netlist according to the ith partitioning scheme Pi;
generating corresponding ith current clock frequency cFi according to the ith netlist;
comparing the current clock frequency cFi with the recorded clock frequency cF, updating the recorded clock frequency cF with the current clock frequency cFi and the recorded partitioning scheme cP with the ith partitioning scheme Pi in case cFi is greater than cF.
In one example, the initial clock frequency cF0 of the recorded clock frequency cF has a value of 0 and the initial value of the recorded partitioning scheme cP is null.
In one possible embodiment, in the case of i ═ n +1, the determining unit is further configured to:
and taking the recorded clock frequency cF as the maximum clock frequency Fmax and the recorded partitioning scheme cP as the partitioning scheme Pmax.
In one example, the ith current clock frequency cFi is generated using a synthetic Synthesis tool.
According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
According to a fourth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and wherein the processor, when executing the executable code, implements the method of the first aspect.
By the method and the device provided by the embodiment of the specification, reasonable time can be given to different operations, and the whole process of taking the C/C + + file as input and the code stream file as output can be automatically executed, so that the operation efficiency of the whole design is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates a flow diagram of a method for high-level synthesis and code-stream generation for an FPGA, according to one embodiment;
FIG. 2 illustrates a flow diagram of a method of determining a maximum clock frequency Fmax and a corresponding partitioning scheme Pmax according to one embodiment;
FIG. 3 shows a schematic block diagram of an apparatus for high-level synthesis and code-stream generation for an FPGA according to one embodiment.
Detailed Description
The scheme provided by the specification is described in the following with reference to the attached drawings.
FIG. 1 illustrates a flow diagram of a method for high-level synthesis and code-stream generation for an FPGA, according to one embodiment.
As shown in fig. 1, step 10, a C/C + + file to be executed is obtained.
And step 12, dividing the operations contained in the C/C + + file to obtain n different division schemes Pn, wherein n is an integer larger than or equal to 1, and determining a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn, wherein one clock cycle in the division schemes Pn contains different operation numbers. In one example, operations contained in the C/C + + file are partitioned using a Scheduling tool.
Here, referring to fig. 2, fig. 2 shows a flow chart of a method of determining a maximum clock frequency Fmax and a corresponding partitioning scheme Pmax according to an embodiment. As shown in FIG. 2, step 120, a corresponding ith netlist is generated according to the ith partitioning scheme Pi. Step 122, generate the corresponding ith current clock frequency cFi according to the ith netlist. In one example, the ith current clock frequency cFi is generated using a synthetic Synthesis tool. Step 124, comparing the current clock frequency cFi with the recorded clock frequency cF, and in case cFi is greater than cF, updating the recorded clock frequency cF with the current clock frequency cFi and the recorded partitioning scheme cP with the ith partitioning scheme Pi. In one example, the initial clock frequency cF0 of the recorded clock frequency cF has a value of 0 and the initial value of the recorded partitioning scheme cP is null.
In one example, when i ═ n +1, the recorded clock frequency cF is used as the maximum clock frequency Fmax, and the recorded partition scheme cP is used as the partition scheme Pmax.
Returning to fig. 1, step 14, a resulting netlist is generated according to the partitioning scheme Pmax. And step 16, generating an FPGA code stream file according to the result netlist.
In one possible embodiment, the C/C + + file includes operations V1 through V9, assuming that these operations are independent of each other and that different operations take different times.
The operations V1 through V9 included in the C/C + + file are divided, and Clk represents one clock cycle, as shown in table 1, so that 4 division schemes are generated in total, where 0< f4< f1 ═ f2< f3 is assumed:
Figure BDA0001948755540000061
TABLE 1
As shown in table 1, n-4;
when i is 1, according to a first partitioning scheme P1, that is: at Clk1, operations V1, V2, and V3 are performed; at Clk2, operations V4, V5, and V6 are performed; and at Clk3, performing operations V7, V8, and V9, generating a corresponding first netlist. From the first netlist, a corresponding first current clock frequency cF1 is generated, cF1 ═ f1, the current clock frequency cF1 is compared with the recorded clock frequency cF, at which time the recorded clock frequency cF ═ cF0 ═ 0, since f1>0, cF1 is greater than cF, the recorded clock frequency cF is updated with the current clock frequency cF1, and the recorded partitioning scheme cP is updated with the first partitioning scheme P1.
When i is 2, according to a second partitioning scheme P2, that is: at Clk1, operations V1, V4, and V7 are performed; at Clk2, operations V2, V6, and V8 are performed; and at Clk3, performing operations V3, V5, and V9, generating a corresponding second netlist. From the second netlist, a corresponding second current clock frequency cF2 is generated, cF2 ═ f2, the current clock frequency cF2 is compared with the recorded clock frequency cF, where cF1 ═ f1, since f2 ═ f1, cF2 is equal to cF, i.e. cF2 is not greater than cF, the recorded clock frequency cF and the recorded partitioning scheme cP are not updated.
When i is 3, according to a third division scheme P3, that is: at Clk1, operations V1, V4, V5, and V7 are performed; and at Clk2, operations V2, V3, V6, V8, and V9 are performed to generate a corresponding third netlist. From the third netlist, a corresponding third current clock frequency cF3 is generated, cF3 ═ f3, the current clock frequency cF3 is compared with the recorded clock frequency cF, at which time cF1 ═ f1, since f3> f1, cF3 is larger than cF1, the recorded clock frequency cF is updated with the current clock frequency cF3, and the recorded partitioning scheme cP is updated with the third partitioning scheme P3.
When i is 4, according to a fourth division scheme P4, that is: at Clk1, operations V1 and V7 are performed; at Clk2, operations V2 and V6 are performed; at Clk3, operations V3, V5, and V9 are performed; at Clk4, operations V4 and V8 are performed, generating a corresponding fourth netlist. From the fourth netlist, a corresponding fourth current clock frequency cF4 is generated, cF4 ═ f4, and the current clock frequency cF4 is compared with the recorded clock frequency cF, where cF3 ═ f3, since f4< f3, cF4 is smaller than cF, and the recorded clock frequency cF and the recorded partitioning scheme cP are not updated.
When i is 5, when i is greater than n, the recorded clock frequency cF is taken as the maximum clock frequency Fmax, and the recorded partition scheme cP is taken as the partition scheme Pmax, that is, Fmax is cF3 is 300MHz, and Pmax is P3.
Generating a result netlist according to the third partition scheme P3; and generating an FPGA code stream file according to the result netlist.
By the method, reasonable time can be given to different operations, and the whole process of taking the C/C + + file as input and the code stream file as output can be automatically executed, so that the operation efficiency of the whole design is improved.
According to an embodiment of another aspect, a device for high-level synthesis and code stream generation of a Field Programmable Gate Array (FPGA) is also provided. FIG. 3 shows a schematic block diagram of an apparatus for high-level synthesis and codestream generation for an FPGA according to one embodiment, as shown in FIG. 3, the apparatus 30 comprising: the obtaining unit 301 is configured to obtain a C/C + + file to be executed. A determining unit 303, configured to divide the C/C + + file to obtain n different division schemes Pn, where n is an integer greater than or equal to 1, and determine a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn. A netlist generation unit 305 configured to generate a resulting netlist according to the partitioning scheme Pmax. And a code stream generating unit 307 configured to generate an FPGA code stream file according to the result netlist.
In one embodiment, the n partitioning schemes include an ith partitioning scheme Pi, 1 ≦ i ≦ n; and, the determining unit 303 is further configured to: generating a corresponding ith netlist according to the ith partitioning scheme Pi; generating corresponding ith current clock frequency cFi according to the ith netlist; comparing the current clock frequency cFi with the recorded clock frequency cF, updating the recorded clock frequency cF with the current clock frequency cFi and the recorded partitioning scheme cP with the ith partitioning scheme Pi in case cFi is greater than cF.
In one example, operations contained in the C/C + + file are partitioned using a Scheduling tool.
In one example, the ith current clock frequency cFi is generated using a synthetic Synthesis tool.
In one example, the initial clock frequency cF0 of the recorded clock frequency cF has a value of 0 and the initial value of the recorded partitioning scheme cP is null.
In one embodiment, in case that i ═ n +1, the determining unit 303 is further configured to: and taking the recorded clock frequency cF as the maximum clock frequency Fmax and the recorded partitioning scheme cP as the partitioning scheme Pmax.
Through the device, reasonable time can be given to different operations, and the whole process of taking the C/C + + file as input and the code stream file as output can be automatically executed, so that the operation efficiency of the whole design is improved.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 1 to 2.
According to an embodiment of still another aspect, there is also provided a computing device including a memory and a processor, the memory having stored therein executable code, the processor implementing the method described in conjunction with fig. 1-2 when executing the executable code.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for high-level synthesis and code stream generation of a Field Programmable Gate Array (FPGA) comprises the following steps:
acquiring a C/C + + file to be executed;
dividing the operations contained in the C/C + + file to obtain n different division schemes Pn, wherein n is not less than 1 and is an integer, and determining a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn, wherein one clock cycle in the division schemes Pn contains different operation numbers;
generating a result netlist according to the division scheme Pmax;
and generating an FPGA code stream file according to the result netlist.
2. The method of claim 1, wherein the n partitioning schemes include an ith partitioning scheme Pi, 1 ≦ i ≦ n; and
the determining a maximum clock frequency Fmax and a corresponding division scheme Pmax based on the respective clock frequencies Fn obtained by the n division schemes Pn further includes:
generating a corresponding ith netlist according to the ith partitioning scheme Pi;
generating corresponding ith current clock frequency cFi according to the ith netlist;
comparing the current clock frequency cFi with the recorded clock frequency cF, updating the recorded clock frequency cF with the current clock frequency cFi and the recorded partitioning scheme cP with the ith partitioning scheme Pi in case cFi is greater than cF.
3. The method of claim 2, wherein an initial clock frequency cF0 of the recorded clock frequency cF has a value of 0 and an initial value of the recorded partitioning scheme cP is null.
4. A method as claimed in claim 2 or 3, wherein the ith current clock frequency cFi is generated using a Synthesis tool.
5. A device for high-level synthesis and code stream generation of a Field Programmable Gate Array (FPGA) comprises:
the acquisition unit is configured to acquire a C/C + + file to be executed;
a determining unit, configured to divide the operations included in the C/C + + file to obtain n different division schemes Pn, where n is an integer greater than or equal to 1, and determine a maximum clock frequency Fmax and a corresponding division scheme Pmax based on respective clock frequencies Fn obtained by the n division schemes Pn, where one clock cycle in the division schemes Pn includes different operation numbers;
a netlist generation unit configured to generate a result netlist according to the partitioning scheme Pmax;
and the code stream generating unit is configured to generate an FPGA code stream file according to the result netlist.
6. The apparatus of claim 5, wherein the n partitioning schemes include an ith partitioning scheme Pi, 1 ≦ i ≦ n; and
the determination unit is further configured to:
generating a corresponding ith netlist according to the ith partitioning scheme Pi;
generating corresponding ith current clock frequency cFi according to the ith netlist;
comparing the current clock frequency cFi with the recorded clock frequency cF, updating the recorded clock frequency cF with the current clock frequency cFi and the recorded partitioning scheme cP with the ith partitioning scheme Pi in case cFi is greater than cF.
7. The apparatus of claim 6, wherein an initial clock frequency cF0 of the recorded clock frequency cF has a value of 0 and an initial value of the recorded partitioning scheme cP is null.
8. An apparatus as claimed in claim 6 or 7, wherein the ith current clock frequency cFi is generated using a synthetic Synthesis tool.
9. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-4.
10. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, implements the method of any of claims 1-4.
CN201910044685.2A 2019-01-17 2019-01-17 Method and device for high-level synthesis and code stream generation of FPGA Active CN109901884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910044685.2A CN109901884B (en) 2019-01-17 2019-01-17 Method and device for high-level synthesis and code stream generation of FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910044685.2A CN109901884B (en) 2019-01-17 2019-01-17 Method and device for high-level synthesis and code stream generation of FPGA

Publications (2)

Publication Number Publication Date
CN109901884A CN109901884A (en) 2019-06-18
CN109901884B true CN109901884B (en) 2022-05-17

Family

ID=66943905

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910044685.2A Active CN109901884B (en) 2019-01-17 2019-01-17 Method and device for high-level synthesis and code stream generation of FPGA

Country Status (1)

Country Link
CN (1) CN109901884B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237583A (en) * 2008-03-07 2008-08-06 杭州华三通信技术有限公司 A decoding and coding method and device for multiple screen
CN101799750A (en) * 2009-02-11 2010-08-11 上海芯豪微电子有限公司 Data processing method and device
CN106775905A (en) * 2016-11-19 2017-05-31 天津大学 Higher synthesis based on FPGA realizes the method that Quasi-Newton algorithm accelerates
CN108769694A (en) * 2018-05-31 2018-11-06 郑州云海信息技术有限公司 A kind of method and device of the Alpha channel codings based on FPGA

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237583A (en) * 2008-03-07 2008-08-06 杭州华三通信技术有限公司 A decoding and coding method and device for multiple screen
CN101799750A (en) * 2009-02-11 2010-08-11 上海芯豪微电子有限公司 Data processing method and device
CN106775905A (en) * 2016-11-19 2017-05-31 天津大学 Higher synthesis based on FPGA realizes the method that Quasi-Newton algorithm accelerates
CN108769694A (en) * 2018-05-31 2018-11-06 郑州云海信息技术有限公司 A kind of method and device of the Alpha channel codings based on FPGA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于高层次综合的AES算法研究与设计;张望等;《计算机应用》;20170510(第05期);全文 *

Also Published As

Publication number Publication date
CN109901884A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
US6438747B1 (en) Programmatic iteration scheduling for parallel processors
Cantin et al. A comparison of automatic word length optimization procedures
Shahein et al. A novel hybrid monotonic local search algorithm for FIR filter coefficients optimization
CN104462668B (en) Computer-implemented method for designing an industrial product modeled with a binary tree
US20150199460A1 (en) Custom circuit power analysis
CN104484392A (en) Method and device for generating database query statement
Xydis et al. Efficient high level synthesis exploration methodology combining exhaustive and gradient-based pruned searching
CN110020333A (en) Data analysing method and device, electronic equipment, storage medium
CN110069284A (en) A kind of Compilation Method and compiler based on OPU instruction set
Andriamisaina et al. High-level synthesis for designing multimode architectures
CN109901884B (en) Method and device for high-level synthesis and code stream generation of FPGA
CN104008116A (en) File synchronization method and electronic device
CN103116587A (en) Excavating method and data searching method and device for keywords capable of defaulting
Kong et al. FIR filter synthesis based on interleaved processing of coefficient generation and multiplier-block synthesis
Kim et al. A functional unit and register binding algorithm for interconnect reduction
JP2019121404A (en) Generation by incremental system for fpga (field programmable gate array) mounting using graph-based similarity search
Han et al. FIR filter synthesis considering multiple adder graphs for a coefficient
Troy et al. Faster concept analysis
JP5856888B2 (en) Corresponding point search method, corresponding point search device, and corresponding point search program
US20070250803A1 (en) High-level synthesis method and high-level synthesis system
US20140232726A1 (en) Space-filling curve processing system, space-filling curve processing method, and program
CN113760380A (en) Method, device, equipment and storage medium for determining running code of network model
US20180232205A1 (en) Apparatus and method for recursive processing
Dinh et al. BDD-based circuit restructuring for reducing dynamic power
Sinha et al. Dataflow graph partitioning for high level synthesis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant