WO2024060446A1 - 高层次综合的快速线性规划方法 - Google Patents

高层次综合的快速线性规划方法 Download PDF

Info

Publication number
WO2024060446A1
WO2024060446A1 PCT/CN2022/142014 CN2022142014W WO2024060446A1 WO 2024060446 A1 WO2024060446 A1 WO 2024060446A1 CN 2022142014 W CN2022142014 W CN 2022142014W WO 2024060446 A1 WO2024060446 A1 WO 2024060446A1
Authority
WO
WIPO (PCT)
Prior art keywords
linear programming
integer linear
tree
model
compression
Prior art date
Application number
PCT/CN2022/142014
Other languages
English (en)
French (fr)
Inventor
王自鑫
何国勤
陈弟虎
朱立琦
胡胜发
汤锦基
袁悦来
Original Assignee
中山大学
广州安凯微电子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中山大学, 广州安凯微电子股份有限公司 filed Critical 中山大学
Publication of WO2024060446A1 publication Critical patent/WO2024060446A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/34Circuit design for reconfigurable circuits, e.g. field programmable gate arrays [FPGA] or programmable logic devices [PLD]
    • G06F30/343Logical level

Definitions

  • the invention relates to the technical field of circuit simulation, and in particular to a high-level integrated fast linear programming method.
  • High-level synthesis refers to the process of automatically converting logical structures described in high-level languages into circuit models described in low-level abstract languages. HLS tools are efficient and fast, which can reduce the design time of hardware engineers and also allow software engineers to complete hardware design.
  • the purpose of embodiments of the present invention is to provide a fast algorithm implementation method for high-level comprehensive tools based on integer linear programming and compression tree construction.
  • Construct a compression tree library which includes several compression tree models.
  • the compression tree models are used to describe the input and output of the hardware circuit and the area cost;
  • the target hardware circuit description is obtained by integration.
  • the step of building a compressed tree library includes:
  • the adder point diagram is obtained based on the multi-operand addition in the fast carry chain
  • the output of the compression tree model is determined according to the output point described in the adder point diagram, and the input of the compression tree model is determined according to the input point described in the adder point diagram;
  • the area cost is determined according to the compression tree value, the output bit width of the compression tree model, and the area efficiency of the compression tree model; the compression tree value is calculated based on the input number of bits of the compression tree model.
  • the step of generating integer linear programming constraints based on the compressed tree bank and constructing an integer linear programming model according to the integer linear programming constraints includes:
  • the compression tree model includes dedicated compression trees and non-dedicated compression trees, i ⁇ [0,i max -1], s ⁇ [0,S max -1]; P s,k,i is located in column i in stage s
  • R s,k,i is the number of all compression tree model types k used for cascading in column i in stage s;
  • M s,l,i is the number of all compression tree model types k used for cascading in stage s
  • the least significant component of the row adder at column i; M s,m,i is the middle significant component of the row adder at column i in stage s; M s,h,i is the row adder at column i in stage
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model based on the integer linear programming constraints also includes:
  • the input of the compression tree model is determined according to the second constraint;
  • the second constraint is:
  • K e is the total number of compressed tree model types
  • i k is the number of input columns of compressed tree model type k
  • I k,i is the number of inputs of the compressed tree model type k located in column i
  • N s,i is the number of input bits located in column i of stage s
  • i also represents the number of columns.
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model based on the integer linear programming constraints also includes:
  • the output number of bits of the compression tree model is determined according to the third constraint; the third constraint is:
  • K e is the total number of compressed tree model types
  • o k is the number of output columns of compressed tree model type k
  • Q k,i is the output number of the compressed tree model type k located in column i
  • N s,i is the number of input bits located in column i of stage s.
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model based on the integer linear programming constraints also includes:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model based on the integer linear programming constraints also includes:
  • the number of bits of the integer linear programming model at the output stage is limited, and the fifth constraint is:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model based on the integer linear programming constraints also includes:
  • the cascade relationship of the compression tree model is determined; the sixth constraint is:
  • K 3 represents the number of dedicated compression tree types used for cascading
  • K 4 represents the number of all compression tree model types used for cascading
  • o k is the number of output columns of compression tree model type k.
  • the step of generating integer linear programming constraints based on the compressed tree bank and constructing an integer linear programming model according to the integer linear programming constraints further includes:
  • the compression tree model in each cascade stage is determined in the integer linear programming model; the seventh constraint is:
  • K 3 represents the number of dedicated compression tree types used for cascade, and K 4 is used for all compression tree model types of cascade.
  • Quantity; o k is the number of output columns of compressed tree model type k.
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model based on the integer linear programming constraints also includes:
  • resource binding is performed on the compressed tree model, and the eighth constraint is:
  • the objective function of the integer linear programming model is determined, and the objective function is:
  • K 1 is the number of dedicated compressed tree types used for binding in the first mapping
  • K 2 is the number of all compressed tree model types used for binding in the second mapping
  • K e is the total number of compressed tree model types
  • a k is the area of equivalent LUT6 of compressed tree model type k.
  • the technical solution of this application provides a high-level comprehensive fast linear programming solution, and proposes a compression tree construction method based on integer linear programming.
  • a compression tree library is constructed, and then an integer linear programming constraint is generated based on the compression tree library, and according to the integer linear programming Planning constraints, construct an integer linear programming model, and solve it to obtain a compressed tree network description, thereby realizing the description of the target hardware circuit; the method uses the cascade and binding of general parallel counters to improve the performance without sacrificing area.
  • the speed of the synthesized circuit and the increased clock frequency enable fast linear programming and can be widely used in application scenarios for designing fast FPGA algorithms.
  • Figure 1 is a step flow chart of the high-level comprehensive fast linear programming method provided in the technical solution of this application;
  • Figure 2 is a schematic structural diagram of a six-input LUT containing a fast carry chain in the technical solution of this application;
  • Figure 3 is a schematic diagram illustrating multi-operand addition abstracted into a dot diagram in the technical solution of this application;
  • Figure 4 is a point diagram of the compression tree in the technical solution of this application.
  • Figure 5 is a schematic diagram of the mapping between the compression tree and the hardware circuit
  • a compression tree can be constructed to solve the technical defects of slow circuit response speed and low clock frequency caused by carry-saving adders and single-column parallel counters in the related technical solutions.
  • the technical solution of this application proposes an integer linear programming method for constructing a compression tree based on general parallel counters on FPGA; the embodiment method supports cascading and binding between general parallel counters, thereby improving the speed of the synthesized circuit without sacrificing area, increasing the clock frequency, and realizing a fast algorithm.
  • the technical solution of this application provides a high-level comprehensive fast linear programming method; the method includes steps S100-S400:
  • the compression tree library includes several compression trees.
  • the compression trees are used to describe the input and output and area cost of the hardware circuit;
  • the compression tree library is the basis for constructing the compression tree and has a great influence on the performance of the compression tree.
  • each compression tree (model) is represented by its input, output, and area cost.
  • a specialized compression tree type is created for GPCs due to their potential for cascading or bundling. For example, when GPC(6:3) is cascaded to other GPCs, it can be achieved by using two LUT6s with carry logic. If used for compressed tree binding, additional LUT6s are needed because the logic is disjoint. Three LUT6s are used.
  • a six-input LUT with a fast carry chain is shown in Figure 2 .
  • a slice is the basic reconfigurable unit of Xilinx FPGAs. It contains 4 six-input LUTs (LUT6), 8 registers, multiplexers, and a 4-bit carry chain; each LUT6 can be configured to implement a single six-input functions, two five-input functions with shared inputs, or one six-input function and one five-input function with shared inputs and shared values.
  • the carry chain is a dedicated architecture used to implement fast addition or subtraction, and can be cascaded to form a larger functional unit.
  • the core work of this embodiment is to build an integer linear programming model for the carry chain based on the application of fast algorithms. And by solving this integer linear programming model, the compressed tree network description of the target hardware circuit is obtained, and the target hardware circuit description is obtained by integrating the input and output in the compressed tree network description.
  • the step S100 of constructing a compressed tree library in the method may include steps S101-S103;
  • FIG. 3 shows a dot diagram including four four-bit adders.
  • Each dot represents a binary bit of each operand, which can be 0 or 1.
  • a group of points in a column have the same binary weight, ordered from lowest on the right to highest on the left. The points above the line in Figure 3 describe the input to be added, and the points below the line represent the output.
  • a compression tree can be described as [m k-1 , m k-2 ,..., m 0 ; n], where m i represents the i-th tree with i The number of input bits in the column, n represents the bit width of the output.
  • the value M of the compression tree and the output bit width n are calculated by the following formula:
  • E The area efficiency of the compression tree is calculated by E, which is defined as the removed bits ⁇ divided by the number of lookup table LUTs L:
  • each compression tree in step S100 in the embodiment can be represented by three parameters: M, n, and E.
  • M, n, and E the function of the compression tree can also be represented by dot plots.
  • a (6, 0, 7; 5) GPC has 6 input bits in column 2, 7 input bits in column 0, and 5 output bits.
  • the variables and meanings of the constructed integer linear programming model are as shown in Table 1:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S201: based on the first constraint condition, all Compress trees for summation.
  • constraint 1 sums all compression trees according to constraint 1, including general parallel counters and row adders for each column and stage.
  • constraint 1 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S202: In the non-output stage, according to the first Two constraints determine the input of the compression tree.
  • Constraint 2 is expressed as:
  • i also represents the number of columns,
  • the value range of is 0 to i k .
  • the step of generating integer linear programming constraints based on the compressed tree library and constructing an integer linear programming model according to the integer linear programming constraints may include step S203: determining the number of output bits of the compressed tree according to a third constraint condition.
  • Constraint 3 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S204: connecting according to the fourth constraint condition.
  • the row adder may be used to generate integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints.
  • Constraint 4 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S205: limiting according to the fifth constraint condition The number of bits in the output stage of the integer linear programming model.
  • Constraint 5 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S206: determining according to the sixth constraint condition The cascade relationship of the compression tree.
  • Constraint 6 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S207: According to the seventh constraint, in The compression tree in each cascade stage is determined in the integer linear programming model.
  • Constraint 7 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S208: according to the eighth constraint condition, The compression tree performs resource binding.
  • Constraint 8 is expressed as:
  • the step of generating integer linear programming constraints based on the compressed tree library, and constructing an integer linear programming model according to the integer linear programming constraints may include step S209: determining by minimizing hardware resources. The objective function of the integer linear programming model.
  • the objective function in the embodiment is to minimize hardware resources; the objective function is:
  • the step S300 is used to solve the integer linear programming model using the open source solving tool lpsolve; the step S300 is used to obtain the hardware circuit description using the high-level synthesis tool SHANG based on the differential constraint system.
  • FIG. 5 it shows the GPC (7;3), (6,0,7,5), (1,3,5,4), (2,1,1,7,5) in Xilinx FPGA Schematic and gate-level mapping (FPGA mapping).
  • a, bn, cn, and dn represent the GPC inputs related to columns 0, 1, 2, and 3 respectively
  • Zn represents the GPC output.
  • the Xilinx Virtex6 device is taken as an example.
  • the GPC mapping proposed in the technical solution of this application can also be applied to other similar Xilinx FPGAs.
  • system verilog code of the circuit description of the compression tree generated by the SHANG high-level synthesis tool is equivalent to describing the structure of the compression tree.
  • fast multipliers play an important role in many applications of fast algorithms.
  • Experiments were conducted using fast multipliers to more comprehensively test the compression tree implementation method proposed by the present invention.
  • embodiments set the bit width n of the multiplier to be the same as the bit width of the multiplier, where n varies from 10 to 24.
  • the proposed compression tree and two-input adder tree are employed to sum the same partial products generated by the Booth algorithm.
  • the method can improve the speed of the synthesized circuit without sacrificing the area, increase the clock frequency, realize fast linear programming, and can be widely used in the application scenarios of designing FPGA fast algorithms. among.
  • the functions/operations noted in the block diagrams may occur out of the order noted in the operational illustrations.
  • two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending on the functionality/operations involved.
  • the embodiments presented and described in the flow diagrams of the present invention are provided by way of example for the purpose of providing a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logical flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of a larger operation are performed independently.
  • logic and/or steps represented in the flowcharts or otherwise described herein, for example, may be considered a sequenced list of executable instructions for implementing the logical functions, and may be embodied in any computer-readable medium, For use by, or in combination with, instruction execution systems, devices or devices (such as computer-based systems, systems including processors or other systems that can fetch instructions from and execute instructions from the instruction execution system, device or device) or equipment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

本发明提供的高层次综合的快速线性规划方法,方法包括以下步骤:构建压缩树库,所述压缩树库包括若干个压缩树,所述压缩树用于描述硬件电路的输入输出以及面积成本;基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型;对所述整数线性规划模型求解,根据求解结果生成压缩树网络描述;根据所述压缩树网络描述,整合得到目标硬件电路描述;方法通过通用并行计数器的级联和绑定,从而在不牺牲面积的前提下提高综合后电路的速度,提高时钟频率,能够实现快速线性规划;可广泛应用于电路仿真技术领域。

Description

高层次综合的快速线性规划方法 技术领域
本发明涉及电路仿真技术领域,尤其涉及一种高层次综合的快速线性规划方法。
背景技术
高层次综合(High-level Synthesis)简称HLS,指的是将高层次语言描述的逻辑结构,自动转换成低抽象级语言描述的电路模型的过程。HLS工具具有高效快速的特点,能够减少硬件工程师设计的时间,同时也让软件工程师完成硬件设计。
但是,在相关技术方案中,基于进位节省加法器和单列并行计数器的早期压缩树在ASIC设计中性能良好,但是不能很好地适应现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA),可能存在着电路响应速度慢,时钟频率低等比较明显的缺陷。
发明内容
有鉴于此,为至少部分解决上述技术问题或者缺陷之一,本发明实施例的目的在于提供一种基于整数线性规划和压缩树构建的高层次综合工具快速算法实现方法。
一方面,本申请技术方案提供了高层次综合的快速线性规划方法,包括以下步骤:
构建压缩树库,所述压缩树库包括若干个压缩树模型,所述压缩树模型用于描述硬件电路的输入输出以及面积成本;
基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型;
对所述整数线性规划模型求解,根据求解结果生成压缩树网络描述;
根据所述压缩树网络描述,整合得到目标硬件电路描述。
在本申请方案的一种可行的实施例中,所述构建压缩树库这一步骤,包括:
根据快速进位链中的多操作数加法构建得到加法器点图;
根据所述加法器点图中所述描述的输出点位确定所述压缩树模型的输出,根据所述加法器点图中所述描述的输入点位确定所述压缩树模型的输入;
根据压缩树值、所述压缩树模型的输出位宽以及所述压缩树模型的面积效率确定所述面积成本;所述压缩树值根据所述压缩树模型的输入位数计算得到。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,包括:
根据第一约束条件对全部压缩树模型进行求和;所述第一约束条件为:
P s,k,i=E s,k,i+F s,k,i+G s,k,i+H s,k,i+R s,k,i+M s,l,i+M s,m,i+M s,h,i
其中,压缩树模型包括专用压缩树和非专用压缩树,i∈[0,i max-1],s∈[0,S max-1];P s,k,i为s阶段中位于列i的压缩树模型类型k的数量;E s,k,i为s阶段中位于列i非专用的压缩树类型k的数量;F s,k,i为s阶段中位于列i的映射于绑定的专用压缩树类型k的数量;G s,k,i为s阶段中位于列i的映射于绑定的压缩树模型类型k的数量;H s,k,i为s阶段中位于列i的用于级联的专用压缩树类型k的数量;R s,k,i为s阶段中列i的用于级联的全部压缩树模型类型k的数量;M s,l,i为s阶段中位于列i的行加法器的最低有效组件;M s,m,i为s阶段中位于列i的行加法器的中间有效组件;M s,h,i为s阶段中位于列i的行加法器的最高有效组件;i max为压缩树模型最大列数;S max为压缩树模型最大阶段数。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
在非输出阶段,根据第二约束条件,确定所述压缩树模型的输入;所述第二约束条件为:
Figure PCTCN2022142014-appb-000001
其中,i∈[0,i max-1],s∈[0,S max-1];K e为压缩树模型类型总数,i k为压缩树模型类型k的输入列数,I k,i为位于列i的压缩树模型类型k的输入数量,N s,i为阶段s的位于列i的输入位数,
Figure PCTCN2022142014-appb-000002
和i的定义相同,同样表示列数。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
根据第三约束条件确定所述压缩树模型的输出位数;所述第三约束条件为:
Figure PCTCN2022142014-appb-000003
其中,i∈[0,i max-1],s∈[0,S max-1];K e为压缩树模型类型总数;o k为压缩树模型类型k的输出列数;Q k,i为位于列i的压缩树模型类型k的输出数量;N s,i为阶段s的位于列i 的输入位数。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
根据第四约束条件,连接所述行加法器;所述第四约束条件为:
M s,h,i+1M s,m,i+1=M s,m,iM s,l,i
其中,i∈[0,i max-1],s∈[0,S max-1]
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
根据第五约束条件,限制所述整数线性规划模型在输出阶段的位数,所述第五约束条件为:
Figure PCTCN2022142014-appb-000004
其中,i∈[0,i max],N s,i为阶段s的位于列i的输入位数;γ为最后的所述加法器决定的整数。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
根据第六约束条件,确定所述压缩树模型的级联关系;所述第六约束条件为:
Figure PCTCN2022142014-appb-000005
其中,i∈[0,i max],s∈[1,S max-2],K 3表示用于级联的专用压缩树类型的数量,K 4用于级联的所有压缩树模型类型的数量;o k为压缩树模型类型k的输出列数。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
根据第七约束条件,在所述整数线性规划模型中确定各级联阶段中的压缩树模型;所述第七约束条件为:
Figure PCTCN2022142014-appb-000006
其中,i∈[0,i max],s∈[1,S max-2],K 3表示用于级联的专用压缩树类型的数量,K 4用于级联的所有压缩树模型类型的数量;o k为压缩树模型类型k的输出列数。
在本申请方案的一种可行的实施例中,所述基于所述压缩树库,生成整数线性规划约束, 并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
根据第八约束条件,对所述压缩树模型进行资源绑定,所述第八约束条件为:
Figure PCTCN2022142014-appb-000007
通过最小化硬件资源,确定所述整数线性规划模型的目标函数,所述目标函数为:
Figure PCTCN2022142014-appb-000008
其中,K 1为第一个映射中用于绑定的专用压缩树类型数量;K 2为第二个映射用于绑定的所有压缩树模型类型数量;K e为压缩树模型类型总数;A k为压缩树模型类型k的等效LUT6的面积。
本发明的优点和有益效果将在下面的描述中部分给出,其他部分可以通过本发明的具体实施方式了解得到:
本申请技术方案提供了高层次综合的快速线性规划方案,提出了基于整数线性规划的压缩树构建方法,首先构建压缩树库,然后基于压缩树库生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型,通过该模型进行求解得到压缩树网络描述,从而实现对目标硬件电路的描述;方法通过通用并行计数器的级联和绑定,从而在不牺牲面积的前提下提高综合后电路的速度,提高时钟频率,能够实现快速线性规划,并能够广泛地应用在设计FPGA快速算法的应用场景之中。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请技术方案中所提供的高层次综合的快速线性规划方法的步骤流程图;
图2为本申请技术方案中含快速进位链的六输入LUT结构示意图;
图3为本申请技术方案中将多操作数加法抽象为一个点图的示意图;
图4为本申请技术方案中压缩树的点图;
图5为本压缩树和硬件电路的映射示意图;
附图标记:501、全加器(Full Adder);502、半加器(Half Adder);503、多路复用器(Multiplexer);504、异或门(Xor Gate)。
具体实施方式
下面详细描述本发明的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。对于以下实施例中的步骤编号,其仅为了便于阐述说明而设置,对步骤之间的顺序不做任何限定,实施例中的各步骤的执行顺序均可根据本领域技术人员的理解来进行适应性调整。
基于目前相关技术方案中,在FPGA上集成了通用并行计数器这一架构基础上,可以构建压缩树用于解决相关技术方案中由于进位节省加法器和单列并行计数器,所带来的电路响应速度慢,时钟频率低等技术缺陷。为此本申请技术方案提出了一种整数线性规划方法,用于在FPGA上构建基于通用并行计数器的压缩树;实施例方法支持通用并行计数器之间的级联和绑定,从而在不牺牲面积的前提下提高综合后电路的速度,提高时钟频率,实现快速算法。
在第一方面,如图1所示,本申请技术方案提供了高层次综合的快速线性规划方法;方法包括步骤S100-S400:
S100、构建压缩树库,所述压缩树库包括若干个压缩树,所述压缩树用于描述硬件电路的输入输出以及面积成本;
具体在实施例中,压缩树库是构建压缩树的基础,对于压缩树的性能有很大的影响。其中,每一种压缩树(模型)都用其输入、输出、面积成本来表示。在实施例中,为GPCs创建了专门的压缩树类型,因为其具有级联或绑定的潜力。例如,当GPC(6:3)级联到其他GPC时,可以通过使用两个带有进位逻辑的LUT6来实现,如果用于压缩树绑定时,因为逻辑不相交,所以需要额外的LUT6,使用到三个LUT6。
S200、基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型;
S300、对所述整数线性规划模型求解,根据求解结果生成压缩树网络描述;
S400、根据所述压缩树网络描述,整合得到目标硬件电路描述;
示例性地,如图2所示,图2中展示的具有快速进位链的六输入LUT。一个切片是Xilinx FPGA的基本可重构单元,它包含4个六输入LUT(LUT6)、8个寄存器、多路复用器和一个4位进位链;每个LUT6都可以配置为实现单个六输入函数,两个具有共享输入的五输入函数,或一个六输入函数和一个具有共享输入和共享值的五输入函数。其中,进位链是一种专用的体系结构,用于实现快速的加或减,可以级联形成一个更大的功能单元。因此,本实 施例的核心工作是基于快速算法的应用,针对进位链构建整数线性规划模型。并通过对这一整数线性规划模型进行求解,得到目标硬件电路的压缩树网络描述,并基于压缩树网络描述中的输入输出等进行整合得到目标硬件电路描述。
在一些可行实施例中,方法中构建压缩树库这一步骤S100,可以包括步骤S101-S103;
S101、根据快速进位链中的多操作数加法构建得到加法器点图;
S102、根据所述加法器点图中所述描述的输出点位确定所述压缩树的输出,根据所述加法器点图中所述描述的输入点位确定所述压缩树的输入;
S103、根据压缩树值、所述压缩树的输出位宽以及所述压缩树的面积效率确定所述面积成本;所述压缩树值根据所述压缩树的输入位数计算得到。
示例性地,如图3所示,在实施例中为了更好地分析计算,可以将多操作数加法抽象为一个点图;图3中显示了一个包含四个四位加法器的点图。每个点表示每个操作数的一个二进制位,它可以是0或1。一列中的一组点具有相同的二进制权值,二进制权值从右边的低到左边的高进行排序。图3中直线上方的点描述了要添加的输入,直线下方的点表示输出。
进一步地,在实施例中,如图4所示,一个压缩树可以被描述为[m k-1,m k-2,……,m 0;n],其中,m i用i表示第i列中的输入位数,n表示输出的位宽。压缩树的值M和输出位宽n由下述公式计算得到:
Figure PCTCN2022142014-appb-000009
n=log 2(M max+1)   (2)
压缩树的面积效率由E计算,E值定义为去除位λ除以查找表LUT数量L:
Figure PCTCN2022142014-appb-000010
因此,实施例中的步骤S100中每种压缩树用输入输出和面积成本可以通过M,n,E三个参数表示。如图4所示,图4描绘了GPC(6、0、7、5)、(1、5、3)和(6、3)的点图,压缩树的功能也可以用点图来表示。一个(6、0、7;5)GPC在列2中有6个输入位,在列0中有7个输入位和5个输出位。当所有的输入位都被设置为1时,(6、0、7;5)GPC得到其最大值M max=6×2 7+7×2 0=31和输出位宽n=log 2(31+1)=5。此外,它的硬件成本是4个六输入LUT,去掉的位是8个,通过输入位和输出位之间的差值来测量。因此,(6、0、7;5)GPC的效率为E=8/4=2。
在一些可行的实施例中,实施例步骤S200中,所构建的整数线性规划模型的变量以及含 义如表1所示:
表1
变量 含义
S max 压缩树最大阶段数
i max 压缩树最大列数
I k,i 位于列i的压缩树类型k的输入数量
Q k,i 位于列i的压缩树类型k的输出数量
P s,k,i 阶段s的位于列i的压缩树类型k的数量
E s,k,i 阶段s的位于列i的非专用压缩树类型k的数量
N s,i 阶段s的位于列i的输入位数
K e 压缩数类型总数
i k 压缩树类型k的输入列数
o k 压缩树类型k的输出列数
γ 由最后加法器决定的整数
M s,l,i 阶段s的位于列i的行加法器的最低有效组件
M s,m,i 阶段s的位于列i的行加法器的中间有效组件
M s,h,i 阶段s的位于列i的行加法器的最高有效组件
K 1 用于绑定的专用压缩树类型的数量
K 2 用于绑定的全部压缩树类型的数量
F s,k,i 阶段s的位于列i的映射于绑定的专用压缩树类型k的数量
G s,k,i 阶段s的位于列i的映射于绑定的全部压缩树类型k的数量
K 3 用于级联的专用压缩树类型的数量
K 4 用于级联的全部压缩树类型的数量
H s,k,i 阶段s的位于列i的用于级联的专用压缩树类型k的数量
R s,k,i 阶段s的位于列i的用于级联的全部压缩树类型k的数量
A k 压缩树类型k的等效LUT6的面积
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S201:根据第一约束条件对全部压缩树进行求和。
具体地,实施例根据约束条件1将所有压缩树求和,包括通用并行计数器和各列各阶段的行加法器。其中,约束条件1表示为:
P s,k,i=E s,k,i+F s,k,i+G s,k,i+H s,k,i+R s,k,i+M s,l,i+M s,m,i+M s,h,i
其中,i∈[0,i max-1],s∈[0,S max-1]。
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S202:在非输出阶段,根据第二约束条件,确定所述压缩树的输入。
具体地,实施例根据约束条件2确保除输出阶段外,各列各阶段的所有位作为压缩树的输入。约束条件2表示为:
Figure PCTCN2022142014-appb-000011
其中,i∈[0,i max-1],s∈[0,S max-1],
Figure PCTCN2022142014-appb-000012
和i的定义相同,同样表示列数,
Figure PCTCN2022142014-appb-000013
的取值范围为取值范围是0到i k
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S203:根据第三约束条件确定所述压缩树的输出位数。
具体在实施例中,实施例根据约束条件3计算压缩树产生的输出位数。约束条件3表示为:
Figure PCTCN2022142014-appb-000014
其中,i∈[0,i max-1],s∈[0,S max-1]。
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S204:根据第四约束条件,连接所述行加法器。
具体地,实施例根据约束条件4确保行加法器的正确连接。约束条件4表示为:
M s,h,i+1M s,m,i+1=M s,m,iM s,l,i
其中,i∈[0,i max-1],s∈[0,S max-1]。
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S205:根据第五约束条件,限制所述整数线性规划模型在输出阶段的位数。
具体地,实施例根据约束条件5限制输出阶段的位数。约束条件5表示为:
Figure PCTCN2022142014-appb-000015
其中,i∈[0,i max]。
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S206:根据第六约束条件,确定所述压缩树的级联关系。
具体地,实施例根据约束条件6确保一个压缩树只能级联一次。约束条件6表示为:
Figure PCTCN2022142014-appb-000016
其中,i∈[0,i max],s∈[1,S max-2]。
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S207:根据第七约束条件,在所述整数线性规划模型中确定各级联阶段中的压缩树。
具体地,实施例根据约束条件7确保级联的前提是下一阶段存在合适的压缩树。约束条件7表示为:
Figure PCTCN2022142014-appb-000017
其中,i∈[0,i max],s∈[1,S max-2]。
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S208:根据第八约束条件,对所述压缩树进行资源绑定。
具体地,实施例根据约束条件8确保用于资源绑定的特定压缩树的正确使用。约束条件8表示为:
Figure PCTCN2022142014-appb-000018
在一些可行的实施方式中,基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤可以包括步骤S209:通过最小化硬件资源,确定所述整数线性规划模型的目标函数。
具体地,实施例中目标函数为最小化硬件资源;该目标函数为:
Figure PCTCN2022142014-appb-000019
在一些其他的可行实施例中,步骤S300求解整数线性规划模型使用开源求解工具lpsolve;步骤获得硬件电路描述使用基于差分约束系统的高层次综合工具SHANG。
如图5所示,展示了Xilinx FPGA中的GPC(7;3)、(6、0、7、5)、(1、3、5、4)、(2、1、1、7、5)的示意图(Schematic)和门级映射(FPGA mapping),在该实施例中a、bn、cn、dn分别表示与列0、1、2、3相关的GPC输入,Zn表示GPC输出。在该实施例中以Xilinx Virtex6设备为例,本申请技术方案所提出的GPC映射也可以应用于其他相似的Xilinx FPGA。
在一些可行的实施方式中,可以由SHANG高层次综合工具生成的压缩树的电路描述的 system verilog代码,等价描述了压缩树的结构。
综上所述,本申请技术方案可以广泛适用于含有大量乘法和加法的FPGA快速算法中。
示例性地,快速乘法器在很多快速算法的应用中具有重要作用。利用快速乘法器进行实验,更全面地测试了本发明提出的压缩树实现方法。在实验过程中,实施例将乘子的位宽n与乘数的位宽相同,其中n从10到24变化。采用所提出的压缩树和双输入加法器树对Booth算法生成的相同部分积进行求和。
如表2所示,展示了快速乘法的实现结果,其中使用Booth算法和所提出的压缩树的组合构建的乘子用“Booth+prop.ILP表示。ILP和那些使用Booth算法和双输入加法器树构造的组合用“Booth+Adder tree”表示。可以看出,使用所提出的压缩树可以显著减少平均切片数量27.86%,与使用加法树的相比,平均最大时钟频率略有增加(0.14%)。也就是说,在同等的面积限制下,使用本发明一种高层次综合工具快速算法实现方法可以得到更好的速度性能。
表2
Figure PCTCN2022142014-appb-000020
从上述具体的实施过程,可以总结出,本发明所提供的技术方案相较于现有技术存在以下优点或优势:
方法通过通用并行计数器的级联和绑定,从而在不牺牲面积的前提下提高综合后电路的速度,提高时钟频率,能够实现快速线性规划,并能够广泛地应用在设计FPGA快速算法的应用场景之中。
在一些可选择的实施例中,在方框图中提到的功能/操作可以不按照操作示图提到的顺序发生。例如,取决于所涉及的功能/操作,连续示出的两个方框实际上可以被大体上同时地执行或所述方框有时能以相反顺序被执行。此外,在本发明的流程图中所呈现和描述的实施例以示例的方式被提供,目的在于提供对技术更全面的理解。所公开的方法不限于本文所呈现 的操作和逻辑流程。可选择的实施例是可预期的,其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。
此外,虽然在功能性模块的背景下描述了本发明,但应当理解的是,除非另有相反说明,功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中,或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是,有关每个模块的实际实现的详细讨论对于理解本发明是不必要的。更确切地说,考虑到在本文中公开的装置中各种功能模块的属性、功能和内部关系的情况下,在工程师的常规技术内将会了解该模块的实际实现。因此,本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本发明。还可以理解的是,所公开的特定概念仅仅是说明性的,并不意在限制本发明的范围,本发明的范围由所附权利要求书及其等同方案的全部范围来决定。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。
以上是对本发明的较佳实施进行了具体说明,但本发明并不限于上述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。

Claims (10)

  1. 高层次综合的快速线性规划方法,其特征在于,包括以下步骤:
    构建压缩树库,所述压缩树库包括若干个压缩树模型,所述压缩树模型用于描述硬件电路的输入输出以及面积成本;
    基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型;
    对所述整数线性规划模型求解,根据求解结果生成压缩树网络描述;
    根据所述压缩树网络描述,整合得到目标硬件电路描述。
  2. 根据权利要求1所述的高层次综合的快速线性规划方法,其特征在于,所述构建压缩树库这一步骤,包括:
    根据快速进位链中的多操作数加法构建得到加法器点图;
    根据所述加法器点图中所述描述的输出点位确定所述压缩树模型的输出,根据所述加法器点图中所述描述的输入点位确定所述压缩树模型的输入;
    根据压缩树值、所述压缩树模型的输出位宽以及所述压缩树模型的面积效率确定所述面积成本;所述压缩树值根据所述压缩树模型的输入位数计算得到。
  3. 根据权利要求2所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,包括:
    根据第一约束条件对全部压缩树模型进行求和;所述第一约束条件为:
    P s,k,i=E s,k,i+F s,k,i+G s,k,i+H s,k,i+R s,k,i+M s,l,i+M s,m,i+M s,h,i
    其中,压缩树模型包括专用压缩树和非专用压缩树,i∈[0,i max-1],s∈[0,S max-1];P s,k,i为s阶段中位于列i的压缩树模型类型k的数量;E s,k,i为s阶段中位于列i非专用压缩树类型k的数量;F s,k,i为s阶段中位于列i的映射于绑定的专用压缩树类型k的数量;G s,k,i为s阶段中位于列i的映射于绑定的压缩树模型类型k的数量;H s,k,i为s阶段中位于列i的用于级联的专用压缩树类型k的数量;R s,k,i为s阶段中列i的用于级联的全部压缩树模型类型k的数量;M s,l,i为s阶段中位于列i的行加法器的最低有效组件;M s,m,i为s阶段中位于列i的行加法器的中间有效组件;M s,h,i为s阶段中位于列i的行加法器的最高有效组件;i max为压缩树模型最大列数;S max为压缩树模型最大阶段数。
  4. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    在非输出阶段,根据第二约束条件,确定所述压缩树模型的输入;所述第二约束条件为:
    Figure PCTCN2022142014-appb-100001
    其中,i∈[0,i max-1],s∈[0,S max-1];K e为压缩树模型的类型总数,i k为压缩树模型类型k的输入列数,I k,i为位于列i的压缩树模型类型k的输入数量,N s,i为阶段s的位于列i的输入位数。
  5. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    根据第三约束条件确定所述压缩树模型的输出位数;所述第三约束条件为:
    Figure PCTCN2022142014-appb-100002
    其中,i∈[0,i max-1],s∈[0,S max-1];K e为压缩树模型类型总数;o k为压缩树模型类型k的输出列数;Q k,i为位于列i的压缩树模型类型k的输出数量;N s,i为阶段s的位于列i的输入位数。
  6. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    根据第四约束条件,连接所述行加法器;所述第四约束条件为:
    M s,h,i+1M s,m,i+1=M s,m,iM s,l,i
    其中,i∈[0,i max-1],s∈[0,S max-1]。
  7. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    根据第五约束条件,限制所述整数线性规划模型在输出阶段的位数;所述第五约束条件为:
    Figure PCTCN2022142014-appb-100003
    其中,i∈[0,i max],N s,i为阶段s的位于列i的输入位数;γ为最后的所述加法器决定的整数。
  8. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    根据第六约束条件,确定所述压缩树模型的级联关系;所述第六约束条件为:
    Figure PCTCN2022142014-appb-100004
    其中,i∈[0,i max],s∈[1,S max-2],K 3表示用于级联的专用压缩树类型的数量,K 4用于级联的所有压缩树模型类型的数量;o k为压缩树模型类型k的输出列数。
  9. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    根据第七约束条件,在所述整数线性规划模型中确定各级联阶段中的所述压缩树模型;所述第七约束条件为:
    Figure PCTCN2022142014-appb-100005
    其中,i∈[0,i max],s∈[1,S max-2],K 3表示用于级联的专用压缩树类型的数量,K 4用于级联的所有压缩树模型类型的数量;o k为压缩树模型类型k的输出列数。
  10. 根据权利要求3所述的高层次综合的快速线性规划方法,其特征在于,所述基于所述压缩树库,生成整数线性规划约束,并根据所述整数线性规划约束,构建整数线性规划模型这一步骤,还包括:
    根据第八约束条件,对所述压缩树模型进行资源绑定;所述第八约束条件为:
    Figure PCTCN2022142014-appb-100006
    通过最小化硬件资源,确定所述整数线性规划模型的目标函数,所述目标函数为:
    Figure PCTCN2022142014-appb-100007
    其中,K 1为第一个映射中用于绑定的专用压缩树类型数量;K 2为第二个映射用于绑定的所有压缩树模型类型数量;K e为压缩树模型类型总数;A k为压缩树模型类型k的等效LUT6的面积。
PCT/CN2022/142014 2022-09-22 2022-12-26 高层次综合的快速线性规划方法 WO2024060446A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211170165.4 2022-09-22
CN202211170165.4A CN115438614A (zh) 2022-09-22 2022-09-22 高层次综合的快速线性规划方法

Publications (1)

Publication Number Publication Date
WO2024060446A1 true WO2024060446A1 (zh) 2024-03-28

Family

ID=84248821

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/142014 WO2024060446A1 (zh) 2022-09-22 2022-12-26 高层次综合的快速线性规划方法

Country Status (2)

Country Link
CN (1) CN115438614A (zh)
WO (1) WO2024060446A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115438614A (zh) * 2022-09-22 2022-12-06 中山大学 高层次综合的快速线性规划方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110153709A1 (en) * 2009-12-23 2011-06-23 Juinn-Dar Huang Delay optimal compressor tree synthesis for lut-based fpgas
CN104063558A (zh) * 2014-07-08 2014-09-24 领佰思自动化科技(上海)有限公司 基于线性规划的大规模集成电路通道布线方法
CN106682258A (zh) * 2016-11-16 2017-05-17 中山大学 一种高层次综合工具中的多操作数加法优化方法及系统
CN115438614A (zh) * 2022-09-22 2022-12-06 中山大学 高层次综合的快速线性规划方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110153709A1 (en) * 2009-12-23 2011-06-23 Juinn-Dar Huang Delay optimal compressor tree synthesis for lut-based fpgas
CN104063558A (zh) * 2014-07-08 2014-09-24 领佰思自动化科技(上海)有限公司 基于线性规划的大规模集成电路通道布线方法
CN106682258A (zh) * 2016-11-16 2017-05-17 中山大学 一种高层次综合工具中的多操作数加法优化方法及系统
CN115438614A (zh) * 2022-09-22 2022-12-06 中山大学 高层次综合的快速线性规划方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TU, LE ET AL.: "Improved Synthesis of Compressor Trees in High-Level Synthesis for Modern FPGAs", IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, vol. 37, no. 12, 12 December 2018 (2018-12-12), pages 3206 - 3210, XP011697656, ISSN: 0278-0070, DOI: 10.1109/TCAD.2018.2801241 *

Also Published As

Publication number Publication date
CN115438614A (zh) 2022-12-06

Similar Documents

Publication Publication Date Title
JP7476175B2 (ja) 乗算累積回路
Tommiska Efficient digital implementation of the sigmoid function for reprogrammable logic
Huddar et al. Novel high speed vedic mathematics multiplier using compressors
US9916131B2 (en) Techniques and devices for performing arithmetic
WO2024060446A1 (zh) 高层次综合的快速线性规划方法
Mehta et al. Implementation of single precision floating point multiplier using karatsuba algorithm
Prasad et al. Design of low power and high speed modified carry select adder for 16 bit Vedic Multiplier
Kulkarni Comparison among different adders
Raju et al. Design and implementation of low power and high performance Vedic multiplier
Naregal et al. Design and implementation of high efficiency vedic binary multiplier circuit based on squaring circuits
Kuppili et al. Design of Vedic Mathematics based 16 bit MAC unit for Power and Delay Optimization
Jadhav et al. A novel high speed FPGA architecture for FIR filter design
Kumar et al. FPGA Implementation of Systolic FIR Filter Using Single-Channel Method
Chandu et al. Design and implementation of high efficiency square root circuit using Vedic mathematics
US20140115023A1 (en) Bid to bcd/dpd converters
Rajasekhar et al. A modified novel compressor based Urdhwa Tiryakbhyam multiplier
JP2765516B2 (ja) 積和演算器
Nigam et al. Hardware Implementation of Canonical Signed Digit Adder-subtractor Circuit
Athow et al. Implementation of large-integer hardware multiplier in Xilinx FPGA
Sahoo et al. Multichannel Filters for Wireless Networks: Lookup-Table-Based Efficient Implementation
Ullah et al. Accurate Multipliers
Bai et al. A New Novel Low Power Floating Point Multiplier Implementation Using Vedic Multiplication Techniques
Ananyaa et al. A Newer Vedic Module to Solve Quadratic Equations
Singh et al. Energy Efficient Vedic Multiplier
SHAHANAZ et al. Multiplier Applications are Designed by Adders using CMOS and GDI Logic

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22959444

Country of ref document: EP

Kind code of ref document: A1