CN104779951B - The optimization implementation method of general bidirectional counter based on FPGA - Google Patents

The optimization implementation method of general bidirectional counter based on FPGA Download PDF

Info

Publication number
CN104779951B
CN104779951B CN201410011494.3A CN201410011494A CN104779951B CN 104779951 B CN104779951 B CN 104779951B CN 201410011494 A CN201410011494 A CN 201410011494A CN 104779951 B CN104779951 B CN 104779951B
Authority
CN
China
Prior art keywords
look
logic
input signal
input
carry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410011494.3A
Other languages
Chinese (zh)
Other versions
CN104779951A (en
Inventor
樊平
耿嘉
刘明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingwei Qili Beijing Technology Co ltd
Original Assignee
Capital Microelectronics Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Capital Microelectronics Beijing Technology Co Ltd filed Critical Capital Microelectronics Beijing Technology Co Ltd
Priority to CN201410011494.3A priority Critical patent/CN104779951B/en
Publication of CN104779951A publication Critical patent/CN104779951A/en
Application granted granted Critical
Publication of CN104779951B publication Critical patent/CN104779951B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Logic Circuits (AREA)

Abstract

The optimization implementation method of the present invention relates to a kind of general bidirectional counter based on FPGA, the method includes:The first addend is exported by the first look-up table logic;Four logic input signals of the counter are fully entered into second look-up table, sum number is exported after logical operation;The first input signal that the first addend is carry gate is persistently gated by winding structure;The input of the wherein described winding structure is a logic input signal of first addend and the second look-up table;The carry gate gates the first input signal or the second input signal according to the sum number logic, obtains carry output signals;Wherein described second input signal is carry input signal;The carry input signal and the sum number are subjected to XOR logic operation, obtain the output result of the counter present bit.Optimization implementation method provided by the invention all eliminates the logical operation of each occupancy of one look-up table resource, while also effectively reducing logic time delay.

Description

The optimization implementation method of general bidirectional counter based on FPGA
Technical field
The present invention relates to technical field of integrated circuits, more particularly to the optimization of the general bidirectional counter based on FPGA is realized Method.
Background technology
Field programmable gate array(Field-Programmable Gate Array, FPGA)It is a kind of hard with enriching The logical device of part resource, powerful parallel processing capability and flexible reconfigurable ability.These features make FPGA at data Many fields such as reason, communication, network have obtained more and more extensive uses.
Addition is most common logical construction, why has arithmetic logic structure primarily to addition inside FPGA Rate and realization optimize.Fig. 1 is Method at Register Transfer Level (RTL) view of object count device, is patrolled RTL as shown in Figure 1 It collects and carries out logical mappings, you can obtain a kind of logical mappings figure of common general bidirectional counter as shown in Figure 2.Logic is reflected After penetrating, look-up table(Look-Up-Table, LUT)Dosage as need the logical resource usage amount paid close attention to.As can be seen that In prior art, for each carry digit, it is required for realizing that addition and subtraction operates using one 4 input LUT, for each Input is not one addend, is required for additional LUT to generate addend signal.All it is to use three for each in Fig. 2 LUT realizes the logical operation of the first addend of present bit, the second addend and sum number respectively, and then to pass through carry chain structure real Existing counter is cumulative.In the logical construction of general bidirectional counter as shown in Figure 1, logical resource usage amount depends on defeated Enter the bit wide of data, specially:Quantity=3 LUT × input data bit wide.The bit wide of input data is bigger, the use of logical resource Amount consumes growth that will be at double, therefore how to make full use of the architectural characteristic optimization design of FPGA, and the use of resource is reduced To minimum, and speed is made faster to be a problem to be solved.
Invention content
The purpose of the present invention is in view of the drawbacks of the prior art, provide a kind of general bidirectional counter based on FPGA Optimization implementation method all eliminates the logical operation of each occupancy of one look-up table resource, while also effective drop Low logic time delay, realizes the optimization to chip efficiency and area.
In a first aspect, an embodiment of the present invention provides a kind of general bidirectional counter based on FPGA optimization realization side Method, including:
The first addend is exported by the first look-up table logic;
Four logic input signals of the counter are fully entered into second look-up table, sum number is exported after logical operation;
The first input signal that the first addend is carry gate is persistently gated by winding structure;The wherein described coiling knot The input of structure is a logic input signal of first addend and the second look-up table;
The carry gate carries out logic the first input signal of gating according to the sum number or the second input signal obtains Carry output signals;Wherein described second input signal is carry input signal;
The carry input signal and the sum number are subjected to XOR logic operation, obtain the defeated of the counter present bit Go out result.
Preferably, described to be specially by the first look-up table logic the first addend of output:
First look-up table is configured, selection signal and download signal logic of first look-up table according to input are made Export the first addend.
It is further preferred that the method further includes, is configured in first look-up table and be stored with Chang Bianliang, pass through institute State the logical operation that selection signal selects the positive value of the Chang Bianliang or negative value to participate in first addend.
Preferably, described that four logic input signals of the counter are fully entered into second look-up table, logical operation Output sum number is specially afterwards:
Second look-up table is configured, selection signal, download signal, carry of the second look-up table according to input are made Signal and data input signal logic export the sum number.
It is further preferred that the method further includes, is configured in the second look-up table and be stored with Chang Bianliang, pass through institute State the logical operation that selection signal selects the positive value of the Chang Bianliang or negative value to participate in the sum number.
Preferably, the FPGA is specially CME M5 or CME M7FPGA devices.
The optimization implementation method of general bidirectional counter provided in an embodiment of the present invention based on FPGA is based on CME M5/ Winding structure specific to M7 persistently gates the first input signal that the first addend is carry gate, and invalid second looks into The signal for looking for the logic input signal of table to carry gate connects so that the LUT that can be inputted by one 4 is by the first addend Logic input with the second addend merges, and the look-up table inputted using only one 4 can be completed to be believed according to the selection of input Number, download signal, carry signal and data input signal, the sum number of logic output counter present bit, hence for each Logical operation all eliminate the occupancy of a look-up table resource, while also effectively reducing logic time delay, realize to core The optimization of piece efficiency and area.
Description of the drawings
Fig. 1 is the Method at Register Transfer Level view for the object count device that the prior art provides;
Fig. 2 is a kind of logical mappings figure for general bidirectional counter that the prior art provides;
Fig. 3 is the logic unit configuration diagram of CME M5/M7FPGA devices provided in an embodiment of the present invention;
Fig. 4 is a kind of optimization implementation method of the general bidirectional counter based on FPGA provided in an embodiment of the present invention;
Fig. 5 is the logical mappings figure of the general bidirectional counter after a kind of optimization provided in an embodiment of the present invention.
Specific implementation mode
Below by drawings and examples, technical scheme of the present invention will be described in further detail.
Method in the following embodiments of the present invention is realized based on CME M5 or CME M7FPGA devices, for preferably reason Technical solution provided in an embodiment of the present invention is solved, the logical construction of CME M5/M7FPGA devices is briefly described first.
As shown in figure 3, Fig. 3 is the schematic diagram of a logic unit framework in CME M5/M7FPGA devices, CME M5/ M7FPGA devices include multiple such logic units.In a logic unit, by the LUT of three 4 inputs(0lut4, 40lut4 and 41lut4), 2 registers, carry, cascade and arithmetical logic composition.Wherein, the essence of LUT is exactly a RAM, 4 input LUT are used in FPGA at present, so each LUT can be seen as the RAM of 4 address wires more.When user passes through After one logic circuit of schematic diagram or HDL language descriptions, FPGA exploitations software can automatic calculation logic circuit be possible to As a result, and result is written in RAM in advance.Often input in this way signal carry out logical operation be equal to one address of input into Row is tabled look-up, and is found out the corresponding content in address and is exported.Function identical with logic circuit may be implemented in LUT.But with patrol It collects unlike circuit, the function of the realization of LUT is determined by input rather than circuit complexity, the generation of LUT results have one Fixed addressing delay.Therefore, the resource usage amount for reducing LUT can effectively reduce the logic time delay of chip, can realize pair The optimization of chip efficiency and area.
Unlike other fpga chips, the structure of CME M5/M7FPGA devices includes making there are one gate a0 Occur for winding structure, the output signal for gating 0lut4 can be continued, while a logic input signal of invalid 40lut4 arrives The signal of gate c1 connects.The optimization for the general bidirectional counter based on FPGA that the following each embodiments of the present invention provide is realized Method is exactly realized using above-mentioned winding structure.
For ease of being compared with Fig. 2 of the prior art, the present invention is only gived in the attached drawing that following each embodiments are provided The schematic diagram of general bidirectional counter used resource part, and the first look-up table in figure are realized in CME M5/M7 devices 0lut4, second look-up table LUT2 in LUT1 and Fig. 3 in Fig. 3 40lut4, winding structure a0 in Fig. 3 gate a0, Gate c1, exclusive or XORCY in carry gate MUXCY and Fig. 3 and the XOR gate in Fig. 3, respectively one-to-one relationship.
Fig. 4 is the optimization implementation method of the general bidirectional counter provided in an embodiment of the present invention based on FPGA, and Fig. 5 is this The logical mappings figure of general bidirectional counter after a kind of optimization that inventive embodiments provide.General bidirectional counter is by bringing into Multiple 1 bit adders composition of position chain structure, user can specify the data of any width to input.With reference to Fig. 4, figure 5 pairs of optimization implementation methods of the present invention are described in further detail.
As shown in figure 5, the logical operation of each addition is by one 4 the first look-up table LUT1 inputted, one 4 input Second look-up table LUT2, a winding structure a0, a carry gate MUXCY and an exclusive or XORCY are completed.LUT1 and LUT2 can at most handle 4 variables of arbitrary logical combination.Wherein, in CME M5 or CME M7, LUT1 is equal to addition An operand, be that one on CME M5 or CME M7 frameworks obligates.
Again as shown in figure 4, the method includes:
Step 401, the first addend is exported by the first look-up table logic;
Specifically, being configured to the first look-up table LUT1, make first look-up table according to the selection signal of input Select and download signal load logics export the first addend.
According to Fig. 1:
fLUT1=~load& (select&delta [i] |~select&-delta [i])(Formula 1)
Step 402, four logic input signals of the counter are fully entered into second look-up table, it is defeated after logical operation Go out sum number;
Specifically, combining as 1 can be with the logical variable of universal counter only has altogether 4, respectively selection signal Select, download signal load, carry signal count_out and data input signal data_in, therefore can be by second Look-up table LUT2 is configured, and is added the prior art the first addend as shown in Figure 2 and second using second look-up table LUT2 Several logic inputs merge, and will all 4 input signals input in second look-up table LUT2, and make second look-up table LUT2 roots It is patrolled according to the selection signal select of input, download signal load, carry signal count_out and data input signal data_in Collect the sum number of output adder.
According to Fig. 1:
fLUT2=fLUT1Xor (load&data_in [i] |~load&count_out [i])(Formula 2)
Thus, it can be seen that by the way that the logic of the first addend A and the second addend B in original Fig. 2 are all input to LUT2 In, therefore intermediate variable B can not be used in the logical process of adder, and directly obtain the sum number of adder.
Step 403, the first input signal that the first addend is carry gate is persistently gated by winding structure;Wherein institute The input for stating winding structure is a logic input signal of first addend and the second look-up table;
Specifically, the framework based on FPGA, wiring path is fixed, based on the constraint on framework, must there is one Cabling is to be connected to MUXCY's from a pin of the input of LUT2.The solution of the present invention can utilize in CME M5/M7 Possessed winding structure a0 realizes the invalid of the connection.Specifically, a0 is a gate.An input termination of a0 The input terminal load signals of LUT2, the signal of the first addend of another input access LUT1 output of a0, output meet MUXCY, An input signal as MUXCY.The signal that the first addend is persistently gated by a0 is used as the first of carry gate MUXCY Input signal is allowed to continue not gate to which the load invalidating signals for connecting the input pin of the slave LUT2 in another path fall.
Step 404, the carry gate gates the first input signal or the second input signal according to the sum number logic, Obtain carry output signals;Wherein described second input signal is carry input signal;
Specifically, multiple carry gate MUXCY are cascade structure so that multiple cascade logic units form vertical chain The carry chain on road, the from top to bottom carry successively as unit of logic unit.The carry output signals conduct of previous logic unit The second input signal of the carry gate MUXCY of the latter logic unit, i.e. carry input signal, to be exported in step 402 Sum number as carry gate MUXCY gating signal input, thus carry gate according to the sum number carry out logic choosing Logical first input signal or the second input signal, obtain the carry output signals of present bit.
Step 405, the carry input signal and the sum number are subjected to XOR logic operation, obtain the counter and works as The output result of anteposition.
Specifically, the sum number exported in step 402 and carry input signal are inputted as the logic of exclusive or, calculate To the output result of present bit.
Preferably, initial carry input signal C1=0 of lowest order logic unit.
Further, configuration is stored with Chang Bianliang in the first look-up table, and the Chang Bianliang is selected by the selection signal Positive value(In conjunction with shown in Fig. 1, the positive value of Chang Bianliang is delta)Or negative value(In conjunction with shown in Fig. 1, the negative value of Chang Bianliang is- delta)Participate in the logical operation of first addend.Likewise, configuration is stored with Chang Bianliang in second look-up table, by described Selection signal selects the positive value of the Chang Bianliang or negative value to participate in the logical operation of the sum number.It selects often to become according to selection signal The positive value or negative value of amount participate in logical operation, realize the function of bidirectional counter.
It is according to the logical mappings schematic diagram of general bidirectional counter as shown in Figure 4 it is found that logical after being optimized by method It is with the logical resource usage amount of bidirectional counter:
Quantity=2 LUT × input data bit wide
It follows that the present invention is looked into using winding structure specific to CME M5 or CME M7FPGA devices invalid second Look for the input terminal of table to the signal connection between carry gate, the look-up table so as to be inputted using only one 4 can be complete At according to the selection signal of input, download signal, carry signal and data input signal, the sum number of logic output adder, because This all eliminates the logical operation of each occupancy of one look-up table resource, while when also effectively reducing logic Prolong, realizes the optimization to chip efficiency and area.
Further, the present invention also provides a preferred embodiments, can be realized on the basis of previous embodiment The quantity of first look-up table LUT is advanced optimized.
Because delta [i] is constant, therefore can be based on feelings of the formula 1 to delta [i] and-delta [i] for different assignment Condition is analyzed.
(Formula 3)
It follows that the quantity of LUT1 can be unrelated with input data bit wide, no matter input data is the width of how many digit Degree, the quantity of LUT1 can be 4.The logic of execution exports 0 respectively ,~load&~select ,~load&select and ~load.
Therefore, logical resource usage amount can advanced optimize for:
LUT quantity=input data bit wide+4
Further, it because 0 and~load is constant, therefore can be closed with the input signal of LUT2 below And therefore logical resource usage amount can further be optimized for:
LUT quantity=input data bit wide+2
It follows that the optimization implementation method of the general bidirectional counter proposed by the present invention based on FPGA, compared to existing There is the common logical mappings method of technology, can at most save about 2/3 LUT numbers, while reducing by 1/2 logic time delay.
It should be noted that, although specific embodiments of the present invention are realized based on CME M5/M7FPGA devices, but It is that optimization implementation method provided by the present invention can be applied equally in the FPGA device of other frameworks.
Professional should further appreciate that, described in conjunction with the examples disclosed in the embodiments of the present disclosure Unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, hard in order to clearly demonstrate The interchangeability of part and software generally describes each exemplary composition and step according to function in the above description. These functions are implemented in hardware or software actually, depend on the specific application and design constraint of technical solution. Professional technician can use different methods to achieve the described function each specific application, but this realization It should not be considered as beyond the scope of the present invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can use hardware, processor to execute The combination of software module or the two is implemented.Software module can be placed in random access memory(RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field In any other form of storage medium well known to interior.
Above-described specific implementation mode has carried out further the purpose of the present invention, technical solution and advantageous effect It is described in detail, it should be understood that the foregoing is merely the specific implementation mode of the present invention, is not intended to limit the present invention Protection domain, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include Within protection scope of the present invention.

Claims (6)

1. a kind of optimization implementation method of the general bidirectional counter based on FPGA, which is characterized in that the method includes:
The first addend is exported by the first look-up table logic;
Four logic input signals of the counter are fully entered into second look-up table, sum number is exported after logical operation;
The first input signal that the first addend is carry gate is persistently gated by gate;The input of the wherein described gate For a logic input signal of first addend and the second look-up table;
The carry gate gates the first input signal or the second input signal according to the sum number logic, obtains carry-out Signal;Wherein described second input signal is carry input signal;
The carry input signal and the sum number are subjected to XOR logic operation, obtain the output knot of the counter present bit Fruit.
2. according to the method described in claim 1, it is characterized in that, described export the first addend tool by the first look-up table logic Body is:
First look-up table is configured, first look-up table is made to be exported according to the selection signal and download signal logic of input First addend.
3. according to the method described in claim 2, it is characterized in that, the method further includes, match in first look-up table It sets and is stored with Chang Bianliang, select the positive value of the Chang Bianliang or negative value to participate in patrolling for first addend by the selection signal Collect operation.
4. according to the method described in claim 1, it is characterized in that, four logic input signals by the counter are complete Portion inputs second look-up table, and sum number is exported after logical operation and is specially:
Second look-up table is configured, selection signal, download signal, carry signal of the second look-up table according to input are made The sum number is exported with data input signal logic.
5. according to the method described in claim 4, it is characterized in that, the method further includes, match in the second look-up table It sets and is stored with Chang Bianliang, the logic for selecting the positive value of the Chang Bianliang or negative value to participate in the sum number by the selection signal is transported It calculates.
6. according to the method described in claim 1, it is characterized in that, the FPGA is specially CME M5 or CME M7 FPGA devices Part.
CN201410011494.3A 2014-01-10 2014-01-10 The optimization implementation method of general bidirectional counter based on FPGA Active CN104779951B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410011494.3A CN104779951B (en) 2014-01-10 2014-01-10 The optimization implementation method of general bidirectional counter based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410011494.3A CN104779951B (en) 2014-01-10 2014-01-10 The optimization implementation method of general bidirectional counter based on FPGA

Publications (2)

Publication Number Publication Date
CN104779951A CN104779951A (en) 2015-07-15
CN104779951B true CN104779951B (en) 2018-07-13

Family

ID=53621229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410011494.3A Active CN104779951B (en) 2014-01-10 2014-01-10 The optimization implementation method of general bidirectional counter based on FPGA

Country Status (1)

Country Link
CN (1) CN104779951B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259523A (en) * 2012-02-17 2013-08-21 京微雅格(北京)科技有限公司 Optimization method of addition chain and integrated circuit adopting addition chain
CN203204600U (en) * 2013-04-11 2013-09-18 上海安路信息科技有限公司 Enhanced five-input lookup table (LUT5) structure-based binary adder-subtractor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6518805B2 (en) * 2000-10-04 2003-02-11 Broadcom Corporation Programmable divider with built-in programmable delay chain for high-speed/low power application

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259523A (en) * 2012-02-17 2013-08-21 京微雅格(北京)科技有限公司 Optimization method of addition chain and integrated circuit adopting addition chain
CN203204600U (en) * 2013-04-11 2013-09-18 上海安路信息科技有限公司 Enhanced five-input lookup table (LUT5) structure-based binary adder-subtractor

Also Published As

Publication number Publication date
CN104779951A (en) 2015-07-15

Similar Documents

Publication Publication Date Title
CN105468335B (en) Pipeline-level operation device, data processing method and network-on-chip chip
Xie et al. FPGA realization of FIR filters for high-speed and medium-speed by using modified distributed arithmetic architectures
WO2021057085A1 (en) Hybrid precision storage-based depth neural network accelerator
CN109871949A (en) Convolutional neural networks accelerator and accelerated method
Julio et al. Energy-efficient Gaussian filter for image processing using approximate adder circuits
Diouri et al. Comparison study of hardware architectures performance between FPGA and DSP processors for implementing digital signal processing algorithms: Application of FIR digital filter
CN109271137A (en) A kind of modular multiplication device and coprocessor based on public key encryption algorithm
Khan et al. Comparative analysis of different algorithm for design of high-speed multiplier accumulator unit (MAC)
CN107092462B (en) 64-bit asynchronous multiplier based on FPGA
Parandeh-Afshar et al. Improving FPGA performance for carry-save arithmetic
CN104779951B (en) The optimization implementation method of general bidirectional counter based on FPGA
CN105874713B (en) A kind of expansible configurable logic element and FPGA device
Khurshid et al. High Efficiency Generalized Parallel Counters for Look‐Up Table Based FPGAs
CN108255463A (en) A kind of digital logical operation method, circuit and fpga chip
Raghul et al. Design and Implementation of Approximate Truncated adder using kogge stone adder for low power applications
CN106649905A (en) Technology mapping method by utilizing carry chain
CN105874712B (en) The bit full adder and FPGA device that can skip
Reddy et al. A Hybrid Approach to Optimized n-bit Multiplier Design on FPGA Leveraging Recursive-Wallace Tree with AI Centric Enhancements for FIR Filters and Neural Network Inference
CN105447217B (en) Four based on FPGA select the process mapping method of a selector
Khurshid et al. Technology optimised fixed-point bit-parallel multiplier for LUT-based FPGAs
Narkhede et al. Design and implementation of an efficient instruction set for ternary processor
Anumandla et al. SoC based floating point implementation of differential evolution algorithm using FPGA
CN104679216B (en) A kind of data path means and its control method
Mora et al. Partial product reduction based on look-up tables
CN113031913B (en) Multiplier, data processing method, device and chip

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240819

Address after: 601, Floor 6, Building 5, Yard 8, Kegu 1st Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing, 100176 (Yizhuang Cluster, High-end Industrial Zone, Beijing Pilot Free Trade Zone)

Patentee after: Jingwei Qili (Beijing) Technology Co.,Ltd.

Country or region after: China

Address before: 20th Floor, Building B, Tiangong Building, No. 30 Yuan Road, Haidian District, Beijing 100083

Patentee before: CAPITAL MICROELECTRONICS Co.,Ltd.

Country or region before: China