Invention content
The purpose of the present invention is in view of the drawbacks of the prior art, provide a kind of general bidirectional counter based on FPGA
Optimization implementation method all eliminates the logical operation of each occupancy of one look-up table resource, while also effective drop
Low logic time delay, realizes the optimization to chip efficiency and area.
In a first aspect, an embodiment of the present invention provides a kind of general bidirectional counter based on FPGA optimization realization side
Method, including:
The first addend is exported by the first look-up table logic;
Four logic input signals of the counter are fully entered into second look-up table, sum number is exported after logical operation;
The first input signal that the first addend is carry gate is persistently gated by winding structure;The wherein described coiling knot
The input of structure is a logic input signal of first addend and the second look-up table;
The carry gate carries out logic the first input signal of gating according to the sum number or the second input signal obtains
Carry output signals;Wherein described second input signal is carry input signal;
The carry input signal and the sum number are subjected to XOR logic operation, obtain the defeated of the counter present bit
Go out result.
Preferably, described to be specially by the first look-up table logic the first addend of output:
First look-up table is configured, selection signal and download signal logic of first look-up table according to input are made
Export the first addend.
It is further preferred that the method further includes, is configured in first look-up table and be stored with Chang Bianliang, pass through institute
State the logical operation that selection signal selects the positive value of the Chang Bianliang or negative value to participate in first addend.
Preferably, described that four logic input signals of the counter are fully entered into second look-up table, logical operation
Output sum number is specially afterwards:
Second look-up table is configured, selection signal, download signal, carry of the second look-up table according to input are made
Signal and data input signal logic export the sum number.
It is further preferred that the method further includes, is configured in the second look-up table and be stored with Chang Bianliang, pass through institute
State the logical operation that selection signal selects the positive value of the Chang Bianliang or negative value to participate in the sum number.
Preferably, the FPGA is specially CME M5 or CME M7FPGA devices.
The optimization implementation method of general bidirectional counter provided in an embodiment of the present invention based on FPGA is based on CME M5/
Winding structure specific to M7 persistently gates the first input signal that the first addend is carry gate, and invalid second looks into
The signal for looking for the logic input signal of table to carry gate connects so that the LUT that can be inputted by one 4 is by the first addend
Logic input with the second addend merges, and the look-up table inputted using only one 4 can be completed to be believed according to the selection of input
Number, download signal, carry signal and data input signal, the sum number of logic output counter present bit, hence for each
Logical operation all eliminate the occupancy of a look-up table resource, while also effectively reducing logic time delay, realize to core
The optimization of piece efficiency and area.
Specific implementation mode
Below by drawings and examples, technical scheme of the present invention will be described in further detail.
Method in the following embodiments of the present invention is realized based on CME M5 or CME M7FPGA devices, for preferably reason
Technical solution provided in an embodiment of the present invention is solved, the logical construction of CME M5/M7FPGA devices is briefly described first.
As shown in figure 3, Fig. 3 is the schematic diagram of a logic unit framework in CME M5/M7FPGA devices, CME M5/
M7FPGA devices include multiple such logic units.In a logic unit, by the LUT of three 4 inputs(0lut4,
40lut4 and 41lut4), 2 registers, carry, cascade and arithmetical logic composition.Wherein, the essence of LUT is exactly a RAM,
4 input LUT are used in FPGA at present, so each LUT can be seen as the RAM of 4 address wires more.When user passes through
After one logic circuit of schematic diagram or HDL language descriptions, FPGA exploitations software can automatic calculation logic circuit be possible to
As a result, and result is written in RAM in advance.Often input in this way signal carry out logical operation be equal to one address of input into
Row is tabled look-up, and is found out the corresponding content in address and is exported.Function identical with logic circuit may be implemented in LUT.But with patrol
It collects unlike circuit, the function of the realization of LUT is determined by input rather than circuit complexity, the generation of LUT results have one
Fixed addressing delay.Therefore, the resource usage amount for reducing LUT can effectively reduce the logic time delay of chip, can realize pair
The optimization of chip efficiency and area.
Unlike other fpga chips, the structure of CME M5/M7FPGA devices includes making there are one gate a0
Occur for winding structure, the output signal for gating 0lut4 can be continued, while a logic input signal of invalid 40lut4 arrives
The signal of gate c1 connects.The optimization for the general bidirectional counter based on FPGA that the following each embodiments of the present invention provide is realized
Method is exactly realized using above-mentioned winding structure.
For ease of being compared with Fig. 2 of the prior art, the present invention is only gived in the attached drawing that following each embodiments are provided
The schematic diagram of general bidirectional counter used resource part, and the first look-up table in figure are realized in CME M5/M7 devices
0lut4, second look-up table LUT2 in LUT1 and Fig. 3 in Fig. 3 40lut4, winding structure a0 in Fig. 3 gate a0,
Gate c1, exclusive or XORCY in carry gate MUXCY and Fig. 3 and the XOR gate in Fig. 3, respectively one-to-one relationship.
Fig. 4 is the optimization implementation method of the general bidirectional counter provided in an embodiment of the present invention based on FPGA, and Fig. 5 is this
The logical mappings figure of general bidirectional counter after a kind of optimization that inventive embodiments provide.General bidirectional counter is by bringing into
Multiple 1 bit adders composition of position chain structure, user can specify the data of any width to input.With reference to Fig. 4, figure
5 pairs of optimization implementation methods of the present invention are described in further detail.
As shown in figure 5, the logical operation of each addition is by one 4 the first look-up table LUT1 inputted, one 4 input
Second look-up table LUT2, a winding structure a0, a carry gate MUXCY and an exclusive or XORCY are completed.LUT1 and
LUT2 can at most handle 4 variables of arbitrary logical combination.Wherein, in CME M5 or CME M7, LUT1 is equal to addition
An operand, be that one on CME M5 or CME M7 frameworks obligates.
Again as shown in figure 4, the method includes:
Step 401, the first addend is exported by the first look-up table logic;
Specifically, being configured to the first look-up table LUT1, make first look-up table according to the selection signal of input
Select and download signal load logics export the first addend.
According to Fig. 1:
fLUT1=~load& (select&delta [i] |~select&-delta [i])(Formula 1)
Step 402, four logic input signals of the counter are fully entered into second look-up table, it is defeated after logical operation
Go out sum number;
Specifically, combining as 1 can be with the logical variable of universal counter only has altogether 4, respectively selection signal
Select, download signal load, carry signal count_out and data input signal data_in, therefore can be by second
Look-up table LUT2 is configured, and is added the prior art the first addend as shown in Figure 2 and second using second look-up table LUT2
Several logic inputs merge, and will all 4 input signals input in second look-up table LUT2, and make second look-up table LUT2 roots
It is patrolled according to the selection signal select of input, download signal load, carry signal count_out and data input signal data_in
Collect the sum number of output adder.
According to Fig. 1:
fLUT2=fLUT1Xor (load&data_in [i] |~load&count_out [i])(Formula 2)
Thus, it can be seen that by the way that the logic of the first addend A and the second addend B in original Fig. 2 are all input to LUT2
In, therefore intermediate variable B can not be used in the logical process of adder, and directly obtain the sum number of adder.
Step 403, the first input signal that the first addend is carry gate is persistently gated by winding structure;Wherein institute
The input for stating winding structure is a logic input signal of first addend and the second look-up table;
Specifically, the framework based on FPGA, wiring path is fixed, based on the constraint on framework, must there is one
Cabling is to be connected to MUXCY's from a pin of the input of LUT2.The solution of the present invention can utilize in CME M5/M7
Possessed winding structure a0 realizes the invalid of the connection.Specifically, a0 is a gate.An input termination of a0
The input terminal load signals of LUT2, the signal of the first addend of another input access LUT1 output of a0, output meet MUXCY,
An input signal as MUXCY.The signal that the first addend is persistently gated by a0 is used as the first of carry gate MUXCY
Input signal is allowed to continue not gate to which the load invalidating signals for connecting the input pin of the slave LUT2 in another path fall.
Step 404, the carry gate gates the first input signal or the second input signal according to the sum number logic,
Obtain carry output signals;Wherein described second input signal is carry input signal;
Specifically, multiple carry gate MUXCY are cascade structure so that multiple cascade logic units form vertical chain
The carry chain on road, the from top to bottom carry successively as unit of logic unit.The carry output signals conduct of previous logic unit
The second input signal of the carry gate MUXCY of the latter logic unit, i.e. carry input signal, to be exported in step 402
Sum number as carry gate MUXCY gating signal input, thus carry gate according to the sum number carry out logic choosing
Logical first input signal or the second input signal, obtain the carry output signals of present bit.
Step 405, the carry input signal and the sum number are subjected to XOR logic operation, obtain the counter and works as
The output result of anteposition.
Specifically, the sum number exported in step 402 and carry input signal are inputted as the logic of exclusive or, calculate
To the output result of present bit.
Preferably, initial carry input signal C1=0 of lowest order logic unit.
Further, configuration is stored with Chang Bianliang in the first look-up table, and the Chang Bianliang is selected by the selection signal
Positive value(In conjunction with shown in Fig. 1, the positive value of Chang Bianliang is delta)Or negative value(In conjunction with shown in Fig. 1, the negative value of Chang Bianliang is-
delta)Participate in the logical operation of first addend.Likewise, configuration is stored with Chang Bianliang in second look-up table, by described
Selection signal selects the positive value of the Chang Bianliang or negative value to participate in the logical operation of the sum number.It selects often to become according to selection signal
The positive value or negative value of amount participate in logical operation, realize the function of bidirectional counter.
It is according to the logical mappings schematic diagram of general bidirectional counter as shown in Figure 4 it is found that logical after being optimized by method
It is with the logical resource usage amount of bidirectional counter:
Quantity=2 LUT × input data bit wide
It follows that the present invention is looked into using winding structure specific to CME M5 or CME M7FPGA devices invalid second
Look for the input terminal of table to the signal connection between carry gate, the look-up table so as to be inputted using only one 4 can be complete
At according to the selection signal of input, download signal, carry signal and data input signal, the sum number of logic output adder, because
This all eliminates the logical operation of each occupancy of one look-up table resource, while when also effectively reducing logic
Prolong, realizes the optimization to chip efficiency and area.
Further, the present invention also provides a preferred embodiments, can be realized on the basis of previous embodiment
The quantity of first look-up table LUT is advanced optimized.
Because delta [i] is constant, therefore can be based on feelings of the formula 1 to delta [i] and-delta [i] for different assignment
Condition is analyzed.
(Formula 3)
It follows that the quantity of LUT1 can be unrelated with input data bit wide, no matter input data is the width of how many digit
Degree, the quantity of LUT1 can be 4.The logic of execution exports 0 respectively ,~load&~select ,~load&select and
~load.
Therefore, logical resource usage amount can advanced optimize for:
LUT quantity=input data bit wide+4
Further, it because 0 and~load is constant, therefore can be closed with the input signal of LUT2 below
And therefore logical resource usage amount can further be optimized for:
LUT quantity=input data bit wide+2
It follows that the optimization implementation method of the general bidirectional counter proposed by the present invention based on FPGA, compared to existing
There is the common logical mappings method of technology, can at most save about 2/3 LUT numbers, while reducing by 1/2 logic time delay.
It should be noted that, although specific embodiments of the present invention are realized based on CME M5/M7FPGA devices, but
It is that optimization implementation method provided by the present invention can be applied equally in the FPGA device of other frameworks.
Professional should further appreciate that, described in conjunction with the examples disclosed in the embodiments of the present disclosure
Unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, hard in order to clearly demonstrate
The interchangeability of part and software generally describes each exemplary composition and step according to function in the above description.
These functions are implemented in hardware or software actually, depend on the specific application and design constraint of technical solution.
Professional technician can use different methods to achieve the described function each specific application, but this realization
It should not be considered as beyond the scope of the present invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can use hardware, processor to execute
The combination of software module or the two is implemented.Software module can be placed in random access memory(RAM), memory, read-only memory
(ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field
In any other form of storage medium well known to interior.
Above-described specific implementation mode has carried out further the purpose of the present invention, technical solution and advantageous effect
It is described in detail, it should be understood that the foregoing is merely the specific implementation mode of the present invention, is not intended to limit the present invention
Protection domain, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.