CN107092462A - A kind of 64 Asynchronous Multipliers based on FPGA - Google Patents

A kind of 64 Asynchronous Multipliers based on FPGA Download PDF

Info

Publication number
CN107092462A
CN107092462A CN201710214226.5A CN201710214226A CN107092462A CN 107092462 A CN107092462 A CN 107092462A CN 201710214226 A CN201710214226 A CN 201710214226A CN 107092462 A CN107092462 A CN 107092462A
Authority
CN
China
Prior art keywords
counter
multipliers
asynchronous
selector
control unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710214226.5A
Other languages
Chinese (zh)
Other versions
CN107092462B (en
Inventor
何安平
吴尽昭
刘晓庆
冯广博
郭慧波
熊菊霞
王娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710214226.5A priority Critical patent/CN107092462B/en
Publication of CN107092462A publication Critical patent/CN107092462A/en
Application granted granted Critical
Publication of CN107092462B publication Critical patent/CN107092462B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Logic Circuits (AREA)

Abstract

The invention discloses a kind of 64 Asynchronous Multipliers based on FPGA, 64 Asynchronous Multipliers include 8*64 multipliers, selector MUX0, selector MUX1, selector MUX2, compressor reducer, counter Count0, counter Count1, counter Count2, some registers, carry lookahead adder CLA, and control unit, wherein, control unit, the streamline constituted using Click nonsynchronous controllers, handshake is analyzed by the carrying out shake communication of nonsynchronous controller, and order produces four groups of trigger signals;Selector MUX0, selector MUX1, selector MUX2, compressor reducer, counter Count0, counter Count1, counter Count2, some registers, carry lookahead adder CLA carry out the processing such as corresponding data transfer, compression, accumulating operation, output according to four groups of trigger signals.Faster, energy consumption is lower for calculating speed of the present invention.

Description

A kind of 64 Asynchronous Multipliers based on FPGA
Technical field
The present invention relates to a kind of 64 Asynchronous Multipliers for being based on field programmable gate array (FPGA).
Background technology
From after the transistor technology appearance seventies in last century, Synchronization Design almost turns into the design method of digital display circuit Synonym.But current technique has tended to manufacturing limit, 12 nanometers to 7 nanometers of transformation has been slowed down, " very likely first Away from Moore's Law " (John Gustafson, the AMD seat of honour designer).Clock caused by the huge advance of manufacturing process is askew Tiltedly, it is the severe challenge of synchronous design method the problems such as power distribution, synchronous design method can not provide these sternnesses and ask in itself The solution of topic, can only largely use GALS (Global Asynchronous and local synchronization) design method, that is, employ a small amount of asynchronous electricity The multi-core technology on road, to alleviate above-mentioned challenge.
Modern asynchronous design is introduced based on micropipeline design method, and the core of this design method is nonsynchronous controller Circuit, for realizing carrying out shake communication agreement and coordination circuits function.Compared to clock scheme, asynchronous circuit uses local communication mould Formula, completes asynchronous controlling, it is not necessary to huge clock distributing network, the problem of solving clock skew with Handshake Protocol.It is asynchronous Almost the power consumption of whole system is set to be effectively controlled without power consumption during idle.This asynchronous design methodologies low-power consumption, The many aspects such as low electromagnetic, low heat emission, modularization are with the obvious advantage.
Digital multiplier is a kind of binary ALU because digital circuitry framework Boolean logic it On, so needing a kind of mechanism that arithmetic is converted into logic, this mechanism is exactly the essence of digital multiplier algorithm.Numeral multiplies The algorithm of musical instruments used in a Buddhist or Taoist mass comparative maturity, most intuitively array algorithm, since the low level of multiplier, calculate each with being multiplied successively Partial product, is then added and is accumulated, it is necessary to n (n+1) individual full adder for n multipliers by several products (partial product) And n2Individual AND gate, realizes that the multiplier calculating speed of this algorithm is slow, area is high with power consumption.
Booth algorithm is a kind of widely used efficient multiplier implementation method, this method calculate first multiplicand with The partial product that each section of multiplier, summation is then compressed to it and obtains final product.The generation and merging of which part product are crucial, portions Dividing the calculating of product not only influences calculating speed, and determines the scale of whole multiplier.Booth algorithm is improved first, The basic framework of classical Booth algorithm displacement, compression and summation is adopted, this multiplier section portion of subtraction is done after eliminating displacement Divide the method for product, and retain some products in shifting process, and to addition quadrature after its many second compression.It is this to improve enhancing Cohesion inside functional module, reduces the coupled relation of intermodule, simplifies the realization that multiplier controls circuit.
But, because multiplier is divided into some multipliers section by Booth algorithm, multiplication problem stipulations are each multiplicand and multiplier The partial product sum of section.Specifically, in Booth algorithm, can according to multiplier section binary data feature, will each section together The Multiplicative Maps of multiplicand are equivalent displacement and subtraction to try to achieve the partial product on this multiplier section, then carried out again many Secondary addition quadrature, or single is added quadrature after many second compressions, this algorithm operating is slower relative to follow-on algorithm speed It is very restricted in speed, and most of mentalities of designing for using synchronous circuit in Digital Design at present, Synchronised clock scheme is, it is necessary to which huge clock distributing network, there is clock skew waits series of problems.
The content of the invention
It is an object of the invention to provide a kind of computing faster, lower 64 Asynchronous Multipliers based on FPGA of energy consumption.
The present invention is achieved in that a kind of 64 Asynchronous Multipliers based on FPGA, and 64 Asynchronous Multipliers include 8*64 multipliers, selector MUX0, selector MUX1, selector MUX2, compressor reducer, counter Count0, counter Count1, counter Count2, some registers, carry lookahead adder CLA, and control unit, wherein:
Described control unit, the streamline constituted using Click nonsynchronous controllers, passes through the carrying out shake communication of nonsynchronous controller To analyze handshake, and order produces four groups of trigger signals;
The counter Count0, for after first group of trigger signal of control unit is received, control selections device MUX0 carries out computing to input signal in 8*64 multipliers, and operation values are stored in 8 registers respectively;
The register, the output valve for store 8*64 multipliers of higher level is receiving the second of control unit After group trigger signal, the output valve of 8*64 multipliers is continued down to transmit;
The counter Count1, for after the 3rd group of trigger signal of control unit is received, passing through selector MUX1, further the number in 8 registers of control, computing is compressed according to setting order in compressor reducer;
The counter Count2, for after the 4th group of trigger signal is received, control selections device MUX2 to select higher level Compressor reducer output valve, and output valve is adjusted back by continuation and 8 register data pressures in higher level's compressor reducer according to judged result Contracting, or output valve is delivered in carry lookahead adder CLA;
The carry lookahead adder CLA carries out sum operation to the output valve received and exports result.
Preferably, in the counter Count0, the input signal of the 8*64 multiplier is the input of 64 digits Signal A, 8 digits input signal B.
Good digital multiplier is processor and the core component of algorithm chip, is basis and the core of all kinds of complicated calculations The heart, particularly completes the key point of high-performance Real-time digital signal processing and image procossing, and the efficiency of multiplier is directly affected The performance of chip.The efficiency of digital multiplier is mainly reflected in two aspects, i.e. area and speed.The different design method of selection With realize algorithm, the influence of area and speed to multiplier is very big.
The present invention proposes a kind of improved Booth multiplication algorithms, and its core concept is first to shift, recompress, and is finally asked With, reduce the coupling of each intermodule, be conducive to control circuit simplification.
In addition, design method of the present invention according to pure asynchronous circuit system, is shaken hands logical using " bound data binding " two-phase The Click micropipelines of agreement are interrogated, the strategy separated according to control with data processing, realize this innovatory algorithm 64 are different Multiplier is walked, and is verified on FPGA.
1st, the asynchronous controlling principle based on micropipeline
The core of asynchronous design methodologies is nonsynchronous controller circuit, and nonsynchronous controller is used to realize carrying out shake communication agreement and association Circuit function is adjusted, the nonsynchronous controller unit of current main flow has three classes, i.e. CElement, GasP and Click.CElement by Muller is proposed the fifties in last century, is most widely used asynchronous controlling unit, realizes shaking hands based on " data-bound " Agreement, this circuit is in communication process of shaking hands, and due to not data are carried out with any constraint, the later stage needs substantial amounts of sequential to test Card work just can guarantee that the correctness of circuit.And GasP and Click circuits will be led to using the Handshake Protocol of " bound data binding " News and data management are separated into different event, and the mechanism of this event separation ensure that the sequential of circuit from principle, and relative The analysis of sequential can significantly simplify asynchronous design methodologies with the use of ensureing.We are constituted using Click nonsynchronous controllers Streamline is called computing, thus completes final multiply repeatedly as control unit, the module of micropipeline control multiplier Musical instruments used in a Buddhist or Taoist mass algorithm.
2nd, Click circuits and two-phase single track Handshake Protocol
Click circuits are equal to 2010 by Peeters and Willem earliest to be proposed, realizes " bound data binding " two-phase Carrying out shake communication agreement.Carrying out shake communication, two signal intensities are carried out with Req (request) and Ack (response) signal between nonsynchronous controller Between, data transfer is realized, and signal management data transfer (is excited) with Fire, as shown in Figure 1.
3rd, asynchronous micropipeline control circuit
64 Asynchronous Multipliers control circuit to carry out the operation time sequence of strict control modules using asynchronous pipeline, multiply Musical instruments used in a Buddhist or Taoist mass has 19 click circuits containing micropipeline altogether, and produces corresponding 19 Fire signals, such as accompanying drawing 2.Asynchronous circuit Due to producing each flowing water section local clock using Handshake Protocol, the global clock in synchronous integrated circuit instead of, it is not necessary to huge Big clock distributing network, so that naturally solve clock drift in synchronous integrated circuit, the problems such as power consumption is higher, and can To obtain the performance under average case, with preferable reusability and robustness.When the incoming micropipelines of request signal in_R During structure, request signal will finally obtain answer signal in_A in sequence toward transmission.By by micropipeline control unit Called repeatedly, complete the arithmetic operation of whole multiplier.
In the control circuit of asynchronous pipeline, trigger signal Fire is not only exported, and in micropipeline control unit In also relate to the control sections such as counter.In whole multiplier, the counter of 3 is needed altogether to drive different selections Device, then by selector further carrys out control data path, realize circulation flowing structure.
Compared to the shortcoming and defect of prior art, the invention has the advantages that:
(1) compared with the synchronous multiplier under same architecture, Asynchronous Multiplier proposed by the present invention is in energy consumption and face In the case that product is substantially constant, faster, each calculating time 150ns or so enters calculating speed for 2 two of any 64 Multiplier processed is multiplied, and can be rapidly completed product calculation;
(2) design is not influenceed by FPGA intrinsic frequencies, and communication delay most reaches soon between micropipeline internal module 1.5ns, it is not necessary to huge clock distributing network and clock skew problem;
(3) modularization of the present invention is good, it is easy to Hierarchical Design.
Brief description of the drawings
Fig. 1 is " bound data binding " two-phase carrying out shake communication protocol theory figure;
Fig. 2 is micropipeline control circuit structure schematic diagram;
Fig. 3 is the structure chart of each logic module in 64 Asynchronous Multipliers of the invention based on FPGA;
Fig. 4 is 8*64 multiplier logic function structure charts;
Fig. 5 is multiplier analogous diagram.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
The invention discloses a kind of 64 Asynchronous Multipliers based on FPGA, as shown in figure 3,64 Asynchronous Multiplier bags Include 8*64 multipliers, selector MUX0, selector MUX1, selector MUX2, compressor reducer, counter Count0, counter Count1, counter Count2, some registers, carry lookahead adder CLA, and control unit (the miniflow water in Fig. 3 Line), wherein,
Described control unit, the streamline constituted using Click nonsynchronous controllers, passes through the carrying out shake communication of nonsynchronous controller To analyze handshake, and order produces four groups of trigger signals;
The counter Count0, for after first group of trigger signal of control unit is received, control selections device MUX0 carries out computing to the input signal A of 64 digits, the input signal B of 8 digits in 64 multipliers, and operation values are stored to respectively In 8 registers;
The register, the output valve for store 8*64 multipliers of higher level is receiving the second of control unit After group trigger signal, the output valve of 8*64 multipliers is continued down to transmit;
The counter Count1, for after the 3rd group of trigger signal of control unit is received, passing through selector MUX1, further the number in 8 registers of control, computing is compressed according to setting order in compressor reducer;
The counter Count2, is received after a series of trigger signal, then control selections device MUX2 selects higher level Compressor reducer output valve, is up to adjust back to continue to compress with 8 register datas in higher level's compressor reducer, is still delivered to carry look ahead In adder CLA.
The carry lookahead adder CLA carries out sum operation to the output valve received and exports result.
In embodiments of the present invention, as shown in figure 3, the completion of the multiplier needs 8*64 multipliers, selector MUX0, MUX1 and MUX2, compressor reducer Compressor, 3 counters Count0, Count1, Count2 and last carry look ahead add The functional modules such as musical instruments used in a Buddhist or Taoist mass CLA are constituted.Wherein, A is one group of carry out divide value with every 8 by Count0 control selections device, is always divided into For 8 groups, the value of division carries out computing with B in 8*64 multipliers respectively, and obtained value is stored in 8 registers respectively. Count1 is to control the number in 8 registers to be compressed in compressor reducer, is compressed 7 times altogether.Count2 selection compressor reducers Output valve is to adjust back or be delivered in carry lookahead adder CLA to carry out sum operation, and the compaction algorithms of compressor reducer are first It is to be compressed the value in FF1 and FF2, is then compressed the value in obtained compressed value and FF3.By that analogy, when Last time compaction algorithms are completed, obtained compressed value is transferred into carry lookahead adder CLA, obtains 64 multipliers most Output valve afterwards.
In embodiments of the present invention, described control unit realization principle is analyzed by the carrying out shake communication of nonsynchronous controller, As shown in figure 3, specifically including:
(1)fire0:In register FF0In have a wi_a_64bit, two values of wi_b_64bit, fire0 triggerings FF0By this two Individual value is down transmitted, and wi_a_64bit value is reached in selector MUX0, and wi_b_64bit will directly reach 8*64 multiplication In device, the two values wait trigger signal to carry out first time calculating jointly.Micropipeline continues handshake down to pass simultaneously Pass, and produce fire1 signals.
(2) fire1, frie3, fire5 are to fire15:This 8 trigger signal control counter Count0Counted, counted Device then control selections device MUX0 is counted, the wi_a_64bit values in selector are divided, 8 class values of output will reach 8*64 In multiplier, computing is carried out with wi_b_64bit value, obtained value is stored in 8 registers.
(3) fire2, fire4, fire6 are to fire16:Register FF1-FF8 stores the output of 8 higher level's multipliers Value, will be continued down to transmit by the output valve of the trigger register of 8 fire signals, 8*64 multipliers.
(4) fire4, fire6, fire8 are to fire16:This 7 trigger signal control control counters count 0-6, work as meter Number device is that the value in FF1 and FF2 is passed through into selector for 0, is delivered in compressor reducer Compressor and is compressed computing, is selected The input value that device MUX1 main functions are compressions required for selection is selected, using 7 grades of circulation flowing structures on data path, and is transported With compressor reducer tree (Compressor_128bit).The computation capability of common adder is limited, and thus the present invention uses 4- 2 compressor reducers, the addition boil down to 2 that this circuit can be concurrently by 4 inputs is exported, and partial product quantity can be reduced into half. 4-2 compressor reducers are serially made up of two one-bit full addres, and high position compression is independent of low order carry, and concurrency is high, and circuit is complicated Degree is relatively low, and arithmetic speed is higher, and then improves the integral operation efficiency of multiplier.
(5) fire5, fire7, fire9 are to fire17:Major control selector MUX2 selection higher level's compressor reducer output valve be Up adjust back or be delivered in carry lookahead adder CLA.When being arrived such as fire5 trigger signals, by the compressed value of first time In the selector MUX1 for adjusting back higher level, the value in FF3 is controlled to proceed compression with readjustment value by MUX1, this operation is always It is extended to the arrival of fire15 signals.When fire17 signals arrive, meeting compressed value, which is down delivered in CLA, to be continued to calculate, and is calculated Obtained value there will be in the register below adder CLA.
(6)fire18:Last signal will trigger carry look ahead CLA FF1 registers, by final product data output.
In embodiments of the present invention, in the counter Count0, the input letter of input signal A, 8 digits to 64 digits Number B carries out calculating process in 8*64 multipliers, as shown in Figure 4.
Figure 4, it is seen that the input signal of multiplier is A, B respectively, wherein A is 64 digits, and B is 8 digits.Input Parameter A will be 8 groups by one group of total score of 8 bit wides, be from [7 respectively:0] [63 are arrived:56].This 8 groups of A are put into 8 with B respectively Multiplier in, wherein 8 multipliers are made up of 4 shift unit Shifter circuits and compressor reducer Compressor, such as Fig. 4 Middle Multiplier1 structure.
Multiplier1 is one of 8*64 multiplier chief components, and the input of multiplier is A [7:0] and B, first By A [7:0] divided using every two bit wide as one group, be divided into (A7A6)(A5A4)(A3A2)(A1A0) 4 groups, distinguish per class value with B It is put into 4 displacement encoders and carries out computing, finally gives the binary value of two 15.In whole 8*64 multiplier, Altogether comprising 88 multipliers, the multipliers of 8 groups of A respectively with B Jing Guo the first order are calculated, 16 15 systems are most obtained at last Number, the computing of first stage is completed.
This 16 binary values are passed through 4-2 compressor reducer trees by second stage, complete to be compressed the operation of numerical value, 4-2 pressures Contracting device concurrently can export 4 boil down tos 2 inputted, partial product quantity can be reduced into half.4-2 compressor reducers are by two one Position full adder is serially constituted, and high position compression is independent of low order carry, and concurrency is high, and circuit complexity is relatively low, arithmetic speed compared with It is high.By a series of calculating of the multiplier, compressor reducer 2 output valves S and C, the two value operation values difference are finally given It is stored in 8 registers, and will continues to be calculated in 64 multipliers.
The present invention realizes this follow-on Booth algorithm using asynchronous design methodologies, and control section is using being easy to [micropipeline of composition, functional circuit realizes that the two is by triggering using combinational logic to the Click nonsynchronous controllers of Time-Series analysis Device is bound up, i.e., asynchronous micropipeline safeguards the calculating time of combinational circuit indirectly by the conducting opportunity of Admin Trigger Sequence, three's cooperation completes once/repeatedly multiplication calculating, constitutes a kind of calculating structure of data path (Data-Path) formula.
The Click micropipelines of maintenance data bound data binding of the present invention " two-phase carrying out shake communication agreement are asynchronous to realize Circuit, asynchronous circuit uses local communication pattern, and asynchronous controlling is completed with Handshake Protocol.
Increase Partial product compression number of times proposed by the present invention simultaneously will add the rearmounted lower coupling Booth algorithm of (subtracting) method, this Algorithm improves computational efficiency by the function of separating modules, is especially suitable for asynchronous controlling, further, and the present invention is with asynchronous micro- Streamline mechanism and combination function module complete displacement, compression and addition function, and the design degree of modularity is high, and flow is simply clear It is clear.
Compared with the synchronous multiplier under same architecture, Asynchronous Multiplier proposed by the present invention is big in energy consumption and area In the case that body is constant, calculating speed is fast, each calculating time 150ns or so, multiplies for 2 binary systems of any 64 Number is multiplied, and can be rapidly completed product calculation.
The design and emulation of 64 Asynchronous Multipliers are carried out using Vivado platforms, hardware description language uses Verilog- 1995 (Vivado be Xilinx companies from RTL to bit stream complete design workflow tool, the FPGA (Field- of utilization Programmable Gate Array) model be Xilinx companies Virtex-7 (xc7vx550tffg1158-2), by wi_ A kind of simulation result that both a_64bit=103741655961231, wi_b_64bit=112381656513586 are multiplied, tool Body oscillogram is accompanying drawing 5, in Vivado simulation document TestBench, writes test code, time stimulatiom is run afterwards, Obtain final result of calculation.
From fig. 5, it can be seen that when inR is changed into high level, nonsynchronous controller carrying out shake communication starts, multiplier starts to calculate, 19 fire signals and 2 counters carry out multiplier data path control altogether.In the resource that circuit takes, LUT is accounted for altogether With 3695, the 1.07% of whole resources is accounted for;Register takes 3335, accounts for the 0.48% of whole resources.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention Any modifications, equivalent substitutions and improvements made within refreshing and principle etc., should be included in the scope of the protection.

Claims (2)

1. a kind of 64 Asynchronous Multipliers based on FPGA, it is characterised in that 64 Asynchronous Multipliers include 8*64 multiplication Device, selector MUX0, selector MUX1, selector MUX2, compressor reducer, counter Count0, counter Count1, counter Count2, some registers, carry lookahead adder CLA, and control unit, wherein,
Described control unit, the streamline constituted using Click nonsynchronous controllers is divided by the carrying out shake communication of nonsynchronous controller Handshake is analysed, and order produces four groups of trigger signals;
The counter Count0, for after first group of trigger signal of control unit is received, MUX0 pairs of control selections device Input signal carries out computing in 8*64 multipliers, and operation values are stored in 8 registers respectively;
The register, the output valve for store 8*64 multipliers of higher level is touched in receive control unit second group After signalling, the output valve of 8*64 multipliers is continued down to transmit;
The counter Count1, for after the 3rd group of trigger signal of control unit is received, by selector MUX1, entering One step controls the number in 8 registers, and computing is compressed in compressor reducer according to setting order;
The counter Count2, for after the 4th group of trigger signal is received, control selections device MUX2 selection higher levels to compress Device output valve, and output valve is adjusted back by continuation and the compression of 8 register datas in higher level's compressor reducer according to judged result, or Output valve is delivered in carry lookahead adder CLA by person;
The carry lookahead adder CLA carries out sum operation to the output valve received and exports result.
2. 64 Asynchronous Multipliers as claimed in claim 1 based on FPGA, it is characterised in that in the counter Count0 In, the input signal of the 8*64 multiplier is input signal A, the input signal B of 8 digits of 64 digits.
CN201710214226.5A 2017-04-01 2017-04-01 64-bit asynchronous multiplier based on FPGA Active CN107092462B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710214226.5A CN107092462B (en) 2017-04-01 2017-04-01 64-bit asynchronous multiplier based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710214226.5A CN107092462B (en) 2017-04-01 2017-04-01 64-bit asynchronous multiplier based on FPGA

Publications (2)

Publication Number Publication Date
CN107092462A true CN107092462A (en) 2017-08-25
CN107092462B CN107092462B (en) 2020-10-09

Family

ID=59646295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710214226.5A Active CN107092462B (en) 2017-04-01 2017-04-01 64-bit asynchronous multiplier based on FPGA

Country Status (1)

Country Link
CN (1) CN107092462B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537331A (en) * 2018-04-04 2018-09-14 清华大学 A kind of restructural convolutional neural networks accelerating circuit based on asynchronous logic
CN113407239A (en) * 2021-06-09 2021-09-17 中山大学 Assembly line processor based on asynchronous single track
WO2023179325A1 (en) * 2022-03-24 2023-09-28 华为技术有限公司 Chip, signal processing method, and electronic device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060008080A1 (en) * 2004-07-09 2006-01-12 Nec Electronics Corporation Modular-multiplication computing unit and information processing unit
CN101504599A (en) * 2009-03-16 2009-08-12 西安电子科技大学 Special instruction set micro-processing system suitable for digital signal processing application

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060008080A1 (en) * 2004-07-09 2006-01-12 Nec Electronics Corporation Modular-multiplication computing unit and information processing unit
CN101504599A (en) * 2009-03-16 2009-08-12 西安电子科技大学 Special instruction set micro-processing system suitable for digital signal processing application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
肖鹏: "基于FPGA的高速双精度浮点乘法器设计", 《微电子学与计算机》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537331A (en) * 2018-04-04 2018-09-14 清华大学 A kind of restructural convolutional neural networks accelerating circuit based on asynchronous logic
CN113407239A (en) * 2021-06-09 2021-09-17 中山大学 Assembly line processor based on asynchronous single track
CN113407239B (en) * 2021-06-09 2023-06-13 中山大学 Pipeline processor based on asynchronous monorail
WO2023179325A1 (en) * 2022-03-24 2023-09-28 华为技术有限公司 Chip, signal processing method, and electronic device

Also Published As

Publication number Publication date
CN107092462B (en) 2020-10-09

Similar Documents

Publication Publication Date Title
Gong et al. MALOC: A fully pipelined FPGA accelerator for convolutional neural networks with all layers mapped on chip
CN104899182B (en) A kind of Matrix Multiplication accelerated method for supporting variable partitioned blocks
CN103677739B (en) A kind of configurable multiply accumulating arithmetic element and composition thereof multiply accumulating computing array
CN100470464C (en) Multiplier based on improved Montgomey's algorithm
CN107092462A (en) A kind of 64 Asynchronous Multipliers based on FPGA
CN109828744A (en) A kind of configurable floating point vector multiplication IP kernel based on FPGA
CN104145281A (en) Neural network computing apparatus and system, and method therefor
CN105183425B (en) A kind of fixation bit wide multiplier with high-precision low complex degree characteristic
CN101221491B (en) Point addition system of elliptic curve cipher system
CN110058840A (en) A kind of low-consumption multiplier based on 4-Booth coding
Stevens et al. Energy and performance models for synchronous and asynchronous communication
CN107544942A (en) A kind of VLSI design methods of Fast Fourier Transform (FFT)
CN106775577B (en) A kind of design method of the non-precision redundant manipulators multiplier of high-performance
CN102364456A (en) 64-point fast Fourier transform (FFT) calculator
CN109472734A (en) A kind of target detection network and its implementation based on FPGA
Yin et al. FPGA-based high-performance CNN accelerator architecture with high DSP utilization and efficient scheduling mode
CN103078729A (en) Dual-precision chaotic signal generator based on FPGA (field programmable gate array)
CN104407836A (en) Device and method of carrying out cascaded multiply accumulation operation by utilizing fixed-point multiplier
Ranganathan et al. A linear array processor with dynamic frequency clocking for image processing applications
CN202395792U (en) Double precision chaotic signal generator based on FPGA
CN108108812A (en) For the efficiently configurable convolutional calculation accelerator of convolutional neural networks
CN107368459A (en) The dispatching method of Reconfigurable Computation structure based on Arbitrary Dimensions matrix multiplication
CN106168897A (en) Resources conservation circuit structure for deep stream aquation pulsation finite impulse response filter
Srivastava et al. Operation-dependent frequency scaling using desynchronization
Msadaa et al. A SoPC FPGA implementing of an enhanced parallel CFAR architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant