CN108762719A - A kind of parallel broad sense inner product reconfigurable controller - Google Patents
A kind of parallel broad sense inner product reconfigurable controller Download PDFInfo
- Publication number
- CN108762719A CN108762719A CN201810497969.2A CN201810497969A CN108762719A CN 108762719 A CN108762719 A CN 108762719A CN 201810497969 A CN201810497969 A CN 201810497969A CN 108762719 A CN108762719 A CN 108762719A
- Authority
- CN
- China
- Prior art keywords
- address
- intermediate result
- broad sense
- computing module
- inner product
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000011159 matrix material Substances 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 20
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 6
- 230000001360 synchronised effect Effects 0.000 claims description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 4
- 238000013500 data storage Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 abstract description 8
- 238000012360 testing method Methods 0.000 abstract description 5
- 238000013461 design Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 4
- 238000007689 inspection Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
- G06F12/0646—Configuration or reconfiguration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/483—Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computer Hardware Design (AREA)
- Nonlinear Science (AREA)
- Complex Calculations (AREA)
- Logic Circuits (AREA)
Abstract
The parallel broad sense inner product reconfigurable controller of the present invention, including:Intermediate result computing module receives source data and calculates intermediate result vector according to source data, generate vectorAddress, be stored in bank;Often complete an intermediate result vectorCalculating generate a completion signal, and the completion signal is sent to final result computing module, as enabling signal;Final result computing module, reading data enter plural multiply-accumulator progress final result and matrix of consequence are calculatedL-th element, generate vectorAddress, be stored in bank;Address data memory processing module carries out data selection according to ping-pong operation selection signal, generates correct bank address signals.Advantageous effect:It is few and utilization ratio of storage resources is big to calculate the time, can meet when carrying out non-homogeneous detection in many signal detection application scenarios, obtain the high real-time requires of test statistics.
Description
Technical field
The invention belongs to non-homogeneous detection technique field more particularly to a kind of parallel broad sense inner product reconfigurable controllers.
Background technology
It is a kind of detection technique to moving target that space-time adaptive, which handles (STAP),.In conventional STAP algorithms, it is necessary into
Row clutter covariance matrix is estimated.When carrying out the estimation of clutter covariance matrix using secondary data, secondary data must expire
The independent identically distributed condition of foot, could reduce performance loss.
In practical applications, detected signal echo can not only be polluted by natural clutter, be also suffered from artificial non-
Uniformly interference is polluted, therefore is often unsatisfactory for independent same distribution condition.
For the jamming target in sample, Melvin first proposed the thought of nonhomogeneity detector (NHD), pass through rejecting
The sample for including jamming target, come the influence for inhibiting it to estimate clutter covariance matrix.The basic ideas of NHD are:According to quilt
The difference of the sample and other sample statistics characteristics of jamming target pollution, is arranged corresponding test statistics to distinguish two kinds of samples
This.
In terms of NHD test statistics selections, US Naval Research Laboratory Gerlach et al. proposes broad sense inner product (GIP)
With remaining two criterion of adaptive power.Enable XLIndicate the l-th sample in initial sample, then its corresponding autocorrelation matrix table
It is shown as:Wherein T is miscellaneous covariance matrix of making an uproar, and is enabledIndicate the sample covariance square being made of L sample
Battle array, then the corresponding GIP values of each sample are represented by:It, can be with according to the corresponding GIP values of each sample
Effectively reject jamming target.
The non-homogeneous detection method of broad sense inner product is related to the rejection ability of clutter and the population size of sample, and sample size is got over
Greatly, clutter covariance matrix data are truer, stronger to the rejection ability of clutter.The non-homogeneous inspection of broad sense inner product is realized on software
Survey method has that precision is not high and operation time is long when calculating great amount of samples, to meet practical non-homogeneous inspection
The high real-time requires of survey technology.
Invention content
The purpose of the present invention is overcoming the shortcomings of in above-mentioned background technology, a kind of parallel broad sense inner product reconfigurable control is proposed
Device preferably meets the high real-time of practical application, the demand that big points calculate, is realized especially by following technical scheme
's:
The parallel broad sense inner product reconfigurable controller includes:
Intermediate result computing module receives source data and calculates intermediate result vector Y according to source dataL, generate vector YL's
Address is stored in bank;Often complete an intermediate result vector YLCalculating generate a completion signal, and by the completion signal
It is sent to final result computing module, as enabling signal;
Final result computing module continuously generates the row X of matrix X by address generatorLThe address of element and it is corresponding in
Between result vector YLThe address of element, reading data enter plural multiply-accumulator and obtain matrix of consequence Z1xNL-th element ZL, generate to
Measure ZLAddress, be stored in bank;
Address data memory processing module carries out data selection according to ping-pong operation selection signal, while to coming from centre
As a result the signal for the same bank of computing module and final result computing module is handled, with generating correct bank
Location signal.
The further design of the hardware implementation method of the parallel broad sense inner product operation is, calculates YLProcess be XLWith
Square formation T, each to arrange the process multiplied accumulating, the ranks number of the square formation T is equal with the columns of matrix X, and the process multiplied accumulating is logical
Cross multichannel parallel computation realization.
The further design of the hardware implementation method of the parallel broad sense inner product operation is that intermediate result computing module is adopted
It is realized with the parallel realization method in four tunnels.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, intermediate result computing module
Source data storage mode is:Matrix T is stored in by row in bank0-bank3, continues to deposit in bank4- by row after being filled with
In bank7;Matrix X is stored in by row in bank8-bank11.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, intermediate result computing module
Intermediate result storage mode is:Odd term is stored in bank12, and even item is stored in bank13.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, intermediate result computing module into
Row intermediate result calculate flow be:During once-through operation, address generator generates a column element X of X firstLWith four row
T matrix elements address, while corresponding matrix element data is carried, it inputs plural multiply-accumulator and obtains intermediate result YL;Then
Intermediate result storage address is generated by address generator, intermediate result is stored in bank.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, final result computing module into
Row final result calculate flow be:When final result computing module, which obtains intermediate result, calculates completion signal, address generates
Device continuously generates the row X of matrix XLThe address of element and corresponding intermediate result vector YLThe address of element;It is input to complex multiplication simultaneously
Accumulator obtains final result ZL, final result storage address is generated by address generator, final result is stored in bank.
The further design of the hardware implementation method of the parallel broad sense inner product operation is that the complex multiplier is
Postpone the flowing water single-precision floating point arithmetic element of 4 clock cycle, the memory access latency of complex multiplier is set as 6 periods.
The further design of the hardware implementation method of the parallel broad sense inner product operation is that the plural number multiply-accumulator is
Five, wherein four are used for four tunnel parallel computation intermediate results, another calculates final result for synchronous.
The further design of the hardware implementation method of the parallel broad sense inner product operation is, each plural number multiply-accumulator by
One complex multiplier and three complex adder compositions, the area that DC is integrated under 40nm CMOS technologies are 19993.56 μ
m2。
Advantages of the present invention
Parallel broad sense inner product reconfigurable controller provided by the invention calculates one immediately after using one intermediate result of calculating
The strategy of final result element calculates ZL-1Time can be hidden in calculate YLTime in, calculate the time it is few and storage money
Source utilization rate is big.The parallel broad sense inner product reconfigurable controller can meet carries out non-homogeneous inspection in many signal detection application scenarios
When survey, the high real-time requires of test statistics are obtained.
Description of the drawings
Fig. 1 is the configuration diagram of parallel broad sense inner product reconfigurable controller.
Fig. 2 is that volume data stores schematic diagram in parallel broad sense.
Fig. 3 is parallel broad sense inner product algorithm calculation process schematic diagram.
Specific implementation mode
The present invention is described in detail with specific implementation case below in conjunction with the accompanying drawings.
Such as Fig. 1, the parallel broad sense inner product reconfigurable controller of the present embodiment is by taking four tunnels are parallel as an example, mainly by by three submodules
Block forms, respectively:Intermediate result computing module, final result computing module and address data memory processing module.It is intermediate
As a result computing module is for calculating intermediate result;Final result computing module calculates final result;Address data memory handles mould
Block handles the coherent signals such as the addresses bank.
Intermediate result computing module, the calculating intermediate result vector Y of complete flowing waterL, including generate XLColumn element address, it is right
XLOne column element and square formation TMxMEach row carry out inner product and multiply accumulating operation, obtain intermediate result vector YL, generate vector YLGround
Location is stored in bank.Often complete a YLCalculating provide one complete signal give final result computing module, as the primary of it
The enabling signal of calculating.
Final result computing module continuously generates the row X of matrix X by address generatorLThe address of element and it is corresponding in
Between result vector YLThe address of element, reading data enter plural multiply-accumulator and obtain matrix of consequence Z1xNL-th element ZL, generate to
Measure ZLAddress, be stored in bank.
Address data memory processing module carries out data selection according to ping-pong operation selection signal, while to coming from centre
As a result the signal for the same bank of computing module and final result computing module is handled, with generating correct bank
The signals such as location.
Such as Fig. 1, storage unit includes 15 bank, and wherein matrix T deposits in bank0-7, and matrix X deposits in bank8-
11, intermediate result YLIt is stored in bank12 and bank13, product matrix is stored in bank14 in final parallel broad sense.Operation
Unit includes 5 plural multiply-accumulators, and plural multiply-accumulator 0-3 is used for four tunnel parallel computation intermediate results, plural multiply-accumulator 4
For calculating final result simultaneously.
It is that volume data stores schematic diagram in parallel broad sense as shown in Figure 2.Its source data storage mode is:Matrix T is deposited by row
It is placed in bank0-bank3, continues to deposit in bank4-bank7 by row after being filled with;Matrix X is stored in bank8- by row
In bank11.So storage is convenient for calculating intermediate result YL4 tunnel concurrent operations of Shi Jinhang, can also simplify corresponding dma module
Design;Intermediate result YL, Y1、Y3... wait odd terms to be stored in bank12 (the latter covers the former), Y2、Y4... etc. even items deposit
It is put into bank13 (the latter covers the former).Product matrix is stored in bank14 in final broad sense.
Such as Fig. 3, the flow that parallel broad sense inner product algorithm carries out intermediate result calculating is:During once-through operation, first
Address generator 1 generates a column element X of XLT matrix elements addresses are arranged with four, while carrying corresponding matrix element data, it is defeated
Enter plural multiply-accumulator and obtains intermediate result YL, intermediate result storage address is then generated by address generator 2, by intermediate result
It is stored in bank.
Similarly, the flow of parallel broad sense inner product algorithm progress final result calculating is:During once-through operation, when the mould
When block obtains intermediate result calculating completion signal, address generator 1 continuously generates the row X of matrix XLThe address of element, and it is corresponding
Intermediate result vector YLThe address of element.Being input to plural multiply-accumulator obtains final result Z simultaneouslyL, then by address generator
2 generate final result storage address, and final result is stored in bank.
Parallel broad sense inner product algorithm hardware realization of the present invention, which once completely calculates, to be included the following steps:
Step 1) sets L=1, is calculated since the first row of matrix X;
Step 2) calculates intermediate result YL。
Calculate intermediate result YLInclude the following steps:
Step 2-1) address that is generated according to address generator submodule, X is taken successivelyL(T1T2T3T4) element be sent into
Multiply accumulating submodule and carry out complex multiplication accumulating operation, obtains (YL1YL2YL3YL4);
Step 2-2) address that is generated according to address generator submodule is by (YL1YL2YL3YL4) it is sequentially written in intermediate result
In bank, while removing one group of 4 row T matrix element and XL1) and 2), repeat, until completing YLCalculating;
Step 3) calculates final result ZL.With 1), 2) it is synchronous carry out, if having generated YL-1, generated according to address generator
Address take X successivelyL-1And YL-1Element carry out plural number multiply accumulating, obtain ZL-1, will according to the address that address generator is generated
Final result is written in final result bank;
If step 4) L<N, L=L+1 jump to step 2,;
Step 5) takes X successivelyNAnd YNElement carry out plural number multiply accumulating, obtain ZN, it is stored in bank, completes inner product operation.
Used complex multiplier in the parallel broad sense inner product reconfigurable controller of the present embodiment, complex adder is to prolong
The flowing water single-precision floating point arithmetic element of slow 4 clock cycle, memory access latency is 6 periods, using EDA emulation/synthesis tool,
The dominant frequency that works reaches 1GHz.
The parallel broad sense inner product reconfigurable controller of the present embodiment, which amounts to, consumes five plural multiply-accumulators, wherein four are used for
Four tunnel parallel computation intermediate results, another is used for synchronous calculating final result.Each plural number multiply-accumulator is by a complex multiplication
Musical instruments used in a Buddhist or Taoist mass and three complex adders are constituted, and the area that DC is integrated under 40nm CMOS technologies is 19993.56 μm2。
The parallel broad sense inner product reconfigurable controller of the present embodiment calculates one most immediately after using one intermediate result of calculating
The strategy of whole result element calculates ZL-1Time can be hidden in calculate YLTime in, compared to calculating complete intermediate knot
The method of parallel computation final result after fruit, the calculating time is few and utilization ratio of storage resources is high.
The characteristics of parallel broad sense inner product reconfigurable controller of the present embodiment is that calculating speed is fast, count flexibility and changeability and storage
Resource utilization is high.Can meet the Digital Signal Processing larger in data volume, for example, in real-time signal detection application scenarios into
When the non-homogeneous detection of row, the high real-time requires of test statistics are obtained.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Any one skilled in the art in the technical scope disclosed by the present invention, the variation or transformation that can be readily occurred in,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims
Subject to.
Claims (10)
1. a kind of parallel broad sense inner product reconfigurable controller, it is characterised in that:Including:
Intermediate result computing module receives source data and calculates intermediate result vector Y according to source dataL, generate vector YLGround
Location is stored in bank;Often complete an intermediate result vector YLCalculating generate one completion signal, and by the completions signal hair
It send to final result computing module, as enabling signal;
Final result computing module continuously generates the row X of matrix X by address generatorLThe address of element and corresponding intermediate result
Vectorial YLThe address of element, reading data enter plural multiply-accumulator progress final result and matrix of consequence Z are calculated1xNL-th member
Plain ZL, generate vector ZLAddress, be stored in bank;
Address data memory processing module carries out data selection according to ping-pong operation selection signal, while to coming from intermediate result
The signal for the same bank of computing module and final result computing module is handled, and the correct addresses bank letter is generated
Number.
2. the hardware implementation method of parallel broad sense inner product operation according to claim 1, it is characterised in that:Calculate YLMistake
Journey is XLWith square formation T, each to arrange the process multiplied accumulating, the ranks number of the square formation T is equal with the columns of matrix X, this multiplies accumulating
Process pass through multidiameter delay calculate realize.
3. parallel broad sense inner product reconfigurable controller according to claim 2, it is characterised in that:Intermediate result computing module is adopted
It is realized with the parallel realization method in four tunnels.
4. parallel broad sense inner product reconfigurable controller according to claim 3, it is characterised in that:Intermediate result computing module
Source data storage mode is:Matrix T is stored in by row in bank0-bank3, continues to deposit in bank4- by row after being filled with
In bank7;Matrix X is stored in by row in bank8-bank11.
5. parallel broad sense inner product reconfigurable controller according to claim 3, it is characterised in that:Intermediate result computing module
Intermediate result storage mode is:Odd term is stored in bank12, and even item is stored in bank13.
6. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:Intermediate result computing module into
Row intermediate result calculate flow be:During once-through operation, address generator generates a column element X of X firstLWith four row
T matrix elements address, while corresponding matrix element data is carried, it inputs plural multiply-accumulator and obtains intermediate result YL;Then
Intermediate result storage address is generated by address generator, intermediate result is stored in bank.
7. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:Final result computing module into
Row final result calculate flow be:When final result computing module, which obtains intermediate result, calculates completion signal, address generates
Device continuously generates the row X of matrix XLThe address of element and corresponding intermediate result vector YLThe address of element;It is input to complex multiplication simultaneously
Accumulator obtains final result ZL, final result storage address is generated by address generator, final result is stored in bank.
8. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:The complex multiplier is
Postpone the flowing water single-precision floating point arithmetic element of 4 clock cycle, the memory access latency of complex multiplier is set as 6 periods.
9. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:It is described plural number multiply-accumulator be
Five, wherein four are used for four tunnel parallel computation intermediate results, another calculates final result for synchronous.
10. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:Each plural number multiply-accumulator
It is made of a complex multiplier and three complex adders, the area that DC is integrated under 40nm CMOS technologies is 19993.56 μ
m2。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810497969.2A CN108762719B (en) | 2018-05-21 | 2018-05-21 | Parallel generalized inner product reconstruction controller |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810497969.2A CN108762719B (en) | 2018-05-21 | 2018-05-21 | Parallel generalized inner product reconstruction controller |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108762719A true CN108762719A (en) | 2018-11-06 |
CN108762719B CN108762719B (en) | 2023-06-06 |
Family
ID=64004919
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810497969.2A Active CN108762719B (en) | 2018-05-21 | 2018-05-21 | Parallel generalized inner product reconstruction controller |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108762719B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795687A (en) * | 2019-10-29 | 2020-02-14 | 南京宁麒智能计算芯片研究院有限公司 | Hierarchical segmentation system and method for autocorrelation algorithm |
CN110796193A (en) * | 2019-10-29 | 2020-02-14 | 南京宁麒智能计算芯片研究院有限公司 | Reconfigurable KNN algorithm-based hardware implementation system and method |
CN111045965A (en) * | 2019-10-25 | 2020-04-21 | 南京大学 | Hardware implementation method for multi-channel conflict-free splitting, computer equipment and readable storage medium for operating method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5276902A (en) * | 1988-11-07 | 1994-01-04 | Fujitsu Limited | Memory access system for vector data processed or to be processed by a vector processor |
CN104794002A (en) * | 2014-12-29 | 2015-07-22 | 南京大学 | Multi-channel parallel dividing method based on specific resources and hardware architecture of multi-channel parallel dividing method based on specific resources |
CN106855618A (en) * | 2017-03-06 | 2017-06-16 | 西安电子科技大学 | Based on the interference sample elimination method under broad sense inner product General Cell |
CN106940815A (en) * | 2017-02-13 | 2017-07-11 | 西安交通大学 | A kind of programmable convolutional neural networks Crypto Coprocessor IP Core |
-
2018
- 2018-05-21 CN CN201810497969.2A patent/CN108762719B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5276902A (en) * | 1988-11-07 | 1994-01-04 | Fujitsu Limited | Memory access system for vector data processed or to be processed by a vector processor |
CN104794002A (en) * | 2014-12-29 | 2015-07-22 | 南京大学 | Multi-channel parallel dividing method based on specific resources and hardware architecture of multi-channel parallel dividing method based on specific resources |
CN106940815A (en) * | 2017-02-13 | 2017-07-11 | 西安交通大学 | A kind of programmable convolutional neural networks Crypto Coprocessor IP Core |
CN106855618A (en) * | 2017-03-06 | 2017-06-16 | 西安电子科技大学 | Based on the interference sample elimination method under broad sense inner product General Cell |
Non-Patent Citations (1)
Title |
---|
张多利等: "二维高精度MUSIC算法的高速实现", 《合肥工业大学学报(自然科学版)》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111045965A (en) * | 2019-10-25 | 2020-04-21 | 南京大学 | Hardware implementation method for multi-channel conflict-free splitting, computer equipment and readable storage medium for operating method |
CN110795687A (en) * | 2019-10-29 | 2020-02-14 | 南京宁麒智能计算芯片研究院有限公司 | Hierarchical segmentation system and method for autocorrelation algorithm |
CN110796193A (en) * | 2019-10-29 | 2020-02-14 | 南京宁麒智能计算芯片研究院有限公司 | Reconfigurable KNN algorithm-based hardware implementation system and method |
Also Published As
Publication number | Publication date |
---|---|
CN108762719B (en) | 2023-06-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7474586B2 (en) | Tensor Computation Data Flow Accelerator Semiconductor Circuit | |
Su et al. | Neural network based reinforcement learning acceleration on fpga platforms | |
CN103955447B (en) | FFT accelerator based on DSP chip | |
CN103543984B (en) | Modified form balance throughput data path architecture for special related application | |
CN108762719A (en) | A kind of parallel broad sense inner product reconfigurable controller | |
CN106445471A (en) | Processor and method for executing matrix multiplication on processor | |
US8543633B2 (en) | Modified Gram-Schmidt core implemented in a single field programmable gate array architecture | |
KR20180123846A (en) | Logical-3d array reconfigurable accelerator for convolutional neural networks | |
US4769779A (en) | Systolic complex multiplier | |
JP7087825B2 (en) | Learning device and learning method | |
KR20190073535A (en) | Hardware double buffering using special purpose operation unit | |
CN101847137B (en) | FFT processor for realizing 2FFT-based calculation | |
CN102945224A (en) | High-speed variable point FFT (Fast Fourier Transform) processor based on FPGA (Field-Programmable Gate Array) and processing method of high-speed variable point FFT processor | |
WO2018027706A1 (en) | Fft processor and algorithm | |
CN109194307A (en) | Data processing method and system | |
CN103543983B (en) | For improving the novel data access method of the FIR operating characteristics in balance throughput data path architecture | |
CN102129419B (en) | Based on the processor of fast fourier transform | |
JP7435602B2 (en) | Computing equipment and computing systems | |
CN109446478A (en) | A kind of complex covariance matrix computing system based on iteration and restructural mode | |
CN106021188A (en) | Parallel hardware architecture and parallel computing method for floating point matrix inversion | |
CN103699355B (en) | Variable-order pipeline serial multiply-accumulator | |
RU2294561C2 (en) | Device for hardware realization of probability genetic algorithms | |
WO1992000563A1 (en) | A number theory mapping generator for addressing matrix structures | |
Singhal et al. | Efficient parallel architecture for fixed-coefficient and variable-coefficient FIR filters using distributed arithmetic | |
Sotiropoulos et al. | A fast parallel matrix multiplication reconfigurable unit utilized in face recognitions systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |