CN108762719A - A kind of parallel broad sense inner product reconfigurable controller - Google Patents

A kind of parallel broad sense inner product reconfigurable controller Download PDF

Info

Publication number
CN108762719A
CN108762719A CN201810497969.2A CN201810497969A CN108762719A CN 108762719 A CN108762719 A CN 108762719A CN 201810497969 A CN201810497969 A CN 201810497969A CN 108762719 A CN108762719 A CN 108762719A
Authority
CN
China
Prior art keywords
address
intermediate result
broad sense
computing module
inner product
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810497969.2A
Other languages
Chinese (zh)
Other versions
CN108762719B (en
Inventor
李丽
祁鹏展
鲍贤亮
宋文清
李伟
何书专
潘红兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201810497969.2A priority Critical patent/CN108762719B/en
Publication of CN108762719A publication Critical patent/CN108762719A/en
Application granted granted Critical
Publication of CN108762719B publication Critical patent/CN108762719B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Nonlinear Science (AREA)
  • Complex Calculations (AREA)
  • Logic Circuits (AREA)

Abstract

The parallel broad sense inner product reconfigurable controller of the present invention, including:Intermediate result computing module receives source data and calculates intermediate result vector according to source data, generate vectorAddress, be stored in bank;Often complete an intermediate result vectorCalculating generate a completion signal, and the completion signal is sent to final result computing module, as enabling signal;Final result computing module, reading data enter plural multiply-accumulator progress final result and matrix of consequence are calculatedL-th element, generate vectorAddress, be stored in bank;Address data memory processing module carries out data selection according to ping-pong operation selection signal, generates correct bank address signals.Advantageous effect:It is few and utilization ratio of storage resources is big to calculate the time, can meet when carrying out non-homogeneous detection in many signal detection application scenarios, obtain the high real-time requires of test statistics.

Description

A kind of parallel broad sense inner product reconfigurable controller
Technical field
The invention belongs to non-homogeneous detection technique field more particularly to a kind of parallel broad sense inner product reconfigurable controllers.
Background technology
It is a kind of detection technique to moving target that space-time adaptive, which handles (STAP),.In conventional STAP algorithms, it is necessary into Row clutter covariance matrix is estimated.When carrying out the estimation of clutter covariance matrix using secondary data, secondary data must expire The independent identically distributed condition of foot, could reduce performance loss.
In practical applications, detected signal echo can not only be polluted by natural clutter, be also suffered from artificial non- Uniformly interference is polluted, therefore is often unsatisfactory for independent same distribution condition.
For the jamming target in sample, Melvin first proposed the thought of nonhomogeneity detector (NHD), pass through rejecting The sample for including jamming target, come the influence for inhibiting it to estimate clutter covariance matrix.The basic ideas of NHD are:According to quilt The difference of the sample and other sample statistics characteristics of jamming target pollution, is arranged corresponding test statistics to distinguish two kinds of samples This.
In terms of NHD test statistics selections, US Naval Research Laboratory Gerlach et al. proposes broad sense inner product (GIP) With remaining two criterion of adaptive power.Enable XLIndicate the l-th sample in initial sample, then its corresponding autocorrelation matrix table It is shown as:Wherein T is miscellaneous covariance matrix of making an uproar, and is enabledIndicate the sample covariance square being made of L sample Battle array, then the corresponding GIP values of each sample are represented by:It, can be with according to the corresponding GIP values of each sample Effectively reject jamming target.
The non-homogeneous detection method of broad sense inner product is related to the rejection ability of clutter and the population size of sample, and sample size is got over Greatly, clutter covariance matrix data are truer, stronger to the rejection ability of clutter.The non-homogeneous inspection of broad sense inner product is realized on software Survey method has that precision is not high and operation time is long when calculating great amount of samples, to meet practical non-homogeneous inspection The high real-time requires of survey technology.
Invention content
The purpose of the present invention is overcoming the shortcomings of in above-mentioned background technology, a kind of parallel broad sense inner product reconfigurable control is proposed Device preferably meets the high real-time of practical application, the demand that big points calculate, is realized especially by following technical scheme 's:
The parallel broad sense inner product reconfigurable controller includes:
Intermediate result computing module receives source data and calculates intermediate result vector Y according to source dataL, generate vector YL's Address is stored in bank;Often complete an intermediate result vector YLCalculating generate a completion signal, and by the completion signal It is sent to final result computing module, as enabling signal;
Final result computing module continuously generates the row X of matrix X by address generatorLThe address of element and it is corresponding in Between result vector YLThe address of element, reading data enter plural multiply-accumulator and obtain matrix of consequence Z1xNL-th element ZL, generate to Measure ZLAddress, be stored in bank;
Address data memory processing module carries out data selection according to ping-pong operation selection signal, while to coming from centre As a result the signal for the same bank of computing module and final result computing module is handled, with generating correct bank Location signal.
The further design of the hardware implementation method of the parallel broad sense inner product operation is, calculates YLProcess be XLWith Square formation T, each to arrange the process multiplied accumulating, the ranks number of the square formation T is equal with the columns of matrix X, and the process multiplied accumulating is logical Cross multichannel parallel computation realization.
The further design of the hardware implementation method of the parallel broad sense inner product operation is that intermediate result computing module is adopted It is realized with the parallel realization method in four tunnels.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, intermediate result computing module Source data storage mode is:Matrix T is stored in by row in bank0-bank3, continues to deposit in bank4- by row after being filled with In bank7;Matrix X is stored in by row in bank8-bank11.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, intermediate result computing module Intermediate result storage mode is:Odd term is stored in bank12, and even item is stored in bank13.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, intermediate result computing module into Row intermediate result calculate flow be:During once-through operation, address generator generates a column element X of X firstLWith four row T matrix elements address, while corresponding matrix element data is carried, it inputs plural multiply-accumulator and obtains intermediate result YL;Then Intermediate result storage address is generated by address generator, intermediate result is stored in bank.
The hardware implementation method of the parallel broad sense inner product operation it is further design be, final result computing module into Row final result calculate flow be:When final result computing module, which obtains intermediate result, calculates completion signal, address generates Device continuously generates the row X of matrix XLThe address of element and corresponding intermediate result vector YLThe address of element;It is input to complex multiplication simultaneously Accumulator obtains final result ZL, final result storage address is generated by address generator, final result is stored in bank.
The further design of the hardware implementation method of the parallel broad sense inner product operation is that the complex multiplier is Postpone the flowing water single-precision floating point arithmetic element of 4 clock cycle, the memory access latency of complex multiplier is set as 6 periods.
The further design of the hardware implementation method of the parallel broad sense inner product operation is that the plural number multiply-accumulator is Five, wherein four are used for four tunnel parallel computation intermediate results, another calculates final result for synchronous.
The further design of the hardware implementation method of the parallel broad sense inner product operation is, each plural number multiply-accumulator by One complex multiplier and three complex adder compositions, the area that DC is integrated under 40nm CMOS technologies are 19993.56 μ m2
Advantages of the present invention
Parallel broad sense inner product reconfigurable controller provided by the invention calculates one immediately after using one intermediate result of calculating The strategy of final result element calculates ZL-1Time can be hidden in calculate YLTime in, calculate the time it is few and storage money Source utilization rate is big.The parallel broad sense inner product reconfigurable controller can meet carries out non-homogeneous inspection in many signal detection application scenarios When survey, the high real-time requires of test statistics are obtained.
Description of the drawings
Fig. 1 is the configuration diagram of parallel broad sense inner product reconfigurable controller.
Fig. 2 is that volume data stores schematic diagram in parallel broad sense.
Fig. 3 is parallel broad sense inner product algorithm calculation process schematic diagram.
Specific implementation mode
The present invention is described in detail with specific implementation case below in conjunction with the accompanying drawings.
Such as Fig. 1, the parallel broad sense inner product reconfigurable controller of the present embodiment is by taking four tunnels are parallel as an example, mainly by by three submodules Block forms, respectively:Intermediate result computing module, final result computing module and address data memory processing module.It is intermediate As a result computing module is for calculating intermediate result;Final result computing module calculates final result;Address data memory handles mould Block handles the coherent signals such as the addresses bank.
Intermediate result computing module, the calculating intermediate result vector Y of complete flowing waterL, including generate XLColumn element address, it is right XLOne column element and square formation TMxMEach row carry out inner product and multiply accumulating operation, obtain intermediate result vector YL, generate vector YLGround Location is stored in bank.Often complete a YLCalculating provide one complete signal give final result computing module, as the primary of it The enabling signal of calculating.
Final result computing module continuously generates the row X of matrix X by address generatorLThe address of element and it is corresponding in Between result vector YLThe address of element, reading data enter plural multiply-accumulator and obtain matrix of consequence Z1xNL-th element ZL, generate to Measure ZLAddress, be stored in bank.
Address data memory processing module carries out data selection according to ping-pong operation selection signal, while to coming from centre As a result the signal for the same bank of computing module and final result computing module is handled, with generating correct bank The signals such as location.
Such as Fig. 1, storage unit includes 15 bank, and wherein matrix T deposits in bank0-7, and matrix X deposits in bank8- 11, intermediate result YLIt is stored in bank12 and bank13, product matrix is stored in bank14 in final parallel broad sense.Operation Unit includes 5 plural multiply-accumulators, and plural multiply-accumulator 0-3 is used for four tunnel parallel computation intermediate results, plural multiply-accumulator 4 For calculating final result simultaneously.
It is that volume data stores schematic diagram in parallel broad sense as shown in Figure 2.Its source data storage mode is:Matrix T is deposited by row It is placed in bank0-bank3, continues to deposit in bank4-bank7 by row after being filled with;Matrix X is stored in bank8- by row In bank11.So storage is convenient for calculating intermediate result YL4 tunnel concurrent operations of Shi Jinhang, can also simplify corresponding dma module Design;Intermediate result YL, Y1、Y3... wait odd terms to be stored in bank12 (the latter covers the former), Y2、Y4... etc. even items deposit It is put into bank13 (the latter covers the former).Product matrix is stored in bank14 in final broad sense.
Such as Fig. 3, the flow that parallel broad sense inner product algorithm carries out intermediate result calculating is:During once-through operation, first Address generator 1 generates a column element X of XLT matrix elements addresses are arranged with four, while carrying corresponding matrix element data, it is defeated Enter plural multiply-accumulator and obtains intermediate result YL, intermediate result storage address is then generated by address generator 2, by intermediate result It is stored in bank.
Similarly, the flow of parallel broad sense inner product algorithm progress final result calculating is:During once-through operation, when the mould When block obtains intermediate result calculating completion signal, address generator 1 continuously generates the row X of matrix XLThe address of element, and it is corresponding Intermediate result vector YLThe address of element.Being input to plural multiply-accumulator obtains final result Z simultaneouslyL, then by address generator 2 generate final result storage address, and final result is stored in bank.
Parallel broad sense inner product algorithm hardware realization of the present invention, which once completely calculates, to be included the following steps:
Step 1) sets L=1, is calculated since the first row of matrix X;
Step 2) calculates intermediate result YL
Calculate intermediate result YLInclude the following steps:
Step 2-1) address that is generated according to address generator submodule, X is taken successivelyL(T1T2T3T4) element be sent into Multiply accumulating submodule and carry out complex multiplication accumulating operation, obtains (YL1YL2YL3YL4);
Step 2-2) address that is generated according to address generator submodule is by (YL1YL2YL3YL4) it is sequentially written in intermediate result In bank, while removing one group of 4 row T matrix element and XL1) and 2), repeat, until completing YLCalculating;
Step 3) calculates final result ZL.With 1), 2) it is synchronous carry out, if having generated YL-1, generated according to address generator Address take X successivelyL-1And YL-1Element carry out plural number multiply accumulating, obtain ZL-1, will according to the address that address generator is generated Final result is written in final result bank;
If step 4) L<N, L=L+1 jump to step 2,;
Step 5) takes X successivelyNAnd YNElement carry out plural number multiply accumulating, obtain ZN, it is stored in bank, completes inner product operation.
Used complex multiplier in the parallel broad sense inner product reconfigurable controller of the present embodiment, complex adder is to prolong The flowing water single-precision floating point arithmetic element of slow 4 clock cycle, memory access latency is 6 periods, using EDA emulation/synthesis tool, The dominant frequency that works reaches 1GHz.
The parallel broad sense inner product reconfigurable controller of the present embodiment, which amounts to, consumes five plural multiply-accumulators, wherein four are used for Four tunnel parallel computation intermediate results, another is used for synchronous calculating final result.Each plural number multiply-accumulator is by a complex multiplication Musical instruments used in a Buddhist or Taoist mass and three complex adders are constituted, and the area that DC is integrated under 40nm CMOS technologies is 19993.56 μm2
The parallel broad sense inner product reconfigurable controller of the present embodiment calculates one most immediately after using one intermediate result of calculating The strategy of whole result element calculates ZL-1Time can be hidden in calculate YLTime in, compared to calculating complete intermediate knot The method of parallel computation final result after fruit, the calculating time is few and utilization ratio of storage resources is high.
The characteristics of parallel broad sense inner product reconfigurable controller of the present embodiment is that calculating speed is fast, count flexibility and changeability and storage Resource utilization is high.Can meet the Digital Signal Processing larger in data volume, for example, in real-time signal detection application scenarios into When the non-homogeneous detection of row, the high real-time requires of test statistics are obtained.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Any one skilled in the art in the technical scope disclosed by the present invention, the variation or transformation that can be readily occurred in, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims Subject to.

Claims (10)

1. a kind of parallel broad sense inner product reconfigurable controller, it is characterised in that:Including:
Intermediate result computing module receives source data and calculates intermediate result vector Y according to source dataL, generate vector YLGround Location is stored in bank;Often complete an intermediate result vector YLCalculating generate one completion signal, and by the completions signal hair It send to final result computing module, as enabling signal;
Final result computing module continuously generates the row X of matrix X by address generatorLThe address of element and corresponding intermediate result Vectorial YLThe address of element, reading data enter plural multiply-accumulator progress final result and matrix of consequence Z are calculated1xNL-th member Plain ZL, generate vector ZLAddress, be stored in bank;
Address data memory processing module carries out data selection according to ping-pong operation selection signal, while to coming from intermediate result The signal for the same bank of computing module and final result computing module is handled, and the correct addresses bank letter is generated Number.
2. the hardware implementation method of parallel broad sense inner product operation according to claim 1, it is characterised in that:Calculate YLMistake Journey is XLWith square formation T, each to arrange the process multiplied accumulating, the ranks number of the square formation T is equal with the columns of matrix X, this multiplies accumulating Process pass through multidiameter delay calculate realize.
3. parallel broad sense inner product reconfigurable controller according to claim 2, it is characterised in that:Intermediate result computing module is adopted It is realized with the parallel realization method in four tunnels.
4. parallel broad sense inner product reconfigurable controller according to claim 3, it is characterised in that:Intermediate result computing module Source data storage mode is:Matrix T is stored in by row in bank0-bank3, continues to deposit in bank4- by row after being filled with In bank7;Matrix X is stored in by row in bank8-bank11.
5. parallel broad sense inner product reconfigurable controller according to claim 3, it is characterised in that:Intermediate result computing module Intermediate result storage mode is:Odd term is stored in bank12, and even item is stored in bank13.
6. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:Intermediate result computing module into Row intermediate result calculate flow be:During once-through operation, address generator generates a column element X of X firstLWith four row T matrix elements address, while corresponding matrix element data is carried, it inputs plural multiply-accumulator and obtains intermediate result YL;Then Intermediate result storage address is generated by address generator, intermediate result is stored in bank.
7. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:Final result computing module into Row final result calculate flow be:When final result computing module, which obtains intermediate result, calculates completion signal, address generates Device continuously generates the row X of matrix XLThe address of element and corresponding intermediate result vector YLThe address of element;It is input to complex multiplication simultaneously Accumulator obtains final result ZL, final result storage address is generated by address generator, final result is stored in bank.
8. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:The complex multiplier is Postpone the flowing water single-precision floating point arithmetic element of 4 clock cycle, the memory access latency of complex multiplier is set as 6 periods.
9. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:It is described plural number multiply-accumulator be Five, wherein four are used for four tunnel parallel computation intermediate results, another calculates final result for synchronous.
10. parallel broad sense inner product reconfigurable controller according to claim 1, it is characterised in that:Each plural number multiply-accumulator It is made of a complex multiplier and three complex adders, the area that DC is integrated under 40nm CMOS technologies is 19993.56 μ m2
CN201810497969.2A 2018-05-21 2018-05-21 Parallel generalized inner product reconstruction controller Active CN108762719B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810497969.2A CN108762719B (en) 2018-05-21 2018-05-21 Parallel generalized inner product reconstruction controller

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810497969.2A CN108762719B (en) 2018-05-21 2018-05-21 Parallel generalized inner product reconstruction controller

Publications (2)

Publication Number Publication Date
CN108762719A true CN108762719A (en) 2018-11-06
CN108762719B CN108762719B (en) 2023-06-06

Family

ID=64004919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810497969.2A Active CN108762719B (en) 2018-05-21 2018-05-21 Parallel generalized inner product reconstruction controller

Country Status (1)

Country Link
CN (1) CN108762719B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795687A (en) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 Hierarchical segmentation system and method for autocorrelation algorithm
CN110796193A (en) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 Reconfigurable KNN algorithm-based hardware implementation system and method
CN111045965A (en) * 2019-10-25 2020-04-21 南京大学 Hardware implementation method for multi-channel conflict-free splitting, computer equipment and readable storage medium for operating method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276902A (en) * 1988-11-07 1994-01-04 Fujitsu Limited Memory access system for vector data processed or to be processed by a vector processor
CN104794002A (en) * 2014-12-29 2015-07-22 南京大学 Multi-channel parallel dividing method based on specific resources and hardware architecture of multi-channel parallel dividing method based on specific resources
CN106855618A (en) * 2017-03-06 2017-06-16 西安电子科技大学 Based on the interference sample elimination method under broad sense inner product General Cell
CN106940815A (en) * 2017-02-13 2017-07-11 西安交通大学 A kind of programmable convolutional neural networks Crypto Coprocessor IP Core

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276902A (en) * 1988-11-07 1994-01-04 Fujitsu Limited Memory access system for vector data processed or to be processed by a vector processor
CN104794002A (en) * 2014-12-29 2015-07-22 南京大学 Multi-channel parallel dividing method based on specific resources and hardware architecture of multi-channel parallel dividing method based on specific resources
CN106940815A (en) * 2017-02-13 2017-07-11 西安交通大学 A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN106855618A (en) * 2017-03-06 2017-06-16 西安电子科技大学 Based on the interference sample elimination method under broad sense inner product General Cell

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张多利等: "二维高精度MUSIC算法的高速实现", 《合肥工业大学学报(自然科学版)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111045965A (en) * 2019-10-25 2020-04-21 南京大学 Hardware implementation method for multi-channel conflict-free splitting, computer equipment and readable storage medium for operating method
CN110795687A (en) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 Hierarchical segmentation system and method for autocorrelation algorithm
CN110796193A (en) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 Reconfigurable KNN algorithm-based hardware implementation system and method

Also Published As

Publication number Publication date
CN108762719B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
JP7474586B2 (en) Tensor Computation Data Flow Accelerator Semiconductor Circuit
Su et al. Neural network based reinforcement learning acceleration on fpga platforms
CN103955447B (en) FFT accelerator based on DSP chip
CN103543984B (en) Modified form balance throughput data path architecture for special related application
CN108762719A (en) A kind of parallel broad sense inner product reconfigurable controller
CN106445471A (en) Processor and method for executing matrix multiplication on processor
US8543633B2 (en) Modified Gram-Schmidt core implemented in a single field programmable gate array architecture
KR20180123846A (en) Logical-3d array reconfigurable accelerator for convolutional neural networks
US4769779A (en) Systolic complex multiplier
JP7087825B2 (en) Learning device and learning method
KR20190073535A (en) Hardware double buffering using special purpose operation unit
CN101847137B (en) FFT processor for realizing 2FFT-based calculation
CN102945224A (en) High-speed variable point FFT (Fast Fourier Transform) processor based on FPGA (Field-Programmable Gate Array) and processing method of high-speed variable point FFT processor
WO2018027706A1 (en) Fft processor and algorithm
CN109194307A (en) Data processing method and system
CN103543983B (en) For improving the novel data access method of the FIR operating characteristics in balance throughput data path architecture
CN102129419B (en) Based on the processor of fast fourier transform
JP7435602B2 (en) Computing equipment and computing systems
CN109446478A (en) A kind of complex covariance matrix computing system based on iteration and restructural mode
CN106021188A (en) Parallel hardware architecture and parallel computing method for floating point matrix inversion
CN103699355B (en) Variable-order pipeline serial multiply-accumulator
RU2294561C2 (en) Device for hardware realization of probability genetic algorithms
WO1992000563A1 (en) A number theory mapping generator for addressing matrix structures
Singhal et al. Efficient parallel architecture for fixed-coefficient and variable-coefficient FIR filters using distributed arithmetic
Sotiropoulos et al. A fast parallel matrix multiplication reconfigurable unit utilized in face recognitions systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant