JP2016514330A5 - - Google Patents

Download PDF

Info

Publication number
JP2016514330A5
JP2016514330A5 JP2016500848A JP2016500848A JP2016514330A5 JP 2016514330 A5 JP2016514330 A5 JP 2016514330A5 JP 2016500848 A JP2016500848 A JP 2016500848A JP 2016500848 A JP2016500848 A JP 2016500848A JP 2016514330 A5 JP2016514330 A5 JP 2016514330A5
Authority
JP
Japan
Prior art keywords
vector
radix
accumulator
output
multiplication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
JP2016500848A
Other languages
English (en)
Japanese (ja)
Other versions
JP2016514330A (ja
Filing date
Publication date
Priority claimed from US13/798,599 external-priority patent/US9275014B2/en
Application filed filed Critical
Publication of JP2016514330A publication Critical patent/JP2016514330A/ja
Publication of JP2016514330A5 publication Critical patent/JP2016514330A5/ja
Ceased legal-status Critical Current

Links

JP2016500848A 2013-03-13 2014-03-07 マルチモード基数2のx乗のバタフライベクトル処理回路を提供するためのプログラマブルなデータパス構成を有するベクトル処理エンジン、ならびに関連ベクトルプロセッサ、システム、および方法 Ceased JP2016514330A (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/798,599 US9275014B2 (en) 2013-03-13 2013-03-13 Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods
US13/798,599 2013-03-13
PCT/US2014/021782 WO2014164298A2 (en) 2013-03-13 2014-03-07 Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods

Publications (2)

Publication Number Publication Date
JP2016514330A JP2016514330A (ja) 2016-05-19
JP2016514330A5 true JP2016514330A5 (enExample) 2017-03-16

Family

ID=50473780

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2016500848A Ceased JP2016514330A (ja) 2013-03-13 2014-03-07 マルチモード基数2のx乗のバタフライベクトル処理回路を提供するためのプログラマブルなデータパス構成を有するベクトル処理エンジン、ならびに関連ベクトルプロセッサ、システム、および方法

Country Status (7)

Country Link
US (1) US9275014B2 (enExample)
EP (1) EP2972988A2 (enExample)
JP (1) JP2016514330A (enExample)
KR (1) KR20150132287A (enExample)
CN (1) CN104969215B (enExample)
TW (1) TWI601066B (enExample)
WO (1) WO2014164298A2 (enExample)

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495154B2 (en) 2013-03-13 2016-11-15 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode vector processing, and related vector processors, systems, and methods
US9684509B2 (en) 2013-11-15 2017-06-20 Qualcomm Incorporated Vector processing engines (VPEs) employing merging circuitry in data flow paths between execution units and vector data memory to provide in-flight merging of output vector data stored to vector data memory, and related vector processing instructions, systems, and methods
US9880845B2 (en) 2013-11-15 2018-01-30 Qualcomm Incorporated Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations, and related vector processor systems and methods
US9619227B2 (en) 2013-11-15 2017-04-11 Qualcomm Incorporated Vector processing engines (VPEs) employing tapped-delay line(s) for providing precision correlation / covariance vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods
US9792118B2 (en) 2013-11-15 2017-10-17 Qualcomm Incorporated Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods
US9977676B2 (en) 2013-11-15 2018-05-22 Qualcomm Incorporated Vector processing engines (VPEs) employing reordering circuitry in data flow paths between execution units and vector data memory to provide in-flight reordering of output vector data stored to vector data memory, and related vector processor systems and methods
US11544214B2 (en) * 2015-02-02 2023-01-03 Optimum Semiconductor Technologies, Inc. Monolithic vector processor configured to operate on variable length vectors using a vector length register
US9817791B2 (en) 2015-04-04 2017-11-14 Texas Instruments Incorporated Low energy accelerator processor architecture with short parallel instruction word
US9952865B2 (en) 2015-04-04 2018-04-24 Texas Instruments Incorporated Low energy accelerator processor architecture with short parallel instruction word and non-orthogonal register data file
US11847427B2 (en) 2015-04-04 2023-12-19 Texas Instruments Incorporated Load store circuit with dedicated single or dual bit shift circuit and opcodes for low power accelerator processor
US10349251B2 (en) * 2015-12-31 2019-07-09 Cavium, Llc Methods and apparatus for twiddle factor generation for use with a programmable mixed-radix DFT/IDFT processor
US10311018B2 (en) * 2015-12-31 2019-06-04 Cavium, Llc Methods and apparatus for a vector subsystem for use with a programmable mixed-radix DFT/IDFT processor
US10503474B2 (en) 2015-12-31 2019-12-10 Texas Instruments Incorporated Methods and instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition
US10210135B2 (en) * 2015-12-31 2019-02-19 Cavium, Llc Methods and apparatus for providing a programmable mixed-radix DFT/IDFT processor using vector engines
CN105718424B (zh) * 2016-01-26 2018-11-02 北京空间飞行器总体设计部 一种并行快速傅立叶变换处理方法
GB2553783B (en) * 2016-09-13 2020-11-04 Advanced Risc Mach Ltd Vector multiply-add instruction
US10401412B2 (en) 2016-12-16 2019-09-03 Texas Instruments Incorporated Line fault signature analysis
US10489877B2 (en) 2017-04-24 2019-11-26 Intel Corporation Compute optimization mechanism
US10331445B2 (en) 2017-05-24 2019-06-25 Microsoft Technology Licensing, Llc Multifunction vector processor circuits
US11803377B2 (en) * 2017-09-08 2023-10-31 Oracle International Corporation Efficient direct convolution using SIMD instructions
US10910061B2 (en) * 2018-03-14 2021-02-02 Silicon Storage Technology, Inc. Method and apparatus for programming analog neural memory in a deep learning artificial neural network
WO2019232091A1 (en) * 2018-05-29 2019-12-05 Jaber Technology Holdings Us Inc. Radix-23 fast fourier transform for an embedded digital signal processor
US11277455B2 (en) 2018-06-07 2022-03-15 Mellanox Technologies, Ltd. Streaming system
US20200106828A1 (en) * 2018-10-02 2020-04-02 Mellanox Technologies, Ltd. Parallel Computation Network Device
US10942985B2 (en) * 2018-12-29 2021-03-09 Intel Corporation Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US11625393B2 (en) 2019-02-19 2023-04-11 Mellanox Technologies, Ltd. High performance computing system
EP3699770B1 (en) 2019-02-25 2025-05-21 Mellanox Technologies, Ltd. Collective communication system and methods
US11132198B2 (en) * 2019-08-29 2021-09-28 International Business Machines Corporation Instruction handling for accumulation of register results in a microprocessor
CN110780842A (zh) * 2019-10-25 2020-02-11 无锡恒鼎超级计算中心有限公司 基于神威架构的船舶三维声弹性模拟计算的并行优化方法
US12061910B2 (en) 2019-12-05 2024-08-13 International Business Machines Corporation Dispatching multiply and accumulate operations based on accumulator register index number
US11750699B2 (en) 2020-01-15 2023-09-05 Mellanox Technologies, Ltd. Small message aggregation
US11252027B2 (en) 2020-01-23 2022-02-15 Mellanox Technologies, Ltd. Network element supporting flexible data reduction operations
US11876885B2 (en) 2020-07-02 2024-01-16 Mellanox Technologies, Ltd. Clock queue with arming and/or self-arming features
US11556378B2 (en) 2020-12-14 2023-01-17 Mellanox Technologies, Ltd. Offloading execution of a multi-task parameter-dependent operation to a network device
CN112800387B (zh) * 2021-03-30 2021-08-03 芯翼信息科技(上海)有限公司 基-6蝶形运算单元、方法、电子设备及存储介质
US12309070B2 (en) 2022-04-07 2025-05-20 Nvidia Corporation In-network message aggregation for efficient small message transport
US11922237B1 (en) 2022-09-12 2024-03-05 Mellanox Technologies, Ltd. Single-step collective operations
US12489657B2 (en) 2023-08-17 2025-12-02 Mellanox Technologies, Ltd. In-network compute operation spreading

Family Cites Families (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04252372A (ja) * 1991-01-28 1992-09-08 Nippon Telegr & Teleph Corp <Ntt> 可変パイプライン構造
EP0681236B1 (en) 1994-05-05 2000-11-22 Conexant Systems, Inc. Space vector data path
US5805875A (en) 1996-09-13 1998-09-08 International Computer Science Institute Vector processing system with multi-operation, run-time configurable pipelines
US6006245A (en) 1996-12-20 1999-12-21 Compaq Computer Corporation Enhanced fast fourier transform technique on vector processor with operand routing and slot-selectable operation
JP3951071B2 (ja) * 1997-05-02 2007-08-01 ソニー株式会社 演算装置および演算方法
US6061705A (en) 1998-01-21 2000-05-09 Telefonaktiebolaget Lm Ericsson Power and area efficient fast fourier transform processor
WO1999045462A1 (de) 1998-03-03 1999-09-10 Siemens Aktiengesellschaft Datenpfad für signalverarbeitungsprozessoren
JP3940542B2 (ja) 2000-03-13 2007-07-04 株式会社ルネサステクノロジ データプロセッサ及びデータ処理システム
JP2003016051A (ja) * 2001-06-29 2003-01-17 Nec Corp 複素ベクトル演算プロセッサ
US7107305B2 (en) * 2001-10-05 2006-09-12 Intel Corporation Multiply-accumulate (MAC) unit for single-instruction/multiple-data (SIMD) instructions
US6986021B2 (en) * 2001-11-30 2006-01-10 Quick Silver Technology, Inc. Apparatus, method, system and executable module for configuration and operation of adaptive integrated circuitry having fixed, application specific computational elements
US7159099B2 (en) 2002-06-28 2007-01-02 Motorola, Inc. Streaming vector processor with reconfigurable interconnection switch
AU2003286131A1 (en) 2002-08-07 2004-03-19 Pact Xpp Technologies Ag Method and device for processing data
US20040193837A1 (en) 2003-03-31 2004-09-30 Patrick Devaney CPU datapaths and local memory that executes either vector or superscalar instructions
US7702712B2 (en) 2003-12-05 2010-04-20 Qualcomm Incorporated FFT architecture and method
US7272751B2 (en) 2004-01-15 2007-09-18 International Business Machines Corporation Error detection during processor idle cycles
KR100985110B1 (ko) 2004-01-28 2010-10-05 삼성전자주식회사 단순한 구조의 4:2 csa 셀 및 4:2 캐리 저장 가산 방법
US7299342B2 (en) * 2005-05-24 2007-11-20 Coresonic Ab Complex vector executing clustered SIMD micro-architecture DSP with accelerator coupled complex ALU paths each further including short multiplier/accumulator using two's complement
US7415595B2 (en) * 2005-05-24 2008-08-19 Coresonic Ab Data processing without processor core intervention by chain of accelerators selectively coupled by programmable interconnect network and to memory
WO2007018553A1 (en) * 2005-08-08 2007-02-15 Commasic Inc. Multi-mode wireless broadband signal processor system and method
US20070106718A1 (en) 2005-11-04 2007-05-10 Shum Hoi L Fast fourier transform on a single-instruction-stream, multiple-data-stream processor
US8024394B2 (en) 2006-02-06 2011-09-20 Via Technologies, Inc. Dual mode floating point multiply accumulate unit
US7519646B2 (en) 2006-10-26 2009-04-14 Intel Corporation Reconfigurable SIMD vector processing system
US8051123B1 (en) 2006-12-15 2011-11-01 Nvidia Corporation Multipurpose functional unit with double-precision and filtering operations
DE102007014808A1 (de) 2007-03-28 2008-10-02 Texas Instruments Deutschland Gmbh Multiplizier- und Multiplizier- und Addiereinheit
US8320478B2 (en) 2008-12-19 2012-11-27 Entropic Communications, Inc. System and method for generating a signal with a random low peak to average power ratio waveform for an orthogonal frequency division multiplexing system
US20110072236A1 (en) 2009-09-20 2011-03-24 Mimar Tibet Method for efficient and parallel color space conversion in a programmable processor
CN102768654A (zh) 2011-05-05 2012-11-07 中兴通讯股份有限公司 具有fft基2蝶运算处理能力的装置及其实现运算的方法
JP2013025468A (ja) * 2011-07-19 2013-02-04 Hitachi Advanced Digital Inc 高速フーリエ変換装置
DE102011108576A1 (de) 2011-07-27 2013-01-31 Texas Instruments Deutschland Gmbh Selbstgetaktete Multipliziereinheit
CN102375805B (zh) 2011-10-31 2014-04-02 中国人民解放军国防科学技术大学 面向向量处理器的基于simd的fft并行计算方法
CN102637124B (zh) 2012-03-22 2015-09-30 中国电子科技集团公司第五十八研究所 一种基4fft算法的并行处理装置及方法
US20140280407A1 (en) 2013-03-13 2014-09-18 Qualcomm Incorporated Vector processing carry-save accumulators employing redundant carry-save format to reduce carry propagation, and related vector processors, systems, and methods
US9495154B2 (en) 2013-03-13 2016-11-15 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode vector processing, and related vector processors, systems, and methods

Similar Documents

Publication Publication Date Title
JP2016514330A5 (enExample)
JP2016517570A5 (enExample)
Jensen et al. A fast fractional difference algorithm
CA2711027C (en) Low power fir filter in multi-mac architecture
JP2016537726A5 (enExample)
KR20210092751A (ko) 내적 계산기 및 그 연산 방법
KR20140092852A (ko) Fir 필터링을 위한 벡터 콘볼루션 함수와 함께 명령어 집합을 갖는 벡터 프로세서
JP2016537722A5 (enExample)
JP2016537723A5 (enExample)
JP2016537724A5 (enExample)
EP2490141A3 (en) Method of, and apparatus for, stream scheduling in parallel pipelined hardware
JP2018523237A (ja) Simd乗算および水平集約演算
Chen et al. High throughput energy efficient parallel FFT architecture on FPGAs
KR20120072226A (ko) 고속 퓨리에 변환기
CN102799564A (zh) 基于多核dsp平台的fft并行方法
EP3451240A1 (en) Apparatus and method for performing auto-learning operation of artificial neural network
Revanna et al. A scalable FFT processor architecture for OFDM based communication systems
CN103677735A (zh) 一种数据处理装置及数字信号处理器
Basiri et al. An efficient hardware based MAC design in digital filters with complex numbers
CN106201999B (zh) 混合基dft/idft并行读取及计算方法和装置
Nguyen et al. An FPGA-based implementation of a pipelined FFT processor for high-speed signal processing applications
Nagaraju et al. Implementation of high speed and area efficient MAC unit for industrial applications
Krishna et al. Design and implementation of time-frequency distributions for real-time applications using field programmable gate array
Nouri et al. Design and evaluation of correlation accelerator in IEEE-802.11 a/g receiver using a template-based coarse-grained reconfigurable array
Gao et al. A general markov framework for page importance computation