JP2016514330A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2016514330A5 JP2016514330A5 JP2016500848A JP2016500848A JP2016514330A5 JP 2016514330 A5 JP2016514330 A5 JP 2016514330A5 JP 2016500848 A JP2016500848 A JP 2016500848A JP 2016500848 A JP2016500848 A JP 2016500848A JP 2016514330 A5 JP2016514330 A5 JP 2016514330A5
- Authority
- JP
- Japan
- Prior art keywords
- vector
- radix
- accumulator
- output
- multiplication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 238000009825 accumulation Methods 0.000 claims description 33
- 238000000034 method Methods 0.000 claims description 11
- 241000545442 Radix Species 0.000 claims description 3
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/798,599 US9275014B2 (en) | 2013-03-13 | 2013-03-13 | Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods |
| US13/798,599 | 2013-03-13 | ||
| PCT/US2014/021782 WO2014164298A2 (en) | 2013-03-13 | 2014-03-07 | Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2016514330A JP2016514330A (ja) | 2016-05-19 |
| JP2016514330A5 true JP2016514330A5 (enExample) | 2017-03-16 |
Family
ID=50473780
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2016500848A Ceased JP2016514330A (ja) | 2013-03-13 | 2014-03-07 | マルチモード基数2のx乗のバタフライベクトル処理回路を提供するためのプログラマブルなデータパス構成を有するベクトル処理エンジン、ならびに関連ベクトルプロセッサ、システム、および方法 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US9275014B2 (enExample) |
| EP (1) | EP2972988A2 (enExample) |
| JP (1) | JP2016514330A (enExample) |
| KR (1) | KR20150132287A (enExample) |
| CN (1) | CN104969215B (enExample) |
| TW (1) | TWI601066B (enExample) |
| WO (1) | WO2014164298A2 (enExample) |
Families Citing this family (38)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9495154B2 (en) | 2013-03-13 | 2016-11-15 | Qualcomm Incorporated | Vector processing engines having programmable data path configurations for providing multi-mode vector processing, and related vector processors, systems, and methods |
| US9684509B2 (en) | 2013-11-15 | 2017-06-20 | Qualcomm Incorporated | Vector processing engines (VPEs) employing merging circuitry in data flow paths between execution units and vector data memory to provide in-flight merging of output vector data stored to vector data memory, and related vector processing instructions, systems, and methods |
| US9880845B2 (en) | 2013-11-15 | 2018-01-30 | Qualcomm Incorporated | Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations, and related vector processor systems and methods |
| US9619227B2 (en) | 2013-11-15 | 2017-04-11 | Qualcomm Incorporated | Vector processing engines (VPEs) employing tapped-delay line(s) for providing precision correlation / covariance vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods |
| US9792118B2 (en) | 2013-11-15 | 2017-10-17 | Qualcomm Incorporated | Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods |
| US9977676B2 (en) | 2013-11-15 | 2018-05-22 | Qualcomm Incorporated | Vector processing engines (VPEs) employing reordering circuitry in data flow paths between execution units and vector data memory to provide in-flight reordering of output vector data stored to vector data memory, and related vector processor systems and methods |
| US11544214B2 (en) * | 2015-02-02 | 2023-01-03 | Optimum Semiconductor Technologies, Inc. | Monolithic vector processor configured to operate on variable length vectors using a vector length register |
| US9817791B2 (en) | 2015-04-04 | 2017-11-14 | Texas Instruments Incorporated | Low energy accelerator processor architecture with short parallel instruction word |
| US9952865B2 (en) | 2015-04-04 | 2018-04-24 | Texas Instruments Incorporated | Low energy accelerator processor architecture with short parallel instruction word and non-orthogonal register data file |
| US11847427B2 (en) | 2015-04-04 | 2023-12-19 | Texas Instruments Incorporated | Load store circuit with dedicated single or dual bit shift circuit and opcodes for low power accelerator processor |
| US10349251B2 (en) * | 2015-12-31 | 2019-07-09 | Cavium, Llc | Methods and apparatus for twiddle factor generation for use with a programmable mixed-radix DFT/IDFT processor |
| US10311018B2 (en) * | 2015-12-31 | 2019-06-04 | Cavium, Llc | Methods and apparatus for a vector subsystem for use with a programmable mixed-radix DFT/IDFT processor |
| US10503474B2 (en) | 2015-12-31 | 2019-12-10 | Texas Instruments Incorporated | Methods and instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition |
| US10210135B2 (en) * | 2015-12-31 | 2019-02-19 | Cavium, Llc | Methods and apparatus for providing a programmable mixed-radix DFT/IDFT processor using vector engines |
| CN105718424B (zh) * | 2016-01-26 | 2018-11-02 | 北京空间飞行器总体设计部 | 一种并行快速傅立叶变换处理方法 |
| GB2553783B (en) * | 2016-09-13 | 2020-11-04 | Advanced Risc Mach Ltd | Vector multiply-add instruction |
| US10401412B2 (en) | 2016-12-16 | 2019-09-03 | Texas Instruments Incorporated | Line fault signature analysis |
| US10489877B2 (en) | 2017-04-24 | 2019-11-26 | Intel Corporation | Compute optimization mechanism |
| US10331445B2 (en) | 2017-05-24 | 2019-06-25 | Microsoft Technology Licensing, Llc | Multifunction vector processor circuits |
| US11803377B2 (en) * | 2017-09-08 | 2023-10-31 | Oracle International Corporation | Efficient direct convolution using SIMD instructions |
| US10910061B2 (en) * | 2018-03-14 | 2021-02-02 | Silicon Storage Technology, Inc. | Method and apparatus for programming analog neural memory in a deep learning artificial neural network |
| WO2019232091A1 (en) * | 2018-05-29 | 2019-12-05 | Jaber Technology Holdings Us Inc. | Radix-23 fast fourier transform for an embedded digital signal processor |
| US11277455B2 (en) | 2018-06-07 | 2022-03-15 | Mellanox Technologies, Ltd. | Streaming system |
| US20200106828A1 (en) * | 2018-10-02 | 2020-04-02 | Mellanox Technologies, Ltd. | Parallel Computation Network Device |
| US10942985B2 (en) * | 2018-12-29 | 2021-03-09 | Intel Corporation | Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions |
| US11625393B2 (en) | 2019-02-19 | 2023-04-11 | Mellanox Technologies, Ltd. | High performance computing system |
| EP3699770B1 (en) | 2019-02-25 | 2025-05-21 | Mellanox Technologies, Ltd. | Collective communication system and methods |
| US11132198B2 (en) * | 2019-08-29 | 2021-09-28 | International Business Machines Corporation | Instruction handling for accumulation of register results in a microprocessor |
| CN110780842A (zh) * | 2019-10-25 | 2020-02-11 | 无锡恒鼎超级计算中心有限公司 | 基于神威架构的船舶三维声弹性模拟计算的并行优化方法 |
| US12061910B2 (en) | 2019-12-05 | 2024-08-13 | International Business Machines Corporation | Dispatching multiply and accumulate operations based on accumulator register index number |
| US11750699B2 (en) | 2020-01-15 | 2023-09-05 | Mellanox Technologies, Ltd. | Small message aggregation |
| US11252027B2 (en) | 2020-01-23 | 2022-02-15 | Mellanox Technologies, Ltd. | Network element supporting flexible data reduction operations |
| US11876885B2 (en) | 2020-07-02 | 2024-01-16 | Mellanox Technologies, Ltd. | Clock queue with arming and/or self-arming features |
| US11556378B2 (en) | 2020-12-14 | 2023-01-17 | Mellanox Technologies, Ltd. | Offloading execution of a multi-task parameter-dependent operation to a network device |
| CN112800387B (zh) * | 2021-03-30 | 2021-08-03 | 芯翼信息科技(上海)有限公司 | 基-6蝶形运算单元、方法、电子设备及存储介质 |
| US12309070B2 (en) | 2022-04-07 | 2025-05-20 | Nvidia Corporation | In-network message aggregation for efficient small message transport |
| US11922237B1 (en) | 2022-09-12 | 2024-03-05 | Mellanox Technologies, Ltd. | Single-step collective operations |
| US12489657B2 (en) | 2023-08-17 | 2025-12-02 | Mellanox Technologies, Ltd. | In-network compute operation spreading |
Family Cites Families (34)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH04252372A (ja) * | 1991-01-28 | 1992-09-08 | Nippon Telegr & Teleph Corp <Ntt> | 可変パイプライン構造 |
| EP0681236B1 (en) | 1994-05-05 | 2000-11-22 | Conexant Systems, Inc. | Space vector data path |
| US5805875A (en) | 1996-09-13 | 1998-09-08 | International Computer Science Institute | Vector processing system with multi-operation, run-time configurable pipelines |
| US6006245A (en) | 1996-12-20 | 1999-12-21 | Compaq Computer Corporation | Enhanced fast fourier transform technique on vector processor with operand routing and slot-selectable operation |
| JP3951071B2 (ja) * | 1997-05-02 | 2007-08-01 | ソニー株式会社 | 演算装置および演算方法 |
| US6061705A (en) | 1998-01-21 | 2000-05-09 | Telefonaktiebolaget Lm Ericsson | Power and area efficient fast fourier transform processor |
| WO1999045462A1 (de) | 1998-03-03 | 1999-09-10 | Siemens Aktiengesellschaft | Datenpfad für signalverarbeitungsprozessoren |
| JP3940542B2 (ja) | 2000-03-13 | 2007-07-04 | 株式会社ルネサステクノロジ | データプロセッサ及びデータ処理システム |
| JP2003016051A (ja) * | 2001-06-29 | 2003-01-17 | Nec Corp | 複素ベクトル演算プロセッサ |
| US7107305B2 (en) * | 2001-10-05 | 2006-09-12 | Intel Corporation | Multiply-accumulate (MAC) unit for single-instruction/multiple-data (SIMD) instructions |
| US6986021B2 (en) * | 2001-11-30 | 2006-01-10 | Quick Silver Technology, Inc. | Apparatus, method, system and executable module for configuration and operation of adaptive integrated circuitry having fixed, application specific computational elements |
| US7159099B2 (en) | 2002-06-28 | 2007-01-02 | Motorola, Inc. | Streaming vector processor with reconfigurable interconnection switch |
| AU2003286131A1 (en) | 2002-08-07 | 2004-03-19 | Pact Xpp Technologies Ag | Method and device for processing data |
| US20040193837A1 (en) | 2003-03-31 | 2004-09-30 | Patrick Devaney | CPU datapaths and local memory that executes either vector or superscalar instructions |
| US7702712B2 (en) | 2003-12-05 | 2010-04-20 | Qualcomm Incorporated | FFT architecture and method |
| US7272751B2 (en) | 2004-01-15 | 2007-09-18 | International Business Machines Corporation | Error detection during processor idle cycles |
| KR100985110B1 (ko) | 2004-01-28 | 2010-10-05 | 삼성전자주식회사 | 단순한 구조의 4:2 csa 셀 및 4:2 캐리 저장 가산 방법 |
| US7299342B2 (en) * | 2005-05-24 | 2007-11-20 | Coresonic Ab | Complex vector executing clustered SIMD micro-architecture DSP with accelerator coupled complex ALU paths each further including short multiplier/accumulator using two's complement |
| US7415595B2 (en) * | 2005-05-24 | 2008-08-19 | Coresonic Ab | Data processing without processor core intervention by chain of accelerators selectively coupled by programmable interconnect network and to memory |
| WO2007018553A1 (en) * | 2005-08-08 | 2007-02-15 | Commasic Inc. | Multi-mode wireless broadband signal processor system and method |
| US20070106718A1 (en) | 2005-11-04 | 2007-05-10 | Shum Hoi L | Fast fourier transform on a single-instruction-stream, multiple-data-stream processor |
| US8024394B2 (en) | 2006-02-06 | 2011-09-20 | Via Technologies, Inc. | Dual mode floating point multiply accumulate unit |
| US7519646B2 (en) | 2006-10-26 | 2009-04-14 | Intel Corporation | Reconfigurable SIMD vector processing system |
| US8051123B1 (en) | 2006-12-15 | 2011-11-01 | Nvidia Corporation | Multipurpose functional unit with double-precision and filtering operations |
| DE102007014808A1 (de) | 2007-03-28 | 2008-10-02 | Texas Instruments Deutschland Gmbh | Multiplizier- und Multiplizier- und Addiereinheit |
| US8320478B2 (en) | 2008-12-19 | 2012-11-27 | Entropic Communications, Inc. | System and method for generating a signal with a random low peak to average power ratio waveform for an orthogonal frequency division multiplexing system |
| US20110072236A1 (en) | 2009-09-20 | 2011-03-24 | Mimar Tibet | Method for efficient and parallel color space conversion in a programmable processor |
| CN102768654A (zh) | 2011-05-05 | 2012-11-07 | 中兴通讯股份有限公司 | 具有fft基2蝶运算处理能力的装置及其实现运算的方法 |
| JP2013025468A (ja) * | 2011-07-19 | 2013-02-04 | Hitachi Advanced Digital Inc | 高速フーリエ変換装置 |
| DE102011108576A1 (de) | 2011-07-27 | 2013-01-31 | Texas Instruments Deutschland Gmbh | Selbstgetaktete Multipliziereinheit |
| CN102375805B (zh) | 2011-10-31 | 2014-04-02 | 中国人民解放军国防科学技术大学 | 面向向量处理器的基于simd的fft并行计算方法 |
| CN102637124B (zh) | 2012-03-22 | 2015-09-30 | 中国电子科技集团公司第五十八研究所 | 一种基4fft算法的并行处理装置及方法 |
| US20140280407A1 (en) | 2013-03-13 | 2014-09-18 | Qualcomm Incorporated | Vector processing carry-save accumulators employing redundant carry-save format to reduce carry propagation, and related vector processors, systems, and methods |
| US9495154B2 (en) | 2013-03-13 | 2016-11-15 | Qualcomm Incorporated | Vector processing engines having programmable data path configurations for providing multi-mode vector processing, and related vector processors, systems, and methods |
-
2013
- 2013-03-13 US US13/798,599 patent/US9275014B2/en not_active Expired - Fee Related
-
2014
- 2014-03-06 TW TW103107652A patent/TWI601066B/zh not_active IP Right Cessation
- 2014-03-07 JP JP2016500848A patent/JP2016514330A/ja not_active Ceased
- 2014-03-07 EP EP14716472.7A patent/EP2972988A2/en not_active Withdrawn
- 2014-03-07 CN CN201480007022.9A patent/CN104969215B/zh not_active Expired - Fee Related
- 2014-03-07 WO PCT/US2014/021782 patent/WO2014164298A2/en not_active Ceased
- 2014-03-07 KR KR1020157028376A patent/KR20150132287A/ko not_active Withdrawn
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2016514330A5 (enExample) | ||
| JP2016517570A5 (enExample) | ||
| Jensen et al. | A fast fractional difference algorithm | |
| CA2711027C (en) | Low power fir filter in multi-mac architecture | |
| JP2016537726A5 (enExample) | ||
| KR20210092751A (ko) | 내적 계산기 및 그 연산 방법 | |
| KR20140092852A (ko) | Fir 필터링을 위한 벡터 콘볼루션 함수와 함께 명령어 집합을 갖는 벡터 프로세서 | |
| JP2016537722A5 (enExample) | ||
| JP2016537723A5 (enExample) | ||
| JP2016537724A5 (enExample) | ||
| EP2490141A3 (en) | Method of, and apparatus for, stream scheduling in parallel pipelined hardware | |
| JP2018523237A (ja) | Simd乗算および水平集約演算 | |
| Chen et al. | High throughput energy efficient parallel FFT architecture on FPGAs | |
| KR20120072226A (ko) | 고속 퓨리에 변환기 | |
| CN102799564A (zh) | 基于多核dsp平台的fft并行方法 | |
| EP3451240A1 (en) | Apparatus and method for performing auto-learning operation of artificial neural network | |
| Revanna et al. | A scalable FFT processor architecture for OFDM based communication systems | |
| CN103677735A (zh) | 一种数据处理装置及数字信号处理器 | |
| Basiri et al. | An efficient hardware based MAC design in digital filters with complex numbers | |
| CN106201999B (zh) | 混合基dft/idft并行读取及计算方法和装置 | |
| Nguyen et al. | An FPGA-based implementation of a pipelined FFT processor for high-speed signal processing applications | |
| Nagaraju et al. | Implementation of high speed and area efficient MAC unit for industrial applications | |
| Krishna et al. | Design and implementation of time-frequency distributions for real-time applications using field programmable gate array | |
| Nouri et al. | Design and evaluation of correlation accelerator in IEEE-802.11 a/g receiver using a template-based coarse-grained reconfigurable array | |
| Gao et al. | A general markov framework for page importance computation |