JP2011100452A5 - - Google Patents

Download PDF

Info

Publication number
JP2011100452A5
JP2011100452A5 JP2010243281A JP2010243281A JP2011100452A5 JP 2011100452 A5 JP2011100452 A5 JP 2011100452A5 JP 2010243281 A JP2010243281 A JP 2010243281A JP 2010243281 A JP2010243281 A JP 2010243281A JP 2011100452 A5 JP2011100452 A5 JP 2011100452A5
Authority
JP
Japan
Prior art keywords
matrix
simd
format
segment
rows
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2010243281A
Other languages
English (en)
Japanese (ja)
Other versions
JP5689282B2 (ja
JP2011100452A (ja
Filing date
Publication date
Priority claimed from US12/612,037 external-priority patent/US8539201B2/en
Application filed filed Critical
Publication of JP2011100452A publication Critical patent/JP2011100452A/ja
Publication of JP2011100452A5 publication Critical patent/JP2011100452A5/ja
Application granted granted Critical
Publication of JP5689282B2 publication Critical patent/JP5689282B2/ja
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

JP2010243281A 2009-11-04 2010-10-29 行列をsimdマルチコア・プロセッサ・アーキテクチャ上で転置するためのコンピュータ実装方法、コンピュータ可読ストレージ媒体及びシステム Expired - Fee Related JP5689282B2 (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/612037 2009-11-04
US12/612,037 US8539201B2 (en) 2009-11-04 2009-11-04 Transposing array data on SIMD multi-core processor architectures

Publications (3)

Publication Number Publication Date
JP2011100452A JP2011100452A (ja) 2011-05-19
JP2011100452A5 true JP2011100452A5 (enExample) 2014-08-14
JP5689282B2 JP5689282B2 (ja) 2015-03-25

Family

ID=43926625

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2010243281A Expired - Fee Related JP5689282B2 (ja) 2009-11-04 2010-10-29 行列をsimdマルチコア・プロセッサ・アーキテクチャ上で転置するためのコンピュータ実装方法、コンピュータ可読ストレージ媒体及びシステム

Country Status (4)

Country Link
US (1) US8539201B2 (enExample)
JP (1) JP5689282B2 (enExample)
KR (1) KR20110079495A (enExample)
CN (1) CN102053948B (enExample)

Families Citing this family (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2138937A4 (en) * 2007-04-12 2011-01-26 Nec Corp DATA PROCESSING DEVICE OF ARRAY PROCESSORYPSY
US8484276B2 (en) * 2009-03-18 2013-07-09 International Business Machines Corporation Processing array data on SIMD multi-core processor architectures
JP5760532B2 (ja) * 2011-03-14 2015-08-12 株式会社リコー プロセッサ装置及びその演算方法
JP6078923B2 (ja) * 2011-10-14 2017-02-15 パナソニックIpマネジメント株式会社 転置演算装置とその集積回路、および転置処理方法
SG10201604445RA (en) * 2011-12-01 2016-07-28 Univ Singapore Polymorphic heterogeneous multi-core architecture
CN102521209B (zh) * 2011-12-12 2015-03-11 浪潮电子信息产业股份有限公司 一种并行多处理器计算机的设计方法
CN108681465B (zh) * 2011-12-22 2022-08-02 英特尔公司 用于产生整数序列的处理器、处理器核及系统
US20140164733A1 (en) * 2011-12-30 2014-06-12 Ashish Jha Transpose instruction
KR101893796B1 (ko) 2012-08-16 2018-10-04 삼성전자주식회사 동적 데이터 구성을 위한 방법 및 장치
CN102929724B (zh) * 2012-11-06 2016-04-13 无锡江南计算技术研究所 基于异构众核处理器的多级访存方法、离散访存方法
US9412063B2 (en) 2013-12-27 2016-08-09 International Business Machines Corporation Transform architecture for multiple neurosynaptic core circuits
US9406015B2 (en) 2013-12-27 2016-08-02 International Business Machines Corporation Transform for a neurosynaptic core circuit
TWI570573B (zh) * 2014-07-08 2017-02-11 財團法人工業技術研究院 矩陣轉置電路
SE539721C2 (en) * 2014-07-09 2017-11-07 Device and method for performing a Fourier transform on a three dimensional data set
KR102452945B1 (ko) * 2015-08-27 2022-10-11 삼성전자주식회사 푸리에 변환을 수행하는 방법 및 장치
US10635909B2 (en) * 2015-12-30 2020-04-28 Texas Instruments Incorporated Vehicle control with efficient iterative triangulation
US10095445B2 (en) * 2016-03-29 2018-10-09 Western Digital Technologies, Inc. Systems and methods for offloading processing from a host to storage processing units using an interconnect network
US10275243B2 (en) 2016-07-02 2019-04-30 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
KR102526754B1 (ko) 2016-07-13 2023-04-27 삼성전자주식회사 3차원 영상 처리 방법 및 장치
KR102654862B1 (ko) * 2016-08-31 2024-04-05 삼성전자주식회사 영상 처리 방법 및 장치
EP3998539B1 (en) * 2016-12-30 2025-03-26 INTEL Corporation Deep learning hardware
US11748625B2 (en) 2016-12-30 2023-09-05 Intel Corporation Distributed convolution for neural networks
US10169296B2 (en) * 2016-12-30 2019-01-01 Intel Corporation Distributed matrix multiplication for neural networks
US11163565B2 (en) 2017-03-20 2021-11-02 Intel Corporation Systems, methods, and apparatuses for dot production operations
CN107168683B (zh) * 2017-05-05 2020-06-09 中国科学院软件研究所 申威26010众核cpu上gemm稠密矩阵乘高性能实现方法
WO2019009870A1 (en) 2017-07-01 2019-01-10 Intel Corporation SAVE BACKGROUND TO VARIABLE BACKUP STATUS SIZE
KR102494412B1 (ko) * 2017-11-28 2023-02-03 삼성전자 주식회사 Simd 연산을 이용하여 이미지 데이터의 주파수 변환을 수행하는 전자 장치 및 전자 장치의 동작 방법
US11669326B2 (en) 2017-12-29 2023-06-06 Intel Corporation Systems, methods, and apparatuses for dot product operations
US11809869B2 (en) 2017-12-29 2023-11-07 Intel Corporation Systems and methods to store a tile register pair to memory
US11816483B2 (en) 2017-12-29 2023-11-14 Intel Corporation Systems, methods, and apparatuses for matrix operations
US11789729B2 (en) 2017-12-29 2023-10-17 Intel Corporation Systems and methods for computing dot products of nibbles in two tile operands
US11093247B2 (en) 2017-12-29 2021-08-17 Intel Corporation Systems and methods to load a tile register pair
US11023235B2 (en) 2017-12-29 2021-06-01 Intel Corporation Systems and methods to zero a tile register pair
US10664287B2 (en) 2018-03-30 2020-05-26 Intel Corporation Systems and methods for implementing chained tile operations
US11093579B2 (en) 2018-09-05 2021-08-17 Intel Corporation FP16-S7E8 mixed precision for deep learning and other algorithms
US10970076B2 (en) 2018-09-14 2021-04-06 Intel Corporation Systems and methods for performing instructions specifying ternary tile logic operations
US11579883B2 (en) 2018-09-14 2023-02-14 Intel Corporation Systems and methods for performing horizontal tile operations
US10990396B2 (en) 2018-09-27 2021-04-27 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US10719323B2 (en) 2018-09-27 2020-07-21 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US10866786B2 (en) 2018-09-27 2020-12-15 Intel Corporation Systems and methods for performing instructions to transpose rectangular tiles
US10896043B2 (en) 2018-09-28 2021-01-19 Intel Corporation Systems for performing instructions for fast element unpacking into 2-dimensional registers
US10963256B2 (en) 2018-09-28 2021-03-30 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US10929143B2 (en) 2018-09-28 2021-02-23 Intel Corporation Method and apparatus for efficient matrix alignment in a systolic array
US10963246B2 (en) 2018-11-09 2021-03-30 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
CN111338974B (zh) * 2018-12-19 2025-05-16 超威半导体公司 用于矩阵数学指令集的图块化算法
US10929503B2 (en) 2018-12-21 2021-02-23 Intel Corporation Apparatus and method for a masked multiply instruction to support neural network pruning operations
US11294671B2 (en) 2018-12-26 2022-04-05 Intel Corporation Systems and methods for performing duplicate detection instructions on 2D data
US11886875B2 (en) 2018-12-26 2024-01-30 Intel Corporation Systems and methods for performing nibble-sized operations on matrix elements
US20200210517A1 (en) 2018-12-27 2020-07-02 Intel Corporation Systems and methods to accelerate multiplication of sparse matrices
US10942985B2 (en) 2018-12-29 2021-03-09 Intel Corporation Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US10922077B2 (en) 2018-12-29 2021-02-16 Intel Corporation Apparatuses, methods, and systems for stencil configuration and computation instructions
US10853559B2 (en) 2019-03-27 2020-12-01 Charter Communications Operating, Llc Symmetric text replacement
US11016731B2 (en) 2019-03-29 2021-05-25 Intel Corporation Using Fuzzy-Jbit location of floating-point multiply-accumulate results
US11269630B2 (en) 2019-03-29 2022-03-08 Intel Corporation Interleaved pipeline of floating-point adders
US11175891B2 (en) 2019-03-30 2021-11-16 Intel Corporation Systems and methods to perform floating-point addition with selected rounding
US10990397B2 (en) 2019-03-30 2021-04-27 Intel Corporation Apparatuses, methods, and systems for transpose instructions of a matrix operations accelerator
US11403097B2 (en) 2019-06-26 2022-08-02 Intel Corporation Systems and methods to skip inconsequential matrix operations
US11334647B2 (en) 2019-06-29 2022-05-17 Intel Corporation Apparatuses, methods, and systems for enhanced matrix multiplier architecture
US11714875B2 (en) 2019-12-28 2023-08-01 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
CN111444134A (zh) * 2020-03-24 2020-07-24 山东大学 分子动力学模拟软件的并行pme的加速优化方法及系统
US11593454B2 (en) * 2020-06-02 2023-02-28 Intel Corporation Matrix operation optimization mechanism
US12112167B2 (en) 2020-06-27 2024-10-08 Intel Corporation Matrix data scatter and gather between rows and irregularly spaced memory locations
US11972230B2 (en) 2020-06-27 2024-04-30 Intel Corporation Matrix transpose and multiply
US11941395B2 (en) 2020-09-26 2024-03-26 Intel Corporation Apparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
CN112433760B (zh) * 2020-11-27 2022-09-23 海光信息技术股份有限公司 数据排序方法和数据排序电路
US12474928B2 (en) 2020-12-22 2025-11-18 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from strided data element positions in a first dimension from three source two-dimensional arrays in a result two-dimensional array
US12001887B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for aligning tiles of a matrix operations accelerator
US12001385B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator
KR102909878B1 (ko) 2021-06-01 2026-01-08 에스케이하이닉스 주식회사 메모리 장치, 반도체 시스템 및 데이터 처리 시스템
KR102527829B1 (ko) * 2021-08-19 2023-04-28 한국기술교육대학교 산학협력단 Cpu와 gpu를 사용하는 행렬 전치기반 2d-fft 연산 장치 및 이를 이용한 데이터 연산 방법
US20240020129A1 (en) * 2022-07-14 2024-01-18 Nxp Usa, Inc. Self-Ordering Fast Fourier Transform For Single Instruction Multiple Data Engines
CN120677468A (zh) * 2023-02-22 2025-09-19 株式会社电装 运算处理装置
US20250208870A1 (en) * 2023-12-20 2025-06-26 Sony Interactive Entertainment Inc. Speeding Up Memory Access

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7197625B1 (en) * 1997-10-09 2007-03-27 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
US6243730B1 (en) * 1999-05-04 2001-06-05 Sony Electronics, Inc. Methods and systems for performing short integer chen IDCT algorithm with fused multiply/add
US6625721B1 (en) * 1999-07-26 2003-09-23 Intel Corporation Registers for 2-D matrix processing
US20030084081A1 (en) * 2001-10-27 2003-05-01 Bedros Hanounik Method and apparatus for transposing a two dimensional array
US6963341B1 (en) * 2002-06-03 2005-11-08 Tibet MIMAR Fast and flexible scan conversion and matrix transpose in a SIMD processor
US7386703B2 (en) * 2003-11-18 2008-06-10 International Business Machines Corporation Two dimensional addressing of a matrix-vector register array
US20070106718A1 (en) * 2005-11-04 2007-05-10 Shum Hoi L Fast fourier transform on a single-instruction-stream, multiple-data-stream processor
US7937567B1 (en) * 2006-11-01 2011-05-03 Nvidia Corporation Methods for scalably exploiting parallelism in a parallel processing system
US7979672B2 (en) * 2008-07-25 2011-07-12 International Business Machines Corporation Multi-core processors for 3D array transposition by logically retrieving in-place physically transposed sub-array data
US8484276B2 (en) * 2009-03-18 2013-07-09 International Business Machines Corporation Processing array data on SIMD multi-core processor architectures

Similar Documents

Publication Publication Date Title
JP2011100452A5 (enExample)
JP5689282B2 (ja) 行列をsimdマルチコア・プロセッサ・アーキテクチャ上で転置するためのコンピュータ実装方法、コンピュータ可読ストレージ媒体及びシステム
US11704548B2 (en) Multicast network and memory transfer optimizations for neural network hardware acceleration
CN102375805B (zh) 面向向量处理器的基于simd的fft并行计算方法
US9582474B2 (en) Method and apparatus for performing a FFT computation
CN105190542B (zh) 提供可伸缩计算结构的方法、计算设备和打印设备
US20180121388A1 (en) Symmetric block sparse matrix-vector multiplication
TW201907397A (zh) 在矩陣向量處理器中之排列
US7640284B1 (en) Bit reversal methods for a parallel processor
JP6429428B2 (ja) リアルタイムアプリケーションのための同時エッジアーチファクト除去を伴う二次元離散フーリエ変換
CN117633418A (zh) 基于矩阵运算的多维快速傅立叶变换加速方法
CN110727911A (zh) 一种矩阵的运算方法及装置、存储介质、终端
JP5706754B2 (ja) データ処理装置及びデータ処理方法
CN109416755B (zh) 人工智能并行处理方法、装置、可读存储介质、及终端
CN117649473A (zh) 一种多图形队列渲染方法、装置和存储介质
US9098449B2 (en) FFT accelerator
US10223763B2 (en) Apparatus and method for performing fourier transform
WO2013097235A1 (zh) 并行位反序装置和方法
CN111368250B (zh) 基于傅里叶变换/逆变换的数据处理系统、方法及设备
JP5654373B2 (ja) 演算装置、演算方法およびプログラム
CN115562850A (zh) 数据处理方法、装置、设备和存储介质
TW201017529A (en) Computation and addressing method of a eneral sized memory-based FFT processor
JP2019530091A5 (enExample)
CN120067503B (zh) 执行fft计算的方法、装置和计算设备
CN114282160B (zh) 一种数据处理装置、集成电路芯片、设备及其实现的方法