KR20110079495A - Simd 멀티코어 프로세서 아키텍처 상의 어레이 데이터 전치 - Google Patents

Simd 멀티코어 프로세서 아키텍처 상의 어레이 데이터 전치 Download PDF

Info

Publication number
KR20110079495A
KR20110079495A KR1020100108204A KR20100108204A KR20110079495A KR 20110079495 A KR20110079495 A KR 20110079495A KR 1020100108204 A KR1020100108204 A KR 1020100108204A KR 20100108204 A KR20100108204 A KR 20100108204A KR 20110079495 A KR20110079495 A KR 20110079495A
Authority
KR
South Korea
Prior art keywords
matrix
simd
format
rows
transposed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
KR1020100108204A
Other languages
English (en)
Korean (ko)
Inventor
제프리 에스 맥앨리스터
넬슨 라미레즈
티모시 제이 멀린스
마크 브랜스포드
Original Assignee
인터내셔널 비지네스 머신즈 코포레이션
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 인터내셔널 비지네스 머신즈 코포레이션 filed Critical 인터내셔널 비지네스 머신즈 코포레이션
Publication of KR20110079495A publication Critical patent/KR20110079495A/ko
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • G06F7/78Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor
    • G06F7/785Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor having a sequence of storage locations each being individually accessible for both enqueue and dequeue operations, e.g. using a RAM
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/38Indexing scheme relating to groups G06F7/38 - G06F7/575
    • G06F2207/3804Details
    • G06F2207/3808Details concerning the type of numbers or the way they are handled
    • G06F2207/3828Multigauge devices, i.e. capable of handling packed numbers without unpacking them

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Discrete Mathematics (AREA)
  • Complex Calculations (AREA)
  • Image Processing (AREA)
KR1020100108204A 2009-11-04 2010-11-02 Simd 멀티코어 프로세서 아키텍처 상의 어레이 데이터 전치 Ceased KR20110079495A (ko)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/612,037 US8539201B2 (en) 2009-11-04 2009-11-04 Transposing array data on SIMD multi-core processor architectures
US12/612,037 2009-11-04

Publications (1)

Publication Number Publication Date
KR20110079495A true KR20110079495A (ko) 2011-07-07

Family

ID=43926625

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020100108204A Ceased KR20110079495A (ko) 2009-11-04 2010-11-02 Simd 멀티코어 프로세서 아키텍처 상의 어레이 데이터 전치

Country Status (4)

Country Link
US (1) US8539201B2 (https=)
JP (1) JP5689282B2 (https=)
KR (1) KR20110079495A (https=)
CN (1) CN102053948B (https=)

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9342282B2 (en) 2012-08-16 2016-05-17 Samsung Electronics Co., Ltd. Method and apparatus for dynamic data configuration
KR20170025097A (ko) * 2015-08-27 2017-03-08 삼성전자주식회사 푸리에 변환을 수행하는 방법 및 장치
WO2018174926A1 (en) * 2017-03-20 2018-09-27 Intel Corporation Systems, methods, and apparatuses for tile transpose
KR20190061981A (ko) * 2017-11-28 2019-06-05 삼성전자주식회사 Simd 연산을 이용하여 이미지 데이터의 주파수 변환을 수행하는 전자 장치 및 전자 장치의 동작 방법
US10866786B2 (en) 2018-09-27 2020-12-15 Intel Corporation Systems and methods for performing instructions to transpose rectangular tiles
US10896043B2 (en) 2018-09-28 2021-01-19 Intel Corporation Systems for performing instructions for fast element unpacking into 2-dimensional registers
US10922077B2 (en) 2018-12-29 2021-02-16 Intel Corporation Apparatuses, methods, and systems for stencil configuration and computation instructions
US10929143B2 (en) 2018-09-28 2021-02-23 Intel Corporation Method and apparatus for efficient matrix alignment in a systolic array
US10929503B2 (en) 2018-12-21 2021-02-23 Intel Corporation Apparatus and method for a masked multiply instruction to support neural network pruning operations
US10942985B2 (en) 2018-12-29 2021-03-09 Intel Corporation Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US10963256B2 (en) 2018-09-28 2021-03-30 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US10963246B2 (en) 2018-11-09 2021-03-30 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
US10970076B2 (en) 2018-09-14 2021-04-06 Intel Corporation Systems and methods for performing instructions specifying ternary tile logic operations
US10990396B2 (en) 2018-09-27 2021-04-27 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US10990397B2 (en) 2019-03-30 2021-04-27 Intel Corporation Apparatuses, methods, and systems for transpose instructions of a matrix operations accelerator
US11016731B2 (en) 2019-03-29 2021-05-25 Intel Corporation Using Fuzzy-Jbit location of floating-point multiply-accumulate results
US11023235B2 (en) 2017-12-29 2021-06-01 Intel Corporation Systems and methods to zero a tile register pair
US11048508B2 (en) 2016-07-02 2021-06-29 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
US11093247B2 (en) 2017-12-29 2021-08-17 Intel Corporation Systems and methods to load a tile register pair
US11093579B2 (en) 2018-09-05 2021-08-17 Intel Corporation FP16-S7E8 mixed precision for deep learning and other algorithms
US11175891B2 (en) 2019-03-30 2021-11-16 Intel Corporation Systems and methods to perform floating-point addition with selected rounding
US11249761B2 (en) 2018-09-27 2022-02-15 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US11269630B2 (en) 2019-03-29 2022-03-08 Intel Corporation Interleaved pipeline of floating-point adders
US11275588B2 (en) 2017-07-01 2022-03-15 Intel Corporation Context save with variable save state size
US11294671B2 (en) 2018-12-26 2022-04-05 Intel Corporation Systems and methods for performing duplicate detection instructions on 2D data
US11334647B2 (en) 2019-06-29 2022-05-17 Intel Corporation Apparatuses, methods, and systems for enhanced matrix multiplier architecture
US11403097B2 (en) 2019-06-26 2022-08-02 Intel Corporation Systems and methods to skip inconsequential matrix operations
US11416260B2 (en) 2018-03-30 2022-08-16 Intel Corporation Systems and methods for implementing chained tile operations
US11579883B2 (en) 2018-09-14 2023-02-14 Intel Corporation Systems and methods for performing horizontal tile operations
US11669326B2 (en) 2017-12-29 2023-06-06 Intel Corporation Systems, methods, and apparatuses for dot product operations
US11699468B2 (en) 2021-06-01 2023-07-11 SK Hynix Inc. Memory device, semiconductor system, and data processing system
US11714875B2 (en) 2019-12-28 2023-08-01 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
US11789729B2 (en) 2017-12-29 2023-10-17 Intel Corporation Systems and methods for computing dot products of nibbles in two tile operands
US11809869B2 (en) 2017-12-29 2023-11-07 Intel Corporation Systems and methods to store a tile register pair to memory
US11816483B2 (en) 2017-12-29 2023-11-14 Intel Corporation Systems, methods, and apparatuses for matrix operations
US11847185B2 (en) 2018-12-27 2023-12-19 Intel Corporation Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements
US11886875B2 (en) 2018-12-26 2024-01-30 Intel Corporation Systems and methods for performing nibble-sized operations on matrix elements
US11941395B2 (en) 2020-09-26 2024-03-26 Intel Corporation Apparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
US11972230B2 (en) 2020-06-27 2024-04-30 Intel Corporation Matrix transpose and multiply
US12001385B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator
US12001887B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for aligning tiles of a matrix operations accelerator
US12112167B2 (en) 2020-06-27 2024-10-08 Intel Corporation Matrix data scatter and gather between rows and irregularly spaced memory locations
US12474928B2 (en) 2020-12-22 2025-11-18 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from strided data element positions in a first dimension from three source two-dimensional arrays in a result two-dimensional array

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008129900A1 (ja) * 2007-04-12 2008-10-30 Nec Corporation アレイプロセッサ型データ処理装置
US8484276B2 (en) * 2009-03-18 2013-07-09 International Business Machines Corporation Processing array data on SIMD multi-core processor architectures
JP5760532B2 (ja) * 2011-03-14 2015-08-12 株式会社リコー プロセッサ装置及びその演算方法
JP6078923B2 (ja) * 2011-10-14 2017-02-15 パナソニックIpマネジメント株式会社 転置演算装置とその集積回路、および転置処理方法
SG10201604445RA (en) * 2011-12-01 2016-07-28 Univ Singapore Polymorphic heterogeneous multi-core architecture
CN102521209B (zh) * 2011-12-12 2015-03-11 浪潮电子信息产业股份有限公司 一种并行多处理器计算机的设计方法
US9898283B2 (en) * 2011-12-22 2018-02-20 Intel Corporation Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset
EP2798475A4 (en) * 2011-12-30 2016-07-13 Intel Corp TRANSPOSED INSTRUCTION
CN102929724B (zh) * 2012-11-06 2016-04-13 无锡江南计算技术研究所 基于异构众核处理器的多级访存方法、离散访存方法
US9406015B2 (en) 2013-12-27 2016-08-02 International Business Machines Corporation Transform for a neurosynaptic core circuit
US9412063B2 (en) 2013-12-27 2016-08-09 International Business Machines Corporation Transform architecture for multiple neurosynaptic core circuits
TWI570573B (zh) 2014-07-08 2017-02-11 財團法人工業技術研究院 矩陣轉置電路
SE539721C2 (en) * 2014-07-09 2017-11-07 Device and method for performing a Fourier transform on a three dimensional data set
US10635909B2 (en) * 2015-12-30 2020-04-28 Texas Instruments Incorporated Vehicle control with efficient iterative triangulation
US10095445B2 (en) * 2016-03-29 2018-10-09 Western Digital Technologies, Inc. Systems and methods for offloading processing from a host to storage processing units using an interconnect network
KR102526754B1 (ko) 2016-07-13 2023-04-27 삼성전자주식회사 3차원 영상 처리 방법 및 장치
KR102654862B1 (ko) * 2016-08-31 2024-04-05 삼성전자주식회사 영상 처리 방법 및 장치
EP4160449A1 (en) * 2016-12-30 2023-04-05 Intel Corporation Deep learning hardware
US10169296B2 (en) 2016-12-30 2019-01-01 Intel Corporation Distributed matrix multiplication for neural networks
US11748625B2 (en) 2016-12-30 2023-09-05 Intel Corporation Distributed convolution for neural networks
CN107168683B (zh) * 2017-05-05 2020-06-09 中国科学院软件研究所 申威26010众核cpu上gemm稠密矩阵乘高性能实现方法
CN111338974B (zh) * 2018-12-19 2025-05-16 超威半导体公司 用于矩阵数学指令集的图块化算法
US10853559B2 (en) 2019-03-27 2020-12-01 Charter Communications Operating, Llc Symmetric text replacement
CN111444134A (zh) * 2020-03-24 2020-07-24 山东大学 分子动力学模拟软件的并行pme的加速优化方法及系统
US11593454B2 (en) * 2020-06-02 2023-02-28 Intel Corporation Matrix operation optimization mechanism
CN112433760B (zh) * 2020-11-27 2022-09-23 海光信息技术股份有限公司 数据排序方法和数据排序电路
KR102527829B1 (ko) * 2021-08-19 2023-04-28 한국기술교육대학교 산학협력단 Cpu와 gpu를 사용하는 행렬 전치기반 2d-fft 연산 장치 및 이를 이용한 데이터 연산 방법
US20240020129A1 (en) * 2022-07-14 2024-01-18 Nxp Usa, Inc. Self-Ordering Fast Fourier Transform For Single Instruction Multiple Data Engines
EP4671986A1 (en) * 2023-02-22 2025-12-31 Denso Corporation ARITHMETIC PROCESSING DEVICE
WO2025136620A1 (en) * 2023-12-20 2025-06-26 Sony Interactive Entertainment Inc. Speeding up memory access

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7197625B1 (en) * 1997-10-09 2007-03-27 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
US6243730B1 (en) * 1999-05-04 2001-06-05 Sony Electronics, Inc. Methods and systems for performing short integer chen IDCT algorithm with fused multiply/add
US6625721B1 (en) * 1999-07-26 2003-09-23 Intel Corporation Registers for 2-D matrix processing
US20030084081A1 (en) * 2001-10-27 2003-05-01 Bedros Hanounik Method and apparatus for transposing a two dimensional array
US6963341B1 (en) * 2002-06-03 2005-11-08 Tibet MIMAR Fast and flexible scan conversion and matrix transpose in a SIMD processor
US7386703B2 (en) * 2003-11-18 2008-06-10 International Business Machines Corporation Two dimensional addressing of a matrix-vector register array
US20070106718A1 (en) * 2005-11-04 2007-05-10 Shum Hoi L Fast fourier transform on a single-instruction-stream, multiple-data-stream processor
US7937567B1 (en) * 2006-11-01 2011-05-03 Nvidia Corporation Methods for scalably exploiting parallelism in a parallel processing system
US7979672B2 (en) * 2008-07-25 2011-07-12 International Business Machines Corporation Multi-core processors for 3D array transposition by logically retrieving in-place physically transposed sub-array data
US8484276B2 (en) * 2009-03-18 2013-07-09 International Business Machines Corporation Processing array data on SIMD multi-core processor architectures

Cited By (95)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9342282B2 (en) 2012-08-16 2016-05-17 Samsung Electronics Co., Ltd. Method and apparatus for dynamic data configuration
KR20170025097A (ko) * 2015-08-27 2017-03-08 삼성전자주식회사 푸리에 변환을 수행하는 방법 및 장치
US11698787B2 (en) 2016-07-02 2023-07-11 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
US12204898B2 (en) 2016-07-02 2025-01-21 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
US11048508B2 (en) 2016-07-02 2021-06-29 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
US12050912B2 (en) 2016-07-02 2024-07-30 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
US11977886B2 (en) 2017-03-20 2024-05-07 Intel Corporation Systems, methods, and apparatuses for tile store
US11847452B2 (en) 2017-03-20 2023-12-19 Intel Corporation Systems, methods, and apparatus for tile configuration
US12314717B2 (en) 2017-03-20 2025-05-27 Intel Corporation Systems, methods, and apparatuses for dot production operations
US12282773B2 (en) 2017-03-20 2025-04-22 Intel Corporation Systems, methods, and apparatus for tile configuration
US12260213B2 (en) 2017-03-20 2025-03-25 Intel Corporation Systems, methods, and apparatuses for matrix add, subtract, and multiply
US12536020B2 (en) 2017-03-20 2026-01-27 Intel Corporation Systems, methods, and apparatuses for tile store
US12182571B2 (en) 2017-03-20 2024-12-31 Intel Corporation Systems, methods, and apparatuses for tile load, multiplication and accumulation
US12147804B2 (en) 2017-03-20 2024-11-19 Intel Corporation Systems, methods, and apparatuses for tile matrix multiplication and accumulation
US12124847B2 (en) 2017-03-20 2024-10-22 Intel Corporation Systems, methods, and apparatuses for tile transpose
US12106100B2 (en) 2017-03-20 2024-10-01 Intel Corporation Systems, methods, and apparatuses for matrix operations
WO2018174927A1 (en) * 2017-03-20 2018-09-27 Intel Corporation Systems, methods, and apparatuses for tile diagonal
US11288069B2 (en) 2017-03-20 2022-03-29 Intel Corporation Systems, methods, and apparatuses for tile store
US11360770B2 (en) 2017-03-20 2022-06-14 Intel Corporation Systems, methods, and apparatuses for zeroing a matrix
US10877756B2 (en) 2017-03-20 2020-12-29 Intel Corporation Systems, methods, and apparatuses for tile diagonal
US12039332B2 (en) 2017-03-20 2024-07-16 Intel Corporation Systems, methods, and apparatus for matrix move
US11080048B2 (en) 2017-03-20 2021-08-03 Intel Corporation Systems, methods, and apparatus for tile configuration
US11086623B2 (en) 2017-03-20 2021-08-10 Intel Corporation Systems, methods, and apparatuses for tile matrix multiplication and accumulation
US11714642B2 (en) 2017-03-20 2023-08-01 Intel Corporation Systems, methods, and apparatuses for tile store
US11288068B2 (en) 2017-03-20 2022-03-29 Intel Corporation Systems, methods, and apparatus for matrix move
US11163565B2 (en) 2017-03-20 2021-11-02 Intel Corporation Systems, methods, and apparatuses for dot production operations
WO2018174926A1 (en) * 2017-03-20 2018-09-27 Intel Corporation Systems, methods, and apparatuses for tile transpose
US11200055B2 (en) 2017-03-20 2021-12-14 Intel Corporation Systems, methods, and apparatuses for matrix add, subtract, and multiply
US11567765B2 (en) 2017-03-20 2023-01-31 Intel Corporation Systems, methods, and apparatuses for tile load
US11263008B2 (en) 2017-03-20 2022-03-01 Intel Corporation Systems, methods, and apparatuses for tile broadcast
US11275588B2 (en) 2017-07-01 2022-03-15 Intel Corporation Context save with variable save state size
KR20190061981A (ko) * 2017-11-28 2019-06-05 삼성전자주식회사 Simd 연산을 이용하여 이미지 데이터의 주파수 변환을 수행하는 전자 장치 및 전자 장치의 동작 방법
WO2019107708A1 (ko) * 2017-11-28 2019-06-06 삼성전자 주식회사 Simd 연산을 이용하여 이미지 데이터의 주파수 변환을 수행하는 전자 장치 및 전자 장치의 동작 방법
US11645077B2 (en) 2017-12-29 2023-05-09 Intel Corporation Systems and methods to zero a tile register pair
US11609762B2 (en) 2017-12-29 2023-03-21 Intel Corporation Systems and methods to load a tile register pair
US11669326B2 (en) 2017-12-29 2023-06-06 Intel Corporation Systems, methods, and apparatuses for dot product operations
US12182568B2 (en) 2017-12-29 2024-12-31 Intel Corporation Systems and methods for computing dot products of nibbles in two tile operands
US12236242B2 (en) 2017-12-29 2025-02-25 Intel Corporation Systems and methods to load a tile register pair
US12282525B2 (en) 2017-12-29 2025-04-22 Intel Corporation Systems, methods, and apparatuses for matrix operations
US11023235B2 (en) 2017-12-29 2021-06-01 Intel Corporation Systems and methods to zero a tile register pair
US12293186B2 (en) 2017-12-29 2025-05-06 Intel Corporation Systems and methods to store a tile register pair to memory
US11816483B2 (en) 2017-12-29 2023-11-14 Intel Corporation Systems, methods, and apparatuses for matrix operations
US11809869B2 (en) 2017-12-29 2023-11-07 Intel Corporation Systems and methods to store a tile register pair to memory
US11093247B2 (en) 2017-12-29 2021-08-17 Intel Corporation Systems and methods to load a tile register pair
US11789729B2 (en) 2017-12-29 2023-10-17 Intel Corporation Systems and methods for computing dot products of nibbles in two tile operands
US11416260B2 (en) 2018-03-30 2022-08-16 Intel Corporation Systems and methods for implementing chained tile operations
US11093579B2 (en) 2018-09-05 2021-08-17 Intel Corporation FP16-S7E8 mixed precision for deep learning and other algorithms
US11579883B2 (en) 2018-09-14 2023-02-14 Intel Corporation Systems and methods for performing horizontal tile operations
US10970076B2 (en) 2018-09-14 2021-04-06 Intel Corporation Systems and methods for performing instructions specifying ternary tile logic operations
US12265826B2 (en) 2018-09-27 2025-04-01 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US11403071B2 (en) 2018-09-27 2022-08-02 Intel Corporation Systems and methods for performing instructions to transpose rectangular tiles
US12175246B2 (en) 2018-09-27 2024-12-24 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US10866786B2 (en) 2018-09-27 2020-12-15 Intel Corporation Systems and methods for performing instructions to transpose rectangular tiles
US11714648B2 (en) 2018-09-27 2023-08-01 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US10990396B2 (en) 2018-09-27 2021-04-27 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US11748103B2 (en) 2018-09-27 2023-09-05 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US11579880B2 (en) 2018-09-27 2023-02-14 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US11249761B2 (en) 2018-09-27 2022-02-15 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US12461745B2 (en) 2018-09-27 2025-11-04 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US11954489B2 (en) 2018-09-27 2024-04-09 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US11392381B2 (en) 2018-09-28 2022-07-19 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US10929143B2 (en) 2018-09-28 2021-02-23 Intel Corporation Method and apparatus for efficient matrix alignment in a systolic array
US11507376B2 (en) 2018-09-28 2022-11-22 Intel Corporation Systems for performing instructions for fast element unpacking into 2-dimensional registers
US10896043B2 (en) 2018-09-28 2021-01-19 Intel Corporation Systems for performing instructions for fast element unpacking into 2-dimensional registers
US10963256B2 (en) 2018-09-28 2021-03-30 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US11954490B2 (en) 2018-09-28 2024-04-09 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US11675590B2 (en) 2018-09-28 2023-06-13 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US11893389B2 (en) 2018-11-09 2024-02-06 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
US11614936B2 (en) 2018-11-09 2023-03-28 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
US12307250B2 (en) 2018-11-09 2025-05-20 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
US10963246B2 (en) 2018-11-09 2021-03-30 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
US10929503B2 (en) 2018-12-21 2021-02-23 Intel Corporation Apparatus and method for a masked multiply instruction to support neural network pruning operations
US11886875B2 (en) 2018-12-26 2024-01-30 Intel Corporation Systems and methods for performing nibble-sized operations on matrix elements
US11294671B2 (en) 2018-12-26 2022-04-05 Intel Corporation Systems and methods for performing duplicate detection instructions on 2D data
US11847185B2 (en) 2018-12-27 2023-12-19 Intel Corporation Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements
US12287843B2 (en) 2018-12-27 2025-04-29 Intel Corporation Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements
US10942985B2 (en) 2018-12-29 2021-03-09 Intel Corporation Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US10922077B2 (en) 2018-12-29 2021-02-16 Intel Corporation Apparatuses, methods, and systems for stencil configuration and computation instructions
US11016731B2 (en) 2019-03-29 2021-05-25 Intel Corporation Using Fuzzy-Jbit location of floating-point multiply-accumulate results
US11269630B2 (en) 2019-03-29 2022-03-08 Intel Corporation Interleaved pipeline of floating-point adders
US10990397B2 (en) 2019-03-30 2021-04-27 Intel Corporation Apparatuses, methods, and systems for transpose instructions of a matrix operations accelerator
US11175891B2 (en) 2019-03-30 2021-11-16 Intel Corporation Systems and methods to perform floating-point addition with selected rounding
US11900114B2 (en) 2019-06-26 2024-02-13 Intel Corporation Systems and methods to skip inconsequential matrix operations
US11403097B2 (en) 2019-06-26 2022-08-02 Intel Corporation Systems and methods to skip inconsequential matrix operations
US11334647B2 (en) 2019-06-29 2022-05-17 Intel Corporation Apparatuses, methods, and systems for enhanced matrix multiplier architecture
US11714875B2 (en) 2019-12-28 2023-08-01 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
US12204605B2 (en) 2019-12-28 2025-01-21 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
US11972230B2 (en) 2020-06-27 2024-04-30 Intel Corporation Matrix transpose and multiply
US12112167B2 (en) 2020-06-27 2024-10-08 Intel Corporation Matrix data scatter and gather between rows and irregularly spaced memory locations
US12405770B2 (en) 2020-06-27 2025-09-02 Intel Corporation Matrix transpose and multiply
US11941395B2 (en) 2020-09-26 2024-03-26 Intel Corporation Apparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
US12474928B2 (en) 2020-12-22 2025-11-18 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from strided data element positions in a first dimension from three source two-dimensional arrays in a result two-dimensional array
US12001385B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator
US12001887B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for aligning tiles of a matrix operations accelerator
US11699468B2 (en) 2021-06-01 2023-07-11 SK Hynix Inc. Memory device, semiconductor system, and data processing system

Also Published As

Publication number Publication date
CN102053948B (zh) 2013-09-11
CN102053948A (zh) 2011-05-11
JP2011100452A (ja) 2011-05-19
US20110107060A1 (en) 2011-05-05
JP5689282B2 (ja) 2015-03-25
US8539201B2 (en) 2013-09-17

Similar Documents

Publication Publication Date Title
KR20110079495A (ko) Simd 멀티코어 프로세서 아키텍처 상의 어레이 데이터 전치
JP7374236B2 (ja) 加速数学エンジン
US8484276B2 (en) Processing array data on SIMD multi-core processor architectures
CN111859273B (zh) 矩阵乘法器
US11803377B2 (en) Efficient direct convolution using SIMD instructions
JP7401513B2 (ja) ハードウェアにおけるスパース行列乗算
US20080208944A1 (en) Digital signal processor structure for performing length-scalable fast fourier transformation
US7836116B1 (en) Fast fourier transforms and related transforms using cooperative thread arrays
US7461114B2 (en) Fourier transform apparatus
US7640284B1 (en) Bit reversal methods for a parallel processor
US20100191791A1 (en) Method and apparatus for evaluation of multi-dimensional discrete fourier transforms
WO2010067324A1 (en) A method of operating a computing device to perform memoization
JPS63136167A (ja) 直交変換プロセッサ
CN102652315A (zh) 信息处理设备、其控制方法、程序及计算机可读存储媒体
Cormen et al. Performing out-of-core FFTs on parallel disk systems
CN104699624B (zh) 面向fft并行计算的无冲突存储访问方法
Al Badawi et al. Faster number theoretic transform on graphics processors for ring learning with errors based cryptography
CN119836622A (zh) 多外积指令
US6728742B1 (en) Data storage patterns for fast fourier transforms
US7657587B2 (en) Multi-dimensional fast fourier transform
KR20100033979A (ko) 병렬 프로세서용 어드레싱 디바이스
Mahmood et al. Algorithm and architecture optimization for 2D discrete Fourier transforms with simultaneous edge artifact removal
Sorokin et al. Conflict-free parallel access scheme for mixed-radix FFT supporting I/O permutations
CN117786293A (zh) 矩阵装置及其操作方法
US6438568B1 (en) Method and apparatus for optimizing conversion of input data to output data

Legal Events

Date Code Title Description
PA0109 Patent application

St.27 status event code: A-0-1-A10-A12-nap-PA0109

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

R17-X000 Change to representative recorded

St.27 status event code: A-3-3-R10-R17-oth-X000

R17-X000 Change to representative recorded

St.27 status event code: A-3-3-R10-R17-oth-X000

A201 Request for examination
PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

D13-X000 Search requested

St.27 status event code: A-1-2-D10-D13-srh-X000

D14-X000 Search report completed

St.27 status event code: A-1-2-D10-D14-srh-X000

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

E601 Decision to refuse application
PE0601 Decision on rejection of patent

St.27 status event code: N-2-6-B10-B15-exm-PE0601

P22-X000 Classification modified

St.27 status event code: A-2-2-P10-P22-nap-X000