TWI270007B - Method and apparatus for shuffling data - Google Patents

Method and apparatus for shuffling data Download PDF

Info

Publication number
TWI270007B
TWI270007B TW93118830A TW93118830A TWI270007B TW I270007 B TWI270007 B TW I270007B TW 93118830 A TW93118830 A TW 93118830A TW 93118830 A TW93118830 A TW 93118830A TW I270007 B TWI270007 B TW I270007B
Authority
TW
Taiwan
Prior art keywords
data
operand
zero
shuffling
mask
Prior art date
Application number
TW93118830A
Other languages
English (en)
Chinese (zh)
Other versions
TW200515279A (en
Inventor
William Macy Jr
Eric Debes
Patrice Roussel
Huy Nguyen
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of TW200515279A publication Critical patent/TW200515279A/zh
Application granted granted Critical
Publication of TWI270007B publication Critical patent/TWI270007B/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30018Bit or string instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • G06F9/30038Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • G06F9/30109Register structure having multiple operands in a single register
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/3013Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Executing Machine-Instructions (AREA)
  • Communication Control (AREA)
  • Error Detection And Correction (AREA)
  • Detection And Prevention Of Errors In Transmission (AREA)
  • Image Analysis (AREA)
  • Radar Systems Or Details Thereof (AREA)
TW93118830A 2003-06-30 2004-06-28 Method and apparatus for shuffling data TWI270007B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/611,344 US20040054877A1 (en) 2001-10-29 2003-06-30 Method and apparatus for shuffling data

Publications (2)

Publication Number Publication Date
TW200515279A TW200515279A (en) 2005-05-01
TWI270007B true TWI270007B (en) 2007-01-01

Family

ID=34062338

Family Applications (1)

Application Number Title Priority Date Filing Date
TW93118830A TWI270007B (en) 2003-06-30 2004-06-28 Method and apparatus for shuffling data

Country Status (10)

Country Link
US (8) US20040054877A1 (enExample)
EP (1) EP1639452B1 (enExample)
JP (4) JP4607105B2 (enExample)
KR (1) KR100831472B1 (enExample)
CN (2) CN100492278C (enExample)
AT (1) ATE442624T1 (enExample)
DE (1) DE602004023081D1 (enExample)
RU (1) RU2316808C2 (enExample)
TW (1) TWI270007B (enExample)
WO (1) WO2005006183A2 (enExample)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI637276B (zh) * 2014-12-27 2018-10-01 美商英特爾股份有限公司 執行向量位元混洗的方法與裝置

Families Citing this family (123)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7739319B2 (en) * 2001-10-29 2010-06-15 Intel Corporation Method and apparatus for parallel table lookup using SIMD instructions
US20040054877A1 (en) 2001-10-29 2004-03-18 Macy William W. Method and apparatus for shuffling data
US7925891B2 (en) * 2003-04-18 2011-04-12 Via Technologies, Inc. Apparatus and method for employing cryptographic functions to generate a message digest
US7647557B2 (en) * 2005-06-29 2010-01-12 Intel Corporation Techniques for shuffling video information
US8212823B2 (en) * 2005-09-28 2012-07-03 Synopsys, Inc. Systems and methods for accelerating sub-pixel interpolation in video processing applications
US20070106883A1 (en) * 2005-11-07 2007-05-10 Choquette Jack H Efficient Streaming of Un-Aligned Load/Store Instructions that Save Unused Non-Aligned Data in a Scratch Register for the Next Instruction
US20070226469A1 (en) * 2006-03-06 2007-09-27 James Wilson Permutable address processor and method
US8290095B2 (en) 2006-03-23 2012-10-16 Qualcomm Incorporated Viterbi pack instruction
US20080071851A1 (en) * 2006-09-20 2008-03-20 Ronen Zohar Instruction and logic for performing a dot-product operation
US20080077772A1 (en) * 2006-09-22 2008-03-27 Ronen Zohar Method and apparatus for performing select operations
US9069547B2 (en) 2006-09-22 2015-06-30 Intel Corporation Instruction and logic for processing text strings
US7536532B2 (en) * 2006-09-27 2009-05-19 International Business Machines Corporation Merge operations of data arrays based on SIMD instructions
JP4686435B2 (ja) * 2006-10-27 2011-05-25 株式会社東芝 演算装置
US7962718B2 (en) * 2007-10-12 2011-06-14 Freescale Semiconductor, Inc. Methods for performing extended table lookups using SIMD vector permutation instructions that support out-of-range index values
US8700884B2 (en) * 2007-10-12 2014-04-15 Freescale Semiconductor, Inc. Single-instruction multiple-data vector permutation instruction and method for performing table lookups for in-range index values and determining constant values for out-of-range index values
US8515052B2 (en) 2007-12-17 2013-08-20 Wai Wu Parallel signal processing system and method
US8078836B2 (en) 2007-12-30 2011-12-13 Intel Corporation Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a common set of per-lane control bits
GB2456775B (en) * 2008-01-22 2012-10-31 Advanced Risc Mach Ltd Apparatus and method for performing permutation operations on data
US9513905B2 (en) 2008-03-28 2016-12-06 Intel Corporation Vector instructions to enable efficient synchronization and parallel reduction operations
WO2009144681A1 (en) * 2008-05-30 2009-12-03 Nxp B.V. Vector shuffle with write enable
US8195921B2 (en) * 2008-07-09 2012-06-05 Oracle America, Inc. Method and apparatus for decoding multithreaded instructions of a microprocessor
JP5375114B2 (ja) * 2009-01-16 2013-12-25 富士通株式会社 プロセッサ
JP5438551B2 (ja) * 2009-04-23 2014-03-12 新日鉄住金ソリューションズ株式会社 情報処理装置、情報処理方法及びプログラム
US9507670B2 (en) * 2010-06-14 2016-11-29 Veeam Software Ag Selective processing of file system objects for image level backups
CN103460181B (zh) * 2011-03-30 2017-10-24 飞思卡尔半导体公司 集成电路器件和执行其位操纵的方法
US20120254588A1 (en) * 2011-04-01 2012-10-04 Jesus Corbal San Adrian Systems, apparatuses, and methods for blending two source operands into a single destination using a writemask
US20120278591A1 (en) * 2011-04-27 2012-11-01 Advanced Micro Devices, Inc. Crossbar switch module having data movement instruction processor module and methods for implementing the same
KR101918464B1 (ko) 2011-09-14 2018-11-15 삼성전자 주식회사 스위즐드 버추얼 레지스터 기반의 프로세서 및 스위즐 패턴 제공 장치
WO2013057872A1 (ja) * 2011-10-18 2013-04-25 パナソニック株式会社 シャッフルパターン生成回路、プロセッサ、シャッフルパターン生成方法、命令
US8984499B2 (en) 2011-12-15 2015-03-17 Intel Corporation Methods to optimize a program loop via vector instructions using a shuffle table and a blend table
US10223111B2 (en) 2011-12-22 2019-03-05 Intel Corporation Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset
US9513918B2 (en) * 2011-12-22 2016-12-06 Intel Corporation Apparatus and method for performing permute operations
WO2013095564A1 (en) 2011-12-22 2013-06-27 Intel Corporation Processors, methods, systems, and instructions to generate sequences of integers in numerical order that differ by a constant stride
US9898283B2 (en) 2011-12-22 2018-02-20 Intel Corporation Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset
CN104011646B (zh) 2011-12-22 2018-03-27 英特尔公司 用于产生按照数值顺序的连续整数的序列的处理器、方法、系统和指令
US9524168B2 (en) 2011-12-23 2016-12-20 Intel Corporation Apparatus and method for shuffling floating point or integer values
WO2013095611A1 (en) * 2011-12-23 2013-06-27 Intel Corporation Apparatus and method for performing a permute operation
US10037205B2 (en) * 2011-12-23 2018-07-31 Intel Corporation Instruction and logic to provide vector blend and permute functionality
JP5935319B2 (ja) * 2011-12-26 2016-06-15 富士通株式会社 回路エミュレーション装置、回路エミュレーション方法及び回路エミュレーションプログラム
US8683296B2 (en) 2011-12-30 2014-03-25 Streamscale, Inc. Accelerated erasure coding system and method
US8914706B2 (en) 2011-12-30 2014-12-16 Streamscale, Inc. Using parity data for concurrent data authentication, correction, compression, and encryption
US9329863B2 (en) 2012-03-13 2016-05-03 International Business Machines Corporation Load register on condition with zero or immediate instruction
JP5730812B2 (ja) * 2012-05-02 2015-06-10 日本電信電話株式会社 演算装置、その方法およびプログラム
US9268683B1 (en) * 2012-05-14 2016-02-23 Kandou Labs, S.A. Storage method and apparatus for random access memory using codeword storage
WO2014004486A2 (en) * 2012-06-26 2014-01-03 Dunling Li Low delay low complexity lossless compression system
US9953436B2 (en) 2012-06-26 2018-04-24 BTS Software Solutions, LLC Low delay low complexity lossless compression system
US9218182B2 (en) * 2012-06-29 2015-12-22 Intel Corporation Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op)
US9342479B2 (en) * 2012-08-23 2016-05-17 Qualcomm Incorporated Systems and methods of data extraction in a vector processor
US9804840B2 (en) 2013-01-23 2017-10-31 International Business Machines Corporation Vector Galois Field Multiply Sum and Accumulate instruction
US9471308B2 (en) 2013-01-23 2016-10-18 International Business Machines Corporation Vector floating point test data class immediate instruction
US9823924B2 (en) 2013-01-23 2017-11-21 International Business Machines Corporation Vector element rotate and insert under mask instruction
US9778932B2 (en) * 2013-01-23 2017-10-03 International Business Machines Corporation Vector generate mask instruction
US9715385B2 (en) 2013-01-23 2017-07-25 International Business Machines Corporation Vector exception code
US9513906B2 (en) 2013-01-23 2016-12-06 International Business Machines Corporation Vector checksum instruction
US9207942B2 (en) * 2013-03-15 2015-12-08 Intel Corporation Systems, apparatuses,and methods for zeroing of bits in a data element
US9405539B2 (en) * 2013-07-31 2016-08-02 Intel Corporation Providing vector sub-byte decompression functionality
US11768689B2 (en) 2013-08-08 2023-09-26 Movidius Limited Apparatus, systems, and methods for low power computational imaging
US10001993B2 (en) 2013-08-08 2018-06-19 Linear Algebra Technologies Limited Variable-length instruction buffer management
CN103501348A (zh) * 2013-10-16 2014-01-08 华仪风能有限公司 一种风力发电机组主控系统与监控系统的通讯方法及系统
US9582419B2 (en) * 2013-10-25 2017-02-28 Arm Limited Data processing device and method for interleaved storage of data elements
KR102122406B1 (ko) 2013-11-06 2020-06-12 삼성전자주식회사 셔플 명령어 처리 장치 및 방법
US9880845B2 (en) * 2013-11-15 2018-01-30 Qualcomm Incorporated Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations, and related vector processor systems and methods
RU2644528C2 (ru) * 2013-12-23 2018-02-12 Интел Корпорейшн Инструкция и логика для идентификации инструкций для удаления в многопоточном процессоре с изменением последовательности
US9552209B2 (en) 2013-12-27 2017-01-24 Intel Corporation Functional unit for instruction execution pipeline capable of shifting different chunks of a packed data operand by different amounts
US9274835B2 (en) 2014-01-06 2016-03-01 International Business Machines Corporation Data shuffling in a non-uniform memory access device
US9256534B2 (en) 2014-01-06 2016-02-09 International Business Machines Corporation Data shuffling in a non-uniform memory access device
US10223113B2 (en) * 2014-03-27 2019-03-05 Intel Corporation Processors, methods, systems, and instructions to store consecutive source elements to unmasked result elements with propagation to masked result elements
JP6419205B2 (ja) * 2014-03-28 2018-11-07 インテル・コーポレーション プロセッサ、方法、システム、コンピュータシステム、およびコンピュータ可読記憶媒体
US9996579B2 (en) * 2014-06-26 2018-06-12 Amazon Technologies, Inc. Fast color searching
US10169803B2 (en) 2014-06-26 2019-01-01 Amazon Technologies, Inc. Color based social networking recommendations
US9424039B2 (en) 2014-07-09 2016-08-23 Intel Corporation Instruction for implementing vector loops of iterations having an iteration dependent condition
EP3175355B1 (en) * 2014-07-30 2018-07-18 Linear Algebra Technologies Limited Method and apparatus for instruction prefetch
US9619214B2 (en) 2014-08-13 2017-04-11 International Business Machines Corporation Compiler optimizations for vector instructions
JP2017199045A (ja) * 2014-09-02 2017-11-02 パナソニックIpマネジメント株式会社 プロセッサ及びデータ並び替え方法
US9785649B1 (en) 2014-09-02 2017-10-10 Amazon Technologies, Inc. Hue-based color naming for an image
US10133570B2 (en) * 2014-09-19 2018-11-20 Intel Corporation Processors, methods, systems, and instructions to select and consolidate active data elements in a register under mask into a least significant portion of result, and to indicate a number of data elements consolidated
EP3001307B1 (en) * 2014-09-25 2019-11-13 Intel Corporation Bit shuffle processors, methods, systems, and instructions
US10169014B2 (en) 2014-12-19 2019-01-01 International Business Machines Corporation Compiler method for generating instructions for vector operations in a multi-endian instruction set
US10296334B2 (en) * 2014-12-27 2019-05-21 Intel Corporation Method and apparatus for performing a vector bit gather
KR20160139823A (ko) 2015-05-28 2016-12-07 손규호 두 키 값과 바이트 중첩을 이용한 다중 자료형 데이터 패킹 또는 패킹 복원 방법
US10001995B2 (en) * 2015-06-02 2018-06-19 Intel Corporation Packed data alignment plus compute instructions, processors, methods, and systems
CN105022609A (zh) * 2015-08-05 2015-11-04 浪潮(北京)电子信息产业有限公司 一种数据混洗方法和数据混洗单元
US9880821B2 (en) 2015-08-17 2018-01-30 International Business Machines Corporation Compiler optimizations for vector operations that are reformatting-resistant
US10503502B2 (en) 2015-09-25 2019-12-10 Intel Corporation Data element rearrangement, processors, methods, systems, and instructions
US10620957B2 (en) * 2015-10-22 2020-04-14 Texas Instruments Incorporated Method for forming constant extensions in the same execute packet in a VLIW processor
US20170177351A1 (en) * 2015-12-18 2017-06-22 Intel Corporation Instructions and Logic for Even and Odd Vector Get Operations
US10338920B2 (en) * 2015-12-18 2019-07-02 Intel Corporation Instructions and logic for get-multiple-vector-elements operations
US9946541B2 (en) * 2015-12-18 2018-04-17 Intel Corporation Systems, apparatuses, and method for strided access
US20170177350A1 (en) * 2015-12-18 2017-06-22 Intel Corporation Instructions and Logic for Set-Multiple-Vector-Elements Operations
US20170177354A1 (en) 2015-12-18 2017-06-22 Intel Corporation Instructions and Logic for Vector-Based Bit Manipulation
US10467006B2 (en) 2015-12-20 2019-11-05 Intel Corporation Permutating vector data scattered in a temporary destination into elements of a destination register based on a permutation factor
US10565207B2 (en) * 2016-04-12 2020-02-18 Hsilin Huang Method, system and program product for mask-based compression of a sparse matrix
US10331830B1 (en) * 2016-06-13 2019-06-25 Apple Inc. Heterogeneous logic gate simulation using SIMD instructions
US10592468B2 (en) * 2016-07-13 2020-03-17 Qualcomm Incorporated Shuffler circuit for lane shuffle in SIMD architecture
US10169040B2 (en) * 2016-11-16 2019-01-01 Ceva D.S.P. Ltd. System and method for sample rate conversion
CN106775587B (zh) * 2016-11-30 2020-04-14 上海兆芯集成电路有限公司 计算机指令的执行方法以及使用此方法的装置
EP3336691B1 (en) * 2016-12-13 2022-04-06 ARM Limited Replicate elements instruction
EP3336692B1 (en) 2016-12-13 2020-04-29 Arm Ltd Replicate partition instruction
US9959247B1 (en) 2017-02-17 2018-05-01 Google Llc Permuting in a matrix-vector processor
US10140239B1 (en) * 2017-05-23 2018-11-27 Texas Instruments Incorporated Superimposing butterfly network controls for pattern combinations
US11194630B2 (en) 2017-05-30 2021-12-07 Microsoft Technology Licensing, Llc Grouped shuffling of partition vertices
US10970081B2 (en) * 2017-06-29 2021-04-06 Advanced Micro Devices, Inc. Stream processor with decoupled crossbar for cross lane operations
US10530397B2 (en) 2017-07-17 2020-01-07 Texas Instruments Incorporated Butterfly network on load data return
CN109324981B (zh) * 2017-07-31 2023-08-15 伊姆西Ip控股有限责任公司 高速缓存管理系统和方法
US10460416B1 (en) * 2017-10-17 2019-10-29 Xilinx, Inc. Inline image preprocessing for convolution operations using a matrix multiplier on an integrated circuit
US10956125B2 (en) 2017-12-21 2021-03-23 International Business Machines Corporation Data shuffling with hierarchical tuple spaces
US10891274B2 (en) 2017-12-21 2021-01-12 International Business Machines Corporation Data shuffling with hierarchical tuple spaces
US11789734B2 (en) * 2018-08-30 2023-10-17 Advanced Micro Devices, Inc. Padded vectorization with compile time known masks
US10620958B1 (en) 2018-12-03 2020-04-14 Advanced Micro Devices, Inc. Crossbar between clients and a cache
CN109783054B (zh) * 2018-12-20 2021-03-09 中国科学院计算技术研究所 一种rsfq fft处理器的蝶形运算处理方法及系统
CN112650496B (zh) * 2019-10-09 2024-04-26 安徽寒武纪信息科技有限公司 混洗方法及计算装置
CN112631596B (zh) * 2019-10-09 2024-09-10 安徽寒武纪信息科技有限公司 混洗方法及计算装置
US12205015B2 (en) * 2020-04-08 2025-01-21 AutoBrains Technologies Ltd. Convolutional neural network with building blocks
US11200239B2 (en) 2020-04-24 2021-12-14 International Business Machines Corporation Processing multiple data sets to generate a merged location-based data set
US12112167B2 (en) 2020-06-27 2024-10-08 Intel Corporation Matrix data scatter and gather between rows and irregularly spaced memory locations
US12118226B2 (en) * 2020-11-20 2024-10-15 Samsung Electronics Co., Ltd. Systems, methods, and devices for shuffle acceleration
KR102381644B1 (ko) * 2020-11-27 2022-04-01 한국전자기술연구원 고속 이차원 FFT 신호처리를 위한 데이터 정렬 방법 및 이를 적용한 SoC
US12474928B2 (en) 2020-12-22 2025-11-18 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from strided data element positions in a first dimension from three source two-dimensional arrays in a result two-dimensional array
US20220197974A1 (en) * 2020-12-22 2022-06-23 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from two source two-dimensional arrays indicated by permute control elements in a result two-dimensional array
CN114297138B (zh) * 2021-12-10 2023-12-26 龙芯中科技术股份有限公司 向量混洗方法、处理器及电子设备
GB2617828B (en) * 2022-04-13 2024-06-19 Advanced Risc Mach Ltd Technique for handling data elements stored in an array storage
KR102777611B1 (ko) * 2022-06-03 2025-03-06 국립창원대학교 산학협력단 텐서플로우 라이트의 메모리 관리 방법 및 시스템
CN115061731B (zh) * 2022-06-23 2023-05-23 摩尔线程智能科技(北京)有限责任公司 混洗电路和方法、以及芯片和集成电路装置

Family Cites Families (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3711692A (en) 1971-03-15 1973-01-16 Goodyear Aerospace Corp Determination of number of ones in a data field by addition
US3723715A (en) 1971-08-25 1973-03-27 Ibm Fast modulo threshold operator binary adder for multi-number additions
US4139899A (en) 1976-10-18 1979-02-13 Burroughs Corporation Shift network having a mask generator and a rotator
US4161784A (en) 1978-01-05 1979-07-17 Honeywell Information Systems, Inc. Microprogrammable floating point arithmetic unit capable of performing arithmetic operations on long and short operands
US4418383A (en) 1980-06-30 1983-11-29 International Business Machines Corporation Data flow component for processor and microprocessor systems
US4393468A (en) 1981-03-26 1983-07-12 Advanced Micro Devices, Inc. Bit slice microprogrammable processor for signal processing applications
JPS57209570A (en) 1981-06-19 1982-12-22 Fujitsu Ltd Vector processing device
US4498177A (en) 1982-08-30 1985-02-05 Sperry Corporation M Out of N code checker circuit
US4569016A (en) 1983-06-30 1986-02-04 International Business Machines Corporation Mechanism for implementing one machine cycle executable mask and rotate instructions in a primitive instruction set computing system
US4707800A (en) 1985-03-04 1987-11-17 Raytheon Company Adder/substractor for variable length numbers
JPS6297060A (ja) 1985-10-23 1987-05-06 Mitsubishi Electric Corp デイジタルシグナルプロセツサ
US4989168A (en) 1987-11-30 1991-01-29 Fujitsu Limited Multiplying unit in a computer system, capable of population counting
US5019968A (en) 1988-03-29 1991-05-28 Yulan Wang Three-dimensional vector processor
DE68925666T2 (de) 1988-10-07 1996-09-26 Ibm Prozessoren für wortorganisierte Daten
US4903228A (en) 1988-11-09 1990-02-20 International Business Machines Corporation Single cycle merge/logic unit
KR920007505B1 (ko) 1989-02-02 1992-09-04 정호선 신경회로망을 이용한 곱셈기
US5081698A (en) 1989-02-14 1992-01-14 Intel Corporation Method and apparatus for graphics display data manipulation
US5497497A (en) 1989-11-03 1996-03-05 Compaq Computer Corp. Method and apparatus for resetting multiple processors using a common ROM
US5168571A (en) 1990-01-24 1992-12-01 International Business Machines Corporation System for aligning bytes of variable multi-bytes length operand based on alu byte length and a number of unprocessed byte data
FR2666472B1 (fr) * 1990-08-31 1992-10-16 Alcatel Nv Systeme de memorisation temporaire d'information comprenant une memoire tampon enregistrant des donnees en blocs de donnees de longueur fixe ou variable.
US5268995A (en) 1990-11-21 1993-12-07 Motorola, Inc. Method for executing graphics Z-compare and pixel merge instructions in a data processor
US5680161A (en) 1991-04-03 1997-10-21 Radius Inc. Method and apparatus for high speed graphics data compression
US5187679A (en) 1991-06-05 1993-02-16 International Business Machines Corporation Generalized 7/3 counters
US5321810A (en) 1991-08-21 1994-06-14 Digital Equipment Corporation Address method for computer graphics system
US5423010A (en) 1992-01-24 1995-06-06 C-Cube Microsystems Structure and method for packing and unpacking a stream of N-bit data to and from a stream of N-bit data words
JP2642039B2 (ja) 1992-05-22 1997-08-20 インターナショナル・ビジネス・マシーンズ・コーポレイション アレイ・プロセッサ
US5426783A (en) 1992-11-02 1995-06-20 Amdahl Corporation System for processing eight bytes or less by the move, pack and unpack instruction of the ESA/390 instruction set
US5408670A (en) 1992-12-18 1995-04-18 Xerox Corporation Performing arithmetic in parallel on composite operands with packed multi-bit components
US5465374A (en) 1993-01-12 1995-11-07 International Business Machines Corporation Processor for processing data string by byte-by-byte
US5568415A (en) * 1993-02-19 1996-10-22 Digital Equipment Corporation Content addressable memory having a pair of memory cells storing don't care states for address translation
US5524256A (en) 1993-05-07 1996-06-04 Apple Computer, Inc. Method and system for reordering bytes in a data stream
JPH0721034A (ja) 1993-06-28 1995-01-24 Fujitsu Ltd 文字列複写処理方法
US5625374A (en) * 1993-09-07 1997-04-29 Apple Computer, Inc. Method for parallel interpolation of images
US5390135A (en) 1993-11-29 1995-02-14 Hewlett-Packard Parallel shift and add circuit and method
US5487159A (en) 1993-12-23 1996-01-23 Unisys Corporation System for processing shift, mask, and merge operations in one instruction
US5399135A (en) 1993-12-29 1995-03-21 Azzouni; Paul Forearm workout bar
US5781457A (en) 1994-03-08 1998-07-14 Exponential Technology, Inc. Merge/mask, rotate/shift, and boolean operations from two instruction sets executed in a vectored mux on a dual-ALU
US5594437A (en) 1994-08-01 1997-01-14 Motorola, Inc. Circuit and method of unpacking a serial bitstream
US5579253A (en) 1994-09-02 1996-11-26 Lee; Ruby B. Computer multiply instruction with a subresult selection option
US6275834B1 (en) 1994-12-01 2001-08-14 Intel Corporation Apparatus for performing packed shift operations
US5819101A (en) 1994-12-02 1998-10-06 Intel Corporation Method for packing a plurality of packed data elements in response to a pack instruction
US5636352A (en) 1994-12-16 1997-06-03 International Business Machines Corporation Method and apparatus for utilizing condensed instructions
TW388982B (en) * 1995-03-31 2000-05-01 Samsung Electronics Co Ltd Memory controller which executes read and write commands out of order
GB9509989D0 (en) 1995-05-17 1995-07-12 Sgs Thomson Microelectronics Manipulation of data
US6381690B1 (en) 1995-08-01 2002-04-30 Hewlett-Packard Company Processor for performing subword permutations and combinations
CN1149469C (zh) 1995-08-31 2004-05-12 英特尔公司 支持分组数据的处理器
US7085795B2 (en) * 2001-10-29 2006-08-01 Intel Corporation Apparatus and method for efficient filtering and convolution of content data
US5819117A (en) 1995-10-10 1998-10-06 Microunity Systems Engineering, Inc. Method and system for facilitating byte ordering interfacing of a computer system
US5838984A (en) 1996-08-19 1998-11-17 Samsung Electronics Co., Ltd. Single-instruction-multiple-data processing using multiple banks of vector registers
US5909572A (en) 1996-12-02 1999-06-01 Compaq Computer Corp. System and method for conditionally moving an operand from a source register to a destination register
US6061521A (en) * 1996-12-02 2000-05-09 Compaq Computer Corp. Computer having multimedia operations executable as two distinct sets of operations within a single instruction cycle
DE19654846A1 (de) * 1996-12-27 1998-07-09 Pact Inf Tech Gmbh Verfahren zum selbständigen dynamischen Umladen von Datenflußprozessoren (DFPs) sowie Bausteinen mit zwei- oder mehrdimensionalen programmierbaren Zellstrukturen (FPGAs, DPGAs, o. dgl.)
US5933650A (en) 1997-10-09 1999-08-03 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
US6223277B1 (en) 1997-11-21 2001-04-24 Texas Instruments Incorporated Data processing circuit with packed data structure capability
US6211892B1 (en) 1998-03-31 2001-04-03 Intel Corporation System and method for performing an intra-add operation
US6192467B1 (en) 1998-03-31 2001-02-20 Intel Corporation Executing partial-width packed data instructions
US6307553B1 (en) 1998-03-31 2001-10-23 Mohammad Abdallah System and method for performing a MOVHPS-MOVLPS instruction
US6041404A (en) * 1998-03-31 2000-03-21 Intel Corporation Dual function system and method for shuffling packed data elements
US6122725A (en) * 1998-03-31 2000-09-19 Intel Corporation Executing partial-width packed data instructions
US6288723B1 (en) 1998-04-01 2001-09-11 Intel Corporation Method and apparatus for converting data format to a graphics card
US6115812A (en) 1998-04-01 2000-09-05 Intel Corporation Method and apparatus for efficient vertical SIMD computations
US5996057A (en) * 1998-04-17 1999-11-30 Apple Data processing system and method of permutation with replication within a vector register file
US6098087A (en) 1998-04-23 2000-08-01 Infineon Technologies North America Corp. Method and apparatus for performing shift operations on packed data
US6263426B1 (en) 1998-04-30 2001-07-17 Intel Corporation Conversion from packed floating point data to packed 8-bit integer data in different architectural registers
JP3869947B2 (ja) * 1998-08-04 2007-01-17 株式会社日立製作所 並列処理プロセッサ、および、並列処理方法
US20020002666A1 (en) * 1998-10-12 2002-01-03 Carole Dulong Conditional operand selection using mask operations
US6405300B1 (en) * 1999-03-22 2002-06-11 Sun Microsystems, Inc. Combining results of selectively executed remaining sub-instructions with that of emulated sub-instruction causing exception in VLIW processor
US6484255B1 (en) 1999-09-20 2002-11-19 Intel Corporation Selective writing of data elements from packed data based upon a mask using predication
US6446198B1 (en) 1999-09-30 2002-09-03 Apple Computer, Inc. Vectorized table lookup
US6546480B1 (en) 1999-10-01 2003-04-08 Hitachi, Ltd. Instructions for arithmetic operations on vectored data
US20050188182A1 (en) * 1999-12-30 2005-08-25 Texas Instruments Incorporated Microprocessor having a set of byte intermingling instructions
US6745319B1 (en) 2000-02-18 2004-06-01 Texas Instruments Incorporated Microprocessor with instructions for shuffling and dealing data
WO2001067235A2 (en) * 2000-03-08 2001-09-13 Sun Microsystems, Inc. Processing architecture having sub-word shuffling and opcode modification
AU2001239077A1 (en) 2000-03-15 2001-09-24 Digital Accelerator Corporation Coding of digital video with high motion content
US7155601B2 (en) 2001-02-14 2006-12-26 Intel Corporation Multi-element operand sub-portion shuffle instruction execution
KR100446235B1 (ko) 2001-05-07 2004-08-30 엘지전자 주식회사 다중 후보를 이용한 움직임 벡터 병합 탐색 방법
US7162607B2 (en) * 2001-08-31 2007-01-09 Intel Corporation Apparatus and method for a data storage device with a plurality of randomly located data
US7272622B2 (en) 2001-10-29 2007-09-18 Intel Corporation Method and apparatus for parallel shift right merge of data
US20040054877A1 (en) 2001-10-29 2004-03-18 Macy William W. Method and apparatus for shuffling data
US7739319B2 (en) 2001-10-29 2010-06-15 Intel Corporation Method and apparatus for parallel table lookup using SIMD instructions
US7631025B2 (en) * 2001-10-29 2009-12-08 Intel Corporation Method and apparatus for rearranging data between multiple registers
US7685212B2 (en) 2001-10-29 2010-03-23 Intel Corporation Fast full search motion estimation with SIMD merge instruction
US7343389B2 (en) * 2002-05-02 2008-03-11 Intel Corporation Apparatus and method for SIMD modular multiplication
US6914938B2 (en) 2002-06-18 2005-07-05 Motorola, Inc. Interlaced video motion estimation

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI637276B (zh) * 2014-12-27 2018-10-01 美商英特爾股份有限公司 執行向量位元混洗的方法與裝置
TWI646436B (zh) * 2014-12-27 2019-01-01 美商英特爾股份有限公司 執行向量位元混洗的方法與裝置
US10296489B2 (en) 2014-12-27 2019-05-21 Intel Corporation Method and apparatus for performing a vector bit shuffle

Also Published As

Publication number Publication date
JP5490645B2 (ja) 2014-05-14
US9229718B2 (en) 2016-01-05
US10152323B2 (en) 2018-12-11
JP2011138541A (ja) 2011-07-14
CN101620525A (zh) 2010-01-06
EP1639452B1 (en) 2009-09-09
US20150121039A1 (en) 2015-04-30
US20110029759A1 (en) 2011-02-03
WO2005006183A3 (en) 2005-12-08
HK1083657A1 (en) 2006-07-07
CN100492278C (zh) 2009-05-27
CN1813241A (zh) 2006-08-02
WO2005006183A2 (en) 2005-01-20
DE602004023081D1 (de) 2009-10-22
JP5535965B2 (ja) 2014-07-02
US20120272047A1 (en) 2012-10-25
US20090265523A1 (en) 2009-10-22
US20130007416A1 (en) 2013-01-03
JP5567181B2 (ja) 2014-08-06
JP2010282649A (ja) 2010-12-16
US20040054877A1 (en) 2004-03-18
KR100831472B1 (ko) 2008-05-22
JP2013229037A (ja) 2013-11-07
RU2006102503A (ru) 2006-06-27
KR20060040611A (ko) 2006-05-10
US9477472B2 (en) 2016-10-25
US9229719B2 (en) 2016-01-05
JP4607105B2 (ja) 2011-01-05
JP2007526536A (ja) 2007-09-13
TW200515279A (en) 2005-05-01
RU2316808C2 (ru) 2008-02-10
EP1639452A2 (en) 2006-03-29
US20150154023A1 (en) 2015-06-04
US8214626B2 (en) 2012-07-03
US8225075B2 (en) 2012-07-17
ATE442624T1 (de) 2009-09-15
US8688959B2 (en) 2014-04-01
US20170039066A1 (en) 2017-02-09
CN101620525B (zh) 2014-01-29

Similar Documents

Publication Publication Date Title
TWI270007B (en) Method and apparatus for shuffling data
US10474466B2 (en) SIMD sign operation
JP4480997B2 (ja) Simd整数乗算上位丸めシフト
US7739319B2 (en) Method and apparatus for parallel table lookup using SIMD instructions
US7539714B2 (en) Method, apparatus, and instruction for performing a sign operation that multiplies
US20040054878A1 (en) Method and apparatus for rearranging data between multiple registers

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees